SELECTION BY ESSENTIAL-GENE KNOCK-IN

Title:

SELECTION BY ESSENTIAL-GENE KNOCK-IN

Document Type and Number:

WIPO Patent Application WO/2021/226151

Kind Code:

Abstract:

Strategies, systems, compositions, and methods for efficient production of knock-in cellular clones without reporter genes. An essential gene is targeted using a knock-in cassette that comprises an exogenous coding sequence for a gene product of interest (or "cargo sequence") in frame with and downstream (3') of an exogenous coding sequence or partial coding sequence of the essential gene. Undesired targeting events create a non-functional version of the essential gene, in essence a knock-out, which is "rescued" by correct integration of the knock-in cassette, which restores the essential gene coding region so that a functional gene product is produced and positions the cargo sequence in frame with and downstream of the essential gene coding sequence.

Inventors:

ZURIS JOHN ANTHONY (US)
MARGULIES CARRIE MARIE (US)
SOH CHEW-LI (US)
TONGE PETER (US)
TOMISHIMA MARK JAMES (US)
MCAULIFFE CONOR BRIAN (US)
MONETTI CLAUDIO (US)

Application Number:

PCT/US2021/030744

Publication Date:

November 11, 2021

Filing Date:

May 04, 2021

Export Citation:

Click for automatic bibliography generation Help

Assignee:

EDITAS MEDICINE INC (US)
BLUEROCK THERAPEUTICS LP (US)

International Classes:

C12N15/11

Attorney, Agent or Firm:

MEDINA, Rolando et al. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

We claim:

1. A method of editing the genome of a cell, the method comprising contacting the cell with:

(i) a nuclease that causes a break within an endogenous coding sequence of an essential gene in the cell, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell, and

(ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3') of an exogenous coding sequence or partial coding sequence of the essential gene, wherein the knock-in cassette is integrated into the genome of the cell by homology-directed repair (HDR) of the break, resulting in a genome-edited cell that expresses:

(a) the gene product of interest, and

(b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof.

2. The method of claim 1, wherein, if the knock-in cassette is not integrated into the genome of the cell by homology-directed repair (HDR) in the correct position or orientation, the cell no longer expresses the gene product encoded by the essential gene, or a functional variant thereof.

3. The method of claim 1 or 2, wherein the break is a double-strand break.

4. The method of any one of claims 1-3, wherein the break is located within the last 1000, 500, 400, 300, 200, 100, or 50 base pairs of the endogenous coding sequence of the essential gene.

5. The method of any one of claims 1-3, wherein the break is located within the last exon of the essential gene.

6. The method of any one of claims 1-5, wherein the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the cell with a guide molecule for the CRISPR/Cas nuclease.

7. The method of any one of claims 1-5, wherein the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease.

8. The method of any one of claims 1-7, wherein the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded.

9. The method of claim 8, wherein the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

10. The method of any one of claims 1-9, wherein the donor template comprises homology arms on either side of the knock-in cassette.

11. The method of claim 10, wherein the homology arms correspond to sequences located on either side of the break in the genome of the cell

12. The method of any one of claims 1-11, wherein the knock-in cassette comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product.

13. The method of claim 12, wherein the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest.

14. The method of claim 13, wherein the 2 A element is a T2A element (EGRGSLLTCGDVEENPGP), a P2A element (ATNFSLLKQAGDVEENPGP), a E2A element (QCTNYALLKLAGDVESNPGP), or an F2A element (VKQTLNFDLLKLAGDVESNPGP).

15 The method of claim 13 or 14, wherein the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element.

16. The method of claim 15, wherein the linker peptide comprises the amino acid sequence GSG.

17. The method of any one of claims 1-16, wherein the knock-in cassette comprises a polyadenylation sequence, and optionally a 3' UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, wherein, if a 3'UTR sequence is present, the 3'UTR sequence is positioned 3' of the exogenous coding sequence and 5' of the polyadenylation sequence.

18. The method of any one of claims 1-17, wherein the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the essential gene.

19. The method of claim 18, wherein the C-terminal fragment is less than 500,

250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. 5

20. The method of claim 18 or 19, wherein the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene that spans the break.

21. The method of any one of claims 1-20, wherein the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell.

22. The method of claim 21, wherein the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell to prevent further binding of the nuclease to the target site, to reduce the likelihood of recombination after integration of the knock-in cassette into the genome of the cell, and/or to increase expression of the gene product of the essential gene and/or the gene product of interest after integration of the knock-in cassette into the genome of the cell.

23. The method of any one of claims 1-22, wherein the essential gene is a housekeeping gene, e.g., a gene listed in Table 3.

24. The method of any one of claims 1-22, wherein the cell is an iPS cell or ES cell and the essential gene is involved in differentiation of iPS or ES cells or expansion of iPS- or ES-derived cells, e.g., a gene listed in Table 4.

25. The method of claim 24, wherein the iPS-derived cells are iPS-derived NK cells or iPS-derived T cells.

26. The method of any one of claims 1-25, wherein the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

27. The method of any one of claims 1-26, wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD16), an interleukin (e.g., interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof), a human leukocyte antigen (e.g., human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E)), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof.

28. A genetically modified cell comprising a genome with an exogenous coding sequence for a gene product of interest in frame with and downstream (3') of a coding sequence of an essential gene, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell.

29. An engineered cell comprising a genomic modification, wherein the genomic modification comprises an insertion of an exogenous knock-in cassette within an endogenous coding sequence of an essential gene in the cell’s genome, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell, wherein the knock- in cassette comprises an exogenous coding sequence for a gene product of interest in frame with and downstream (3') of an exogenous coding sequence or partial coding sequence encoding the gene product of the essential gene, or a functional variant thereof, and wherein the cell expresses the gene product of interest and the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof, optionally wherein the gene product of interest and the gene product encoded by the essential gene are expressed from the endogenous promoter of the essential gene.

30. The cell of claim 28 or 29, wherein the cell’s genome comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product.

31. The cell of claim 30, wherein the cell’s genome comprises an IRES or 2A element located between the coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest.

32. The cell of any one of claims 28-31, wherein the cell’s genome comprises a polyadenylation sequence, and optionally a 3' UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, wherein, if a 3'UTR sequence is present, the 3'UTR sequence is positioned 3' of the exogenous coding sequence and 5' of the polyadenylation sequence.

33. The cell of any one of claims 28-32, wherein the coding sequence of the essential gene is less than 100% identical to an endogenous coding sequence of the essential gene.

34. The cell of any one of claims 28-33, wherein the essential gene is a housekeeping gene, e.g., a gene listed in Table 3.

35. The cell of any one of claims 28-33, wherein the cell is an iPS cell or ES cell and the essential gene is involved in differentiation of iPS or ES cells or expansion of iPS- or ES-derived cells, e.g., a gene listed in Table 4.

36. The cell of claim 35, wherein the iPS-derived cells are iPS-derived NK cells or iPS-derived T cells.

37. The cell of any one of claims 28-36, wherein the cell’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

38. The cell of any one of claims 28-37, wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof.

39. The cell of any one of claims 28-38, for use as a medicament.

40. The cell of any one of claims 28-38, for use in the treatment of a disease, disorder, or condition, e.g., a cancer.

41. A cell, or population of cells, produced by the method of any one of claims 1- 27 or progeny thereof.

42. A system for editing the genome of a cell, the system comprising the cell, a nuclease that causes a break within an endogenous coding sequence of an essential gene of the cell, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell, and a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3') of an exogenous coding sequence or partial coding sequence of the essential gene.

43. The system of claim 42, wherein the break is a double-strand break.

44. The system of claim 42 or 43, wherein the break is located within the last

1000, 500, 400, 300, 200, 100 or 50 base pairs of the coding sequence of the essential gene.

45. The system of any one of claims 42-44, wherein the break is located within the last exon of the essential gene.

46. The system of any one of claims 42-45, wherein the nuclease is a CRISPR/Cas nuclease and the system further comprises a guide molecule for the CRISPR/Cas nuclease.

47. The system of any one of claims 42-45, wherein the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease.

48. The system of any one of claims 42-47, wherein the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded.

49. The system of claim 48, wherein the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

50. The system of any one of claims 42-49, wherein the donor template comprises homology arms on either side of the knock-in cassette.

51. The system of claim 50, wherein the homology arms correspond to sequences located on either side of the break in the genome of the cell.

52. The system of any one of claims 42-51, wherein the knock-in cassette comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product.

53. The system of claim 52, wherein the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest.

54. The system of any one of claims 42-53, wherein the knock-in cassette comprises a polyadenylation sequence, and optionally a 3' UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, wherein, if a 3'UTR sequence is present, the 3'UTR sequence is positioned 3' of the exogenous coding sequence and 5' of the polyadenylation sequence.

55. The system of any one of claims 42-54, wherein the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the essential gene.

56. The system of claim 55, wherein the C-terminal fragment is less than 500,

250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length.

57. The system of claim 55 or 56, wherein the C-terminal fragment includes an amino acid sequence that is encoded by a region of the coding sequence of the essential gene that spans the break.

58. The system of any one of claims 42-57, wherein the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell.

59. The system of claim 58, wherein the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell to prevent further binding of a nuclease to the target site, to reduce the likelihood of recombination after integration of the knock-in cassette into the genome of the cell, or to increase expression of the gene product of the essential gene and/or the gene product of interest after integration of the knock-in cassette into the genome of the cell.

60. The system of claim 59, wherein the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette does not comprise a target site for the nuclease.

61. The system of any one of claims 42-60, wherein the essential gene is a housekeeping gene, e.g., a gene listed in Table 3.

62. The system of any one of claims 42-61, wherein the cell is an iPS cell or ES cell and the essential gene is involved in differentiation of iPS or ES cells or expansion of iPS- or ES-derived cells, e.g., a gene listed in Table 4.

63. The system of claim 62, wherein the iPS-derived cells are iPS-derived NK cells or iPS-derived T cells.

64. The system of any one of claims 42-63, wherein the donor DNA template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

65. The system of any one of claims 42-64, wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof.

66. A donor template comprising a knock-in cassette with an exogenous coding sequence for a gene product of interest in frame with and downstream (3') of an exogenous coding sequence or partial coding sequence of an essential gene, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell.

67. The donor template of claim 66, for use in editing the genome of a cell by homology-directed repair (HDR)

68. The donor template of claim 66 or 67, wherein the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded.

69. The donor template of claim 68, wherein the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

70. The donor template of any one of claims 66-69, wherein the donor template comprises homology arms on either side of the knock-in cassette.

71. The donor template of any one of claims 66-70, wherein the knock-in cassette comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product.

72. The donor template of claim 71, wherein the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest.

73. The donor template of any one of claims 66-72, wherein the knock-in cassette comprises a polyadenylation sequence, and optionally a 3' UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, wherein, if a 3'UTR sequence is present, the 3'UTR sequence is positioned 3' of the exogenous coding sequence and 5' of the polyadenylation sequence.

74. The donor template of any one of claims 66-73, wherein the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the endogenous coding sequence of the essential gene.

75. The donor template of claim 74, wherein the C-terminal fragment is less than 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length.

76. The donor template of any one of claims 66-75, wherein the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the essential gene.

77. The donor template of claim 76, wherein the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene to prevent further binding of a nuclease to the target site, to reduce the likelihood of recombination after integration of the knock-in cassette into a genome of a cell, or to increase expression of the gene product of the essential gene and/or the gene product of interest after integration of the knock-in cassette into a genome of a cell.

78. The donor template of claim 77, wherein the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette does not comprise a target site for a nuclease.

79. The donor template of any one of claims 66-78, wherein the essential gene is a housekeeping gene, e.g., a gene listed in Table 3.

80. The donor template of any one of claims 66-79, wherein the cell is an iPS cell or ES cell and the essential gene is involved in differentiation of iPS or ES cells or expansion of iPS- or ES-derived cells, e.g., a gene listed in Table 4.

81. The donor template of any one of claims 66-80, wherein the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

82. The donor template of any one of claims 66-81, wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof.

83. A method of generating genetically modified mammalian cells comprising a safety switch comprising: providing at least one donor nucleic acid construct comprising a genetic payload comprising at least one necessary component of a safety switch wherein said genetic payload is flanked by a first homologous region (HR) and a second HR, wherein the first and second HRs are essentially homologous to a first genomic region (GR) and a second GR, respectively, wherein the first GR and the second GR are adjacent to and flank a pre-determined genomic position in an exon of an essential gene in a mammalian cell, providing a gene editing system containing a nuclease that is targeted to the pre- determined genomic position, and adding the at least one donor nucleic acid construct and the gene editing system into a population of mammalian cells wherein a plurality of the mammalian cells incorporate the genetic payload at the pre-determined genomic position, wherein a disruption to the essential gene sequence caused by the nuclease is restored upon integration of the HRs and genetic payload.

84. The method of claim 83 wherein each donor nucleic acid construct comprises at least one necessary component of the safety switch.

85. The method of any of claims 83 or 84, wherein each donor nucleic acid construct comprises all of the necessary components of a safety switch.

86. The method of any one of claims 83 - 85 wherein a combination of the donor nucleic acid constructs contain all of the necessary components of a functional safety switch.

87. The method of any one of claims 83 - 86 wherein the necessary components of the safety switch dimerize to produce a functional suicide switch.

88. The method of any one of claims 83 - 87 wherein the genetic payload from a first donor nucleic acid construct is incorporated into a first allele of the essential gene and the genetic payload from a second donor nucleic acid construct is incorporated into a second allele of the essential gene.

89. The method of any one of claims 83 - 88 wherein one or more of the necessary components of the safety switch are incorporated into a first allele of the essential gene and the rest of the necessary components of the safety switch are incorporated into the second allele of the essential gene.

90. The method of any one of claims 83 — 89 wherein activation of the safety switch is triggered by a cellular event, an environmental event or a chemical agent.

91. The method of any of claims 83 - 90 wherein activation of the safety switch induces apoptosis.

92. The method of any one of claims 83 - 91wherein activation of the safety switch inhibits growth of cells that have incorporated all of the necessary components of the safety switch.

93. A population of cells made by the method of any one of claims 83 - 92.

94. The population of cells of claim 93, wherein the cells are pluripotent stem cells (PSCs).

95. The population of cells of claim 93, wherein the cells are induced pluripotent stem cells (iPSCs).

96. A cell from the population of cells of any one of claims 93 - 95 wherein the cell is differentiated into a differentiated cell.

97. The differentiated cell of claim 96, wherein the differentiated cell is selected from: a cell in the immune system, optionally selected from a T cell, a T cell expressing a chimeric antigen receptor (CAR), a suppressive T cell, a myeloid cell, a dendritic cell, and a macrophage; a cell in the nervous system, optionally selected from a dopaminergic neuron, a microglia cell, an oligodendrocyte, an astrocyte, a cortical neuron, a spinal or oculomotor neuron, an enteric neuron, a Placode-derived cell, a Schwann cell, and a trigeminal or sensory neuron; a cell in the ocular system, optionally selected from a retinal pigment epithelial cell, a photoreceptor cone cell, a photoreceptor rod cell, a bipolar cell, and a ganglion cell; a cell in the cardiovascular system, optionally selected from a cardiomyocyte, an endothelial cell, and a nodal cell; or a cell in the metabolic system, optionally selected from a hepatocyte, a cholangiocyte, and a pancreatic beta cell.

98. A method of increasing the percentage of cells in the population of cells of claim 93 - 97 that incorporate the genetic payload at the pre-determined genomic position comprising: creating a first population of mammalian cells comprising cells of any of claims 93 - 97 by providing at least one donor nucleic acid construct comprising a specific genetic payload flanked by a first homologous region (HR) and a second HR, wherein the first and second HRs are essentially homologous to a first genomic region (GR) and a second GR, respectively, wherein the first GR and the second GR are adjacent to and flank a pre- determined genomic position in an exon of an essential gene in a mammalian cell, providing a gene editing system containing a nuclease that is targeted to the pre- determined genomic position, providing the at least one donor nucleic acid construct and the gene editing system into the first population of mammalian cells, culturing the first population of mammalian cells, and identifying the percentage of surviving cells that comprise the specific genetic payload, creating a second population of mammalian cells by expanding the surviving cells from the first population of mammalian cells by providing to the surviving cells from the first population of mammalian cells, a gene editing system containing a nuclease that is targeted to the pre-determined genomic position; optionally reintroducing the at least one donor construct; culturing the second population of mammalian cells; and identifying the percentage of surviving cells that comprise the specific exogenous genetic payload, wherein the percentage of surviving cells from the second population of mammalian cells that comprise the specific exogenous genetic payload is higher than the percentage of surviving cells from the first population of mammalian cells that comprise the specific exogenous genetic payload.

99. The method of claim 98 wherein a plurality of the surviving cells from the first population of mammalian cells that do not comprise the specific genetic payloads are killed during the creation of the second population of mammalian cells.

100. The method of any one of claims 98 or 99 wherein a plurality of surviving cells from the first population of mammalian cells that do not comprise the specific genetic payloads incorporate the specific genetic payloads during the creation of the second population of mammalian cells.

101. The method of any one of claims 98 - 100 wherein the percentage of surviving cells from the second population of mammalian cells that comprise the specific genetic payloads is at least three times larger than the percentage of surviving cells from the first population of mammalian cells that comprise the specific genetic payloads.

102. The method of any one of the preceding claims 98 — 101 wherein the percentage of surviving cells from the second population of mammalian cells that do not comprise the specific genetic payloads is at least five (5) times lower than the percentage of surviving cells from the first population of mammalian cells that do not comprise the specific genetic payloads.

103. The method of any one of claims 98 - 102 wherein at least one of the donor nucleic acid constructs has a different genetic payload than at least one other donor nucleic acid constructs, and at least a plurality of the second population of mammalian cells incorporate each of the different genetic payloads.

104. The method of any one of claims 98 - 103 wherein the at least one of the HR regions contains at least one mutation that prevents the cutting of the genetic payload at the nuclease cutting site.

105. The method of claims 98 - 104 wherein identifying the percentage of surviving cells is accomplished using flow cytometry.

106. An engineered iPSC comprising a genomic modification, wherein the genomic modification comprises an insertion of an exogenous knock-in cassette within an endogenous coding sequence of a GAPDH gene in the iPSC’s genome, wherein the knock-in cassette comprises an exogenous coding sequence for a safety switch in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence encoding GAPDH, or a functional variant thereof, and wherein the iPSC expresses the gene product of interest and GAPDH, or a functional variant thereof, optionally wherein the gene product of interest and the GAPDH are expressed from the endogenous promoter of the GAPDH gene.

107. The iPSC of claim 106, wherein the iPSC’s genome comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product.

108. The iPSC of claim 107, wherein the iPSC’s genome comprises an IRES or 2A element located between the coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest.

109. The iPSC of any one of claims 106 - 108, wherein the iPSC’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, wherein, if a 3 ’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

110. The iPSC of any one of claims 106 - 109, wherein the coding sequence of the GAPDH gene is less than 100% identical to an endogenous coding sequence of the GAPDH gene.

111. The iPSC of any one of claims 106 — 110, wherein the iPSC’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

112. The iPSC of any one of claims 106 — 111, for use as a medicament.

113. The iPSC of any one of claims 106 — 112, for use in the treatment of a disease, disorder, or condition, e.g., a cancer.

114. A system for editing the genome of an iPSC in a population of iPSCs, the system comprising the population of iPSC, a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene of the iPSC, and a donor template that comprises a knock-in cassette comprising a safety switch in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene.

115. The system of claim 114, wherein the break is a double-strand break.

116. The system of any one of claims 114-115, wherein the break is located within the last exon of the GAPDH gene.

117. The system of any one of claims 114-116, wherein the nuclease is a CRISPR/Cas nuclease and the system further comprises a guide molecule for the CRISPR/Cas nuclease.

118. The system of any one of claims 114-116, wherein the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease.

119. The system of any one of claims 114-118, wherein the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product.

120. The system of claim 119, wherein the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest.

121. The system of any one of claims 114 — 120, wherein the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, wherein, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

122. The system of any one of claims 114 - 121, wherein the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC.

123. The system of claim 122, wherein the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the DNA nuclease, and/or to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC.

124. The system of claim 123, wherein the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette does not comprise a target site for the nuclease.

125. A method of increasing the percentage of genetically modified mammalian cells with a desired genetic payload within a population of mammalian cells comprising: creating a first population of mammalian cells by providing at least one donor nucleic acid construct comprising a specific genetic payload flanked by a first homologous region (HR) and a second HR, wherein the first and second HRs are essentially homologous to a first genomic region (GR) and a second GR, respectively, wherein the first GR and the second GR are adjacent to and flank a pre-determined genomic position in an exon of an essential gene in a mammalian cell, providing a gene editing system containing a nuclease that is targeted to the pre- determined genomic position, providing the at least one donor nucleic acid construct and the gene editing system into the first population of mammalian cells, culturing the first population of mammalian cells, and identifying the percentage of surviving cells that comprise the specific genetic payload, creating a second population of mammalian cells by providing to the surviving cells from the first population of mammalian cells, a gene editing system containing a nuclease that is targeted to the pre-determined genomic position; optionally reintroducing the at least one donor construct; culturing the second population of mammalian cells; and identifying the percentage of surviving cells that comprise the specific exogenous genetic payload, wherein the percentage of surviving cells from the second population of mammalian cells that comprise the specific exogenous genetic payload is higher than the percentage of surviving cells from the first population of mammalian cells that comprise the specific exogenous genetic payload.

126. The method of claim 125 wherein a plurality of the surviving cells from the first population of mammalian cells that do not comprise the specific genetic payloads are killed during the creation of the second population of mammalian cells.

127. The method any one of claims 125 - 126 wherein a plurality of surviving cells from the first population of mammalian cells that do not comprise the specific genetic payloads incorporate the specific genetic payloads during the creation of the second population of mammalian cells.

128. The method of any one of claims 125 - 127 wherein the percentage of surviving cells from the second population of mammalian cells that comprise the specific genetic payloads is at least three (3) times larger than the percentage of surviving cells from the first population of mammalian cells that comprise the specific genetic payloads.

129. The method of any one of claims 125 - 128 wherein at least one of the donor nucleic acid constructs has a different genetic payload than at least one other donor nucleic acid constructs, and at least a plurality of the second population of mammalian cells incorporate each of the different genetic payloads.

130. The method of any one of claims 125 - 129 wherein the percentage of surviving cells from the second population of mammalian cells that do not comprise the specific genetic payloads is at least five (5) times lower than the percentage of surviving cells from the first population of mammalian cells that do not comprise the specific genetic payloads.

131. The method of any one of claims 125 - 130 wherein the at least one of the HR regions contains at least one mutation that prevents the cutting of the genetic payload at the nuclease cutting site.

132. The method of any one of claims 125 - 131 wherein identifying the percentage of surviving cells is accomplished using flow cytometry.

133. A population of cells made by the method of any one of claims 125 - 132.

134. The population of cells of claim 133, wherein the cells are pluripotent stem cells (PSCs).

135. The population of cells of claim 133, wherein the cells are induced pluripotent stem cells (iPSCs).

136. A cell from the population of cells of any one of claims 133 - 135 wherein the cell is differentiated into a differentiated cell.

137. The differentiated cell of claim 136, wherein the differentiated cell is selected from: a cell in the immune system, optionally selected from a T cell, a T cell expressing a chimeric antigen receptor (CAR), a suppressive T cell, a myeloid cell, a dendritic cell, and a macrophage; a cell in the nervous system, optionally selected from a dopaminergic neuron, a microglia cell, an oligodendrocyte, an astrocyte, a cortical neuron, a spinal or oculomotor neuron, an enteric neuron, a Placode-derived cell, a Schwann cell, and a trigeminal or sensory neuron; a cell in the ocular system, optionally selected from a retinal pigment epithelial cell, a photoreceptor cone cell, a photoreceptor rod cell, a bipolar cell, and a ganglion cell; a cell in the cardiovascular system, optionally selected from a cardiomyocyte, an endothelial cell, and a nodal cell; or a cell in the metabolic system, optionally selected from a hepatocyte, a cholangiocyte, and a pancreatic beta cell.

138. An iPSC of claim 135, wherein the iPSC’s genome comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product.

139. The iPSC of claim 138, wherein the iPSC’s genome comprises an IRES or 2A element located between the coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest.

140. The iPSC of any one of claims 138 - 139, wherein the iPSC’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, wherein, if a 3 ’UTR sequence is present, the 3 ’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

141. The iPSC of any one of claims 138 - 140, wherein the coding sequence of the GAPDH gene is less than 100% identical to an endogenous coding sequence of the GAPDH gene.

142. The iPSC of any one of claims 138 - 141, wherein the iPSC’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

143. The iPSC of any one of claims 138 - 142, for use as a medicament.

144. The iPSC of any one of claims 138 — 143, for use in the treatment of a disease, disorder, or condition, e.g., a cancer.

Description:

SELECTION BY ESSENTIAL-GENE KNOCK-IN

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application No.

63/019,950, filed May 4, 2020, the contents of which is hereby incorporated in its entirety.

BACKGROUND

[0002] One major problem with targeted integration strategies for the generation of genetically engineered cells is that successful targeted integration events can be rare, especially when using double-stranded DNA (dsDNA) as a template where knock-in efficiencies are often below 5%. There is therefore typically a requirement for a screening or selection strategy that enriches for cellular clones that harbor a successfully integrated allele or gene. Many selection strategies have been devised to identify correctly targeted clones, e.g., by co-integration of reporter genes that confer fluorescence, antibiotic resistance, etc. However, these selection strategies are time consuming, inefficient and not desirable for use in a therapeutic context. Indeed, even for a single targeted integration, it can be necessary to screen hundreds, sometimes thousands, of clones in order to identify a successfully targeted clone. In situations where multiple edits are desired it can be necessary to screen tens of thousands of clones or more.

SUMMARY

[0003] The present disclosure provides strategies, systems, compositions, and methods for genetically engineering cells via targeted integration that do not require external selection markers, such as fluorescent or antibiotic resistance markers, while yielding a high frequency of correctly targeted clones. In general, the strategies, systems, compositions, and methods for genetically engineering cells via targeted integration provided herein feature a targeted break in an essential gene mediated by a nuclease, and integration of an exogenous knock-in cassette that, if inserted correctly, results in a functional variant of the essential gene and also includes an expression construct harboring a cargo sequence.

[0004] In one aspect, the disclosure features a method of editing the genome of a cell

(e.g., a cell in a population of cells), the method comprising contacting the cell (or the population of cells) with: (i) a nuclease that causes a break within an endogenous coding sequence of an essential gene in the cell, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene, wherein the knock-in cassette is integrated into the genome of the cell by homology-directed repair (HDR) of the break, resulting in a genome-edited cell that expresses:

(a) the gene product of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof.

[0005] In some embodiments, following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable cells of the population of cells are genome-edited cells, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 80% of the viable cells of the population of cells are genome-edited cells, and about 20% or less of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 60% of the viable cells of the population of cells are genome- edited cells, and about 40% or less of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 90% of the viable cells of the population of cells are genome-edited cells, and about 10% or less of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 95% of the viable cells of the population of cells are genome-edited cells, and about 5% or less of the population of cells lacking an integrated knock-in cassette are viable cells.

[0006] In some embodiments, if the knock-in cassette is not integrated into the genome of the cell by homology-directed repair (HDR) in the correct position or orientation, the cell no longer expresses the gene product encoded by the essential gene, or a functional variant thereof. [0007] In some embodiments, the break is a double-strand break.

[0008] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0009] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of cells contacted with the nuclease. In some embodiments, the nuclease is capable of introducing indels (insertions or deletions) in at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of cells contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the cell (or the population of cells) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the nuclease is a CRISPR/Cas nuclease selected from Table 5. In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide molecule binds to and mediates CRISPR/Cas cleavage at a location within the essential gene that is necessary for function (e.g., functional gene expression or protein function). In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0010] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0011] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the cell. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the cell. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the cell, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the cell.

[0012] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e.g.,

ATNF SLLKQ AGD VEENPGP), a E2A element (e.g, QCTNYALLKLAGDVESNPGP), or an F2A element (e.g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0013] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0014] In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the essential gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene that spans the break.

[0015] In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell, e.g., less than 99%, less than 95%, less than 90%, less than 85%, or less than 80% identical to the corresponding endogenous coding sequence of the essential gene of the cell. In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is 80% to 99% identical to the corresponding endogenous coding sequence of the essential gene of the cell, e.g., 85% to 95% or 90% to 99% identical to the corresponding endogenous coding sequence of the essential gene of the cell. In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the cell, or to increase expression of the gene product of the essential gene and/or the gene product of interest after integration of the knock-in cassette into the genome of the cell.

[0016] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0017] In some embodiments, the essential gene is GAPDH, TBP, E2F4, G6PD, or

KIF11. In some embodiments, the essential gene is a gene selected from Table 3, Table 4, or Table 17.

[0018] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0019] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited cell comprises knock-in cassettes at one or both alleles of the essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest from the same allele of an essential gene, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest from different alleles of the essential gene, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof. [0020] In some embodiments, the method comprises contacting the cell (or the population of cells) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene. In some embodiments, the genome-edited cell comprises the first knock-in cassette at a first allele of the essential gene and the second knock-in cassette at the second allele of the essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof.

[0021] In some embodiments, the method comprises contacting the cell (or the population of cells) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a first essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a second essential gene. In some embodiments, the genome-edited cell comprises the first knock- in cassette at one or both alleles of the first essential gene and the second knock-in cassette at one or both alleles of the second essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene products encoded by the first and second essential genes required for survival and/or proliferation of the cell, or a functional variant thereof. [0022] In another aspect, the disclosure features a genetically modified cell comprising a genome with an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of a coding sequence of an essential gene, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell, and wherein at least part of the coding sequence of the essential gene comprises an exogenous coding sequence.

[0023] In some embodiments, the exogenous coding sequence of the essential gene comprises about 2000, 1500, 1000, 750, 500, 400, 300, 200, 100, or 50 base pairs of the coding sequence of the essential gene.

[0024] In some embodiments, the exogenous coding sequence of the essential gene encodes a C-terminal fragment of a protein encoded by the essential gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene that spans the break.

[0025] In some embodiments, the exogenous coding sequence of the essential gene is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell. In some embodiments, the exogenous coding sequence of the essential gene has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell to remove a target site of a nuclease, e.g., a Cas. In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence of the essential gene includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0026] In some embodiments, the essential gene is GAPDH, TBP, E2F4, G6PD, or

KIF11.

[0027] In some embodiments, the cell’s genome comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the cell’s genome comprises an IRES or 2A element located between the coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest. [0028] In some embodiments, the cell’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3 ’UTR sequence is present, the 3 ’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0029] In some embodiments, the cell’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0030] In another aspect, the disclosure features an engineered cell comprising a genomic modification, wherein the genomic modification comprises an insertion of an exogenous knock- in cassette within an endogenous coding sequence of an essential gene in the cell’s genome, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell, wherein the knock-in cassette comprises an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence encoding the gene product of the essential gene, or a functional variant thereof, and wherein the cell expresses the gene product of interest and the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof, optionally wherein the gene product of interest and the gene product encoded by the essential gene are expressed from the endogenous promoter of the essential gene.

[0031] In some embodiments, the exogenous coding sequence or partial coding sequence encoding the gene product of the essential gene comprises about 2000, 1500, 1000, 750, 500,

400, 300, 200, 100, or 50 base pairs of the coding sequence of the essential gene.

[0032] In some embodiments, wherein the exogenous coding sequence or partial coding sequence encoding the gene product of the essential gene encodes a C-terminal fragment of a protein encoded by the essential gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene that spans the break.

[0033] In some embodiments, exogenous coding sequence or partial coding sequence encoding the gene product of the essential gene is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell. In some embodiments, the exogenous coding sequence or partial coding sequence encoding the gene product of the essential gene has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell to remove a target site of a nuclease, e.g., a Cas. In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence encoding the gene product of the essential gene includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0034] In some embodiments, the essential gene is GAPDH, TBP, E2F4, G6PD, or

KIF11.

[0035] In some embodiments, the cell’s genome comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the cell’s genome comprises an IRES or 2A element located between the coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest.

[0036] In some embodiments, the cell’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3 ’UTR sequence is present, the 3 ’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0037] In some embodiments, the cell’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0038] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited cell comprises knock-in cassettes at one or both alleles of the essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof. [0039] In some embodiments, the engineered cell comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene. In some embodiments, the engineered cell comprises the first knock-in cassette and the second knock-in cassette at a first allele of the essential gene, optionally wherein the engineered cell also comprises the first knock-in cassette and the second knock-in cassette at a second allele of the essential gene. In some embodiments, the engineered cell comprises the first knock-in cassette at a first allele of the essential gene and the second knock-in cassette at the second allele of the essential gene. In some embodiments, the engineered cell expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof.

[0040] In some embodiments, the engineered cell comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a first essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a second essential gene. In some embodiments, the engineered cell comprises the first knock-in cassette at one or both alleles of the first essential gene and the second knock-in cassette at one or both alleles of the second essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene products encoded by the first and second essential genes required for survival and/or proliferation of the cell, or a functional variant thereof.

[0041] In another aspect, the disclosure features any of the cells described herein for use as a medicament and/or for use in the treatment of a disease, disorder or condition, e.g., a disease, disorder or condition described herein, e.g., a cancer, e.g., a cancer described herein. [0042] In another aspect, the disclosure features a cell, or a population of cells, produced by any of the methods described herein, or progeny thereof. [0043] In another aspect, the disclosure features a system for editing the genome of a cell

(or a cell in a population of cells), the system comprising the cell (or the population of cells), a nuclease that causes a break within an endogenous coding sequence of an essential gene of the cell, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell, and a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene.

[0044] In some embodiments, after contacting the population of cells with the nuclease and the donor template, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable cells of the population of cells are genome-edited cells, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, after contacting the population of cells with the nuclease and the donor template, at least about 80% of the viable cells of the population of cells are genome-edited cells, and about 20% or less of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, after contacting the population of cells with the nuclease and the donor template, at least about 60% of the viable cells of the population of cells are genome-edited cells, and about 40% or less of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, after contacting the population of cells with the nuclease and the donor template, at least about 90% of the viable cells of the population of cells are genome-edited cells, and about 10% or less of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, after contacting the population of cells with the nuclease and the donor template, at least about 95% of the viable cells of the population of cells are genome-edited cells, and about 5% or less of the population of cells lacking an integrated knock-in cassette are viable cells.

[0045] In some embodiments, after contacting the cell or population of cells with the nuclease and the donor template, if the knock-in cassette is not integrated into the genome of the cell by homology-directed repair (HDR) in the correct position or orientation, the cell no longer expresses the gene product encoded by the essential gene, or a functional variant thereof.

[0046] In some embodiments, the break is a double-strand break. [0047] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

500, 400, 300, 200, 100, or 50 base pairs of the endogenous coding sequence of the essential gene. In some embodiments, the break is located within the last exon of the essential gene. [0048] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of cells contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the cell (or the population of cells) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0049] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0050] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the cell. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the cell. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the cell, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the cell.

[0051] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e.g.,

[0052] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0053] In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the essential gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene that spans the break.

[0054] In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell. In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the cell, or to increase expression of the gene product of the essential gene and/or the gene product of interest after integration of the knock-in cassette into the genome of the cell.

[0055] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0056] In some embodiments, the essential gene is GAPDH, TBP, E2F4, G6PD, or

KIF11.

[0057] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0058] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, after contacting the population of cells with the nuclease and the donor template, the genome-edited cell comprises knock-in cassettes at one or both alleles of the essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof.

[0059] In some embodiments, the system comprises a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene. In some embodiments, after contacting the population of cells with the nuclease and the donor templates, the genome-edited cell comprises the first knock-in cassette at a first allele of the essential gene and the second knock-in cassette at the second allele of the essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof.

[0060] In some embodiments, the system comprises a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a first essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a second essential gene. In some embodiments, after contacting the population of cells with the nuclease and the donor templates, the genome-edited cell comprises the first knock-in cassette at one or both alleles of the first essential gene and the second knock- in cassette at one or both alleles of the second essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene products encoded by the first and second essential genes required for survival and/or proliferation of the cell, or a functional variant thereof.

[0061] In another aspect, the disclosure features a donor template comprising a knock-in cassette with an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of an essential gene, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell.

[0062] In some embodiments, the donor template is for use in editing the genome of a cell by homology-directed repair (HDR).

[0063] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0064] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of a target site in the genome of the cell. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of a target site in the genome of the cell. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of a target site in the genome of the cell, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of a target site in the genome of the cell.

[0065] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e.g.,

[0066] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0067] In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the essential gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene.

[0068] In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell. In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the cell, or to increase expression of the gene product of the essential gene and/or the gene product of interest after integration of the knock-in cassette into the genome of the cell.

[0069] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0070] In some embodiments, the essential gene is GAPDH, TBP, E2F4, G6PD, or

KIF11.

[0071] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0072] In one aspect, the disclosure features a method of producing a population of modified cells, the method comprising contacting cells with: (i) a nuclease that causes a break within an endogenous coding sequence of an essential gene in a plurality of the cells, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cells, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene, wherein the knock- in cassette is integrated into the genome of a plurality of the cells by homology-directed repair (HDR) of the break, resulting in genome-edited cells that expresses: (a) the gene product of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the plurality of cells, or a functional variant thereof, and wherein following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable cells are genome-edited cells, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the cells lacking an integrated knock-in cassette are viable cells, thereby producing a population of modified cells. In some embodiments, following the contacting step, at least about 80% of the viable cells are genome-edited cells, and about 20% or less of the cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 60% of the viable cells are genome-edited cells, and about 40% or less of the cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 90% of the viable cells are genome-edited cells, and about 10% or less of the cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 95% of the viable cells are genome- edited cells, and about 5% or less of cells lacking an integrated knock-in cassette are viable cells. [0073] In some embodiments, if the knock-in cassette is not integrated into the genome of the cell by homology-directed repair (HDR) in the correct position or orientation, the cell no longer expresses the gene product encoded by the essential gene, or a functional variant thereof. [0074] In some embodiments, the break is a double-strand break.

[0075] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

500, 400, 300, 200, 100, or 50 base pairs of the endogenous coding sequence of the essential gene. In some embodiments, the break is located within the last exon of the essential gene. [0076] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of cells contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the cell (or the population of cells) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885. [0077] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0078] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the cell. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the cell. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the cell, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the cell.

[0079] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e.g.,

[0080] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0081] In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the essential gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene that spans the break.

[0082] In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell. In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the cell, or to increase expression of the gene product of the essential gene and/or the gene product of interest after integration of the knock-in cassette into the genome of the cell.

[0083] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0084] In some embodiments, the essential gene is GAPDH, TBP, E2F4, G6PD, or

KIF11.

[0085] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0086] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited cells comprise knock-in cassettes at one or both alleles of the essential gene. In some embodiments, the genome-edited cells expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cells, or a functional variant thereof.

[0087] In some embodiments, the method comprises contacting the cells (or the population of cells) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene. In some embodiments, the genome-edited cells comprise the first knock-in cassette at a first allele of the essential gene and the second knock-in cassette at the second allele of the essential gene. In some embodiments, the genome-edited cells expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cells, or a functional variant thereof.

[0088] In some embodiments, the method comprises contacting the cells (or the population of cells) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a first essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a second essential gene. In some embodiments, the genome-edited cells comprise the first knock- in cassette at one or both alleles of the first essential gene and the second knock-in cassette at one or both alleles of the second essential gene. In some embodiments, the genome-edited cells expresses (a) the first and second gene products of interest, and (b) the gene products encoded by the first and second essential genes required for survival and/or proliferation of the cells, or a functional variant thereof.

[0089] In another aspect, the disclosure features a method of selecting and/or identifying a cell comprising a knock-in of a gene product of interest within an endogenous coding sequence of an essential gene in the cell, the method comprising contacting a population of cells with: (i) a nuclease that causes a break within an endogenous coding sequence of an essential gene in a plurality of the cells, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cells, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene, wherein the knock-in cassette is integrated into the genome of a plurality of the cells by homology-directed repair (HDR) of the break, and identifying a genome-edited cell within the population of cells that expresses: (a) the gene product of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof.

[0090] In some embodiments, following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable cells of the population of cells are genome-edited cells, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 80% of the viable cells of the population of cells are genome-edited cells, and about 20% or less of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 60% of the viable cells of the population of cells are genome- edited cells, and about 40% or less of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 90% of the viable cells of the population of cells are genome-edited cells, and about 10% or less of the population of cells lacking an integrated knock-in cassette are viable cells. In some embodiments, following the contacting step, at least about 95% of the viable cells of the population of cells are genome-edited cells, and about 5% or less of the population of cells lacking an integrated knock-in cassette are viable cells.

[0091] In some embodiments, if the knock-in cassette is not integrated into the genome of the cell by homology-directed repair (HDR) in the correct position or orientation, the cell no longer expresses the gene product encoded by the essential gene, or a functional variant thereof. [0092] In some embodiments, the break is a double-strand break.

[0093] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

500, 400, 300, 200, 100, or 50 base pairs of the endogenous coding sequence of the essential gene. In some embodiments, the break is located within the last exon of the essential gene.

[0094] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of cells contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the cell (or the population of cells) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the essential gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0095] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0096] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the cell. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the cell. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the cell, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the cell.

[0097] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e.g.,

[0098] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0099] In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the essential gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene that spans the break.

[0100] In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell. In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the cell, or to increase expression of the gene product of the essential gene and/or the gene product of interest after integration of the knock-in cassette into the genome of the cell.

[0101] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0102] In some embodiments, the essential gene is GAPDH, TBP, E2F4, G6PD, or

KIF11.

[0103] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0104] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited cell comprises knock-in cassettes at one or both alleles of the essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof.

[0105] In some embodiments, the method comprises contacting the population of cells with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene. In some embodiments, the genome-edited cells comprises the first knock-in cassette at a first allele of the essential gene and the second knock-in cassette at the second allele of the essential gene. In some embodiments, the genome-edited cells expresses (a) the first and second gene products of interest, and (b) the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof.

[0106] In some embodiments, the method comprises contacting the population of cells with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a first essential gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a second essential gene. In some embodiments, the genome-edited cells comprises the first knock-in cassette at one or both alleles of the first essential gene and the second knock-in cassette at one or both alleles of the second essential gene. In some embodiments, the genome-edited cell expresses (a) the first and second gene products of interest, and (b) the gene products encoded by the first and second essential genes required for survival and/or proliferation of the cell, or a functional variant thereof.

[0107] In another aspect, the disclosure features a method of editing the genome of an induced pluripotent stem cell (iPSC) (e.g., an iPSC in a population of iPSCs), the method comprising contacting the iPSC (or the population of iPSCs) with: (i) a nuclease that causes a break within an endogenous coding sequence of a glyceraldehyde 3 -phosphate dehydrogenase (GAPDH) gene in the iPSC, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, wherein the knock-in cassette is integrated into the genome of the iPSC by homology- directed repair (HDR) of the break, resulting in a genome-edited iPSC that expresses: (a) the gene product of interest, and (b) GAPDH, or a functional variant thereof.

[0108] In some embodiments, following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 80% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 20% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 60% of the viable iPSCs of the population of iPSCs are genome- edited iPSCs, and about 40% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 90% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 10% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 95% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 5% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0109] In some embodiments, if the knock-in cassette is not integrated into the genome of the iPSCs by homology-directed repair (HDR) in the correct position or orientation, the iPSCs no longer expresses GAPDH, or a functional variant thereof.

[0110] In some embodiments, the break is a double-strand break.

[0111] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0112] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885. [0113] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0114] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0115] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0116] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0117] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the GAPDH gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0118] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0119] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0120] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0121] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0122] In some embodiments, the method comprises contacting the iPSC(or the population of iPSCs) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the genome-edited iPSC comprises the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0123] In another aspect, the disclosure features a genetically modified iPSC comprising a genome with an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of a coding sequence of a GAPDH gene, and wherein at least part of the coding sequence of the GAPDH gene comprises an exogenous coding sequence.

[0124] In some embodiments, the exogenous coding sequence of the GAPDH gene comprises about 2000, 1500, 1000, 750, 500, 400, 300, 200, 100, or 50 base pairs of the coding sequence of the GAPDH gene. In some embodiments, the exogenous coding sequence of the GAPDH gene comprises about 200 base pairs of the coding sequence of the GAPDH gene. [0125] In some embodiments, the exogenous coding sequence of the GAPDH gene encodes a C-terminal fragment of a protein encoded by the GAPDH gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0126] In some embodiments, the exogenous coding sequence of the GAPDH gene is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence of the GAPDH gene has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of a nuclease, e.g., a Cas. In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence of the GAPDH gene includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations. [0127] in some embodiments, the iPSC’s genome comprises a regulatory element that enables expression of the gene product encoded by the GAPDH gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the iPSC’s genome comprises an IRES or 2A element located between the coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest.

[0128] In some embodiments, the iPSC’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0129] In some embodiments, the iPSC’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0130] In another aspect, the disclosure features an engineered iPSC comprising a genomic modification, wherein the genomic modification comprises an insertion of an exogenous knock-in cassette within an endogenous coding sequence of a GAPDH gene in the iPSC’s genome, wherein the knock-in cassette comprises an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence encoding GAPDH, or a functional variant thereof, and wherein the iPSC expresses the gene product of interest and GAPDH, or a functional variant thereof, optionally wherein the gene product of interest and GAPDH are expressed from the endogenous GAPDH promoter.

[0131] In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH comprises about 2000, 1500, 1000, 750, 500, 400, 300, 200, 100, or 50 base pairs of the coding sequence of the GAPDH gene. In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH comprises about 200 base pairs of the coding sequence of the GAPDH gene.

[0132] In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH encodes a C-terminal fragment of GAPDH. In some embodiments, the C- terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0133] In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of a nuclease, e.g., a Cas. In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence encoding GAPDH includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0134] In some embodiments, the iPSC’s genome comprises a regulatory element that enables expression of the gene product encoded by the GAPDH gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the iPSC’s genome comprises an IRES or 2A element located between the coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest.

[0135] In some embodiments, the iPSC’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3 ’UTR sequence is present, the 3 ’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0136] In some embodiments, the iPSC’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0137] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0138] In some embodiments, the engineered iPSC comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the engineered iPSC comprises the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the engineered iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0139] In another aspect, the disclosure features an immune cell (e.g., an iNK cell or T cell) differentiated from an iPSC described herein.

[0140] In another aspect, the disclosure features any of the iPSCs (or iNK or T cell differentiated from an iPSC) described herein for use as a medicament and/or for use in the treatment of a disease, disorder or condition, e.g., a disease, disorder or condition described herein, e.g., a cancer, e.g., a cancer described herein.

[0141] In another aspect, the disclosure features an iPSC, or a population of iPSCs, produced by any of the methods described herein, or progeny thereof.

[0142] In another aspect, the disclosure features a system for editing the genome of an iPSC (or an iPSC in a population of iPSCs), the system comprising the iPSC (or the population of iPSC), a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene of the iPSC, and a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene.

[0143] In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 80% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 20% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 60% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 40% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 90% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 10% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 95% of the viable iPSCs of the population of iPSCs are genome- edited iPSCs, and about 5% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0144] In some embodiments, after contacting the iPSC or population of iPSCs with the nuclease and the donor template, if the knock-in cassette is not integrated into the genome of the iPSC by homology-directed repair (HDR) in the correct position or orientation, the iPSC no longer expresses GAPDH or a functional variant thereof.

[0145] In some embodiments, the break is a double-strand break.

[0146] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0147] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0148] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0149] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0150] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0151] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0152] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0153] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0154] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0155] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene. [0156] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0157] In some embodiments, the system comprises a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor templates, the genome-edited iPSC comprises the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0158] In another aspect, the disclosure features a donor template comprising a knock-in cassette with an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a GAPDH gene. [0159] In some embodiments, the donor template is for use in editing the genome of an iPSC by homology-directed repair (HDR).

[0160] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0161] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of a target site in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of a target site in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of a target site in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of a target site in the genome of the iPSC.

[0162] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0163] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0164] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene. [0165] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0166] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0167] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0168] In another aspect, the disclosure features a method of producing a population of modified iPSCs, the method comprising contacting iPSCs with: (i) a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene in a plurality of the iPSCs, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, wherein the knock-in cassette is integrated into the genome of a plurality of the iPSCs by homology-directed repair (HDR) of the break, resulting in genome-edited iPSCs that expresses: (a) the gene product of interest, and (b)

GAPDH, or a functional variant thereof, and wherein following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs are genome- edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the iPSCs lacking an integrated knock-in cassette are viable iPSCs, thereby producing a population of modified iPSCs. In some embodiments, following the contacting step, at least about 80% of the viable iPSCs are genome-edited iPSCs, and about 20% or less of the iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 60% of the viable iPSCs are genome-edited iPSCs, and about 40% or less of the iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 90% of the viable iPSCs are genome-edited iPSCs, and about 10% or less of the iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 95% of the viable iPSCs are genome- edited iPSCs, and about 5% or less of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0169] In some embodiments, if the knock-in cassette is not integrated into the genome of the iPSC by homology-directed repair (HDR) in the correct position or orientation, the iPSC no longer expresses GAPDH, or a functional variant thereof.

[0170] In some embodiments, the break is a double-strand break.

[0171] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0172] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0173] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0174] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0175] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0176] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence. [0177] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0178] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0179] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0180] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0181] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSCs comprise knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof. [0182] In some embodiments, the method comprises contacting iPSCs (or the population of iPSCs) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the genome-edited iPSCs comprise the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0183] In another aspect, the disclosure features a method of selecting and/or identifying an iPSC comprising a knock-in of a gene product of interest within an endogenous coding sequence of a GAPDH gene in the iPSC, the method comprising contacting a population of iPSCs with: (i) a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene in a plurality of the iPSCs, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, wherein the knock-in cassette is integrated into the genome of a plurality of the iPSCs by homology-directed repair (HDR) of the break, and identifying a genome-edited iPSC within the population of iPSCs that expresses: (a) the gene product of interest, and (b) GAPDH, or a functional variant thereof.

[0184] In some embodiments, following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of iPSCs lacking an integrated knock-in cassette are iPSCs. In some embodiments, following the contacting step, at least about 80% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 20% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 60% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 40% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 90% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 10% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 95% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 5% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0185] In some embodiments, if the knock-in cassette is not integrated into the genome of the iPSC by homology-directed repair (HDR) in the correct position or orientation, the iPSC no longer expresses GAPDH, or a functional variant thereof.

[0186] In some embodiments, the break is a double-strand break.

[0187] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0188] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0189] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0190] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0191] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e.g., ATNF SLLKQ AGD VEENPGP), aE2A element (e.g., QCTNYALLKLAGDVESNPGP), or an F2A element (e.g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0192] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0193] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0194] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0195] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0196] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0197] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0198] In some embodiments, the method comprises contacting the population of iPSCs with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the genome-edited iPSCs comprise the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0199] In another aspect, the disclosure features a method of editing the genome of an induced pluripotent stem cell (iPSC) (e.g., an iPSC in a population of iPSCs), the method comprising contacting the iPSC (or the population of iPSCs) with: (i) a nuclease that causes a break within an endogenous coding sequence of a glyceraldehyde 3 -phosphate dehydrogenase (GAPDH) gene in the iPSC, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, wherein the knock-in cassette is integrated into the genome of the iPSC by homology- directed repair (HDR) of the break, resulting in a genome-edited iPSC that expresses: (a) the gene product of interest, and (b) GAPDH, or a functional variant thereof, wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof.

[0200] In some embodiments, following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 80% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 20% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 60% of the viable iPSCs of the population of iPSCs are genome- edited iPSCs, and about 40% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 90% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 10% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 95% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 5% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0201] In some embodiments, if the knock-in cassette is not integrated into the genome of the iPSCs by homology-directed repair (HDR) in the correct position or orientation, the iPSCs no longer expresses GAPDH, or a functional variant thereof.

[0202] In some embodiments, the break is a double-strand break.

[0203] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0204] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0205] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0206] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0207] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e.g., ATNF SLLKQ AGD VEENPGP), aE2A element (e.g., QCTNYALLKLAGDVESNPGP), or an F2A element (e.g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0208] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0209] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the GAPDH gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0210] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0211] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0212] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0213] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0214] In some embodiments, the method comprises contacting the iPSC (or the population of iPSCs) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the genome-edited iPSC comprises the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0215] In some embodiments, the genome-edited iPSC comprises multi-cistronic knock- ins (e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: CD 16 + IL15; IL15 + CD 16; CD16 + CAR; CAR + CD 16; IL15 + CAR; CAR + IL15; CD 16 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CD 16; IL15 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + IL15; CAR + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CAR.

In some embodiments, the genome-edited iPSC comprises bi-allelic knock-ins (e.g., a first gene product of interest at a first allele of GAPDH gene, and a second gene product of interest at a second allele of GAPDH gene) of the following pairs of gene products of interest: CD16 + IL15; IL15 + CD16; CD16 + CAR; CAR + CD16; IL15 + CAR; CAR + IL15; CD16 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CD 16; IL15 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + IL15; CAR + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CAR. [0216] In some embodiments, the method comprises contacting the iPSC (or the population of iPSCs) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a second essential gene. In some embodiments, the genome-edited iPSC comprises the first knock-in cassette at one or both alleles of the GAPDH gene and the second knock-in cassette at one or both alleles of the second essential gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, (b) GAPDH, and (c) the gene product encoded by the second essential gene required for survival and/or proliferation of the iPSC, or a functional variant thereof. In some embodiments, the second essential gene is a gene listed in Table 3 or 4. In some embodiments, the second essential gene is TBP.

[0217] In another aspect, the disclosure features a genetically modified iPSC comprising a genome with an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of a coding sequence of a GAPDH gene, wherein at least part of the coding sequence of the GAPDH gene comprises an exogenous coding sequence, and wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof.

[0218] In some embodiments, the exogenous coding sequence of the GAPDH gene comprises about 2000, 1500, 1000, 750, 500, 400, 300, 200, 100, or 50 base pairs of the coding sequence of the GAPDH gene. In some embodiments, the exogenous coding sequence of the GAPDH gene comprises about 200 base pairs of the coding sequence of the GAPDH gene.

[0219] In some embodiments, the exogenous coding sequence of the GAPDH gene encodes a C-terminal fragment of a protein encoded by the GAPDH gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0220] In some embodiments, the exogenous coding sequence of the GAPDH gene is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence of the GAPDH gene has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of a nuclease, e.g., a Cas. In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence of the GAPDH gene includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0221] In some embodiments, the iPSC’s genome comprises a regulatory element that enables expression of the gene product encoded by the GAPDH gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the iPSC’s genome comprises an IRES or 2A element located between the coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest.

[0222] In some embodiments, the iPSC’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3 ’UTR sequence is present, the 3 ’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0223] In some embodiments, the iPSC’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0224] In another aspect, the disclosure features an engineered iPSC comprising a genomic modification, wherein the genomic modification comprises an insertion of an exogenous knock-in cassette within an endogenous coding sequence of a GAPDH gene in the iPSC’s genome, wherein the knock-in cassette comprises an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence encoding GAPDH, or a functional variant thereof, wherein the iPSC expresses the gene product of interest and GAPDH, or a functional variant thereof, optionally wherein the gene product of interest and GAPDH are expressed from the endogenous GAPDH promoter, and wherein the gene product of interest is a chimeric antigen receptor (CAR), a non- naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof.

[0225] In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH comprises about 2000, 1500, 1000, 750, 500, 400, 300, 200, 100, or 50 base pairs of the coding sequence of the GAPDH gene. In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH comprises about 200 base pairs of the coding sequence of the GAPDH gene.

[0226] In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH encodes a C-terminal fragment of GAPDH. In some embodiments, the C- terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0227] In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of a nuclease, e.g., a Cas. In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence encoding GAPDH includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0228] In some embodiments, the iPSC’s genome comprises a regulatory element that enables expression of the gene product encoded by the GAPDH gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the iPSC’s genome comprises an IRES or 2A element located between the coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest.

[0229] In some embodiments, the iPSC’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0230] In some embodiments, the iPSC’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0231] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0232] In some embodiments, the engineered iPSC comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the engineered iPSC comprises the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the engineered iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0233] In some embodiments, the engineered iPSC comprises multi-cistronic knock-ins

(e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: CD 16 + IL15; IL15 + CD 16; CD 16 + CAR; CAR + CD 16; IL15 + CAR; CAR + IL15; CD 16 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CD 16; IL15 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + IL15; CAR + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CAR.

In some embodiments, the engineered iPSC comprises bi-allelic knock-ins (e.g., a first gene product of interest at a first allele of GAPDH gene, and a second gene product of interest at a second allele of GAPDH gene) of the following pairs of gene products of interest: CD16 + IL15; IL15 + CD16; CD16 + CAR; CAR + CD16; IL15 + CAR; CAR + IL15; CD16 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CD 16; IL15 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + IL15; CAR + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CAR.

[0234] In some embodiments, engineered iPSC comprises the first knock-in cassette at one or both alleles of the GAPDH gene and the second knock-in cassette at one or both alleles of a second essential gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, (b) GAPDH, and (c) the gene product encoded by the second essential gene required for survival and/or proliferation of the iPSC, or a functional variant thereof. In some embodiments, the second essential gene is a gene listed in Table 3 or 4. In some embodiments, the second essential gene is TBP.

[0235] In another aspect, the disclosure features an immune cell (e.g., an iNK cell or T cell) differentiated from an iPSC described herein.

[0236] In another aspect, the disclosure features any of the iPSCs (or iNK or T cell differentiated from an iPSC) described herein for use as a medicament and/or for use in the treatment of a disease, disorder or condition, e.g., a disease, disorder or condition described herein, e.g., a cancer, e.g., a cancer described herein.

[0237] In another aspect, the disclosure features an iPSC, or a population of iPSCs, produced by any of the methods described herein, or progeny thereof.

[0238] In another aspect, the disclosure features a system for editing the genome of an iPSC (or an iPSC in a population of iPSCs), the system comprising the iPSC (or the population of iPSC), a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene of the iPSC, and a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof.

[0239] In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 80% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 20% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 60% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 40% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 90% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 10% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 95% of the viable iPSCs of the population of iPSCs are genome- edited iPSCs, and about 5% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0240] In some embodiments, after contacting the iPSC or population of iPSCs with the nuclease and the donor template, if the knock-in cassette is not integrated into the genome of the iPSC by homology-directed repair (HDR) in the correct position or orientation, the iPSC no longer expresses GAPDH or a functional variant thereof.

[0241] In some embodiments, the break is a double-strand break. [0242] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0243] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs94-157 and 225-1885.

[0244] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0245] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0246] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0247] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0248] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0249] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0250] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0251] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0252] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0253] In some embodiments, the system comprises a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor templates, the genome-edited iPSC comprises the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0254] In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template or templates, the iPSCs comprise multi-cistronic knock-ins (e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: CD16 + IL15; IL15 + CD16; CD16 + CAR; CAR + CD 16; IL15 + CAR; CAR + IL15; CD 16 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CD 16; IL15 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47)

+ IL15; CAR + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CAR. In some embodiments, the iPSCs comprise bi-allelic knock-ins (e.g., a first gene product of interest at a first allele of GAPDH gene, and a second gene product of interest at a second allele of GAPDH gene) of the following pairs of gene products of interest: CD 16 + IL15; IL15 + CD 16; CD 16 + CAR; CAR + CD 16; IL15 + CAR; CAR + IL15; CD 16 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CD 16; IL15 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + IL15; CAR + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CAR. [0255] In some embodiments, the iPSCs comprise the first knock-in cassette at one or both alleles of the GAPDH gene and the second knock-in cassette at one or both alleles of a second essential gene. In some embodiments, the IPSCs express (a) the first and second gene products of interest, (b) GAPDH, and (c) the gene product encoded by the second essential gene required for survival and/or proliferation of the iPSC, or a functional variant thereof. In some embodiments, the second essential gene is a gene listed in Table 3 or 4. In some embodiments, the second essential gene is TBP.

[0256] In another aspect, the disclosure features a donor template comprising a knock-in cassette with an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a GAPDH gene, wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof. [0257] In some embodiments, the donor template is for use in editing the genome of an iPSC by homology-directed repair (HDR).

[0258] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0259] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of a target site in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of a target site in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of a target site in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of a target site in the genome of the iPSC.

[0260] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0261] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence. [0262] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene. [0263] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0264] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0265] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0266] In another aspect, the disclosure features a method of producing a population of modified iPSCs, the method comprising contacting iPSCs with: (i) a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene in a plurality of the iPSCs, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, wherein the knock-in cassette is integrated into the genome of a plurality of the iPSCs by homology-directed repair (HDR) of the break, resulting in genome-edited iPSCs that expresses: (a) the gene product of interest, and (b)

GAPDH, or a functional variant thereof, wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL- 15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof, and wherein following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs are genome- edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the iPSCs lacking an integrated knock-in cassette are viable iPSCs, thereby producing a population of modified iPSCs. In some embodiments, following the contacting step, at least about 80% of the viable iPSCs are genome-edited iPSCs, and about 20% or less of the iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 60% of the viable iPSCs are genome-edited iPSCs, and about 40% or less of the iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 90% of the viable iPSCs are genome-edited iPSCs, and about 10% or less of the iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 95% of the viable iPSCs are genome- edited iPSCs, and about 5% or less of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0267] In some embodiments, if the knock-in cassette is not integrated into the genome of the iPSC by homology-directed repair (HDR) in the correct position or orientation, the iPSC no longer expresses GAPDH, or a functional variant thereof.

[0268] In some embodiments, the break is a double-strand break.

[0269] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0270] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0271] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0272] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0273] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0274] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0275] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0276] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0277] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0278] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0279] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSCs comprise knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0280] In some embodiments, the method comprises contacting iPSCs (or the population of iPSCs) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the genome-edited iPSCs comprise the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0281] In some embodiments, the genome-edited iPSCs comprise multi-cistronic knock- ins (e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: CD 16 + IL15; IL15 + CD 16; CD16 + CAR; CAR + CD 16; IL15 + CAR; CAR + IL15; CD 16 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CD 16; IL15 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + IL15; CAR + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CAR.

In some embodiments, the genome-edited iPSCs comprise bi-allelic knock-ins (e.g., a first gene product of interest at a first allele of GAPDH gene, and a second gene product of interest at a second allele of GAPDH gene) of the following pairs of gene products of interest: CD16 + IL15; IL15 + CD16; CD16 + CAR; CAR + CD16; IL15 + CAR; CAR + IL15; CD16 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CD 16; IL15 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + IL15; CAR + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CAR.

[0282] In some embodiments, the method comprises contacting iPSCs (or the population of iPSCs) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a second essential gene. In some embodiments, the genome-edited iPSC comprises the first knock-in cassette at one or both alleles of the GAPDH gene and the second knock-in cassette at one or both alleles of the second essential gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, (b) GAPDH, and (c) the gene product encoded by the second essential gene required for survival and/or proliferation of the iPSC, or a functional variant thereof. In some embodiments, the second essential gene is a gene listed in Table 3 or 4. In some embodiments, the second essential gene is TBP.

[0283] In another aspect, the disclosure features a method of selecting and/or identifying an iPSC comprising a knock-in of a gene product of interest within an endogenous coding sequence of a GAPDH gene in the iPSC, the method comprising contacting a population of iPSCs with: (i) a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene in a plurality of the iPSCs, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, wherein the knock-in cassette is integrated into the genome of a plurality of the iPSCs by homology-directed repair (HDR) of the break, and identifying a genome-edited iPSC within the population of iPSCs that expresses: (a) the gene product of interest, and (b) GAPDH, or a functional variant thereof, wherein the gene product of interest is a chimeric antigen receptor (CAR), a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL- 12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof.

[0284] In some embodiments, following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of iPSCs lacking an integrated knock-in cassette are iPSCs. In some embodiments, following the contacting step, at least about 80% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 20% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 60% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 40% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 90% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 10% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 95% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 5% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0285] In some embodiments, if the knock-in cassette is not integrated into the genome of the iPSC by homology-directed repair (HDR) in the correct position or orientation, the iPSC no longer expresses GAPDH, or a functional variant thereof.

[0286] In some embodiments, the break is a double-strand break.

[0287] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

500, 400, 300, 200, 100, or 50 base pairs of the endogenous coding sequence of the GAPDH gene. In some embodiments, the break is located within the last 200 base pairs of the endogenous coding sequence of the GAPDH gene. In some embodiments, the break is located within the last exon of the GAPDH gene. [0288] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0289] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0290] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. [0291] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0292] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0293] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0294] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0295] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0296] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0297] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0298] In some embodiments, the method comprises contacting the population of iPSCs with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the genome-edited iPSCs comprise the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0299] In some embodiments, the genome-edited iPSCs comprise multi-cistronic knock- ins (e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: CD 16 + IL15; IL15 + CD 16; CD16 + CAR; CAR + CD 16; IL15 + CAR; CAR + IL15; CD 16 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CD 16; IL15 + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + IL15; CAR + (HLA-E or HLA-G or CD47); (HLA-E or HLA-G or CD47) + CAR.

[0300] In some embodiments, the method comprises contacting iPSCs (or population of iPSCs) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a second essential gene. In some embodiments, the genome-edited iPSCs comprise the first knock-in cassette at one or both alleles of the GAPDH gene and the second knock-in cassette at one or both alleles of the second essential gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, (b) GAPDH, and (c) the gene product encoded by the second essential gene required for survival and/or proliferation of the iPSCs, or a functional variant thereof. In some embodiments, the second essential gene is a gene listed in Table 3 or 4. In some embodiments, the second essential gene is TBP.

[0301] In another aspect, the disclosure features a method of editing the genome of an induced pluripotent stem cell (iPSC) (e.g., an iPSC in a population of iPSCs), the method comprising contacting the iPSC (or the population of iPSCs) with: (i) a nuclease that causes a break within an endogenous coding sequence of a glyceraldehyde 3 -phosphate dehydrogenase (GAPDH) gene in the iPSC, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, wherein the knock-in cassette is integrated into the genome of the iPSC by homology- directed repair (HDR) of the break, resulting in a genome-edited iPSC that expresses: (a) the gene product of interest, and (b) GAPDH, or a functional variant thereof, wherein the gene product of interest is PD-L1 or leukocyte surface antigen cluster of differentiation CD47 (CD47). [0302] In some embodiments, following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 80% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 20% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 60% of the viable iPSCs of the population of iPSCs are genome- edited iPSCs, and about 40% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 90% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 10% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 95% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 5% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0303] In some embodiments, if the knock-in cassette is not integrated into the genome of the iPSCs by homology-directed repair (HDR) in the correct position or orientation, the iPSCs no longer expresses GAPDH, or a functional variant thereof.

[0304] In some embodiments, the break is a double-strand break.

[0305] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0306] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0307] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0308] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0309] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0310] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0311] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the GAPDH gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0312] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0313] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0314] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0315] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0316] In some embodiments, the method comprises contacting the iPSC (or the population of iPSCs) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the genome-edited iPSC comprises the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0317] In some embodiments, the genome-edited iPSC comprises multi-cistronic knock- ins (e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: PD-L1 + CD47; or CD47 + PD- Ll. In some embodiments, the genome-edited iPSC comprises bi-allelic knock-ins (e.g., a first gene product of interest at a first allele of GAPDH gene, and a second gene product of interest at a second allele of GAPDH gene) of the following pairs of gene products of interest: PD-L1 + CD47. [0318] In another aspect, the disclosure features a genetically modified iPSC comprising a genome with an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of a coding sequence of a GAPDH gene, wherein at least part of the coding sequence of the GAPDH gene comprises an exogenous coding sequence, and wherein the gene product of interest is PD-L1 or leukocyte surface antigen cluster of differentiation CD47 (CD47). [0319] In some embodiments, the exogenous coding sequence of the GAPDH gene comprises about 2000, 1500, 1000, 750, 500, 400, 300, 200, 100, or 50 base pairs of the coding sequence of the GAPDH gene. In some embodiments, the exogenous coding sequence of the GAPDH gene comprises about 200 base pairs of the coding sequence of the GAPDH gene.

[0320] In some embodiments, the exogenous coding sequence of the GAPDH gene encodes a C-terminal fragment of a protein encoded by the GAPDH gene. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0321] In some embodiments, the exogenous coding sequence of the GAPDH gene is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence of the GAPDH gene has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of a nuclease, e.g., a Cas. In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence of the GAPDH gene includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0322] In some embodiments, the iPSC’s genome comprises a regulatory element that enables expression of the gene product encoded by the GAPDH gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the iPSC’s genome comprises an IRES or 2A element located between the coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. [0323] In some embodiments, the iPSC’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0324] In some embodiments, the iPSC’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0325] In another aspect, the disclosure features an engineered iPSC comprising a genomic modification, wherein the genomic modification comprises an insertion of an exogenous knock-in cassette within an endogenous coding sequence of a GAPDH gene in the iPSC’s genome, wherein the knock-in cassette comprises an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence encoding GAPDH, or a functional variant thereof, wherein the iPSC expresses the gene product of interest and GAPDH, or a functional variant thereof, optionally wherein the gene product of interest and GAPDH are expressed from the endogenous GAPDH promoter, and wherein the gene product of interest is PD-L1 or leukocyte surface antigen cluster of differentiation CD47 (CD47).

[0326] In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH comprises about 2000, 1500, 1000, 750, 500, 400, 300, 200, 100, or 50 base pairs of the coding sequence of the GAPDH gene. In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH comprises about 200 base pairs of the coding sequence of the GAPDH gene.

[0327] In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH encodes a C-terminal fragment of GAPDH. In some embodiments, the C- terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0328] In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence encoding GAPDH has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of a nuclease, e.g., a Cas. In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence encoding GAPDH includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0329] In some embodiments, the iPSC’s genome comprises a regulatory element that enables expression of the gene product encoded by the GAPDH gene and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the iPSC’s genome comprises an IRES or 2A element located between the coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest.

[0330] In some embodiments, the iPSC’s genome comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3 ’UTR sequence is present, the 3 ’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0331] In some embodiments, the iPSC’s genome does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0332] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0333] In some embodiments, the engineered iPSC comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the engineered iPSC comprises the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the engineered iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0334] In some embodiments, the engineered iPSC comprises multi-cistronic knock-ins

(e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: PD-L1 + CD47; CD47 + PD-L1. In some embodiments, the engineered iPSC comprises bi-allelic knock-ins (e.g., a first gene product of interest at a first allele of GAPDH gene, and a second gene product of interest at a second allele of GAPDH gene) of PD-L1 + CD47.

[0335] In another aspect, the disclosure features an immune cell (e.g., an iNK cell or T cell) differentiated from an iPSC described herein.

[0336] In another aspect, the disclosure features any of the iPSCs (or iNK or T cell differentiated from an iPSC) described herein for use as a medicament and/or for use in the treatment of a disease, disorder or condition, e.g., a disease, disorder or condition described herein, e.g., a cancer, e.g., a cancer described herein.

[0337] In another aspect, the disclosure features an iPSC, or a population of iPSCs, produced by any of the methods described herein, or progeny thereof.

[0338] In another aspect, the disclosure features a system for editing the genome of an iPSC (or an iPSC in a population of iPSCs), the system comprising the iPSC (or the population of iPSC), a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene of the iPSC, and a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and wherein the gene product of interest is PD-L1 or leukocyte surface antigen cluster of differentiation CD47 (CD47).

[0339] In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 80% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 20% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 60% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 40% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 90% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 10% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, at least about 95% of the viable iPSCs of the population of iPSCs are genome- edited iPSCs, and about 5% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0340] In some embodiments, after contacting the iPSC or population of iPSCs with the nuclease and the donor template, if the knock-in cassette is not integrated into the genome of the iPSC by homology-directed repair (HDR) in the correct position or orientation, the iPSC no longer expresses GAPDH or a functional variant thereof.

[0341] In some embodiments, the break is a double-strand break.

[0342] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0343] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0344] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0345] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0346] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0347] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0348] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0349] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0350] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations. [0351] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0352] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0353] In some embodiments, the system comprises a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, after contacting the population of iPSCs with the nuclease and the donor templates, the genome-edited iPSC comprises the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0354] In some embodiments, after contacting the population of iPSCs with the nuclease and the donor template or templates, the iPSCs comprise multi-cistronic knock-ins (e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: PD-L1 + CD47; CD47 + PD-L1.

[0355] In another aspect, the disclosure features a donor template comprising a knock-in cassette with an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of a GAPDH gene, wherein the gene product of interest is PD-L1 or leukocyte surface antigen cluster of differentiation CD47 (CD47).

[0356] In some embodiments, the donor template is for use in editing the genome of an iPSC by homology-directed repair (HDR).

[0357] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0358] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5 homology arm comprising a sequence homologous to a sequence located 5 of a target site in the genome of the iPSC. In some embodiments, the donor template comprises a 3 homology arm comprising a sequence homologous to a sequence located 3 of a target site in the genome of the iPSC. In some embodiments, the donor template comprises a 5 homology arm comprising a sequence homologous to a sequence located 5 of a target site in the genome of the iPSC, and the donor template comprises a 3 homology arm comprising a sequence homologous to a sequence located 3 of a target site in the genome of the iPSC.

[0359] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0360] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0361] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 10 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene. [0362] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0363] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0364] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0365] In another aspect, the disclosure features a method of producing a population of modified iPSCs, the method comprising contacting iPSCs with: (i) a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene in a plurality of the iPSCs, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, wherein the knock-in cassette is integrated into the genome of a plurality of the iPSCs by homology-directed repair (HDR) of the break, resulting in genome-edited iPSCs that expresses: (a) the gene product of interest, and (b) GAPDH, or a functional variant thereof, wherein the gene product of interest is PD-L1 or leukocyte surface antigen cluster of differentiation CD47 (CD47), and wherein following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the iPSCs lacking an integrated knock-in cassette are viable iPSCs, thereby producing a population of modified iPSCs. In some embodiments, following the contacting step, at least about 80% of the viable iPSCs are genome-edited iPSCs, and about 20% or less of the iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 60% of the viable iPSCs are genome-edited iPSCs, and about 40% or less of the iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 90% of the viable iPSCs are genome- edited iPSCs, and about 10% or less of the iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 95% of the viable iPSCs are genome-edited iPSCs, and about 5% or less of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0366] In some embodiments, if the knock-in cassette is not integrated into the genome of the iPSC by homology-directed repair (HDR) in the correct position or orientation, the iPSC no longer expresses GAPDH, or a functional variant thereof.

[0367] In some embodiments, the break is a double-strand break.

[0368] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0369] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0370] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0371] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3 homology arm comprising a sequence homologous to a sequence located 3 of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3 homology arm comprising a sequence homologous to a sequence located 3 of the break in the genome of the iPSC.

[0372] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0373] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0374] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0375] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0376] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations. [0377] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0378] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSCs comprise knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0379] In some embodiments, the method comprises contacting iPSCs (or the population of iPSCs) with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the genome-edited iPSCs comprise the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0380] In some embodiments, the genome-edited iPSCs comprise multi-cistronic knock- ins (e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: PD-L1 + CD47; CD47 + PD- Ll.

[0381] In another aspect, the disclosure features a method of selecting and/or identifying an iPSC comprising a knock-in of a gene product of interest within an endogenous coding sequence of a GAPDH gene in the iPSC, the method comprising contacting a population of iPSCs with: (i) a nuclease that causes a break within an endogenous coding sequence of a GAPDH gene in a plurality of the iPSCs, and (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, wherein the knock-in cassette is integrated into the genome of a plurality of the iPSCs by homology-directed repair (HDR) of the break, and identifying a genome-edited iPSC within the population of iPSCs that expresses: (a) the gene product of interest, and (b) GAPDH, or a functional variant thereof, wherein the gene product of interest is PD-L1 or leukocyte surface antigen cluster of differentiation CD47 (CD47).

[0382] In some embodiments, following the contacting step, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and/or about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, or about 5% or less, of the population of iPSCs lacking an integrated knock-in cassette are iPSCs. In some embodiments, following the contacting step, at least about 80% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 20% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 60% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 40% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 90% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 10% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs. In some embodiments, following the contacting step, at least about 95% of the viable iPSCs of the population of iPSCs are genome-edited iPSCs, and about 5% or less of the population of iPSCs lacking an integrated knock-in cassette are viable iPSCs.

[0383] In some embodiments, if the knock-in cassette is not integrated into the genome of the iPSC by homology-directed repair (HDR) in the correct position or orientation, the iPSC no longer expresses GAPDH, or a functional variant thereof.

[0384] In some embodiments, the break is a double-strand break.

[0385] In some embodiments, the break is located within the last 2000, 1500, 1000, 750,

[0386] In some embodiments, the nuclease is highly efficient, e.g., capable of editing at least about 60%, at least about 65%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or more, of iPSCs contacted with the nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease. In some embodiments, the nuclease is a CRISPR/Cas nuclease and the method further comprises contacting the iPSC (or the population of iPSCs) with a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a Cas9 or a Cas12a nuclease, or a variant thereof (e.g., a nuclease comprising the amino acid sequence of any one of SEQ ID NOs: 58-66). In some embodiments, the guide molecule comprises a targeting domain sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule comprises a targeting domain sequence that differs by no more than 3 nucleotides from a sequence that is complementary to a portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule specifically binds to the portion of the endogenous coding sequence of the GAPDH gene. In some embodiments, the guide molecule does not bind to an endogenous coding sequence of another gene, e.g., a different essential gene. In some embodiments, the guide comprises a nucleotide sequence of any one of SEQ ID NOs: 94-157 and 225-1885.

[0387] In some embodiments, the donor template is a donor DNA template, optionally wherein the donor DNA template is double-stranded. In some embodiments, the donor DNA template is a plasmid, optionally wherein the plasmid has not been linearized.

[0388] In some embodiments, the donor template comprises homology arms on either side of the knock-in cassette. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC. In some embodiments, the donor template comprises a 5’ homology arm comprising a sequence homologous to a sequence located 5’ of the break in the genome of the iPSC, and the donor template comprises a 3’ homology arm comprising a sequence homologous to a sequence located 3’ of the break in the genome of the iPSC.

[0389] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of GAPDH and the gene product of interest as separate gene products, optionally, wherein at least one of the gene products is a protein and the regulatory element enables expression of that protein separate from the other gene product. In some embodiments, the knock-in cassette comprises an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the GAPDH gene and the exogenous coding sequence for the gene product of interest. In some embodiments, the 2A element is a T2A element (e.g., EGRGSLLTCGDVEENPGP), a P2A element (e g., ATNF SLLKQ AGD VEENPGP), aE2A element (e g., QCTNYALLKLAGDVESNPGP), or an F2A element (e g., VKQTLNFDLLKLAGDVESNPGP). In some embodiments, the knock-in cassette further comprises a sequence encoding a linker peptide upstream of the 2A element. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0390] In some embodiments, the knock-in cassette comprises a polyadenylation sequence, and optionally a 3’ UTR sequence, downstream of the exogenous coding sequence for the gene product of interest, and, if a 3’UTR sequence is present, the 3’UTR sequence is positioned 3’ of the exogenous coding sequence and 5’ of the polyadenylation sequence.

[0391] In some embodiments, the exogenous partial coding sequence of the GAPDH gene in the knock-in cassette encodes a C-terminal fragment of GAPDH. In some embodiments, the C-terminal fragment is less than about 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the C-terminal fragment is less than about 25 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the GAPDH gene that spans the break.

[0392] In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC. In some embodiments, the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette has been codon optimized relative to the corresponding endogenous coding sequence of the GAPDH gene of the iPSC to remove a target site of the nuclease, to reduce the likelihood of homologous recombination after integration of the knock-in cassette into the genome of the iPSC, or to increase expression of GAPDH and/or the gene product of interest after integration of the knock-in cassette into the genome of the iPSC.

[0393] In some embodiments, the nuclease is a Cas (e.g., Cas9 or Cas12a), the exogenous coding sequence or partial coding sequence of the GAPDH gene in the knock-in cassette includes at least one PAM site for the Cas, and the at least one PAM site (or all PAM sites) has been codon optimized or saturated with silent and/or missense mutations.

[0394] In some embodiments, the donor template does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0395] In some embodiments, the knock-in cassette is a multi-cistronic (e.g., bi-cistronic) knock-in cassette comprising exogenous coding sequences for two or more gene products of interest. In some embodiments, the knock-in cassette comprises a first exogenous coding sequence for a first gene product of interest, a linker (e.g., T2A, P2A, and/or IRES), and a second exogenous coding sequence for a second gene product of interest. In some embodiments, the genome-edited iPSC comprises knock-in cassettes at one or both alleles of the GAPDH gene. In some embodiments, the genome-edited iPSC expresses (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0396] In some embodiments, the method comprises contacting the population of iPSCs with a first a donor template that comprises a first knock-in cassette comprising a first exogenous coding sequence for a first gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene, and with a second donor template that comprises a second knock-in cassette comprising a second exogenous coding sequence for a second gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the GAPDH gene. In some embodiments, the genome-edited iPSCs comprise the first knock-in cassette at a first allele of the GAPDH gene and the second knock-in cassette at the second allele of the GAPDH gene. In some embodiments, the genome-edited iPSCs express (a) the first and second gene products of interest, and (b) GAPDH, or a functional variant thereof.

[0397] In some embodiments, the genome-edited iPSCs comprise multi-cistronic knock- ins (e.g., at one or both alleles of GAPDH gene) of two or more gene products of interest, e.g., one or more of the following gene products of interest, in order: PD-L1 + CD47; CD47 + PD- Ll.

[0398] In another aspect, the disclosure features a method of generating a genetically modified mammalian cell comprising a coding sequence for a gene product of interest at a pre- determined genomic position, comprising: providing at least one donor template comprising the coding sequence for a gene product of interest flanked by a first homologous arm and a second homology arm, wherein the first and second homology arms are essentially homologous to a first genomic region (GR) and a second GR, respectively, wherein the first GR and the second GR are adjacent to and flank a pre-determined genomic position in an exon of an essential gene in a mammalian cell, wherein the cell becomes inviable if the exon is disrupted; providing a gene editing system containing a nuclease that is targeted to the pre-determined genomic position; introducing the at least one donor template and the gene editing system into a population of mammalian cells; culturing the population of mammalian cells; and identifying a surviving cell that comprises the coding sequence for the gene product of interest, wherein the identified surviving cell is a genetically modified mammalian cell comprising the coding sequence for the gene product of interest at the pre-determined genomic position. In another aspect, the disclosure features a method of selecting a mammalian cell comprising a coding sequence for a gene product of interest that has integrated precisely at a pre-determined genomic position, comprising: providing at least one donor template comprising the coding sequence for the gene product of interest flanked by a first homology arm and a second homology arm, wherein the first and second homology arms are essentially homologous to a first genomic region (GR) and a second GR, respectively, wherein the first GR and the second GR are adjacent to and flank a pre- determined genomic position in an exon of an essential gene in a mammalian cell, wherein the cell becomes inviable if the exon is disrupted; providing a gene editing system containing a nuclease that is targeted to the pre-determined genomic position; introducing the donor template and the gene editing system into a population of mammalian cells; culturing the population of mammalian cells; and identifying a surviving cell that comprises the coding sequence for a gene product of interest, wherein the identified surviving cell comprises the coding sequence for a gene product of interest integrated precisely at the pre-determined genomic position.

[0399] In some embodiments, the exon is the last or penultimate exon of the essential gene if the essential gene has more than one exon. In some embodiments, the pre-determined genomic position in the exon of the essential gene is within about 200 bps upstream of a stop codon, or within about 200 bps downstream of a start codon, of the essential gene.

[0400] In some embodiments, the gene editing system is a meganuclease based system, a zinc finger nuclease (ZFN) based system, a transcription activator-like effector based nuclease (TALEN) system, a CRISPR based system, or a NgAgo based system.

[0401] In some embodiments, the gene editing system is a CRISPR based system comprising a nuclease, or an mRNA or DNA encoding a nuclease, and a guide RNA (gRNA) that targets the pre-determined genomic position, optionally wherein the gene editing system is a ribonucleoprotein (RNP) complex comprising the nuclease and the gRNA.

[0402] In some embodiments, the nuclease is Cas5, Cas6, Cas7, Cas9 (optionally saCas9 or spCas9), Cas12a, or Csml.

[0403] In some embodiments, the essential gene is selected from the gene loci listed in

Table 3 or 4. In some embodiments, the essential gene is GAPDH, RPL13A, RPL7, or RPLPO gene.

[0404] In some embodiments, the first homology arm and/or the second homology arm comprise a silent PAM blocking mutation or a codon modification that prevents cleavage of the donor template by the nuclease such that the essential gene locus, once modified, is not cleaved by the nuclease.

[0405] In some embodiments, the coding sequence for the gene product of interest is linked in frame to the essential gene sequence through a coding sequence for a self-cleaving peptide, or the coding sequence for the gene product of interest contains an internal ribosomal entry site (IRES) at the 5’ end.

[0406] In some embodiments, the gene product of interest is a therapeutic protein

(optionally an antibody, an engineered antigen receptor, or an antigen-binding fragment thereof), an immunomodulatory protein, a reporter protein, or a safety switch signal.

[0407] In some embodiments, the method further comprises contacting the population of mammalian cells with an inhibitor of non-homologous end joining.

[0408] In some embodiments, the population of mammalian cells are human cells. In some embodiments, the populations of mammalian cells are pluripotent stem cells (PSCs). In some embodiments, the PSCs are embryonic stem cells or induced PSCs (iPSCs). [0409] In some embodiments, the method comprises providing more than one donor template. In some embodiments, each donor template is targeted to the essential gene. In some embodiments, each donor template comprises a different genomic sequence. In some embodiments, each donor template comprises coding sequence for more than one gene product of interest.

[0410] In some embodiments, the genomic sequences from one donor template are incorporated into one allele of the essential gene and the genomic sequences from the other donor template are incorporated into the other allele of the essential gene. In some embodiments, each donor template comprises coding sequence for more than one gene product of interest.

[0411] In some embodiments, each donor template comprises at least one safety switch.

In some embodiments, each donor template comprises at least one component of a safety switch. In some embodiments, the safety switch requires dimerization to function as a suicide switch. [0412] In some embodiments, the method further comprising the additional steps of providing to the surviving cells, the gene editing system containing a nuclease that is targeted to the pre-determined genomic position; optionally reintroducing the at least one donor template, to obtain a second population of mammalian cells; culturing the second population of mammalian cells; and identifying a surviving cell from the second population of mammalian cells that comprises the coding sequences for gene products of interest from the donor templates; wherein the identified surviving cell from the second population of mammalian cells is a genetically modified mammalian cell comprising the coding sequences for gene products of interest from donor templates at the pre-determined genomic position.

[0413] In some embodiments, the percentage of surviving cells from the second culturing step comprising the coding sequences for gene products of interest is enriched at least four-fold from the surviving cells from the first culturing step comprising the coding sequences for gene products of interest. In some embodiments, the percentage of surviving cells from the second culturing step comprising the coding sequences for gene products of interest from the donor templates is at least 2%.

[0414] In some embodiments, the method further comprises separating a mammalian cell comprising the coding sequences for gene products of interest from the donor templates. In some embodiments, the method further comprises growing the mammalian cell comprising the coding sequences for gene products of interest from the donor templates into a plurality of cells comprising the coding sequences for gene products of interest from the donor templates.

[0415] In some embodiments, the population of mammalian cells are PSCs. In some embodiments, the PSCs are embryonic stem cells or iPSCs.

[0416] In another aspect, the disclosure features a genetically engineered cell obtainable by any of the methods described herein. In some embodiments, the genetically engineered cell is a PSC. In some embodiments, the genetically engineered cell is an iPSC.

[0417] In another aspect, the disclosure features a method of obtaining a differentiated cell, comprising culturing a genetically engineered iPSC obtainable by any of the methods described herein in a culture medium that allows differentiation of the iPSC into the differentiated cell, or a genetically modified differentiated cell obtained by such method. In some embodiments, the differentiated cell is an immune cell, optionally selected from a T cell, a T cell expressing a chimeric antigen receptor (CAR), a suppressive T cell, a myeloid cell, a dendritic cell, and an immunosuppressive macrophage; a cell in the nervous system, optionally selected from dopaminergic neuron, a microglial cell, an oligodendrocyte, an astrocyte, a cortical neuron, a spinal or oculomotor neuron, an enteric neuron, a Placode-derived cell, a Schwann cell, and a trigeminal or sensory neuron; a cell in the ocular system, optionally selected from a retinal pigment epithelial cell, a photoreceptor cone cell, a photoreceptor rod cell, a bipolar cell, and a ganglion cell; a cell in the cardiovascular system, optionally selected from a cardiomyocyte, an endothelial cell, and a nodal cell; or a cell in the metabolic system, optionally selected from a hepatocyte, a cholangiocyte, and a pancreatic beta cell. In some embodiments, the differentiated cell is a human cell.

[0418] In another aspect, the disclosure features a pharmaceutical composition comprising any of the cells described herein. In another aspect, the disclosure features a method of treating a human patient in need thereof, comprising introducing the pharmaceutical composition to the patient, wherein the pharmaceutical composition comprises differentiated human cells. In another aspect, the disclosure features the pharmaceutical composition for use in treating a human patient in need thereof, wherein the pharmaceutical composition comprises differentiated human cells. In another aspect, the disclosure features use of the pharmaceutical composition for the manufacture of a medicament in treating a human patient in need thereof, wherein the pharmaceutical composition comprises differentiated human cells. In some embodiments, the differentiated human cells are autologous or allogenic cells.

[0419] In another aspect, the disclosure features a system for editing the genome of a mammalian cell, the system comprising a population of mammalian cells, a nuclease that causes a break within an endogenous coding sequence of an essential gene of the mammalian cell, and a plurality of donor templates each comprising a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3’) of an exogenous coding sequence or partial coding sequence of the essential gene, and wherein after contacting the population of mammalian cells with the nuclease and the donor templates, and optionally contacting the population of mammalian cells with the nuclease and optionally the donor templates a second time, at least about 2% of the viable cells of the population of mammalian cells are genome-edited cells that expresses the gene products of interest from the plurality of donor templates. In some embodiments, the essential gene is GAPDH.

[0420] In some embodiments, the mammalian cell is a PSC. In some embodiments, the mammalian cell is an iPSC.

[0421] In some embodiments, the break is a double-strand break. In some embodiments, the break is located within the last 1000, 500, 400, 300, 200, 100 or 50 base pairs of the coding sequence of the GAPDH gene. In some embodiments, the break is located within the last exon of the GAPDH gene.

[0422] In some embodiments, the nuclease is a CRISPR/Cas nuclease and the system further comprises a guide molecule for the CRISPR/Cas nuclease. In some embodiments, the nuclease is a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN) or a meganuclease.

[0423] In some embodiments, the donor templates are donor DNA templates, optionally wherein the donor DNA templates are double-stranded. In some embodiments, the donor templates comprise homology arms on either side of the exogenous coding sequences. In some embodiments, the homology arms correspond to sequences located on either side of the break in the genome of the mammalian cell. BRIEF DESCRIPTION OF THE DRAWING

[0424] The teachings described herein will be more fully understood from the following description of various exemplary embodiments, when read together with the accompanying drawing. It should be understood that the drawing described below is for illustration purposes only and is not intended to limit the scope of the present teachings in any way.

[0425] Fig. 1 shows the locations on the GAPDH gene where exemplary AsCpf1

(AsCas12a) guide RNAs bind, and the results of screening the exemplary guide RNAs that target the GAPDH gene three days after transfection. Results are from gDNA from living cells.

[0426] Fig. 2 shows results of screening the exemplary AsCpf1 (AsCas12a) guide RNAs that target the GAPDH gene, three days after transfection. Results are from gDNA from living cells.

[0427] Fig. 3 A shows an exemplary integration strategy that targets an essential gene according to certain embodiments of the present disclosure. In particular embodiments, introducing a double strand break using CRISPR gene editing (e.g., by Cas12a or Cas9) within a terminal exon (e.g., within about 500 bp upstream (5') of the stop codon of the essential gene) and administering a donor plasmid with homology arms designed to mediate homology directed repair (HDR) at the cleavage site, results in a population of viable cells carrying a cargo of interest integrated at the essential gene locus. Those cells that were edited the CRISPR nuclease, but failed to undergo integration of the cargo at the essential gene locus, do not survive.

[0428] Fig. 3B shows an exemplary integration strategy that targets the GAPDH gene according to certain embodiments of the present disclosure. Although Fig. 3B shows a strategy wherein the GAPDH gene is modified in an induced pluripotent stem cell (iPSC), this strategy can be applied to a variety of cell types, including primary cells, stem cells, and cells differentiated from iPSCs.

[0429] Fig. 3C shows an exemplary integration strategy that targets the GAPDH gene according to certain embodiments of the present disclosure. The diagram shows that the only cells that should survive over time are those cells that underwent targeted integration of a cassette that restores the GAPDH locus and includes a cargo of interest, as well as unedited cells. The population of unedited cells following CRISPR editing should be small if the nuclease and guide RNA are highly effective at cleaving the essential gene target site and introduce indels that significantly reduce the function of the essential gene product. [0430] Fig. 3D shows an exemplary integration strategy that targets an essential gene according to certain embodiments of the present disclosure. In particular embodiments, introducing a double strand break using CRISPR gene editing (e.g., by Cas12a or Cas9) to target a 5' exon (e.g., within about 500 bp downstream (3') of a start codon of the essential gene) and administering a donor plasmid with homology arms designed to mediate homology directed repair (HDR) at the cleavage site, results in a population of viable cells carrying a cargo of interest integrated at the essential gene locus. Those cells that were edited the CRISPR nuclease, but failed to undergo integration of the cargo at the essential gene locus, do not survive.

[0431] Fig. 4 shows editing efficiency at different concentrations (0.625 μM to 4 μM) of an exemplary AsCpf1 (AsCas12a) guide RNA that targets the GAPDH gene.

[0432] Fig. 5 shows the knock-in (KI) efficiency of a CD47 encoding “cargo” in the

GAPDH gene 4 days post-electroporation when the dsDNA plasmid (“PLA”) was also present. Knock-in efficiency was measured with two different concentrations of the plasmid. Knock-in was measured using ddPCR targeting the 3' positions of the knock-in “cargo”.

[0433] Fig. 6 shows the knock-in efficiency of a CD47 encoding “cargo” in the GAPDH gene 9 days post-electroporation when the dsDNA plasmid was also present. Knock-in was measured using ddPCR both targeting the 5' and 3' positions of the knock-in “cargo”, increasing the reliability of the result.

[0434] Fig. 7 maps AsCpf1 (AsCas12a) guide RNAs that target terminal exons of the

RPLP0 gene.

[0435] Fig. 8 maps AsCpf1 (AsCas12a) guide RNAs that target terminal exons of the

RPLP0 gene.

[0436] Fig. 9 maps AsCpf1 (AsCas12a) guide RNAs that target terminal exons of the

RPL13A gene.

[0437] Fig. 10 maps AsCpf1 (AsCas12a) guide RNAs that target terminal exons of the

RPL13A gene.

[0438] Fig. 11 maps AsCpf1 (AsCas12a) guide RNAs that target terminal exons of the

RPL7 gene.

[0439] Fig. 12 maps AsCpf1 (AsCas12a) guide RNAs that target terminal exons of the

RPL7 gene. [0440] Fig. 13 shows the efficiency of integration of a knock-in cassette, comprising a

GFP protein encoding “cargo” sequence, into the GAPDH locus of iPSCs, measured 7 days following transfection. (A) Depicts exemplary microscopy (brightfield and fluorescent) images, and (B) depicts exemplary flow cytometry data. Images and flow cytometry data depict insertion rates for cargo transfection alone (PLA1593 or PLA1651) compared to cargo and guide RNA transfections (RSQ22337 + PLA1593 or RSQ24570 + PLA1651), additionally, insertion rates with an exemplary exonic coding region targeting guide RNA with appropriate cargo (RSQ22337 + PLA1593) are compared to insertion rates with an intronic targeting guide RNA with appropriate cargo (RSQ24570 + PLA1651).

[0441] Fig. 14A depicts a schematic representation of a bicistronic knock-in cassette

(e.g., comprising two cistrons separated by a linker) for insertion into the GAPDH locus, the leading GAPDH Exon 9 coding region and exogenous sequences encoding proteins of interest are separated by linker sequences, the second GAPDH allele can comprise a target knock-in cassette insertion, indels, or is wild type (WT).

[0442] Fig. 14B depicts a schematic representation of bi-allelic knock-in cassettes for insertion into the GAPDH locus. Exogenous “cargo” sequences encoding proteins of interest are located on different knock-in cassettes, for each construct, the leading GAPDH Exon 9 coding region is separated from an exogenous sequence encoding a protein of interest by a linker sequence.

[0443] Fig. 15A depicts a schematic representation of a bicistronic knock-in cassette for insertion into the GAPDH locus, with the leading GAPDH Exon 9 coding region and exogenous sequences encoding GFP and mCherry separated by linker sequences P2A, T2A, and/or IRES. [0444] Fig. 15B is a panel of exemplary microscopic images (brightfield and fluorescent) of iPSCs nine days following nucleofection of RNPs comprising RSQ22337 (SEQ ID NO: 95) targeting GAPDH and Cas12a (SEQ ID NO: 62) and a bicistronic knock-in cassette comprising “cargo” sequence encoding GFP and mCherry molecules inserted at the GAPDH locus. iPSCs comprising exemplary “cargo” molecules PLA1582 (comprising donor template SEQ ID NO:

41) with linkers P2A and T2A, PLA1583 (comprising donor template SEQ ID NO: 42) with linkers T2A and P2A, and PLA1584 (comprising donor template SEQ ID NO: 43) with linkers T2A and IRES are shown. Results show that at least two different cargos can be inserted in a bicistronic manner and expression is detectable irrespective of linker type used. All images were taken at 2X 100 pm on a Keyence Microscope.

[0445] Fig. 15C depicts expression quantification (Y axis) of exemplary “cargo” molecules GFP and mCherry from various bicistronic molecules comprising the described linker pairs (X axis). mCherry as a sole “cargo” protein was utilized as a relative control.

[0446] Fig. 16A depicts exemplary flow cytometry data for bi-allelic GFP and mCherry knock-in at the GAPDH gene.

[0447] Fig. 16B depicts fluorescence imaging of cell populations prior to flow cytometry analysis following bi-allelic GFP and mCherry knock-in at the GAPDH gene.

[0448] Fig. 16C are histograms depicting exemplary flow cytometry analysis data for bi- allelic GFP and mCherry knock-in at the GAPDH gene. Cells were nucleofected with 0.5 μM RNPs comprising Cas12a (SEQ ID NO: 62) and RSQ22337 (SEQ ID NO: 95), and 2.5 μg (5 trials) or 5 μg (1 trial) GFP and mCherry donor templates.

[0449] Fig. 17A depicts exemplary flow cytometry data for GFP expression in iPSCs seven days after being transfected with a gRNA and an appropriate donor template comprising a knock-in cassette with a “cargo” sequence encoding GFP that was recombined into various loci. [0450] Fig. 17B depicts the percentage of cells having editing events as measured by

Inference of CRISPR Edits (ICE) assays 48 hours after being transfected with the noted gRNA. [0451] Fig. 17C depicts relative integrated “cargo” (GFP) expression intensity as determined by flow cytometry conducted with an FITC channel to filter GFP signal for iPSCs transfected with the noted exemplary gRNA and knock-in cassette combinations.

[0452] Fig. 18 depicts exemplary flow cytometry data highlighting the efficiency of integration of a donor template comprising a knock-in cassette comprising a GFP protein encoding “cargo” sequence, into the TBP locus of iPSCs.

[0453] Fig. 19 is exemplary ddPCR results describing knock-in cassette integration ratios in GAPDH or TBP alleles in an iPSC population.

[0454] Fig. 20 is a histogram representation of exemplary flow cytometry data for AAV6 mediated knock-in of GFP into T cells using RNPs comprising RSQ22337 targeting GAPDH and Cas12a (SEQ ID NO: 62) at various concentrations of RNP and various AAV6 multiplicity of infection (MOI) rates (vg/cell) measured seven days after electroporation and transduction. The Y axis represents percentage cell population expressing GFP, while the X axis depicts AAV6 MOL

[0455] Fig. 21 is a histogram representation of exemplary flow cytometry data depicting cell viability following AAV6 mediated knock-in of GFP at the GAPDH gene in differentiated cells. Depicted is T cell viability four days after AAV6 mediated transduction of a GFP cargo and electroporated with 1 μM RNPs comprising RSQ22337 and Cas12a (SEQ ID NO: 62); the Y axis notes cell viability as a function of total cell population, while the X axis lists various MOIs used to transduce the cells.

[0456] Fig. 22A depicts exemplary flow cytometry charts for a population of T cells transduced by AAV6 comprising a knock-in GFP cargo targeting GAPDH at 5E4 MOI and transformed with 4 μM RNP comprising Cas12a (SEQ NO: 62) and RSQ22337.

[0457] Fig. 22B depicts exemplary control experiment flow cytometry charts for T cells that were not transduced by AAV6, but solely transformed with 4 μM RNP comprising Cas12a (SEQ NO: 62) and RSQ22337.

[0458] Fig. 23 are histograms depicting exemplary flow cytometry data for AAV6 mediated knock-in of GFP into T cells at either the GAPDH locus using RNPs comprising RSQ22337 and Cas12a (SEQ ID NO: 62), or at the TRAC locus. Integration constructs each comprised homology arms approximately 500bp in length, and T cells were transduced with the same concentration of RNP and AAV MOL The mean and standard deviation of three independent biological replicates is shown, significant differences in targeted integration were observed (p = 0.0022 using unpaired t-test).

[0459] Fig. 24A is a histogram depicting the knock-in efficiency of CD 16 encoding

“cargo” integrated at the GAPDH gene of iPSCs. Targeting integration (TI) was measured at day 0 and day 19 of bulk edited cell populations using ddPCR targeting the 5' (5' assay) and 3' (3' assay) positions of the knock-in cargo.

[0460] Fig. 24B is a histogram depicting the genotypes of iPSC clones with CD 16 encoding “cargo” integrated at the GAPDH gene, measured using ddPCR targeting the 5' (5' CDN probe) and 3' (3' PolyA probe) positions of the knock-in cargo. Shown are results for four exemplary cell lines, two lines were classified as homozygous knock-in with targeted integration (TI) rates of 88.5% (clone 1) and 90.5% (clone 2) respectively, and two lines were classified as heterozygous knock-in with TI rates of 45.6% (clone 1) and 46.5% (clone 2) respectively. [0461] Fig. 25 A depicts exemplary flow cytometry data from day 32 of homozygous clone 1 CD 16 knock-in iPSCs differentiation into iNKs. The data highlights the efficiency of integration and high expression (e.g., approximately 98%) of a knock-in cassette comprising a CD 16 protein encoding “cargo” sequence, into the GAPDH gene of iPSCs. In addition, the data shows knock-in of a “cargo” at the GADPH gene does not inhibit the differentiation process, as represented by high CD56+CD45+ population proportions.

[0462] Fig. 25B depicts exemplary flow cytometry data from day 32 of homozygous clone 2 CD 16 knock-in iPSCs differentiation into iNKs. The data highlights the efficiency of integration and expression of a knock-in cassette comprising a CD 16 protein encoding “cargo” sequence, into the GAPDH gene of iPSCs.

[0463] Fig. 25C depicts exemplary flow cytometry data from day 32 of heterozygous clone 1 CD 16 knock-in iPSCs differentiation into iNKs. The data highlights the efficiency of integration and high expression (e.g., approximately 97.8%) of a knock-in cassette comprising a CD 16 protein encoding “cargo” sequence, into the GAPDH gene of iPSCs.

[0464] Fig. 25D depicts exemplary flow cytometry data from day 32 of heterozygous clone 2 CD 16 knock-in iPSCs differentiation into iNKs. The data highlights the efficiency of integration and expression of a knock-in cassette comprising a CD 16 protein encoding “cargo” sequence, into the GAPDH gene of iPSCs.

[0465] Fig. 26 is a schematic representation of an exemplary solid tumor cell killing assay, depicting the use of knock-in iPSCs differentiated into iNK cells to kill 3D spheroids created from a cancer cell line (e.g., SK-OV-3 ovarian cancer cells). Antibodies and/or cytokines may optionally be added during the 3D spheroid killing stage.

[0466] Fig. 27A shows the results of a solid tumor killing assay as described in FIG 26.

Homozygous clones comprising CD 16 knock-in at the GAPDH gene were differentiated into iNK cells and functioned to reduce tumor cell spheroid size, particularly following the addition of an antibody, e.g., 10μg/mL trastuzumab; addition of an antibody promotes antibody dependent cellular cytotoxicity (ADCC) and tumor cell killing by iNKs. Control “WT PCS” cells were bulk unedited parental clones that were electroporated without RNPs or plasmids, and at the same stage of iNK cell differentiation as test cells. The Y axis depicts normalized total integrated red object intensity, a proxy for tumor cell abundance, while the X axis depicts the Effector to Target cell (E:T) ratio. [0467] Fig. 27B shows the results of a solid tumor killing assay as described in FIG 26.

Heterozygous clones comprising CD 16 knock-in at the GAPDH gene were differentiated into iNK cells and functioned to reduce tumor cell spheroid size, particularly following the addition of an antibody, e.g., 10μg/mL trastuzumab; addition of an antibody promotes ADCC and tumor cell killing by iNKs. Control “WT PCS” cells were bulk unedited parental clones that were electroporated without RNPs or plasmids, and at the same stage of iNK cell differentiation as test cells. The Y axis depicts normalized total integrated red object intensity, a proxy for tumor cell abundance, while the X axis depicts the E:T ratio.

[0468] Fig. 28 shows the results of an in-vitro serial killing assay, where homozygous or heterozygous clones comprising CD 16 knock-in at the GAPDH gene were differentiated into iNK cells and were serially challenged with hematological cancer cells (e.g., Raji cells), with or without the addition of antibody 0. lμg/mL rituximab. The X axis represents time (0-598 hr.) with an additional tumor cell bolus (5,000 cells) being added approximately every 48hours, the Y axis represents killing efficacy as measured by normalized total red object area (e.g., presence of tumor cells). Star (*) denotes onset of addition of 0.1 μg/mL rituximab in previously rituximab absent trials. The data shows that edited iNK cells (CD 16 knock-in at GAPDH gene; clones “Homo Cl”, “Homo_C2”, “Het Cl”, and “Het_C2”) continue to kill hematological cancer cells while unedited (“PCS”) or control edited iNKs (“GFP Bulk”) derived from parental iPSCs lose this function at equivalent time points.

[0469] Fig. 29 depicts a correlation (R ² of 0.768) between CD 16 expression and reduction in tumor spheroid size at an Effector to Target (E:T) ratio of 3.16:1. Shown are differentiated iNK cells derived from either iPSC bulk edited cells or iPSC individual clones with CD 16 knock-in at the GAPDH gene. The Y axis represents normalized tumor cell killing values, while the X axis represents the percentage of a cell population expressing CD 16.

[0470] Fig. 30A is a histogram depicting exemplary ddPCR data measured at day 9 post nucleofection of two different iPSC lines with plasmids and 2 μM RNPs comprising RSQ22337 targeting the GAPDH gene and Cas12a (SEQ ID NO: 62), for knock-in of CD 16 cargo, a CAR cargo, or a biallelic GFP/mCherry cargo into the GAPDH gene.

[0471] Fig. 30B depicts exemplary flow cytometry data from iPSC lines edited with plasmids and 2 μM RNPs comprising RSQ22337 targeting the GAPDH gene and Cas12a (SEQ ID NO: 62), for knock-in of CXCR2 cargo into the GAPDH gene (GAPDH ::CXCR2) or control iPSCs transformed with RNP only (Wild-type). CXCR2 expression is noted on the X axis, edited cells expressing CXCR2 was 29.2% of the bulk edited cell population, while surface expression of CXCR2 was 8.53% of the bulk edited cell populations.

[0472] Fig. 31 is a histogram depicting the knock-in efficiency of a series of knock-in cassette cargo sequences such as CD16-P2A-CAR, CD16-IRES-CAR, CAR-P2A-CD16, CAR- IRES-CD16, and mbIL-15 into the GAPDH gene using RNPs comprising RSQ22337 targeting the GAPDH gene and Cas12a (SEQ ID NO: 62), measured on day 0 post-electroporation measured using ddPCR targeting the 5' (5' CDN probe) and 3' (3' Poly A probe) positions of the knock-in “cargo”.

[0473] Fig. 32 diagrammatically depicts a membrane-bound IL15.IL15Ra (mbIL-15) construct that can be utilized as a knock-in cargo sequence as described herein.

[0474] Fig. 33 is a histogram depicting the TI of mbIL-15 into the GAPDH gene over time when measured as a percentage of a bulk edited population. Shown are TI rates from iPSCs that that are on day 28 of the differentiation to iNK cell process.

[0475] Fig. 34A depicts exemplary flow cytometry data from bulk edited mbIL-15

GAPDH gene knock-in iPSC populations at day 39 of differentiation into iNKs.

[0476] Fig. 34B depicts exemplary flow cytometry data from bulk edited mbIL-15

GAPDH gene knock-in iPSC populations at day 39 of differentiation into iNKs.

[0477] Fig 34C shows surface expression phenotypes (measured as a percentage of the population) of bulk edited mbIL-15 GAPDH gene knock-in iPSC populations being differentiated into iNK cells as compared to parental clone cells also being differentiated into iNK cells (“WT”) at day 32, day 39, day 42, and day 49 of iPSC differentiation.

[0478] Fig. 35 shows the results from two in-vitro tumor cell killing assays. Two biological replicates of bulk edited iPSC populations (SI and S2) comprising mbIL-15 knock-in at the GAPDH gene were differentiated into iNK cells (day 56 of differentiation for S2, and day 63 of differentiation for SI) and functioned to reduce hematological cancer cells (e.g., Raji cells) fluorescence signal when compared to WT parental cells also differentiated into iNK cells, measured in the absence or presence of 10 μg/mL rituximab, E:T ratios of 1 (A) or 2.5 (B); (experiments performed in duplicate, R1 and R2).

[0479] Fig. 36 shows the results of a solid tumor killing assay as described in FIG 26.

Two biological replicates of bulk edited iPSC populations (SI and S2) comprising mbIL-15 knock-in at the GAPDH gene were differentiated into iNK cells (day 39 of iPSC differentiation) and functioned to reduce tumor cell spheroid size when compared to WT parental cells also differentiated into iNK cells. Addition of 5ng/mL exogenous IL-15 increased tumor cell killing by iNKs. The Y axis depicts normalized total integrated red object intensity, a proxy for tumor cell abundance, while the X axis depicts E:T ratio.

[0480] Fig. 37A shows the results of solid tumor killing assays as described in FIG 26.

Two biological replicates of bulk edited iPSC populations (SI and S2) comprising mbIL-15 knock-in at the GAPDH gene were differentiated into iNK cells (day 63 of iPSC differentiation for SI, and day 56 of iPSC differentiation for S2) and functioned to reduce tumor cell spheroid size. The Y axis represents killing efficacy as measured by normalized total red object area (e.g., presence of tumor cells), while the X axis represents the E:T cell ratio; experiments were performed in duplicate or triplicate, Rl, R2, and R2.1.

[0481] Fig. 37B shows the results of solid tumor killing assays as described in 37A, but with the addition of 10μg/mL Herceptin antibody, an addition that triggers ADCC tumor cell killing.

[0482] Fig. 37C shows the results of solid tumor killing assays as described in 37A, but with the addition of 5ng/mL exogenous IL-15.

[0483] Fig. 37D shows the results of solid tumor killing assays as described in 37A, but with the addition of 5ng/mL exogenous IL-15 and 10μg/mL Herceptin antibody, an addition that triggers ADCC tumor cell killing.

[0484] Fig. 38 depicts the cumulative results of two independent sets of cells and 3-5 repeats of solid tumor killing assays as described in FIG 26. Two independent bulk edited populations (SI and S2) comprising mb IL-15 knock-in at the GAPDH gene were differentiated into iNK cells (day 39 and 49 of iPSC differentiation for set 1, and day 42 of iPSC differentiation for S2) and functioned to significantly reduce tumor cell spheroid size when compared to differentiated WT parental cell iNKs in the absence of exogenous IL-15 (P=0.034, +/- standard deviation, unpaired t-test); in addition, differentiated knock-in cells trended towards significant reduction of tumor cell spheroid size when compared to differentiated WT parental cells in the presence of 5ng/mL exogenous IL-15 (P=0.052, +/- standard deviation, unpaired t-test). [0485] Fig. 39A schematically depicts a knock-in cassette cargo sequence comprising membrane-bound IL15.IL15Ra (mbIL-15) coupled with a GFP sequence, for integration at a target gene as described herein.

[0486] Fig. 39B schematically depicts a knock-in cassette cargo sequence comprising

CD 16, IL15, and IL15Ra, for integration at a target gene as described herein.

[0487] Fig. 39C schematically depicts a knock-in cassette cargo sequence comprising

CD 16 and membrane bound IL15.IL15Ra (mbIL-15), for integration at a target gene as described herein.

[0488] Fig. 40A depicts exemplary flow cytometry data from bulk edited iPSC populations seven days after transformation with PLA1829 (see Fig. 39A) comprising a cargo sequence of membrane-bound IL15.IL15Ra (mbIL-15) coupled with a GFP sequence inserted in the GAPDH gene using RNPs comprising RSQ22337 targeting the GAPDH gene and Cas12a (SEQ ID NO: 62), or control WT cells transformed with RNPs only, measured using ddPCR. Shown on the Y axis is IL-15Ra expression, while GFP expression is shown on the X axis. [0489] Fig. 40B depicts exemplary flow cytometry data from bulk edited iPSC populations seven days after transformation with PLA1832 or PLA1834 (see Fig. 39B and 39C), comprising a cargo sequence of CD 16, IL-15, and IL15Ra, or comprising a cargo sequence of CD 16 and membrane-bound IL15.IL15Ra (mbIL-15); inserted in the GAPDH gene using RNPs comprising RSQ22337 targeting the GAPDH gene and Cas12a (SEQ ID NO: 62), measured using ddPCR. Shown on the Y axis is IL-15Ra expression, X axis is GFP expression.

[0490] Fig. 41 A is a histogram depicting the genotypes of individual colonies following transformation as described in Fig. 40A with PLA1829 (5 μg) and 2 μM RNPs comprising RSQ22337 targeting the GAPDH gene and Cas12a (SEQ ID NO: 62), , measured using ddPCR. Shown are individual homozygous (~100% TI), heterozygous (~50% TI), or wild type (~0% TI) cells.

[0491] Fig. 4 IB is a histogram depicting the genotypes of individual colonies following transformation as described in Fig. 40B with PLA1832 (5 μg) and 2 μM RNPs comprising RSQ22337 targeting the GAPDH gene and Cas12a (SEQ ID NO: 62), measured using ddPCR. Shown are individual homozygous (~100% TI), heterozygous (~50% TI), or wild type (~0% TI) cells. [0492] Fig. 41C is a histogram depicting the genotypes of individual colonies following transformation as described in Fig. 40B with PLA1834 (5 μg) and 2 μM RNPs comprising RSQ22337 targeting the GAPDH gene and Cas12a (SEQ ID NO: 62), measured using ddPCR. Shown are individual homozygous (~100% TI), heterozygous (~50% TI), or wild type (~0% TI) cells.

[0493] Fig. 42A depicts exemplary flow cytometry data from cells comprising knock-in cargo sequences from PLA1829, PLA1832, or PLA1834 at the GAPDH gene (as described in Fig. 40A-40C) measured at day 32 of differentiation into iNKs; “WT” cells were transformed with RNPs only and were also at day 32 of differentiation into iNKs. The data highlights the efficiency of integration and expression of knock-in cassettes comprising an IL-15Ra protein encoding “cargo” sequence. The Y axis quantifies the percentage of cells from the noted population that are expressing IL-15Ra, while the X axis denotes colony genotype.

[0494] Fig. 42B depicts exemplary flow cytometry data from cells comprising knock-in cargo sequences from PLA1829, PLA1832, or PLA1834 at the GAPDH gene (as described in Fig. 40A-40C) measured at day 32 of differentiation into iNKs; “WT” cells were transformed with RNPs only and were also at day 32 of differentiation into iNKs. The data highlights the efficiency of integration and expression of knock-in cassettes comprising a CD 16 protein encoding “cargo” sequence. The Y axis quantifies the percentage of cells from the noted population that are expressing CD 16, while the X axis denotes colony genotype.

[0495] Fig. 42C depicts exemplary flow cytometry data from cells comprising knock-in cargo sequences from PLA1829, PLA1832, or PLA1834 at the GAPDH gene (as described in Fig. 40A-40C) measured at day 32 of differentiation into iNKs; “WT” cells were transformed with RNPs only and were also at day 32 of differentiation into iNKs. The data highlights the efficiency of integration and expression of knock-in cassettes comprising an IL-15Ra protein encoding “cargo” sequence. The Y axis quantifies the median fluorescence intensity (MFI) of a cell population expressing IL-15Ra, while the X axis denotes colony genotype.

[0496] Fig. 42D depicts exemplary flow cytometry data from cells comprising knock-in cargo sequences from PLA1829, PLA1832, or PLA1834 at the GAPDH gene (as described in Fig. 40A-40C) measured at day 32 of differentiation into iNKs; “WT” cells were transformed with RNPs only and were also at day 32 of differentiation into iNKs. The data highlights the efficiency of integration and expression of knock-in cassettes comprising a CD 16 protein encoding “cargo” sequence. The Y axis quantifies the median fluorescence intensity (MFI) of a cell population expressing CD 16, while the X axis denotes colony genotype.

[0497] Fig. 43 A is a panel of cytometric dot plots showing further enrichment of PSCs that have been edited for a PDL1 -based transgene, edited for a CD47-based transgene, or biallelically edited for a PDL1-based transgene and a CD47-based transgene targeted to the GAPDH gene locus, following a second round of editing with ribonucleoprotein (“RNP”) and PDL1-based and CD47-based donor constructs or RNP alone.

[0498] Fig. 43B is a panel of cytometric dot plots showing further enrichment of PSCs that have been edited for a PDL1-based transgene targeted to the GAPDH gene, following a second round of editing with RNP alone.

[0499] Fig. 44 depicts two cytometric dot plots showing unedited PSCs or enrichment of

PSCs that have been edited at the GAPDH locus using two different donor templates, one of which is PDL1-based and the other is CD47-based. When editing using two different donor constructs, cells can be observed that are edited with either one unique donor construct (either PDL1-based or CD47-based) or biallelically edited for both a PDL1-based transgene and a CD47-based transgene targeted to the GAPDH gene.

DETAILED DESCRIPTION

Definitions and Abbreviations

[0500] Unless otherwise specified, each of the following terms have the meaning set forth in this section.

[0501] The indefinite articles “a” and “an” refer to at least one of the associated noun, and are used interchangeably with the terms “at least one” and “one or more.” The conjunctions “or” and “and/or” are used interchangeably as non-exclusive disjunctions.

[0502] The term “cancer” (also used interchangeably with the term “neoplastic”), as used herein, refers to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Cancerous disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, e.g., malignant tumor growth, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state, e.g., cell proliferation associated with wound repair. [0503] The terms “CRISPR/Cas nuclease” as used herein refer to any CRISPR/Cas protein with DNA nuclease activity, e.g., a Cas9 or a Cas12 protein that exhibits specific association (or “targeting”) to a DNA target site, e.g., within a genomic sequence in a cell in the presence of a guide molecule. The strategies, systems, and methods disclosed herein can use any combination of CRISPR/Cas nuclease disclosed herein, or known to those of ordinary skill in the art. Those of ordinary skill in the art will be aware of additional CRISPR/Cas nucleases and variants suitable for use in the context of the present disclosure, and it will be understood that the present disclosure is not limited in this respect.

[0504] The term “differentiation” as used herein is the process by which an unspecialized

(“uncommitted”) or less specialized cell acquires the features of a specialized cell such as, for example, a blood cell. In some embodiments, a differentiated or differentiation-induced cell is one that has taken on a more specialized (“committed”) position within the lineage of a cell. For example, an iPS cell (iPSC) can be differentiated into various more differentiated cell types, for example, a hematopoietic stem cell, a lymphocyte, and other cell types, upon treatment with suitable differentiation factors in the cell culture medium. Suitable methods, differentiation factors, and cell culture media for the differentiation of pluri- and multipotent cell types into more differentiated cell types are well known to those of skill in the art. In some embodiments, the term “committed”, is applied to the process of differentiation to refer to a cell that has proceeded through a differentiation pathway to a point where, under normal circumstances, it would or will continue to differentiate into a specific cell type or subset of cell types, and cannot, under normal circumstances, differentiate into a different cell type (other than a specific cell type or subset of cell types) nor revert to a less differentiated cell type.

[0505] The terms “differentiation marker,” “differentiation marker gene,” or

“differentiation gene,” as used herein refers to genes or proteins whose expression are indicative of cell differentiation occurring within a cell, such as a pluripotent cell. In some embodiments, differentiation marker genes include, but are not limited to, the following genes: CD34, CD4, CD8, CD3, CD56 (NCAM), CD49, CD45, NK cell receptor (cluster of differentiation 16 (CD 16)), natural killer group-2 member D (NKG2D), CD69, NKp30, NKp44, NKp46, CD 158b, FOXA2, FGF5, SOX17, XIST, NODAL, COL3A1, OTX2, DUSP6, EOMES, NR2F2, NROBl, CXCR4, CYP2B6, GAT A3, GATA4, ERBB4, GATA6, HOXC6, INHA, SMAD6, RORA, NIPBL, TNFSFll, CDH11, ZIC4, GAL, SOX3, PITX2, APOA2, CXCL5, CER1, FOXQ1, MLL5, DPP 10, GSC, PCDH10, CTCFL, PCDH20, TSHZ1, MEGF10, MYC, DKK1, BMP2, LEFTY2, HES1, CDX2, GNAS, EGR1, C0L3A1, TCF4, HEPH, KDR, TOX, FOXA1, LCK, PCDH7, CD1D FOXG1, LEFTY1, TUJ1, T gene (Brachyury), ZIC1, GATA1, GATA2, HDAC4, HDAC5, HDAC7, HDAC9, NOTCH1, NOTCH2, NOTCH4, PAX5, RBPJ, RUNX1, STAT1 and STAT3.

[0506] The terms “differentiation marker gene profile,” or “differentiation gene profile,”

“differentiation gene expression profile,” “differentiation gene expression signature,” “differentiation gene expression panel,” “differentiation gene panel,” or “differentiation gene signature” as used herein refer to expression or levels of expression of a plurality of differentiation marker genes.

[0507] The term “nuclease” as used herein refers to any protein that catalyzes the cleavage of phosphodiester bonds. In some embodiments the nuclease is a DNA nuclease. In some embodiments the nuclease is a “nickase” which causes a single-strand break when it cleaves double-stranded DNA, e.g., genomic DNA in a cell. In some embodiments the nuclease causes a double-strand break when it cleaves double-stranded DNA, e.g., genomic DNA in a cell. In some embodiments the nuclease binds a specific target site within the double-stranded DNA that overlaps with or is adjacent to the location of the resulting break. In some embodiments, the nuclease causes a double-strand break that contains overhangs ranging from 0 (blunt ends) to 22 nucleotides in both 3' and 5' orientations. As discussed herein, CRISPR/Cas nucleases, zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and meganucleases are exemplary nucleases that can be used in accordance with the strategies, systems, and methods of the present disclosure.

[0508] The term “embryonic stem cell” as used herein refers to pluripotent stem cells derived from the inner cell mass of the embryonic blastocyst. In some embodiments, embryonic stem cells are pluripotent and give rise during development to all derivatives of the three primary germ layers: ectoderm, endoderm and mesoderm. In some such embodiments, embryonic stem cells do not contribute to the extra-embryonic membranes or the placenta, i.e., are not totipotent. [0509] The term “endogenous,” as used herein in the context of nucleic acids refers to a native nucleic acid (e.g., a gene, a protein coding sequence) in its natural location, e.g., within the genome of a cell. [0510] The term “essential gene” as used herein with respect to a cell refers to a gene that encodes at least one gene product that is required for survival and/or proliferation of the cell. An essential gene can be a housekeeping gene that is essential for survival of all cell types or a gene that is required to be expressed in a specific cell type for survival and/or proliferation under particular culture conditions, e.g., for proper differentiation of iPS or ES cells or expansion of iPS- or ES-derived cells. Loss of function of an essential gene results, in some embodiments, in a significant reduction of cell survival, e.g., of the time a cell characterized by a loss of function of an essential gene survives as compared to a cell of the same cell type but without a loss of function of the same essential gene. In some embodiments, loss of function of an essential gene results in the death of the affected cell. In some embodiments, loss of function of an essential gene results in a significant reduction of cell proliferation, e.g., in the ability of a cell to divide, which can manifest in a significant time period the cell requires to complete a cell cycle, or, in some preferred embodiments, in a loss of a cell’s ability to complete a cell cycle, and thus to proliferate at all.

[0511] The term “exogenous,” as used herein in the context of nucleic acids refers to a nucleic acid (whether native or non-native) that has been artificially introduced into a man-made construct (e.g., a knock-in cassette, or a donor template) or into the genome of a cell using, for example, gene editing or genetic engineering techniques, e.g., HDR based integration techniques. [0512] The term “guide molecule” or “guide RNA” or “gRNA” when used in reference to a CRISPR/Cas system is any nucleic acid that promotes the specific association (or “targeting”) of a CRISPR/Cas nuclease, e.g., a Cas9 or a Cas12 protein to a DNA target site such as within a genomic sequence in a cell. While guide molecules are typically RNA molecules it is well known in the art that chemically modified RNA molecules including DNA/RNA hybrid molecules can be used as guide molecules.

[0513] The terms “hematopoietic stem cell,” or “definitive hematopoietic stem cell” as used herein, refer to CD34-positive (CD34+) stem cells. In some embodiments, CD34-positive stem cells are capable of giving rise to mature myeloid and/or lymphoid cell types. In some embodiments, the myeloid and/or lymphoid cell types include, for example, T cells, natural killer (NK) cells and/or B cells.

[0514] The terms “induced pluripotent stem cell”, “iPS cell” or “iPSC” as used herein to refer to a stem cell obtained from a differentiated somatic (e.g., adult, neonatal, or fetal) cell by a process referred to as reprogramming (e.g., dedifferentiation). In some embodiments, reprogrammed cells are capable of differentiating into tissues of all three germ or dermal layers: mesoderm, endoderm, and ectoderm. iPSCs are not found in nature.

[0515] The terms “iPS-derived NK cell” or “iNK cell” or as used herein refers to a natural killer cell which has been produced by differentiating an iPS cell, which iPS cell may or may not have a genetic modification.

[0516] The terms “iPS-derived T cell” or “iT cell” or as used herein refers to a T which has been produced by differentiating an iPS cell, which iPS cell may or may not have a genetic modification.

[0517] The term “multipotent stem cell” as used herein refers to a cell that has the developmental potential to differentiate into cells of one or more germ layers (ectoderm, mesoderm and endoderm), but not all three germ layers. Thus, in some embodiments, a multipotent cell may also be termed a “partially differentiated cell.” Multipotent cells are well- known in the art, and examples of multipotent cells include adult stem cells, such as for example, hematopoietic stem cells and neural stem cells. In some embodiments, “multipotent” indicates that a cell may form many types of cells in a given lineage, but not cells of other lineages. For example, a multipotent hematopoietic cell can form the many different types of blood cells (red, white, platelets, etc.), but it cannot form neurons. Accordingly, in some embodiments, “multipotency” refers to a state of a cell with a degree of developmental potential that is less than totipotent and pluripotent.

[0518] The term “pluripotent” as used herein refers to ability of a cell to form all lineages of the body or soma (i.e., the embryo proper) or a given organism (e.g., human). For example, embryonic stem cells are a type of pluripotent stem cells that are able to form cells from each of the three germs layers, the ectoderm, the mesoderm, and the endoderm. Generally, pluripotency may be described as a continuum of developmental potencies ranging from an incompletely or partially pluripotent cell (e.g., an epiblast stem cell or EpiSC), which is unable to give rise to a complete organism to the more primitive, more pluripotent cell, which is able to give rise to a complete organism (e.g., an embryonic stem cell or an induced pluripotent stem cell).

[0519] The term “pluripotency” as used herein refers to a cell that has the developmental potential to differentiate into cells of all three germ layers (ectoderm, mesoderm, and endoderm). In some embodiments, pluripotency can be determined, in part, by assessing pluripotency characteristics of the cells. In some embodiments, pluripotency characteristics include, but are not limited to: (i) pluripotent stem cell morphology; (ii) the potential for unlimited self-renewal; (iii) expression of pluripotent stem cell markers including, but not limited to SSEA1 (mouse only), SSEA3/4, SSEA5, TRA1- 60/81, TRA1-85, TRA2-54, GCTM-2, TG343, TG30, CD9, CD29, CD133/prominin, CD140a, CD56, CD73, CD90, CD105, OCT4 (also known as POU5F1), NANOG, SOX2, CD30 and/or CD50; (iv) ability to differentiate to all three somatic lineages (ectoderm, mesoderm and endoderm); (v) teratoma formation consisting of the three somatic lineages; and (vi) formation of embryoid bodies consisting of cells from the three somatic lineages.

[0520] The term “pluripotent stem cell morphology” as used herein refers to the classical morphological features of an embryonic stem cell. In some embodiments, normal embryonic stem cell morphology is characterized as small and round in shape, with a high nucleus-to- cytoplasm ratio, the notable presence of nucleoli, and typical intercell spacing.

[0521] The term “polycistronic” or “multicistronic” when used herein with reference to a knock-in cassette refers to the fact that the knock-in cassette can express two or more proteins from the same mRNA transcript. Similarly, a “bicistronic” knock-in cassette is a knock-in cassette that can express two proteins from the same mRNA transcript.

[0522] The term “polynucleotide” (including, but not limited to “nucleotide sequence”,

“nucleic acid”, “nucleic acid molecule”, “nucleic acid sequence”, and “oligonucleotide”) as used herein refer to a series of nucleotide bases (also called “nucleotides”) in DNA and RNA, and mean any chain of two or more nucleotides. In some embodiments, polynucleotides, nucleotide sequences, nucleic acids, etc. can be chimeric mixtures or derivatives or modified versions thereof, single-stranded or double-stranded. In some such embodiments, modifications can occur at the base moiety, sugar moiety, or phosphate backbone, for example, to improve stability of the molecule, its hybridization parameters, etc. In general, a nucleotide sequence typically carries genetic information, including, but not limited to, the information used by cellular machinery to make proteins and enzymes. In some embodiments, a nucleotide sequence and/or genetic information comprises double- or single-stranded genomic DNA, RNA, any synthetic and genetically manipulated polynucleotide, and/or sense and/or antisense polynucleotides. In some embodiments, nucleic acids containing modified bases. [0523] Conventional IUPAC notation is used in nucleotide sequences presented herein, as shown in Table 1, below (see also Cornish-Bowden, Nucleic Acids Res. 1985; 13(9):3021-30, incorporated by reference herein). It should be noted, however, that “T” denotes “Thymine or Uracil” in those instances where a sequence may be encoded by either DNA or RNA, for example in certain CRISPR/Cas guide molecule targeting domains.

Table 1: IUPAC nucleic acid notation

[0524] The terms “potency” or “developmental potency” as used herein refer to the sum of all developmental options accessible to the cell (i.e., the developmental potency), particularly, for example in the context of cellular developmental potential. In some embodiments, the continuum of cell potency includes, but is not limited to, totipotent cells, pluripotent cells, multipotent cells, oligopotent cells, unipotent cells, and terminally differentiated cells.

[0525] The terms “prevent,” “preventing,” and “prevention” as used herein refer to the prevention of a disease in a mammal, e.g., in a human, including (a) avoiding or precluding the disease; (b) affecting the predisposition toward the disease; or (c) preventing or delaying the onset of at least one symptom of the disease. [0526] The terms “protein,” “peptide” and “polypeptide” as used herein are used interchangeably to refer to a sequential chain of amino acids linked together via peptide bonds. The terms include individual proteins, groups or complexes of proteins that associate together, as well as fragments or portions, variants, derivatives and analogs of such proteins. Unless otherwise specified, peptide sequences are presented herein using conventional notation, beginning with the amino or N-terminus on the left, and proceeding to the carboxyl or C- terminus on the right. Standard one-letter or three-letter abbreviations can be used.

[0527] The term “gene product of interest” as used herein can refer to any product encoded by a gene including any polynucleotide or polypeptide. In some embodiments the gene product is a protein which is not naturally expressed by a target cell of the present disclosure. In some embodiments the gene product is a protein which confers a new therapeutic activity to the cell such as, but not limited to, a chimeric antigen receptor (CAR) or antigen-binding fragment thereof, a T cell receptor or antigen-binding portion thereof, a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof. It is to be understood that the methods and cells of the present disclosure are not limited to any particular gene product of interest and that the selection of a gene product of interest will depend on the type of cell and ultimate use of the cells.

[0528] The term “reporter gene” as used herein refers to an exogenous gene that has been introduced into a cell, e.g., integrated into the genome of the cell, that confers a trait suitable for artificial selection. Common reporter genes are fluorescent reporter genes that encode a fluorescent protein, e.g., green fluorescent protein (GFP) and antibiotic resistance genes that confer antibiotic resistance to cells.

[0529] The terms “reprogramming” or “dedifferentiation” or “increasing cell potency” or

“increasing developmental potency” as used herein refer to a method of increasing potency of a cell or dedifferentiating a cell to a less differentiated state. For example, in some embodiments, a cell that has an increased cell potency has more developmental plasticity (i.e., can differentiate into more cell types) compared to the same cell in the non-reprogrammed state. That is, in some embodiments, a reprogrammed cell is one that is in a less differentiated state than the same cell in a non-reprogrammed state. In some embodiments, “reprogramming” refers to de- differentiating a somatic cell, or a multipotent stem cell, into a pluripotent stem cell, also referred to as an induced pluripotent stem cell, or iPSC. Suitable methods for the generation of iPSCs from somatic or multipotent stem cells are well known to those of skill in the art.

[0530] The term “subject” as used herein means a human or non-human animal. In some embodiments a human subject can be any age (e.g., a fetus, infant, child, young adult, or adult). In some embodiments a human subject may be at risk of or suffer from a disease, or may be in need of alteration of a gene or a combination of specific genes. Alternatively, in some embodiments, a subject may be a non-human animal, which may include, but is not limited to, a mammal. In some embodiments, a non-human animal is a non-human primate, a rodent (e.g., a mouse, rat, hamster, guinea pig, etc.), a rabbit, a dog, a cat, and so on. In certain embodiments of this disclosure, the non-human animal subject is livestock, e.g., a cow, a horse, a sheep, a goat, etc. In certain embodiments, the non-human animal subject is poultry, e.g., a chicken, a turkey, a duck, etc.

[0531] The terms “treatment,” “treat,” and “treating,” as used herein refer to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress, ameliorate, reduce severity of, prevent or delay the recurrence of a disease, disorder, or condition or one or more symptoms thereof, and/or improve one or more symptoms of a disease, disorder, or condition as described herein. In some embodiments, a condition includes an injury. In some embodiments, an injury may be acute or chronic (e.g., tissue damage from an underlying disease or disorder that causes, e.g., secondary damage such as tissue injury). In some embodiments, treatment, e.g., in the form of an iPSC-derived NK cell or a population of iPSC-derived NK cells as described herein, may be administered to a subject after one or more symptoms have developed and/or after a disease has been diagnosed. Treatment may be administered in the absence of symptoms, e.g., to prevent or delay onset of a symptom or inhibit onset or progression of a disease. For example, in some embodiments, treatment may be administered to a susceptible subject prior to the onset of symptoms (e.g., in light of genetic or other susceptibility factors). In some embodiments, treatment may also be continued after symptoms have resolved, for example to prevent or delay their recurrence. In some embodiments, treatment results in improvement and/or resolution of one or more symptoms of a disease, disorder or condition. [0532] The term “variant” as used herein refers to an entity such as a polypeptide or polynucleotide that shows significant structural identity with a reference entity but differs structurally from the reference entity in the presence or level of one or more chemical moieties as compared with the reference entity. In many embodiments, a variant also differs functionally from its reference entity. In general, whether a particular entity is properly considered to be a “variant” of a reference entity is based on its degree of structural identity with the reference entity. As used herein, the terms “functional variant” refer to a variant that confers the same function as the reference entity, e.g., a functional variant of a gene product of an essential gene is a variant that promotes the survival and/or proliferation of a cell. It is to be understood that a functional variant need not be functionally equivalent to the reference entity as long as it confers the same function as the reference entity.

Methods of Editing the Genome of a Cell

[0533] In one aspect, the present disclosure provides methods of editing the genome of a cell. In certain embodiments, the method comprises contacting the cell with a nuclease that causes a break within an endogenous coding sequence of an essential gene in the cell wherein the essential gene encodes at least one gene product that is required for survival and/or proliferation of the cell. The cell is also contacted with (i) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3') of an exogenous coding sequence or partial coding sequence of the essential gene and/or (ii) a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and upstream (5') of an exogenous coding sequence or partial coding sequence of the essential gene (Fig. 3D). The knock-in cassette is integrated into the genome of the cell by homology-directed repair (HDR) of the break, resulting in a genome-edited cell that expresses the gene product of interest and the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof. The genetically modified “knock-in” cell survives and proliferates to produce progeny cells with genomes that also include the exogenous coding sequence for the gene product of interest. This is illustrated in Fig. 3 A for an exemplary method.

[0534] If the knock-in cassette is not properly integrated into the genome of the cell, undesired editing events that result from the break, e.g., NHEJ-mediated creation of indels, may produce a non-functional, e.g., out of frame, version of the essential gene. This produces a “knock-out” cell when the editing efficiency of the nuclease is high enough to disrupt both alleles. In certain embodiments, this produces a “knock-out” cell when the editing efficiency of the nuclease is high enough to disrupt one allele. Without sufficient functional copies of the essential gene these “knock-out” cells are unable to survive and do not produce any progeny cells.

[0535] Since the “knock-in” cells survive and the “knock-out” cells do not survive, the method automatically selects for the “knock-in” cells when it is applied to a population of starting cells. Significantly, in certain embodiments, the method does not require high knock-in efficiencies because of this automatic selection aspect. It is therefore particularly suitable for methods where the donor template is a dsDNA (e.g., a plasmid) where knock-in efficiencies are often below 5%. As noted in the exemplary method of Fig. 3C, in some embodiments some of the cells in the population of starting cells may remain unedited, i.e., unaffected by the nuclease. These cells would also survive and produce progeny with genomes that do not include the exogenous coding sequence for the gene product of interest. When the nuclease editing efficiency is high, e.g., about 60-90%, or higher the percentage of unedited cells will be relatively low as compared to the percentage of genetically modified cells. In some embodiments, high nuclease editing efficiencies (e.g., greater than 65%, greater than 70%, greater than 75%, greater than 80%, greater than 85%, greater than 90%, or greater than 95%) facilitates efficient population wide transgene integration, as the percentage of unedited cells will be relatively low as compared to the percentage of genetically modified cells. In some embodiments of the methods disclosed herein, at least about 65% of the cells (e.g., about 70%, about 75%, about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, or about 99% of the cells) are edited by a nuclease, e.g., an Cas12a or Cas9. In some embodiments, an RNP containing a CRISPR nuclease (e.g., Cas9 or Cas12a) and a guide are capable of cleaving the locus of an essential gene (e.g., a terminal exon in the locus of any essential gene provided in Table 3) in at least 65% of the cells in a population of cells (e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the cells in a population of cells). In some embodiments, editing efficiency is determined prior to target cell die off, e.g., at day 1 and/or day 2 post transfection or transduction. In some embodiments, editing efficiency measured at day 1 and/or day 2 post transfection or transduction may not capture the complete proportion of cells for which editing occurred, as in some embodiments, certain editing events may result in near immediate and/or swift cell death. In some embodiments, near immediate and/or swift cell death may be any period of time less than 48 hours post transfection or transduction, for example, less than 48 hours, less than 44 hours, less than 40 hours, less than 36 hours, less than 32 hours, less than 28 hours, less than 24 hours, less than 20 hours, less than 16 hours, less than 15 hours, less than 14 hours, less than 13 hours, less than 12 hours, less than 11 hours, less than 10 hours, less than 9 hours, less than 8 hours, less than 7 hours, less than 6 hours, less than 5 hours, less than 4 hours, less than 3 hours, less than 2 hours, or less than 1 hour after transfection or transduction.

[0536] In some embodiments, the nuclease causes a double-strand break. In some embodiments the nuclease causes a single-strand break, e.g., in some embodiments the nuclease is a nickase. In some embodiments the nuclease is a prime editor which comprises a nickase domain fused to a reverse transcriptase domain. In some embodiments the nuclease is an RNA- guided prime editor and the gRNA comprises the donor template. In some embodiments a dual- nickase system is used which causes a double-strand break via two single-strand breaks on opposing strands of a double-stranded DNA, e.g., genomic DNA of the cell.

[0537] In some embodiments, the present disclosure provides methods suitable for high- efficiency knock-in (e.g., a high proportion of a cell population comprises a knock-in allele), overcoming a major manufacturing challenge. Historically, gene of interest knock-in using plasmid vectors results in efficiencies typically between 0.1 and 5% (see e.g., Zhu et al., CRISPR/Cas-Mediated Selection-free Knockin Strategy in Human Embryonic Stem Cells. Stem Cell Reports. 2015;4(6): 1103-1111), this low knock-in efficiency can result in a need for extensive time and resources devoted to screening potentially edited clones.

[0538] In some embodiments, a gene of interest knocked into a cell may have a role in effector function, specificity, stealth, persistence, homing/chemotaxis, and/or resistance to certain chemicals (see for example, Saetersmoen et al., Seminars in Immunopathology, 2019). [0539] In certain embodiments, the present disclosure provides methods for creation of knock-in cells that maintain high levels of expression regardless of age, differentiation status, and/or exogenous conditions. For example, in some embodiments, an integrated cargo is expressed at an optimal level with a desired subcellular localization as a function of an insertion site. In some embodiments, the present disclosure provides such cells.

Systems for Editing the Genome of a Cell

[0540] In one aspect the present disclosure provides systems for editing the genome of a cell. In some embodiments, the system comprises the cell, a nuclease that causes a break within an endogenous coding sequence of an essential gene of the cell, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell, and a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3') of an exogenous coding sequence or partial coding sequence of the essential gene.

[0541] In some embodiments, the nuclease causes a double-strand break. In some embodiments the nuclease causes a single-strand break, e.g., in some embodiments the nuclease is a nickase. In some embodiments the nuclease is a prime editor which comprises a nickase domain fused to a reverse transcriptase domain. In some embodiments the nuclease is an RNA- guided prime editor and the gRNA comprises the donor template. In some embodiments a dual- nickase system is used which causes a double-strand break via two single-strand breaks on opposing strand of a double-stranded DNA, e.g., genomic DNA of the cell.

[0542] Genome editing systems can be implemented (e.g., administered or delivered to a cell or a subject) in a variety of ways, and different implementations may be suitable for distinct applications. For instance, a genome editing system is implemented, in certain embodiments, as a protein/RNA complex (a ribonucleoprotein, or RNP). In certain embodiments, a genome editing system is implemented as one or more nucleic acids encoding an RNA-guided nuclease and guide RNA components described herein (optionally with one or more additional components); in certain embodiments, a genome editing system is implemented as one or more vectors comprising such nucleic acids, for instance a viral vector such as an adeno-associated virus; and in certain embodiments, a genome editing system is implemented as a combination of any of the foregoing. Additional or modified implementations that operate according to the principles set forth herein will be apparent to the skilled artisan and are within the scope of this disclosure.

[0543] In some embodiments, methods as described herein include performing certain steps in at least duplicate. For example, in some embodiments, integration of certain gene products of interest, particularly including multiple genes of interest or a large number of exogenous gene sequences, may result in an initial selection round that results in a lower than desired level of targeted integration. In certain embodiments, a lower than desirable levels of nuclease activity and/or of knock-in cassette targeted integration may result in a lower than desirable percentage of surviving cells and/or cells comprising the knock-in cassette; this may make identifying a cell with the genetic payload difficult. In some embodiments, to further enrich for the population of edited cells, cells were optionally expanded and then re-edited by providing the pool of edited cells with either both RNP and donor templates (e.g., one or more RNP particles targeting one or more loci, and one or more donor templates designed for targeted integration at one or more loci), or just RNP alone (e.g., one or more RNP that utilize residual donor template).

[0544] In some embodiments, where multiple rounds of RNP and/or donor template editing is performed, enrichment is affected by: i) removing cells that have not incorporated the genetic payload and/or ii) creating more cells with incorporated knock-in cassette. In some embodiments, the effectiveness of an additional enrichment steps, depending on the cargo, depending on whether multiple constructs are used, the target within the essential gene, or other factors, can lead to at least about two-fold, three-fold, four-fold, five-fold, or higher improvement in the percentage of cells incorporating the knock-in cassette from the donor template. In some embodiments, such enrichment can lead to uptake of the “cargo” within the essential gene of mammalian cells of greater than 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or greater than 95%.

[0545] In some embodiments, donor templates (e.g., donor nucleic acid constructs) comprise the transgene flanked by a first homologous region (HR) e.g., a homology arm, and a second HR, e.g., a second homology arm, designed to anneal to a first genomic region (GR) and a second GR within an essential gene of a cell. To be able to anneal, the HRs and GRs need not be perfectly homologous. In some embodiments, examples include a non-inhibitory small number (less than 6 and as few as 1) of mutations in the PAM 5’ of the transgene in the knock-in cassette. In some embodiments, other non-inhibitory changes include codon optimization, wherein unnecessary nucleotides in the wildtype exon are removed from the nucleotide sequence in the knock-in cassette. In some embodiments, other such silent PAM blocking mutations or a codon modifications that prevents cleavage of the donor nucleic acid construct by the nuclease are further contemplated. In some embodiments, at least about 90% homology is sufficient for functional annealing for purposes of the examples herein. In some embodiments, the level of homology between the HR and GR is more than 90%, more than 92%, more than 94%, more than 96%, more than 98%, or more than 99%. Other embodiments and the concepts set forth in this paragraph are contemplated and subsumed in the term “essentially homologous.”

Genetically Modified Cells

[0546] In one aspect the present disclosure provides genetically modified cells or engineered cells including populations of such cells and progeny of such cells.

[0547] In some embodiments, the cell is produced by a method of the present disclosure, e.g., a method that comprises contacting the cell with a nuclease that causes a break within an endogenous coding sequence of an essential gene in the cell wherein the essential gene encodes at least one gene product that is required for survival and/or proliferation of the cell. The cell is also contacted with a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and downstream (3 ') of an exogenous coding sequence or partial coding sequence of the essential gene. The knock-in cassette is integrated into the genome of the cell by homology-directed repair (HDR) of the break, resulting in a genome-edited cell that expresses the gene product of interest and the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof. This is illustrated in Fig. 3 for an exemplary method. In some embodiments, a cell is contacted with a donor template that comprises a knock-in cassette comprising an exogenous coding sequence for a gene product of interest in frame with and upstream (5') of an exogenous coding sequence or partial coding sequence of the essential gene. [0548] In some embodiments, the cell comprises a genome with an exogenous coding sequence for a gene product of interest in frame with and downstream (3 ') of a coding sequence of an essential gene, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell.

[0549] In some embodiments, the cell comprises a genome with an exogenous coding sequence for a gene product of interest in frame with and upstream (5') of a coding sequence of an essential gene, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell.

[0550] In some embodiments, the cell comprises a genomic modification, wherein the genomic modification comprises an insertion of an exogenous knock-in cassette within an endogenous coding sequence of an essential gene in the cell’s genome, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell, wherein the knock-in cassette comprises an exogenous coding sequence for a gene product of interest in frame with and downstream (3') of an exogenous coding sequence or partial coding sequence encoding the gene product of the essential gene, or a functional variant thereof, and wherein the cell expresses the gene product of interest and the gene product encoded by the essential gene that is required for survival and/or proliferation of the cell, or a functional variant thereof. In some embodiments, the gene product of interest and the gene product encoded by the essential gene are expressed from the endogenous promoter of the essential gene.

Donor template

[0551] In one aspect the present disclosure provides a donor template comprising a knock-in cassette with an exogenous coding sequence for a gene product of interest in frame with and downstream (3') of an exogenous coding sequence or partial coding sequence of an essential gene, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell.

[0552] In one aspect the present disclosure provides an impetus for designing donor templates comprising a knock-in cassette with an exogenous coding sequence for a gene product of interest in frame with and upstream (5') of an exogenous coding sequence or partial coding sequence of an essential gene, wherein the essential gene encodes a gene product that is required for survival and/or proliferation of the cell; see e.g., Fig. 3D.

[0553] In some embodiments, the donor template is for use in editing the genome of a cell by homology-directed repair (HDR).

[0554] Donor template design is described in detail in the literature, for instance in PCT

Publication No. W020I6/073990A1. Donor templates can be single-stranded or double- stranded and can be used to facilitate HDR-based repair of double-strand breaks (DSBs), and are particularly useful for inserting a new sequence into the target sequence, or replacing the target sequence altogether. In some embodiments, the donor template is a donor DNA template. In some embodiments the donor DNA template is double-stranded.

[0555] Whether single-stranded or double stranded, donor templates generally include regions that are homologous to regions of DNA within or near (e.g., flanking or adjoining) a target sequence to be cleaved. These homologous regions are referred to herein as “homology arms,” and are illustrated schematically below relative to the knock-in cassette (which may be separated from one or both of the homology arms by additional spacer sequences that are not shown):

[0556] [5' homology arm] - [knock-in cassette] - [3' homology arm]

[0557] The homology arms can have any suitable length (including 0 nucleotides if only one homology arm is used), and 5' and 3' homology arms can have the same length, or can differ in length. The selection of appropriate homology arm lengths can be influenced by a variety of factors, such as the desire to avoid homologies or microhomologies with certain sequences such as Alu repeats or other very common elements. For example, a 5' homology arm can be shortened to avoid a sequence repeat element. In other embodiments, a 3' homology arm can be shortened to avoid a sequence repeat element. In some embodiments, both the 5' and the 3' homology arms can be shortened to avoid including certain sequence repeat elements.

[0558] In some embodiments, more than one donor template can be administered to a cell population. In some embodiments, the more than one donor templates are different, for example, each donor template facilitates knock-in of “cargo” sequences encoding different gene products of interest. In some embodiments, the more than one donor templates can be provided at the same time and their payloads incorporated into the same essential gene (e.g., one incorporated at one allele, the other incorporated at the other allele). In some embodiments, this may be particularly advantageous when a particular transgene system and/or gene product of interest has functional sequences that require them to be separated into different alleles of an essential gene. Further, in some embodiments, having multiple copies of gene targets of interest that are different but accomplish a similar goal, e.g., copies of safety switches, can be helpful to assure the functionality and creation of a corresponding phenotype. In some embodiments, more than one copy of a safety switch can ensure elimination of cells when necessary. Further, in some embodiments, certain safety switches requires dimerization to function as a suicide switch system (e.g., as described herein). In some embodiments, when more than one donor template is administered to a cell population, such donor templates may be designed to integrate at the same genetic locus, or at different genetic loci.

[0559] A donor template can be a nucleic acid vector, such as a viral genome or circular double-stranded DNA, e.g., a plasmid. Nucleic acid vectors comprising donor templates can include other coding or non-coding elements. For example, a donor template nucleic acid can be delivered as part of a viral genome (e.g., in an AAV, adenoviral, Sendai virus, or lentiviral genome) that includes certain genomic backbone elements (e.g., inverted terminal repeats, in the case of an AAV genome). In some embodiments, a donor template is comprised in a plasmid that has not been linearized. In some embodiments, a donor template is comprised in a plasmid that has been linearized. In some embodiments, a donor template is comprised within a linear dsDNA fragment. In some embodiments, a donor template nucleic acid can be delivered as part of an AAV genome. In some embodiments, a donor template nucleic acid can be delivered as a single stranded oligo donor (ssODN), for example, as a long multi -kb ssODN derived from ml 3 phage synthesis, or alternatively, short ssODNs, e.g., that comprise small genes of interest, tags, and/or probes. In some embodiments, a donor template nucleic acid can be delivered as a Doggybone™ DNA (dbDNA™) template. In some embodiments, a donor template nucleic acid can be delivered as a DNA minicircle. In some embodiments, a donor template nucleic acid can be delivered as a Integration-deficient Lentiviral Particle (IDLV). In some embodiments, a donor template nucleic acid can be delivered as a MMLV-derived retrovirus. In some embodiments, a donor template nucleic acid can be delivered as a piggyBac™ sequence. In some embodiments, a donor template nucleic acid can be delivered as a replicating EBNA1 episome.

[0560] In certain embodiments, the 5' homology arm may be about 25 to about 1,000 base pairs in length, e.g., at least about 100, 200, 400, 600, or 800 base pairs in length. In certain embodiments, the 5' homology arm comprises about 50 to 800 base pairs, e.g., 100 to 800, 200 to 800, 400 to 800, 400 to 600, or 600 to 800 base pairs. In certain embodiments, the 3' homology arm may be about 25 to about 1,000 base pairs in length, e.g., at least about 100, 200, 400, 600, or 800 base pairs in length. In certain embodiments, the 3' homology arm comprises about 50 to 800 base pairs, e.g., 100 to 800, 200 to 800, 400 to 800, 400 to 600, or 600 to 800 base pairs. In certain embodiments, the 5' and 3' homology arms are symmetrical in length. In certain embodiments, the 5' and 3' homology arms are asymmetrical in length. [0561] In certain embodiments, a 5' homology arm is less than about 3,000 base pairs, less than about 2,900 base pairs, less than about 2,800 base pairs, less than about 2,700 base pairs, less than about 2,600 base pairs, less than about 2,500 base pairs, less than about 2,400 base pairs, less than about 2,300 base pairs, less than about 2,200 base pairs, less than about 2,100 base pairs, less than about 2,000 base pairs, less than about 1,900 base pairs, less than about 1,800 base pairs, less than about 1,700 base pairs, less than about 1,600 base pairs, less than about 1,500 base pairs, less than about 1,400 base pairs, less than about 1,300 base pairs, less than about 1,200 base pairs, less than about 1,100 base pairs, less than about 1,000 base pairs, less than about 900 base pairs, less than about 800 base pairs, less than about 700 base pairs, less than about 600 base pairs, less than about 500 base pairs, or less than about 400 base pairs.

[0562] In certain embodiments, e.g., where a viral vector is utilized to introduce a knock- in cassette through a method described herein, a 5' homology arm is less than about 1,000 base pairs, less than about 900 base pairs, less than about 800 base pairs, is less than about 700 base pairs, less than about 600 base pairs, less than about 500 base pairs, less than about 400 base pairs, or less than about 300 base pairs. In certain embodiments, e.g., where a viral vector is utilized to introduce a knock-in cassette through a method described herein, a 5' homology arm is about 400-600 base pairs, e.g., about 500 base pairs.

[0563] In certain embodiments, a 3' homology arm is less than about 3,000 base pairs, less than about 2,900 base pairs, less than about 2,800 base pairs, less than about 2,700 base pairs, less than about 2,600 base pairs, less than about 2,500 base pairs, less than about 2,400 base pairs, less than about 2,300 base pairs, less than about 2,200 base pairs, less than about 2,100 base pairs, less than about 2,000 base pairs, less than about 1,900 base pairs, less than about 1,800 base pairs, less than about 1,700 base pairs, less than about 1,600 base pairs, less than about 1,500 base pairs, less than about 1,400 base pairs, less than about 1,300 base pairs, less than about 1,200 base pairs, less than about 1,100 base pairs, less than 1,000 base pairs, less than about 900 base pairs, less than about 800 base pairs, less than about 700 base pairs, less than about 600 base pairs, less than about 500 base pairs, or less than about 400 base pairs.

[0564] In certain embodiments, e.g., where a viral vector is utilized to introduce a knock- in cassette through a method described herein, a 3' homology arm is less than about 1,000 base pairs, less than about 900 base pairs, less than about 800 base pairs, less than about 700 base pairs, less than about 600 base pairs, less than about 500 base pairs, less than about 400 base pairs, or less than about 300 base pairs. In certain embodiments, e.g., where a viral vector is utilized to introduce a knock-in cassette through a method described herein, a 3' homology arm is about 400-600 base pairs, e.g., about 500 base pairs.

[0565] In certain embodiments, the 5' and 3' homology arms flank the break and are less than 100, 75, 50, 25, 15, 10 or 5 base pairs away from an edge of the break. In certain embodiments, the 5' and 3' homology arms flank an endogenous stop codon. In certain embodiments, the 5' and 3' homology arms flank a break located within about 500 base pairs (e.g., about 500 base pairs, about 450 base pairs, about 400 base pairs, about 350 base pairs, about 300 base pairs, about 250 base pairs, about 200 base pairs, about 150 base pairs, about 100 base pairs, about 50 base pairs, or about 25 base pairs) upstream (5') of an endogenous stop codon, e.g., the stop codon of an essential gene.

In certain embodiments, the 5' homology arm encompasses an edge of the break.

Knock-in cassette

[0566] In some embodiments, a knock-in cassette within the donor template comprises an exogenous coding sequence for the gene product of interest in frame with and downstream (3 ') of an exogenous coding sequence or partial coding sequence of the essential gene. In some embodiments, a knock-in cassette within a donor template comprises an exogenous coding sequence for the gene product of interest in frame with and upstream (5') of an exogenous coding sequence or partial coding sequence of an essential gene. In some embodiments, the knock-in cassette is a polycistronic knock-in cassette. In some embodiments, the knock-in cassette is a bicistronic knock-in cassette. In some embodiment the knock-in cassette does not comprise a reporter gene, e.g., a fluorescent reporter gene or an antibiotic resistance gene.

[0567] In some embodiments, a single essential gene locus will be targeted by two knock-in cassettes comprising different “cargo” sequences. In some embodiments, one allele will incorporate one knock-in cassette, while the other allele will incorporate the other knock-in cassette. In some embodiments, a gRNA utilized to generate an appropriate DNA break may be the same for each of the two different knock-in cassettes. In some embodiments, gRNAs utilized to generate appropriate DNA breaks for each of the two different knock-in cassettes may be different, such that the “cargo” sequence is incorporated at a different position for each allele. In some embodiments, such a different position for each allele may still be within the ultimate exons coding region. In some embodiments, such a different position for each allele may be within the penultimate exon (second to last), and/or ultimate (last) exons coding region. In some embodiments, such a different position for at least one of the alleles may be within the first exon. In some embodiments, such a different position for at least one of the alleles may be within the first or second exon.

[0568] In order to properly restore the essential gene coding region in the genetically modified cell (so that a functioning gene product is produced) the knock-in cassette does not need to comprise an exogenous coding sequence that corresponds to the entire coding sequence of the essential gene. Indeed, depending on the location of the break in the endogenous coding sequence of the essential gene it may be possible to restore the essential gene by providing a knock-in cassette that comprises a partial coding sequence of the essential gene, e.g., that corresponds to a portion of the endogenous coding sequence of the essential gene that spans the break and the entire region downstream of the break (minus the stop codon), and/or that corresponds to a portion of the endogenous coding sequence of the essential gene that spans the break and the entire region upstream of the break (up to and optionally including the start codon).

[0569] In order to minimize the size of the knock-in cassette it may in fact be advantageous, in some embodiments, to have the break located within the last 1500, 1000, 750, 500, 400, 300, 200, 100, or 50 base pairs of the endogenous coding sequence of the essential gene, i.e., towards the 3' end of the coding sequence. In some embodiments, a base pair’s location in a coding sequence may be defined 3'-to-5' from an endogenous translational stop signal (e.g., a stop codon). In some embodiments, as used herein, an “endogenous coding sequence” can include both exonic and intronic base pairs, and refers to gene sequence occurring 5' to an endogenous functional translational stop signal. In some embodiments, a break within an endogenous coding sequence comprises a break within one DNA strand. In some embodiments, a break within an endogenous coding sequence comprises a break within both DNA strands. In some embodiments, a break is located within the last 1000 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 750 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 600 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 500 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 400 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 300 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 250 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 200 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 150 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 100 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 75 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 50 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the last 21 base pairs of the endogenous coding sequence. [0570] In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes a C-terminal fragment of a protein encoded by the essential gene, e.g., a fragment that is less than 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette is codon optimized. In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette is codon optimized to eliminate at least one PAM site. In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette is codon optimized to eliminate more than one PAM site. In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette is codon optimized to eliminate all relevant nuclease specific PAM sites. In some embodiments, a C-terminal fragment of a protein encoded by the essential gene is about 140 amino acids in length. In some embodiments, a C-terminal fragment of a protein encoded by the essential gene is about 130 amino acids in length. In some embodiments, a C-terminal fragment of a protein encoded by the essential gene is about 120 amino acids in length. In some embodiments, the C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene that spans the break. In some embodiments, a C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 1 exon of the essential gene. In some embodiments, a C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 2 exons of the essential gene. In some embodiments, a C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 3 exons of the essential gene. In some embodiments, a C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 4 exons of the essential gene. In some embodiments, a C-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 5 exons of the essential gene.

[0571] In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a C-terminal fragment of a protein encoded by an essential gene, e.g., a fragment that is less than 500, 250, 150, 125, 100, 75, 50, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, or 7 amino acids in length. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 20 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 19 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 18 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 17 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 16 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 1 amino acid C-terminal fragment of a protein encoded by an essential gene.

[0572] In some embodiments, e.g., when the essential gene includes many exons as shown in the exemplary method of Fig. 3 A, it may be advantageous to have the break within the last exon of the essential gene. In some embodiments, e.g., when the essential gene includes many exons as shown in the exemplary method of Fig. 3 A, it may be advantageous to have the break within the penultimate exon of the essential gene. It is to be understood however that the present disclosure is not limited to any particular location for the break and that the available positions will vary depending on the nature and length of the essential gene and the length of the exogenous coding sequence for the gene product of interest. For example, for essential genes that include a few exons or when the gene product of interest is small it may be possible to locate the break in an upstream exon. [0573] In order to minimize the size of the knock-in cassette it may in fact be advantageous, in some embodiments, to have the break located within the first 1500, 1000, 750, 500, 400, 300, 200, 100, or 50 base pairs of an endogenous coding sequence of the essential gene, i.e., starting from the 5' end of a coding sequence. In some embodiments, a base pair’s location in a coding sequence may be defined 5'-to-3' from an endogenous translational start signal (e.g., a start codon). In some embodiments, as used herein, an “endogenous coding sequence” can include both exonic and intronic base pairs, and refers to gene sequence occurring 3' to an endogenous functional translational start signal. In some embodiments, a break within an endogenous coding sequence comprises a break within one DNA strand. In some embodiments, a break within an endogenous coding sequence comprises a break within both DNA strands. In some embodiments, a break is located within the first 1000 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 750 base pairs of a endogenous coding sequence. In some embodiments, a break is located within the first 600 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 500 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 400 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 300 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 250 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 200 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 150 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 100 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 75 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 50 base pairs of the endogenous coding sequence. In some embodiments, a break is located within the first 21 base pairs of the endogenous coding sequence.

[0574] In some embodiments, the exogenous partial coding sequence of the essential gene in the knock-in cassette encodes an N-terminal fragment of a protein encoded by the essential gene, e.g., a fragment that is less than 500, 250, 150, 125, 100, 75, 50, 25, 20, 15 or 10 amino acids in length. In some embodiments, an N-terminal fragment of a protein encoded by the essential gene is about 140 amino acids in length. In some embodiments, an N-terminal fragment of a protein encoded by the essential gene is about 130 amino acids in length. In some embodiments, an N-terminal fragment of a protein encoded by the essential gene is about 120 amino acids in length. In some embodiments, an N-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence of the essential gene that spans the break. In some embodiments, an N-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 1 exon of the essential gene. In some embodiments, an N-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 2 exons of the essential gene. In some embodiments, an N-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 3 exons of the essential gene. In some embodiments, an N-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 4 exons of the essential gene. In some embodiments, an N-terminal fragment includes an amino acid sequence that is encoded by a region of the endogenous coding sequence within 5 exons of the essential gene.

[0575] In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes an N-terminal fragment of a protein encoded by an essential gene, e.g., a fragment that is less than 500, 250, 150, 125, 100, 75, 50, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, or 7 amino acids in length. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 20 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 19 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 18 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 17 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 16 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette encodes a 1 amino acid N-terminal fragment of a protein encoded by an essential gene. [0576] In some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is less than 100% identical to the corresponding endogenous coding sequence of the essential gene of the cell, e.g., less than 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55% or less than 50% (i.e., when the two sequences are aligned using a standard pairwise sequence alignment tool that maximizes the alignment between the corresponding sequences). For example, in some embodiments, the exogenous coding sequence or partial coding sequence of the essential gene in the knock-in cassette is codon optimized relative to the corresponding endogenous coding sequence of the essential gene of the cell, e.g., to prevent further binding of a nuclease to the target site. Alternatively or additionally it may be codon optimized to reduce the likelihood of recombination after integration of the knock-in cassette into the genome of the cell and/or to increase expression of the gene product of the essential gene and/or the gene product of interest after integration of the knock-in cassette into the genome of the cell.

[0577] In some embodiments, a knock-in cassette comprises one or more nucleotides or base pairs that differ (e.g., are mutations) relative to an endogenous knock-in site. In some embodiments, such mutations in a knock-in cassette provide resistance to cutting by a nuclease. In some embodiments, such mutations in a knock-in cassette prevent a nuclease from cutting the target loci following homologous recombination. In some embodiments, such mutations in a knock-in cassette occur within one or more coding and/or non-coding regions of a target gene. In some embodiments, such mutations in a knock-in cassette are silent mutations. In some embodiments, such mutations in a knock-in cassette are silent and/or missense mutations.

[0578] In some embodiments, such mutations in a knock-in cassette occur within a target protospacer motif and/or a target protospacer adjacent motif (PAM) site. In some embodiments, a knock-in cassette includes a target protospacer motif and/or a PAM site that are saturated with silent mutations. In some embodiments, a knock-in cassette includes a target protospacer motif and/or a PAM site that are approximately 30%, 40%, 50%, 60%, 70%, 80%, or 90% saturated with silent mutations. In some embodiments, a knock-in cassette includes a target protospacer motif and/or a PAM site that are saturated with silent and/or missense mutations. In some embodiments, a knock-in cassette includes a target protospacer motif and/or a PAM site that comprise at least one mutation, at least 2 mutations, at least 3 mutations, at least 4 mutations, at least 5 mutations, at least 6 mutations, at least 7 mutations, at least 8 mutations, at least 9 mutations, at least 10 mutations, at least 11 mutations, at least 12 mutations, at least 13 mutations, at least 14 mutations, or at least 15 mutations.

[0579] In some embodiments, certain codons encoding certain amino acids in a target site cannot be mutated through codon-optimization without losing some portion of an endogenous proteins natural function. In some embodiments, certain codons encoding certain amino acids in a target site cannot be mutated through codon-optimization.

[0580] In some embodiments, the knock-in cassette is codon optimized in only a portion of the coding sequence. For example, in some embodiments, a knock-in cassette encodes a C- terminal fragment of a protein encoded by an essential gene, e.g., a fragment that is less than 500, 250, 150, 125, 100, 75, 50, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, or 7 amino acids in length. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 20 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 19 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 18 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 17 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 16 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 15 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 14 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 13 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 12 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a ll amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 10 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 9 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 8 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 7 amino acid C- terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 6 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 5 amino acid C-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes an amino acid C-terminal fragment that is less than 5 amino acids of a protein encoded by an essential gene.

[0581] In some embodiments, the knock-in cassette is codon optimized in only a portion of the coding sequence. For example, in some embodiments, a knock-in cassette encodes an N- terminal fragment of a protein encoded by an essential gene, e.g., a fragment that is less than 500, 250, 150, 125, 100, 75, 50, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, or 7 amino acids in length. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 20 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 19 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 18 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 17 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 16 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 15 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 14 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 13 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 12 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a ll amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 10 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 9 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 8 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 7 amino acid N- terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 6 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes a 5 amino acid N-terminal fragment of a protein encoded by an essential gene. In some embodiments, the exogenous partial coding sequence of an essential gene in a knock-in cassette that has been codon optimized encodes an amino acid N-terminal fragment that is less than 5 amino acids of a protein encoded by an essential gene.

[0582] In some embodiments, the knock-in cassette comprises one or more sequences encoding a linker peptide, e.g., between an exogenous coding sequence or partial coding sequence of the essential gene and a “cargo” sequence and/or a regulatory element described herein. Such linker peptides are known in the art, any of which can be included in a knock-in cassette described herein. In some embodiments, the linker peptide comprises the amino acid sequence GSG.

[0583] In some embodiments, the knock-in cassette comprises other regulatory elements such as a polyadenylation sequence, and optionally a 3' UTR sequence, downstream of the exogenous coding sequence for the gene product of interest. If a 3 'UTR sequence is present, the 3 'UTR sequence is positioned 3' of the exogenous coding sequence and 5' of the polyadenylation sequence.

[0584] In some embodiments, the knock-in cassette comprises other regulatory elements such as a 5' UTR and a start codon, upstream of the exogenous coding sequence for the gene product of interest. If a 5 'UTR sequence is present, the 5 'UTR sequence is positioned 5' of the “cargo” sequence and/or exogenous coding sequence.

Exemplary Homology Arms (HA)

[0585] In certain embodiments, a donor template comprises a 5' and/or 3' homology arm homologous to region of a GAPDH locus. In some embodiments, a donor template comprises a 5' homology arm comprising or consisting of the sequence of SEQ ID NO:1, 2, or 3. In some embodiments, a 5' homology arm comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NO: 1, 2, or 3. In some embodiments, a donor template comprises a 3' homology arm comprising or consisting of the sequence of SEQ ID NO:4 or 5. In certain embodiments, a 3' homology arm comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NO: 4 or 5. [0586] In some embodiments, a donor template comprises a 5' homology arm comprising

SEQ ID NO: 1, and a 3' homology arm comprising SEQ ID NO: 4. In some embodiments, a donor template comprises a 5' homology arm comprising SEQ ID NO: 2, and a 3' homology arm comprising SEQ ID NO: 4. In some embodiments, a donor template comprises a 5' homology arm comprising SEQ ID NO: 3, and a 3' homology arm comprising SEQ ID NO:5.

[0587] In some embodiments, a stretch of sequence flanking a nuclease cleavage site may be duplicated in both a 5' and 3' homology arm. In some embodiments, such a duplication is designed to optimize HDR efficiency. In some embodiments, one of the duplicated sequences may be codon optimized, while the other sequence is not codon optimized. In some embodiments, both of the duplicated sequences may be codon optimized. In some embodiments, codon optimization may remove a target PAM site. In some embodiments, a duplicated sequence may be no more than: 100 bp in length, 90 bp in length, 80 bp in length, 70 bp in length, 60 bp in length, 50 bp in length, 40 bp in length, 30 bp in length, or 20 bp in length.

SEQ ID NO: 1 - exemplary 5' HA for knock-in cassette insertion at GAPDH locus

SEQ ID NO: 2 - exemplary 5' HA for knock-in cassette insertion at GAPDH locus

SEQ ID NO: 3 - exemplary 5' HA for knock-in cassette insertion at GAPDH locus

GGCTTTCCCATAATTTCCTTTCAAGGTGGGGAGGGAGGTAGAGGGGTGATGTGGGGA GTACGCT

GCAGGGCCTCACTCCTTTTGCAGACCACAGTCCATGCCATCACTGCCACCCAGAAGA CTGTGGA TGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGGGCTCTCCAGAACATCATCCCTGC CTCT ACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATCCCTGAGCTGAACGGGAAGCTCACTGGC ATGG CCTTCCGTGTCCCCACTGCCAACGTGTCAGTGGTGGACCTGACCTGCCGTCTAGAAAAAC CTGC CAAATATGATGACATCAAGAAGGTGGTGAAGCAGGCGTCGGAGGGCCCCCTCAAGGGCAT CCTG GGCTACACTGAGCACCAGGTGGTCTCCTCTGACTTCAACAGCGACACCCACTCCTCCACC TTTG ACGCTGGGGCTGGCATTGCCCTCAACGACCACTTTGTCAAGCTCATCTCTTGGTACGACA ATGA G T T C G GAT AT AG C AAT AGAG T G G T C GAT C T GAT G G C T CAT AT G G C TAG C AAAGAG

SEQ ID NO: 4 - exemplary 3' HA for knock-in cassette insertion at GAPDH locus

ATTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCCCACATGGCCTCCAAGGAGTA AGACCCC T G G AC CACCAGCCCCAG C AAG AG C AC AAG AG G AAG AG AG AG AC CCTCACTGCTGGG GAG T C C C T GCCACACTCAGTCCCCCACCACACTGAATCTCCCCTCCTCACAGTTGCCATGTAGACCCC TTGA AGAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCATGTACCATCAATAAAGTACCCTGTGC TCAA CCAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGGGCAGAGGGGAGGGAAGCTGGGCTTG TGTC AAGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATGTTCTCCTCAGACTGAGGGTAGGG CCTC CAAACAGCCTTGCTTGCTTCGAGAACCATTTGCTTCCCGCTCAGACGTCTTGAGTGCTAC AGGA AGCTGGCACCACTACTTCAGAGAACAAGGCCTTTTCCTCTCCTCGCTCCAGT

SEQ ID NO: 5 - exemplary 3' HA for knock-in cassette insertion at GAPDH locus

AGACTGGCTCTTAAAAAGTGCAGGGTCTGGCGCCCTCTGGTGGCTGGCTCAGAAAAA GGGCCCT GACAACTCTTTTCATCTTCTAGGTATGACAACGAATTTGGCTACAGCAACAGGGTGGTGG ACCT CAT G G C C C AC AT G G C C T C C AAG GAG T AAGAC C C C T G GAC C AC C AG C C C C AG C AAGAG C AC AAGA GGAAGAGAGAGACCCTCACTGCTGGGGAGTCCCTGCCACACTCAGTCCCCCACCACACTG AATC TCCCCTCCTCACAGTTGCCATGTAGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCGCA CCTT GTCATGTACCATCAATAAAGTACCCTGTGCTCAACCAGTTACTTGTCCTGTCTTATTCTA GGGT CTGGGGCAGAGGGGAGGGAAGCTGGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGG GACC TGGTATGTTCTCCTCAGACTGAGGGTAGGGCCTCCAAACAGCCTTGCTTGCT

[0588] In some embodiments, a donor template comprises a 5' and/or 3' homology arm homologous to a region of a TBP locus. In some embodiments, a donor template comprises a 5' homology arm comprising or consisting of the sequence of SEQ ID NO:6, 7, or 8. In some embodiments, a 5' homology arm comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NO: 6, 7, or 8. In some embodiments, a donor template comprises a 3' homology arm comprising or consisting of the sequence of SEQ ID NO:9, 10, or 11. In certain embodiments, a 3' homology arm comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NO:

9, 10, or 11.

[0589] In some embodiments, a donor template comprises a 5' homology arm comprising

SEQ ID NO: 6, and a 3' homology arm comprising SEQ ID NO: 9. In some embodiments, a donor template comprises a 5' homology arm comprising SEQ ID NO: 7, and a 3' homology arm comprising SEQ ID NO: 10. In some embodiments, a donor template comprises a 5' homology arm comprising SEQ ID NO: 8, and a 3' homology arm comprising SEQ ID NO: 11.

SEQ ID NO: 6 - exemplary 5' HA for knock-in cassette insertion at TBP locus

SEQ ID NO: 7 - exemplary 5' HA for knock-in cassette insertion at TBP locus

SEQ ID NO: 8 - exemplary 5' HA for knock-in cassette insertion at TBP locus

SEQ ID NO: 9 - exemplary 3' HA for knock-in cassette insertion at TBP locus

CAGAAAT T T AT GAAG C AT T T GAAAAC AT CTACCCTATTC TAAAG G GAT T C AG GAAGAC GAC G T A ATGGCTCTCATGTACCCTTGCCTCCCCCACCCCCTTCTTTTTTTTTTTTTAAACAAATCA GTTT GTTTTGGTACCTTTAAATGGTGGTGTTGTGAGAAGATGGATGTTGAGTTGCAGGGTGTGG CACC AGGTGATGCCCTTCTGTAAGTGCCCACCGCGGGATGCCGGGAAGGGGCATTATTTGTGCA CTGA GAACACCGCGCAGCGTGACTGTGAGTTGCTCATACCGTGCTGCTATCTGGGCAGCGCTGC CCAT T TAT T TAT AT G T AGAT T T T AAAC AC T G C T G T T GAC AAG T T G G T T T GAG G GAGAAAAC T T T AAG T GTTAAAGCCACCTCTATAATTGATTGGACTTTTTAATTTTAATGTTTTTCCCCATGAACC ACAG

TTTTTATATTTCTACCAGAAAAGTAAAAATCTTTTTTAAAAGTGTTGTTTTT

SEQ ID NO: 10 - exemplary 3' HA for knock-in cassette insertion at TBP locus

TAGGT GC TAAAGT CAGAGCAGAAAT T TAT GAAGCAT T T GAAAACAT C TACCC TAT T C TAAAGGG ATTCAGGAAGACGACGTAATGGCTCTCATGTACCCTTGCCTCCCCCACCCCCTTCTTTTT TTTT TTTTAAACAAATCAGTTTGTTTTGGTACCTTTAAATGGTGGTGTTGTGAGAAGATGGATG TTGA GTTGCAGGGTGTGGCACCAGGTGATGCCCTTCTGTAAGTGCCCACCGCGGGATGCCGGGA AGGG GCATTATTTGTGCACTGAGAACACCGCGCAGCGTGACTGTGAGTTGCTCATACCGTGCTG CTAT CTGGGCAGCGCTGCCCATTTATTTATATGTAGATTTTAAACACTGCTGTTGACAAGTTGG TTTG AG G GAGAAAAC T T T AAG T G T T AAAG C C AC C T C T AT AAT T GAT T G GAC T T T T T AAT T T T AAT G T T T T T C C C C AT GAAC C AC AG T T T T T AT AT T T C T AC C AGAAAAG T AAAAAT C T T T

SEQ ID NO: 11 - exemplary 3' HA for knock-in cassette insertion at TBP locus

AAGGGATTCAGGAAGACGACGTAATGGCTCTCATGTACCCTTGCCTCCCCCACCCCC TTCTTTT TTTTTTTTTAAACAAATCAGTTTGTTTTGGTACCTTTAAATGGTGGTGTTGTGAGAAGAT GGAT GTTGAGTTGCAGGGTGTGGCACCAGGTGATGCCCTTCTGTAAGTGCCCACCGCGGGATGC CGGG AAGGGGCATTATTTGTGCACTGAGAACACCGCGCAGCGTGACTGTGAGTTGCTCATACCG TGCT GCTATCTGGGCAGCGCTGCCCATTTATTTATATGTAGATTTTAAACACTGCTGTTGACAA GTTG GTTTGAGGGAGAAAACTTTAAGTGTTAAAGCCACCTCTATAATTGATTGGACTTTTTAAT TTTA AT G T T T T T C C C CAT GAAC C AC AG T T T T TAT AT T T C T AC C AGAAAAG T AAAAAT C T T T T T TAAAA GTGTTGTTTTTCTAATTTATAACTCCTAGGGGTTATTTCTGTGCCAGACACA

[0590] In some embodiments, a donor template comprises a 5' and/or 3' homology arm homologous to a region of a G6PD locus. In some embodiments, a donor template comprises a

5' homology arm comprising or consisting of the sequence of SEQ ID NO: 12. In some embodiments, a 5' homology arm comprises or consists of a sequence that is at least 85%, 90%,

95%, 98% or 99% identical to the sequence of SEQ ID NO: 12. In some embodiments, a donor template comprises a 3' homology arm comprising or consisting of the sequence of SEQ ID

NO: 13. In certain embodiments, a 3' homology arm comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NO: 13.

[0591] In some embodiments, a donor template comprises a 5' homology arm comprising

SEQ ID NO: 12, and a 3' homology arm comprising SEQ ID NO: 13.

SEQ ID NO: 12 - exemplary 5' HA for knock-in cassette insertion at G6PD locus

GGCCCGGGGGACTCCACATGGTGGCAGGCAGTGGCATCAGCAAGACACTCTCTCCCT CACAGAA

CGTGAAGCTCCCTGACGCCTATGAGCGCCTCATCCTGGACGTCTTCTGCGGGAGCCA GATGCAC

TTCGTGCGCAGGTGAGGCCCAGCTGCCGGCCCCTGCATACCTGTGGGCTATGGGGTG GCCTTTG

CCCTCCCTCCCTGTGTGCCACCGGCCTCCCAAGCCATACCATGTCCCCTCAGCGACG AGCTCCG

TGAGGCCTGGCGTATTTTCACCCCACTGCTGCACCAGATTGAGCTGGAGAAGCCCAA GCCCATC

CCCTATATTTATGGCAGGTGAGGAAAGGGTGGGGGCTGGGGACAGAGCCCAGCGGGC AGGGGCG GGGTGAGGGTGGAGCTACCTCATGCCTCTCCTCCACCCGTCACTCTCCAGCCGAGGCCCC ACGG AG G C AGAC GAG C T GAT GAAGAGAG TGGGCTTC C AG T AC GAG G GAAC C T AC AAAT G G G T C AAC C C TCACAAGCTG

SEQ ID NO: 13 - exemplary 3' HA for knock-in cassette insertion at G6PD locus

GTGGGTGAACCCCCACAAGCTCTGAGCCCTGGGCACCCACCTCCACCCCCGCCACGG CCACCCT

CCTTCCCGCCGCCCGACCCCGAGTCGGGAGGACTCCGGGACCATTGACCTCAGCTGC ACATTCC

TGGCCCCGGGCTCTGGCCACCCTGGCCCGCCCCTCGCTGCTGCTACTACCCGAGCCC AGCTACA

TTCCTCAGCTGCCAAGCACTCGAGACCATCCTGGCCCCTCCAGACCCTGCCTGAGCC CAGGAGC

TGAGTCACCTCCTCCACTCACTCCAGCCCAACAGAAGGAAGGAGGAGGGCGCCCATT CGTCTGT

CCCAGAGCTTATTGGCCACTGGGTCTCACTCCTGAGTGGGGCCAGGGTGGGAGGGAG GGACGAG

GGGGAGGAAAGGGGCGAGCACCCACGTGAGAGAATCTGCCTGTGGCCTTGCCCGCCA GCCTCAG

TGCCACTTGACATTCCTTGTCACCAGCAACATCTCGAGCCCCCTGGATGTCC

[0592] In some embodiments, a donor template comprises a 5' and/or 3' homology arm homologous to a region of a E2F4 locus. In some embodiments, a donor template comprises a 5' homology arm comprising or consisting of the sequence of SEQ ID NO: 14, 15, or 16. In some embodiments, a 5' homology arm comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NO: 14, 15, or 16. In some embodiments, a donor template comprises a 3' homology arm comprising or consisting of the sequence of SEQ ID NO: 17, 18, or 19. In certain embodiments, a 3' homology arm comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NO: 17, 18, or 19.

[0593] In some embodiments, a donor template comprises a 5' homology arm comprising

SEQ ID NO: 14, and a 3' homology arm comprising SEQ ID NO: 17. In some embodiments, a donor template comprises a 5' homology arm comprising SEQ ID NO: 15, and a 3' homology arm comprising SEQ ID NO: 18. In some embodiments, a donor template comprises a 5' homology arm comprising SEQ ID NO: 16, and a 3' homology arm comprising SEQ ID NO: 19.

SEQ ID NO: 14 - exemplary 5' HA for knock-in cassette insertion at E2F4 locus

CCAGGGGGCTGTAGTGGGGCCAGGCTGGACCTCTGTGCCCTGAGCATGGCTTTCTTG TTTTTCA

GTTTTGGAACTCCCCAAAGAGCTGTCAGAAATCTTTGATCCCACACGAGGTAGGCTG CTGCATT

CCTCCCTGAGGCTAGGGGTAAGGGACACAGCTCATTGGGTCCTATGGCTGTTTTCTT GCCCTTT

TGAGGACCTTGTTGTGGCGCTTATGGTAACTGGGGCAAAGGGTGAAGTTCCTGATGG GCAGGTG

GGGTTCCCTTTCCTGGGCTTTGGTGGGTGGAGAGGTGGGAGCTGGAATGTTAGTAAC TGAGCTC

CCTCCATTCCCAGAGTGCATGAGCTCGGAGCTGCTGGAGGAGTTGATGTCCTCAGAA GGTGGGT

GGCCCTGGAAGGTGGGAGTGGGTGTGGGCAGGGGTTGGGCTGCTGCTAGGGGAGCCC TGGCCCA

GGGCCTGAGACTAGTGCTCTCTGCAGTGTTCGCCCCTCTGCTGAGACTTTCTCCTCC TCCTGGC

GACCACGACTACATCTACAACCTGGACGAGAGCGAGGGCGTGTGCGACCTGTTTGAT GTGCCCG

TGCTGAACCTG SEQ ID NO: 15 - exemplary 5' HA for knock-in cassette insertion at E2F4 locus

CCAGGCTGGACCTCTGTGCCCTGAGCATGGCTTTCTTGTTTTTCAGTTTTGGAACTC CCCAAAG

AGCTGTCAGAAATCTTTGATCCCACACGAGGTAGGCTGCTGCATTCCTCCCTGAGGC TAGGGGT

AAGGGACACAGCTCATTGGGTCCTATGGCTGTTTTCTTGCCCTTTTGAGGACCTTGT TGTGGCG

CTTATGGTAACTGGGGCAAAGGGTGAAGTTCCTGATGGGCAGGTGGGGTTCCCTTTC CTGGGCT

TTGGTGGGTGGAGAGGTGGGAGCTGGAATGTTAGTAACTGAGCTCCCTCCATTCCCA GAGTGCA

TGAGCTCGGAGCTGCTGGAGGAGTTGATGTCCTCAGAAGGTGGGTGGCCCTGGAAGG TGGGAGT

GGGTGTGGGCAGGGGTTGGGCTGCTGCTAGGGGAGCCCTGGCCCAGGGCCTGAGACT AGTGCTC

TCTGCAGTGTTTGCCCCTCTGCTTCGTCTTAGTCCTCCTCCGGGCGACCACGACTAC ATCTACA

ACCTGGACGAGAGCGAGGGCGTGTGCGACCTGTTTGATGTGCCCGTGCTGAACCTG

SEQ ID NO: 16 - exemplary 5' HA for knock-in cassette insertion at E2F4 locus

GTCAGAAATCTTTGATCCCACACGAGGTAGGCTGCTGCATTCCTCCCTGAGGCTAGG GGTAAGG

GACACAGCTCATTGGGTCCTATGGCTGTTTTCTTGCCCTTTTGAGGACCTTGTTGTG GCGCTTA

TGGTAACTGGGGCAAAGGGTGAAGTTCCTGATGGGCAGGTGGGGTTCCCTTTCCTGG GCTTTGG

TGGGTGGAGAGGTGGGAGCTGGAATGTTAGTAACTGAGCTCCCTCCATTCCCAGAGT GCATGAG

CTCGGAGCTGCTGGAGGAGTTGATGTCCTCAGAAGGTGGGTGGCCCTGGAAGGTGGG AGTGGGT

GTGGGCAGGGGTTGGGCTGCTGCTAGGGGAGCCCTGGCCCAGGGCCTGAGACTAGTG CTCTCTG

CAGTGTTTGCCCCTCTGCTTCGTCTTTCTCCACCCCCGGGAGACCACGATTATATCT ACAACCT

GGACGAGAGTGAAGGTGTCTGTGACCTCTTCGACGTGCCCGTGCTCAACCTC

SEQ ID NO: 17 - exemplary 3' HA for knock-in cassette insertion at E2F4 locus

CCACCCCCGGGAGACCACGATTATATCTACAACCTGGACGAGAGTGAAGGTGTCTGT GACCTCT

TTGATGTGCCTGTTCTCAACCTCTGACTGACAGGGACATGCCCTGTGTGGCTGGGAC CCAGACT

GTCTGACCTGGGGGTTGCCTGGGGACCTCTCCCACCCGACCCCTACAGAGCTTGAGA GCCACAG

ACGCCTGGCTTCTCCGGCCTCCCCTCACCGCACAGTTCTGGCCACAGCTCCCGCTCC TGTGCTG

GCACTTCTGTGCTCGCAGAGCAGGGGAACAGGACTCAGCCCCCATCACCGTGGAGCC AAAGTGT

TTGCTTCTCCCTTTCTGCGGCCTTCGCCAGCCCAGGCTCGGCTGCCACCCAGTGGCA CAGAACC

GAGGAGCTGCCATTACCCCCCATAGGGGGCAGTGTCTTGTTCCTGCCAGCCTCAGTG TCTTGCT

TCTGCCAGCTCCTTCCCCTAGGAGGGAAGGGTGGGGTGGAACTGGGCACATG

SEQ ID NO: 18 - exemplary 3' HA for knock-in cassette insertion at E2F4 locus

ATTATATCTACAACCTGGACGAGAGTGAAGGTGTCTGTGACCTCTTTGATGTGCCTG TTCTCAA

CCTCTGACTGACAGGGACATGCCCTGTGTGGCTGGGACCCAGACTGTCTGACCTGGG GGTTGCC

TGGGGACCTCTCCCACCCGACCCCTACAGAGCTTGAGAGCCACAGACGCCTGGCTTC TCCGGCC

TCCCCTCACCGCACAGTTCTGGCCACAGCTCCCGCTCCTGTGCTGGCACTTCTGTGC TCGCAGA

GCAGGGGAACAGGACTCAGCCCCCATCACCGTGGAGCCAAAGTGTTTGCTTCTCCCT TTCTGCG

GCCTTCGCCAGCCCAGGCTCGGCTGCCACCCAGTGGCACAGAACCGAGGAGCTGCCA TTACCCC

CCATAGGGGGCAGTGTCTTGTTCCTGCCAGCCTCAGTGTCTTGCTTCTGCCAGCTCC TTCCCCT

AGGAGGGAAGGGTGGGGTGGAACTGGGCACATGCCAGCACCACTTCTAGCTT

SEQ ID NO: 19 - exemplary 3' HA for knock-in cassette insertion at E2F4 locus

TGACTGACAGGGACATGCCCTGTGTGGCTGGGACCCAGACTGTCTGACCTGGGGGTT GCCTGGG

GACCTCTCCCACCCGACCCCTACAGAGCTTGAGAGCCACAGACGCCTGGCTTCTCCG GCCTCCC CTCACCGCACAGTTCTGGCCACAGCTCCCGCTCCTGTGCTGGCACTTCTGTGCTCGCAGA GCAG

GGGAACAGGACTCAGCCCCCATCACCGTGGAGCCAAAGTGTTTGCTTCTCCCTTTCT GCGGCCT

TCGCCAGCCCAGGCTCGGCTGCCACCCAGTGGCACAGAACCGAGGAGCTGCCATTAC CCCCCAT

AGGGGGCAGTGTCTTGTTCCTGCCAGCCTCAGTGTCTTGCTTCTGCCAGCTCCTTCC CCTAGGA

GGGAAGGGTGGGGTGGAACTGGGCACATGCCAGCACCACTTCTAGCTTCCTTCGCTA TCCCCCA

CCCCCTGACCCTCCAGCTCCTCCTGGCCCTCTCACGTGCCCACTTCTGCTGG

[0594] In some embodiments, a donor template comprises a 5' and/or 3' homology arm homologous to a region of a KIF11 locus. In some embodiments, a donor template comprises a 5' homology arm comprising or consisting of the sequence of SEQ ID NO: 20, 21, or 22. In some embodiments, a 5' homology arm comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NO: 20, 21, or 22. In some embodiments, a donor template comprises a 3' homology arm comprising or consisting of the sequence of SEQ ID NO: 23, 24, or 25. In certain embodiments, a 3' homology arm comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NO: 23, 24, or 25.

[0595] In some embodiments, a donor template comprises a 5' homology arm comprising

SEQ ID NO: 20, and a 3' homology arm comprising SEQ ID NO: 23. In some embodiments, a donor template comprises a 5' homology arm comprising SEQ ID NO: 21, and a 3' homology arm comprising SEQ ID NO: 24. In some embodiments, a donor template comprises a 5' homology arm comprising SEQ ID NO: 22, and a 3' homology arm comprising SEQ ID NO: 25.

SEQ ID NO: 20 - exemplary 5' HA for knock-in cassette insertion at KIF11 locus

SEQ ID NO: 21 - exemplary 5' HA for knock-in cassette insertion at KIF11 locus

TTCCTGTG GAC TGTACTATGTTGG T AC AC AAGAAAAAC AG TGTACTATGT GAAT AC T C AC T C AA AGCCAGTAGCACTCCCTGATTGTAACACCAAAAAAGTCTCTCAGCATTGCCAAATGTCCC CTGT G G C AG CAGAAT C AC T C C C T GAT GAGAAC C AC T AC C C T G GAG TAAAAT C T AT AAC TATGTCTTAG AAAAT AAC AC AGAAAAT TAATATTTCTTT C AC TCTACTCCTTCCATTAGTGAT C AAAT AAAGAA GGCATTTGGCGCTACTTGCCAAATTGTTGGCTCAAACTTGTGCTGAACCTTTTTTGGTTT TCTA C AC T T AAG T T T T T T T G C C T AT AAC C C AGAGAAC T T T GAAAAT AGAG T G T AG T T AAT G T G T AT C T AATGTTACTTTGTATT GAC TTAATTTTCCCGCCT T AAAT C C AC AG CAT AAAAAAT C AC AT G GAA AAGAC AAAGAAAAC AGAG G CAT TAACACAC T GGAGAGGT C T AAAG T G GAAGAAAC AAC C GAG C A C C T G G T C AC CAAGAG CAGAC T G C C T C T GAGAG C C C AGAT C AAC C T G

SEQ ID NO: 22 - exemplary 5' HA for knock-in cassette insertion at KIF11 locus

SEQ ID NO: 23 - exemplary 3' HA for knock-in cassette insertion at KIF11 locus

AAAAAAT C AC AT G GAAAAGAC AAAGAAAAC AGAG G CAT TAACACAC T GGAGAGGT C TAAAGTGG AAGAAAC T AC AGAG C AC T T G G T T AC AAAGAG C AGAT T AC C T C T G C GAG C C C AGAT C AAC C T T T A ATTCACTTGGGGGTTGGCAATTTTATTTTTAAAGAAAACTTAAAAATAAAACCTGAAACC CCAG AACTTGAGCCTTGTGTATAGATTTTAAAAGAATATATATATCAGCCGGGCGCGGTGGCTC ATGC CTGTAATCCCAGCACTTTGGGAGGCTGAGGCGGGTGGATTGCTTGAGCCCAGGAGTTTGA GACC AGCCTGGCCAACGTGGCAAAACCTCGTCTCTGTTAAAAATTAGCCGGGCGTGGTGGCACA CTCC TGTAATCCCAGCTACTGGGGAGGCTGAGGCACGAGAATCACTTGAACCCAGGAAGCGGGG TTGC AG T GAG C CAAAG G T AC AC C AC T AC AC T C C AG C C T G G G C AAC AGAG C AAGAC T

SEQ ID NO: 24 - exemplary 3' HA for knock-in cassette insertion at KIF11 locus

AAC T AC AGAG C AC T T G G C T AC AT AGAG C AGAT T AC C T C T G C GAG C C C AGAT C AAC C T T T AAT T C ACTTGGGGGTTGGCAATTTTATTTTTAAAGAAAACTTAAAAATAAAACCTGAAACCCCAG AACT TGAGCCTTGTGTATAGATTTTAAAAGAATATATATATCAGCCGGGCGCGGTGGCTCATGC CTGT AATCCCAGCACTTTGGGAGGCTGAGGCGGGTGGATTGCTTGAGCCCAGGAGTTTGAGACC AGCC TGGCCAACGTGGCAAAACCTCGTCTCTGTTAAAAATTAGCCGGGCGTGGTGGCACACTCC TGTA ATCCCAGCTACTGGGGAGGCTGAGGCACGAGAATCACTTGAACCCAGGAAGCGGGGTTGC AGTG AG C CAAAG G T AC AC C AC T AC AC T C C AG C C T G G G C AAC AGAG C AAGAC T C G G T C T C AAAAAC AAA ATT T AAAAAAGAT AT AAG G C AG T AC T G T AAAT T C AG T T GAAT TTTGATATCT

SEQ ID NO: 25 - exemplary 3' HA for knock-in cassette insertion at KIF11 locus

AT TAACACAC TG GAGAG TT C T GAAGT G GAAGAAAC T AC AGAG C AC T T G G T T AC AAAGAG C AGAT TACCTCTGCGAGCCCAGATCAACCTTTAATTCACTTGGGGGTTGGCAATTTTATTTTTAA AGAA AAC T T AAAAAT AAAAC C T GAAAC C C C AGAAC T T GAG C C T T G T G TAT AGAT T T T AAAAGAAT AT A TATATCAGCCGGGCGCGGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCTGAGGCG GGTG GATTGCTTGAGCCCAGGAGTTTGAGACCAGCCTGGCCAACGTGGCAAAACCTCGTCTCTG TTAA AAATTAGCCGGGCGTGGTGGCACACTCCTGTAATCCCAGCTACTGGGGAGGCTGAGGCAC GAGA ATCACTTGAACCCAGGAAGCGGGGTTGCAGTGAGCCAAAGGTACACCACTACACTCCAGC CTGG G C AAC AG AG C AAG AC TCGGTCT CAAAAACAAAAT T T AAAAAAGAT AT AAG G C

Inverted Terminal Repeats (ITRs)

[0596] In certain embodiments, a donor template comprises an AAV derived sequence.

In certain embodiments, a donor template comprises AAV derived sequences that are typical of an AAV construct, such as cis-acting 5' and 3' inverted terminal repeats (ITRs) (See, e.g., B. J. Carter, in “Handbook of Parvoviruses”, ed., P. Tijsser, CRC Press, pp. 155 168 (1990), which is incorporated in its entirety herein by reference). Generally, ITRs are able to form a hairpin. The ability to form a hairpin can contribute to an ITRs ability to self-prime, allowing primase- independent synthesis of a second DNA strand. ITRs also play a role in integration of AAV construct (e.g., a coding sequence) into a genome of a target cell. ITRs can also aid in efficient encapsidation of an AAV construct in an AAV particle.

[0597] In some embodiments, a donor template described herein is included within an rAAV particle (e.g., an AAV6 particle). In some embodiments, an ITR is or comprises about 145 nucleic acids. In some embodiments, all or substantially all of a sequence encoding an ITR is used. In some embodiments, an AAV ITR sequence may be obtained from any known AAV, including presently identified mammalian AAV types. In some embodiments an ITR is an AAV6 ITR.

[0598] An example of an AAV construct employed in the present disclosure is a “cis- acting” construct containing a cargo sequence (e.g., a donor template described herein), in which the donor template is flanked by 5' or “left” and 3' or “right” AAV ITR sequences. 5' and left designations refer to a position of an ITR sequence relative to an entire construct, read left to right, in a sense direction. For example, in some embodiments, a 5' or left ITR is an ITR that is closest to a target loci promoter (as opposed to a polyadenylation sequence) for a given construct, when a construct is depicted in a sense orientation, linearly. Concurrently, 3' and right designations refer to a position of an ITR sequence relative to an entire construct, read left to right, in a sense direction. For example, in some embodiments, a 3' or right ITR is an ITR that is closest to a polyadenylation sequence in a target loci (as opposed to a promoter sequence) for a given construct, when a construct is depicted in a sense orientation, linearly. ITRs as provided herein are depicted in 5' to 3' order in accordance with a sense strand. Accordingly, one of skill in the art will appreciate that a 5' or “left” orientation ITR can also be depicted as a 3' or “right” ITR when converting from sense to antisense direction. Further, it is well within the ability of one of skill in the art to transform a given sense ITR sequence (e.g., a 5 '/left AAV ITR) into an antisense sequence (e.g., 3 '/right ITR sequence). One of ordinary skill in the art would understand how to modify a given ITR sequence for use as either a 5 '/left or 3 '/right ITR, or an antisense version thereof.

[0599] For example, in some embodiments an ITR (e.g., a 5' ITR) can have a sequence according to SEQ ID NO: 158. In some embodiments, an ITR (e.g., a 3' ITR) can have a sequence according to SEQ ID NO: 159. In some embodiments, an ITR includes one or more modifications, e.g., truncations, deletions, substitutions or insertions, as is known in the art. In some embodiments, an ITR comprises fewer than 145 nucleotides, e.g., 127, 130, 134 or 141 nucleotides. For example, in some embodiments, an ITR comprises 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123 ,124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143 144, or 145 nucleotides.

[0600] A non-limiting example of 5' AAV ITR sequences includes SEQ ID NO: 158. A non-limiting example of 3' AAV ITR sequences includes SEQ ID NO: 159. In some embodiments, the 5' and a 3' AAV ITRs (e.g., SEQ ID NO: 158 and 159) flank a donor template described herein (e.g., a donor template comprising a 5ΉA, a knock-in cassette, and a 3' HA). The ability to modify ITR sequences is within the skill of the art. (See, e.g., texts such as Sambrook et al. “Molecular Cloning. A Laboratory Manual”, 2d ed., Cold Spring Harbor Laboratory, New York (1989); and K. Fisher et al., J Virol., 70:520 532 (1996), each of which is incorporated in its entirety herein by reference). In some embodiments, a 5' ITR sequence is at least 85%, 90%, 95%, 98% or 99% identical to a 5' ITR sequence represented by SEQ ID NO: 158. In some embodiments, a 3' ITR sequence is at least 85%, 90%, 95%, 98% or 99% identical to a 3' ITR sequence represented by SEQ ID NO: 159.

SEQ ID NO: 158 - exemplary 5' ITR for knock-in cassette insertion

CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGC GTCGGGC

GACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAA CTCCATC

ACTAGGGGTTCCT

SEQ ID NO: 159 - exemplary 3' ITR for knock-in cassette insertion

AGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTG AGGCCGG

GCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCG AGCGCGC

AGCTGCCTGCAGG Flanking untranslated regions. 5' UTRs and 3' UTRs [0601] In some embodiments, a knock-in cassette described herein includes all or a portion of an untranslated region (UTR), such as a 5' UTR and/or a 3' UTR. UTRs of a gene are transcribed but not translated. A 5' UTR starts at a transcription start site and continues to the start codon but does not include the start codon. A 3' UTR starts immediately following the stop codon and continues until the transcriptional termination signal. The regulatory and/or control features of a UTR can be incorporated into any of the knock-in cassettes described herein to enhance or otherwise modulate the expression of an essential target gene loci and/or a cargo sequence.

[0602] Natural 5' UTRs include a sequence that plays a role in translation initiation. In some embodiments, a 5' UTR comprises sequences, like Kozak sequences, which are commonly known to be involved in the process by which the ribosome initiates translation of many genes. Kozak sequences have the consensus sequence CCR(A/G)CCAUGG, where R is a purine (A or G) three bases upstream of the start codon (AUG), and the start codon is followed by another “G”. The 5' UTRs have also been known to form secondary structures that are involved in elongation factor binding. Non-limiting examples of 5' UTRs include those from the following genes: albumin, serum amyloid A, Apolipoprotein A/B/E, transferrin, alpha fetoprotein, erythropoietin, and Factor VIII.

[0603] In some embodiments, a UTR may comprise a non-endogenous regulatory region.

In some embodiments, a UTR that comprises a non-endogenous regulatory region is a 3’ UTR.

In some embodiments, a UTR that comprises a non-endogenous regulatory region is a 5’ UTR.

In some embodiments, a non-endogenous regulatory region may be a target of at least one inhibitory nucleic acid. In some embodiments, an inhibitory nucleic acid inhibits expression and/or activity of a target gene. In some embodiments, an inhibitory nucleic acid is a short interfering RNA (siRNA), a short hairpin RNA (shRNA), a microRNA (miRNA), an antisense oligonucleotide, a guide RNA (gRNA), or a ribozyme. In some embodiments, an inhibitory nucleic acid is an endogenous molecule. In some embodiments, an inhibitory nucleic acid is a non-endogenous molecule. In some embodiments, an inhibitory nucleic acid displays a tissue specific expression pattern. In some embodiments, an inhibitory nucleic acid displays a cell specific expression pattern. [0604] In some embodiments, a knock-in cassette may comprise more than one non- endogenous regulatory regions, e.g., two, three, four, five, six, seven, eight, nine, or ten regulatory regions. In some embodiments, a knock-in cassette may comprise four non- endogenous regulatory regions. In some embodiments, a construct may comprise more than one non-endogenous regulatory regions, wherein at least one of the more than one non-endogenous regulatory regions are not the same as at least one of the other non-endogenous regulatory regions.

[0605] In some embodiments, a 3' UTR is found immediately 3' to the stop codon of a gene of interest. In some embodiments, a 3' UTR from an mRNA that is transcribed by a target cell can be included in any knock-in cassette described herein. In some embodiments, a 3' UTR is derived from an endogenous target loci and may include all or part of the endogenous sequence. In some embodiments, a 3' UTR sequence is at least 85%, 90%, 95% or 98% identical to the sequence of SEQ ID NO: 26.

SEQ ID NO: 26 - exemplary 3' UTR for knock-in cassette insertion

GCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGA

Polyadenylation Sequences

[0606] In some embodiments, a knock-in cassette construct provided herein can include a polyadenylation (poly(A)) signal sequence. Most nascent eukaryotic mRNAs possess a poly(A) tail at their 3' end, which is added during a complex process that includes cleavage of the primary transcript and a coupled polyadenylation reaction driven by the poly(A) signal sequence (see, e.g., Proudfoot et al., Cell 108:501-512, 2002, which is incorporated herein by reference in its entirety). A poly(A) tail confers mRNA stability and transferability (Molecular Biology of the Cell, Third Edition by B. Alberts et al., Garland Publishing, 1994, which is incorporated herein by reference in its entirety). In some embodiments, a poly(A) signal sequence is positioned 3' to a coding sequence.

[0607] As used herein, “polyadenylation” refers to the covalent linkage of a polyadenylyl moiety, or its modified variant, to a messenger RNA molecule. In eukaryotic organisms, most messenger RNA (mRNA) molecules are polyadenylated at the 3' end. A 3' poly(A) tail is a long sequence of adenine nucleotides (e.g., 50, 60, 70, 100, 200, 500, 1000, 2000, 3000, 4000, or 5000) added to the pre-mRNA through the action of an enzyme, polyadenylate polymerase. In some embodiments, a poly(A) tail is added onto transcripts that contain a specific sequence, e.g., a polyadenylation (or poly(A)) signal. A poly(A) tail and associated proteins aid in protecting mRNA from degradation by exonucleases. Polyadenylation also plays a role in transcription termination, export of the mRNA from the nucleus, and translation. Polyadenylation typically occurs in the nucleus immediately after transcription of DNA into RNA, but also can occur later in the cytoplasm. After transcription has been terminated, an mRNA chain is cleaved through the action of an endonuclease complex associated with RNA polymerase. A cleavage site is usually characterized by the presence of the base sequence AAUAAA near the cleavage site. After the mRNA has been cleaved, adenosine residues are added to the free 3' end at the cleavage site.

[0608] As used herein, a “poly(A) signal sequence” or “polyadenylation signal sequence” is a sequence that triggers the endonuclease cleavage of an mRNA and the addition of a series of adenosines to the 3' end of the cleaved mRNA.

[0609] There are several poly(A) signal sequences that can be used, including those derived from bovine growth hormone (bGH) (Woychik et al., Proc. Natl. Acad Sci. US. A. 81(13):3944-3948, 1984; U.S. Patent No. 5,122,458, each of which is incorporated herein by reference in its entirety), mouse-β-globin, mouse-a-globin (Orkin et al., EMBO J 4(2):453-456, 1985; Thein et al., Blood71(2):313-319, 1988, each of which is incorporated herein by reference in its entirety), human collagen, polyoma virus (Batt et al., Mol. Cell Biol. 15(9):4783-4790, 1995, which is incorporated herein by reference in its entirety), the Herpes simplex virus thymidine kinase gene (HSV TK), IgG heavy-chain gene polyadenylation signal (US 2006/0040354, which is incorporated herein by reference in its entirety), human growth hormone (hGH) (Szymanski et al., Mol. Therapy 15(7): 1340-1347, 2007, which is incorporated herein by reference in its entirety), the group comprising a SV40 poly(A) site, such as the SV40 late and early poly(A) site (Schek et al., Mol. Cell Biol. 12(12):5386- 5393, 1992, which is incorporated herein by reference in its entirety).

[0610] The poly(A) signal sequence can be AATAAA. The AATAAA sequence may be substituted with other hexanucleotide sequences with homology to AATAAA and that are capable of signaling polyadenylation, including ATT AAA, AGTAAA, CATAAA, TATAAA, GAT AAA, ACT AAA, A AT AT A, AAGAAA, AATAAT, AAAAAA, AATGAA, AATCAA, AACAAA, AATCAA, AATAAC, A AT AG A, AATTAA, or A AT A AG (see, e g., WO 06/12414, which is incorporated herein by reference in its entirety). [0611] In some embodiments, a poly(A) signal sequence can be a synthetic polyadenylation site (see, e.g., the pCl-neo expression construct of Promega that is based on Levitt el al., Genes Dev. 3(7): 1019-1025, 1989, which is incorporated herein by reference in its entirety). In some embodiments, a poly(A) signal sequence is the polyadenylation signal of soluble neuropilin-1 (sNRP) (AAATAAAATACGAAATG) (see, e.g., WO 05/073384, which is incorporated herein by reference in its entirety). In some embodiments, a poly(A) signal sequence comprises or consists of the SV40 poly(A) site. In some embodiments, a poly(A) signal sequence comprises or consists of SEQ ID NO: 27. In some embodiments, a poly(A) signal sequence comprises or consists of bGHpA. In some embodiments, a poly(A) signal sequence comprises or consists of SEQ ID NO: 28. Additional examples of poly(A) signal sequences are known in the art. In some embodiments, a poly(A) sequence is at least 85%, 90%, 95%, 98% or 99% identical to the sequence of SEQ ID NOs: 27 or 28.

SEQ ID NO: 27 - exemplary SV40 poly(A) signal sequence

AAC T T GT T TAT TGCAGCT TATAAT GGT TACAAATAAAGCAATAGCAT CACAAAT T T CACAAATA AAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTA

SEQ ID NO: 28 - exemplary bGH poly(A) signal sequence

CTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGA CCCTGGA

AGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCT GAGTAGG

TGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAA GACAATA

GCAGGCATGCTGGGGATGCGGTGGGCTCTATGG

IRES and 2A Elements

[0612] In some embodiments, the knock-in cassette comprises a regulatory element that enables expression of the gene product encoded by the essential gene and the gene product of interest as separate gene products, e.g., an IRES or 2A element located between the exogenous coding sequence or partial coding sequence of the essential gene and the exogenous coding sequence for the gene product of interest.

[0613] In some embodiments, a knock-in cassette may comprise multiple gene products of interest (e.g., at least two gene products of interest). In some embodiments, gene products of interest may be separated by a regulatory element that enables expression of the at least two gene products of interest as more than one gene product, e.g., an IRES or 2 A element located between the at least two coding sequences, facilitating creation of at least two peptide products. [0614] Internal Ribosome Entry Site (IRES) elements are one type of regulatory element that are commonly used for this purpose. As is well known in the art, IRES elements allow for initiation of translation from an internal region of the mRNA and hence expression of two separate proteins from the same mRNA transcript. IRES was originally discovered in poliovirus RNA, where it promotes translation of the viral genome in eukaryotic cells. Since then, a variety of IRES sequences have been discovered - many from viruses, but also some from cellular mRNAs, e.g., see Mokrejs et al., Nucleic Acids Res. 2006; 34(Database issue):D125-D130. [0615] 2A elements are another type of regulatory element that are commonly used for this purpose. These 2A elements encode so-called “self-cleaving” 2A peptides which are short peptides (about 20 amino acids) that were first discovered in picomaviruses. The term “self- cleaving” is not entirely accurate, as these peptides are thought to function by making the ribosome skip the synthesis of a peptide bond at the C-terminus of a 2A element, leading to separation between the end of the 2A sequence and the next peptide downstream. The “cleavage” occurs between the Glycine (G) and Proline (P) residues found on the C-terminus meaning the upstream cistron, i.e., protein encoded by the essential gene will have a few additional residues from the 2A peptide added to the end, while the downstream cistron, i.e., gene product of interest will start with the Proline (P).

[0616] Table 2 below lists the four commonly used 2A peptides (an optional GSG sequence is sometimes added to the N-terminal end of the peptide to improve cleavage efficiency). There are many potential 2A peptides that may be suitable for methods and compositions described herein (see e.g., Luke et al., Occurrence, function and evolutionary origins of ‘2A-like’ sequences in virus genomes. J Gen Virol. 2008). Those skilled in the art know that the choice of specific 2A peptide for a particular knock-in cassette will ultimately depend on a number of factors such as cell type or experimental conditions. Those skilled in the art will recognize that nucleotide sequences encoding specific 2A peptides can vary while still encoding a peptide suitable for inducing a desired cleavage event.

Table 2: Exemplary 2A peptide sequences

Essential senes

[0617] An essential gene can be any gene that is essential for the survival and/or the proliferation of the cell. In some embodiments, an essential gene is a housekeeping gene that is essential for survival of all cell types, e.g., a gene listed in Table 3. See also other housekeeping genes discussed in Eisenberg, Trends in Gen. 2014; 30(3): 119-20 and Moein et al., Adv, Biomed Res. 2017, 6:15. Additional genes that are essential for various cell types, including iPSCs/ESCs, are listed in Table 4 (see also the essential genes discussed in Yilmaz et al., Nat. Cell Biol. 2018; 20:610-619 the entire contents of which are incorporated herein by reference). [0618] In some embodiments the essential gene is GAPDH and the DNA nuclease causes a break in exon 9, e.g., a double-strand break. In some embodiments the essential gene is TBP and the DNA nuclease causes a break in exon 7, or exon 8, e.g., a double-strand break. In some embodiments the essential gene is E2F4 and the DNA nuclease causes a break in exon 10, e.g., a double-strand break. In some embodiments the essential gene is G6PD and the DNA nuclease causes a break in exon 13, e.g., a double-strand break. In some embodiments the essential gene is KIF11 and the DNA nuclease causes a break in exon 22, e.g., a double-strand break.

Table 3: Exemplary housekeeping genes

Table 4: Additional exemplary essential genes

[0619] The gene symbols used in herein (including in Tables 3 and 4) are based on those found in the Human Gene Naming Committee (HGNC) which is searchable on the world-wide web at www.genenames.org. Ensembl IDs are provided for each gene symbol and are searchable world-wide web at www.ensembl.org.

[0620] The genes provided in Tables 3 and 4 are non-limiting examples of essential genes. Although additional essential genes will be apparent to the skilled artisan based on the knowledge in the art, the suitability of a particular gene for use according to the present disclosure can be determined, e.g., as discussed herein. For example, in some embodiments, a particular essential gene can be selected by analysis of potential off-target sites elsewhere in the genome. In some embodiments, only essential genes with one or more gRNA target sites that are unique in the human genome are selected for methods described herein. In some embodiments, only essential genes with one or more gRNA target sites that are found in only one other locus in the human genome are selected for methods described herein. In some embodiments, only essential genes with one or more gRNA target sites found in only two other loci in the human genome are selected for methods described herein.

Gene product of interest

[0621] The methods, systems and cells of the present disclosure enable the integration of a gene of interest at an essential gene of a cell. The gene of interest can encode any gene product of interest. In certain embodiments, a gene product of interest comprises an antibody, an antigen, an enzyme, a growth factor, a receptor (e.g., cell surface, cytoplasmic, or nuclear), a hormone, a lymphokine, a cytokine, a chemokine, a reporter, a functional fragment of any of the above, or a combination of any of the above.

[0622] In some embodiments, sequence for a gene product of interest can include, but is not limited to, prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNA sequences from eukaryotic (e.g., mammalian) DNA, and synthetic DNA sequences. For example, a gene of interest may encode an miRNA, an shRNA, a native polypeptide (i.e. a polypeptide found in nature) or fragment thereof; a variant polypeptide (i.e. a mutant of the native polypeptide having less than 100% sequence identity with the native polypeptide) or fragment thereof; an engineered polypeptide or peptide fragment, a therapeutic peptide or polypeptide, an imaging marker, a selectable marker, a degradation signal, and the like.

[0623] In some embodiments, a gene product of interest may be but is not limited to, e.g., a therapeutic protein or a gene product that confers a desired feature to the modified cell. In some embodiments, the transgene encodes a reporter protein, such as a fluorescent protein (e.g., as described herein) and an enzyme (e.g., luciferase and lacZ). In some embodiments, a reporter gene may aid the tracking of therapeutic cells once they are introduced to a subject.

[0624] In some embodiments, a gene product of interest may be but is not limited to therapeutic proteins such as a protein deficient in a patient. In some embodiments, for example, therapeutic proteins include, but are not limited to, those deficient in lysosomal storage disorders, such as alpha-L-iduronidase, arylsulfatase A, beta-glucocerebrosidase, acid sphingomyelinase, and alpha- and beta-galactosidase; and those deficient in hemophilia such as Factor VIII and Factor IX. Other examples of therapeutic proteins include, but are not limited to, antibodies or antibody fragments (e.g., scFv) such as those targeting pathogenic proteins (e.g., tau, alpha- synuclein, and beta-amyloid protein) and those targeting cancer cells (e.g., chimeric antigen receptors (CAR) as described herein)

[0625] In some embodiments, a gene product of interest may be a protein involved in immune regulation, or an immunomodulatory protein. In some embodiments, for example, such proteins are, PD-L1, CTLA-4, M-CSF, IL-4, IL-6, IL-10, IL-11, IL-13, TGF-bI, and various isoforms thereof. By way of example, in some embodiments, a gene product of interest may be an isoform of HLA-G (e.g., HLA-G1, -G2, -G3, -G4, -G5, -G6, or -G7) or HLA-E; allogeneic cells expressing such a nonclassical MHC class I molecule may be less immunogenic and better tolerated when transplanted into a human patient who is not the source of the cells, making “universal” cell therapy possible.

[0626] In some embodiments, an exemplary gene product of interest is one that confers therapeutic value, e.g., a new therapeutic activity to the cell. In some embodiments, exemplary gene products of interest are polypeptides such as a chimeric antigen receptor (CAR) or antigen- binding fragment thereof, a T cell receptor or antigen binding fragment thereof, a non-naturally occurring variant of FcγRIII (CD 16), interleukin 15 (IL-15), interleukin 15 receptor (IL-15R) or a variant thereof, interleukin 12 (IL-12), interleukin- 12 receptor (IL-12R) or a variant thereof, human leukocyte antigen G (HLA-G), human leukocyte antigen E (HLA-E), leukocyte surface antigen cluster of differentiation CD47 (CD47), or any combination of two or more thereof. It is to be understood that the methods and cells of the present disclosure are not limited to any particular gene product of interest and that the selection of a gene product of interest will depend on the type of cell and ultimate use of the cells.

[0627] In some embodiments, a gene product of interest may be a cytokine. In some embodiments, expression of a cytokine from a modified cell generated using a method as described herein allows for localized dosing of the cytokine in vivo (e.g., within a subject in need thereof) and/or avoids a need to systemically administer a high-dose of the cytokine to a subject in need thereof (e.g., a lower dose of the cytokine may be administered). In some embodiments, the risk of dose-limiting toxicities associated with administering a cytokine is reduced while cytokine mediated cell functions are maintained. In some embodiments, to facilitate cell function without the need to additionally administer high-doses of soluble cytokines, a partial or full peptide of one or more of IL2, IL4, IL6, IL7, IL9, IL10, IL11, IL12, IL15, IL18, IL21, IFN-a, IFN-b and/or their respective receptor is introduced to the cell to enable cytokine signaling with or without the expression of the cytokine itself, thereby maintaining or improving cell growth, proliferation, expansion, and/or effector function with reduced risk of cytokine toxicities. In some embodiments, the introduced cytokine and/or its respective native or modified receptor for cytokine signaling are expressed on the cell surface. In some embodiments, the cytokine signaling is constitutively activated. In some embodiments, the activation of the cytokine signaling is inducible. In some embodiments, the activation of the cytokine signaling is transient and/or temporal. In some embodiments, a gene product if interest can be IL2, IL3, IL4, IL6, IL7, IL9, IL10, IL11, IL12, IL13, IL15, IL21, GM-CSF, IFN-a, IFN-b, IFN-g, erythropoietin, and/or the respective cytokine receptor. In some embodiments, a gene product of interest can be CCL3, TNFa, CCL23, IL2RB, IL12RB2, or IRF7.

[0628] In some embodiments, a gene product of interest can be a chemokine and/or the respective chemokine receptor. In some embodiments, a chemokine receptor can be, but is not limited to, CCR2, CCR5, CCR8, CX3C1, CX3CR1, CXCR1, CXCR2, CXCR3A, CXCR3B, or CXCR2. In some embodiments, a chemokine can be, but is not limited to, CCL7, CCL19, or CXL14.

[0629] As used herein, the term “chimeric antigen receptor” or “CAR” refers to a receptor protein that has been modified to give cells expressing the CAR the new ability to target a specific protein. Within the context of the disclosure, a cell modified to comprise a CAR or an antigen binding fragment may be used for immunotherapy to target and destroy cells associated with a disease or disorder, e.g ., cancer cells. In some embodiments, the CAR can bind to any antigen of interest.

[0630] CARs of interest can include, but are not limited to, a CAR targeting mesothelin,

EGFR, HER2 and/or MICA/B. To date, mesothelin-targeted CAR T-cell therapy has shown early evidence of efficacy in a phase I clinical trial of subjects having mesothelioma, non-small cell lung cancer, and breast cancer (NCT02414269). Similarly, CARs targeting EGFR, HER2 and MICA/B have shown promise in early studies (see, e.g, Li et al. (2018), Cell Death & Disease , 9(177); Han et al. (2018) Am. J. Cancer Res., 8(1): 106-119; and Demoulin 2017)

Future Oncology, 13(8); the entire contents of each of which are expressly incorporated herein by reference in their entireties). [0631] CARs are well-known to those of ordinary skill in the art and include those described in, for example: WO13/063419 (mesothelin), W015/164594 (EGFR), WO13/063419 (HER2), W016/154585 (MICA and MICB), the entire contents of each of which are expressly incorporated herein by reference in their entireties. In some embodiments, a gene product of interest is any suitable CAR, NK cell specific CAR (NK-CAR), T cell specific CAR, or other binder that targets a cell, e.g ., an NK cell, to a target cell, e.g. , a cell associated with a disease or disorder, may be expressed in the modified cells provided herein. Exemplary CARs, and binders, include, but are not limited to, bi-specific antigen binding CARs, switchable CARs, dimerizable CARs, split CARs, multi-chain CARs, inducible CARs, CARs and binders that bind BCMA, androgen receptor, PSMA, PSCA, Mucl, HPV viral peptides (i.e., E7), EBV viral peptides, WT1, CEA, EGFR, EGFRvIII, IL13Ra2, GD2, CA125, EpCAM, Mucl6, carbonic anhydrase IX (CAIX), CCR1, CCR4, carcinoembryonic antigen (CEA), CD3, CD5, CD7, CD10, CD 19, CD20, CD22, CD23, CD24, CD26, CD30, CD33, CD34, CD35, CD38 CD41, CD44, CD44V6, CD49f, CD56, CD70, CD92, CD99, CD123, CD133, CD135, CD148, CD150, CD261, CD362, CLEC12A, MDM2, CYP1B, livin, cyclin 1, NKp30, NKp46, DNAM1, NKp44, CA9, PD1, PDL1, an antigen of cytomegalovirus (CMV), epithelial glycoprotein-40 (EGP-40), GPRC5D, receptor tyrosine kinases erb-B2,3,4, EGFIR, ERBB folate binding protein (FBP), fetal acetylcholine receptor (AChR), folate receptor-a, ganglioside G3 (GD3) human Epidermal Growth Factor Receptor 2 (HER-2), human tel om erase reverse transcriptase (hTERT), ICAM-1, Integrin B7, Interleukin- 13 receptor subunit alpha-2 (IL-13Ra2), K-light chain, kinase insert domain receptor (KDR), Lewis A (CA19.9), Lewis Y (Le Y), LI cell adhesion molecule (LI- CAM), LILRB2, melanoma antigen family A 1 (MAGE-A1), MICA/B, Mucin 16 (Muc-16), NKCSI, NKG2D ligands, c-Met, cancer-testis antigen NYESO-1, oncofetal antigen (h5T4), PRAME, prostate stem cell antigen (PSCA), PRAME prostate-specific membrane antigen (PSMA), tumor-associated glycoprotein 72 (TAG-72), TIM-3, TRBCI, TRBC2, vascular endothelial growth factor R2 (VEGF-R2), Wilms tumor protein (WT-1), a pathogen antigen, or any suitable combination thereof.. Additional suitable CARs and binders for use in the modified cells provided herein will be apparent to those of skill in the art based on the present disclosure and the general knowledge in the art. Such additional suitable CARs include those described in Figure 3 of Davies and Maher, Adoptive T-cell Immunotherapy of Cancer Using Chimeric Antigen Receptor-Grafted T Cells , Archivum Immunologiae et Therapiae Experimentalis 58(3): 165-78 (2010), the entire contents of which are incorporated herein by reference. Additional CARs suitable for methods described herein include: CD171-specific CARs (Park et al., Mol Ther (2007) 15(4): 825-833), EGFRvIII-specific CARs (Morgan et al, Hum Gene Ther (2012) 23(10): 1043-1053), EGF-R-specific CARs (Kobold et al, J Natl Cancer Inst (2014) 107(1):364), carbonic anhydrase K-specific CARs (Lamers et al., Biochem Soc Trans (2016) 44(3):951-959), FR-a-specific CARs (Kershaw et al., Clin Cancer Res (2006) 12(20):6106- 6015), HER2-specific CARs (Ahmed et al., J Clin Oncol (2015) 33(15)1688-1696; Nakazawa et al., Mol Ther (2011) 19(12):2133-2143; Ahmed et al., Mol Ther (2009) 17(10): 1779-1787; Luo et al., Cell Res (2016) 26(7):850-853; Morgan et al., Mol Ther (2010) 18(4):843-85 1; Grada et al., Mol Ther Nucleic Acids (2013) 9(2):32), CEA- specific CARs (Katz et al., Clin Cancer Res (2015) 21 (14):3149-3159), IL13Ra2-specific CARs (Brown et al., Clin Cancer Res (2015) 21(18)4062-4072), GD2-specific CARs (Louis et al., Blood (2011) 118(23):6050-6056; Caruana et al., Nat Med (2015) 21(5):524-529), ErbB2-specific CARs (Wilkie et al., J Clin Immunol (2012) 32(5): 1059-1070), VEGF-R- specific CARs (Chinnasamy et al., Cancer Res (2016) 22(2)436-447), FAP-specific CARs (Wang et al., Cancer Immunol Res (2014) 2(2): 154-166), MSLN-specific CARs (Moon et al., Clin Cancer Res (2011) 17(14)4719-30), CD19-specific CARs (Axicabtagene ciloleucel (Yescarta®) and Tisagenlecleucel (Kymriah®). See also, Li et al., J Hematol and Oncol (2018) 11(22), reviewing clinical trials of tumor-specific CARs.

[0632] As used herein, the term “CD 16” refers to a receptor ( FcγRIII) for the Fc portion of immunoglobulin G, and it is involved in the removal of antigen-antibody complexes from the circulation, as well as other antibody-dependent responses. In some embodiments, a CD 16 protein is an hCD16 variant. In some embodiments an hCD16 variant is a high affinity F158V variant.

[0633] In some embodiments, a gene product of interest comprises a high affinity non- cleavable CD 16 (hnCD16) or a variant thereof. In some embodiments, a high affinity non- cleavable CD 16 or a variant thereof comprises at least any one of the followings: (a) F176V and S197P in ectodomain domain of CD16 (see e.g., Jing et al., Identification of an ADAM17 Cleavage Region in Human CD 16 (FcγRIII) and the Engineering of a Non-Cleavable Version of the Receptor in NK Cells; PLOS One, 2015); (b) a full or partial ectodomain originated from CD64; (c) a non-native (or non-CD 16) transmembrane domain; (d) a non-native (or non-CD 16) intracellular domain; (e) a non-native (or nonCD16) signaling domain; (f) a non-native stimulatory domain; and (g) transmembrane, signaling, and stimulatory domains that are not originated from CD 16, and are originated from a same or different polypeptide. In some embodiments, the non-native transmembrane domain is derived from CD3D, CD3E, CD3G, CD3s, CD4, CD5, CD5a, CD5b, CD27, CD2S, CD40, CDS4, CD 166, 4-1BB, 0X40, ICOS, ICAM-1, CTLA-4, PD-1, LAG-3, 2B4, BTLA, CD16, IL7, IL12, IL15, KIR2DL4, KIR2DS1, NKp30, NKp44, NKp46, NKG2C, NKG2D, or T cell receptor (TCR) polypeptide. In some embodiments, the non-native stimulatory domain is derived from CD27, CD2S, 4-1BB, 0X40, ICOS, PD-1, LAG-3, 2B4, BTLA, DAPIO, DAP 12, CTLA-4, or NKG2D polypeptide. In some other embodiments, the non-native signaling domain is derived from CD3s, 2B4, DAPIO,

DAP 12, DNAM1, CD137 (41BB), IL21, IL7, IL12, IL15, NKp30, NKp44, NKp46, NKG2C, or NKG2D polypeptide. In some particular embodiments of a hnCD16 variant, the non-native transmembrane domain is derived from NKG2D, the non-native stimulatory domain is derived from 2B4, and the non-native signaling domain is derived from CD3s. In some embodiments, a gene product of interest comprises a high affinity cleavable CD 16 (hnCD16) or a variant thereof. In some embodiments, a high affinity cleavable CD 16 or a variant thereof comprises at least FI 76V. In some embodiments, a high affinity cleavable CD 16 or a variant thereof does not comprise an S197P amino acid substitution.

[0634] As used herein, the term “IL-15/IL15RA” or “Interleukin- 15” (IL-15) refers to a cytokine with structural similarity to Interleukin-2 (IL-2). Like IL-2, IL-15 binds to and signals through a complex composed of IL-2/IL-15 receptor beta chain (CD 122) and the common gamma chain (gamma-C, CD132). IL-15 is secreted by mononuclear phagocytes (and some other cells) following infection by virus(es). This cytokine induces cell proliferation of natural killer cells. IL-15 Receptor alpha (IL15RA) specifically binds IL-15 with very high affinity, and is capable of binding IL-15 independently of other subunits (see e.g., Mishra et al., Molecular pathways: Interleukin- 15 signaling in health and in cancer, Clinical Cancer Research, 2014). It is suggested that this property allows IL-15 to be produced by one cell, endocytosed by another cell, and then presented to a third party cell. IL15RA is reported to enhance cell proliferation and expression of apoptosis inhibitor BCL2L1/BCL2-XL and BCL2. Exemplary sequences of IL-15 are provided in NG 029605.2, and exemplary sequences of IL-15RA are provided in NM 002189.4. In some embodiments, the IL-15R variant is a constitutively active IL-15R variant. In some embodiments, the constitutively active IL-15R variant is a fusion between IL- 15R and an IL-15R agonist, e.g., an IL-15 protein or IL-15R-binding fragment thereof. In some embodiments, the IL-15R agonist is IL-15, or an IL-15R-binding variant thereof. Exemplary suitable IL-15R variants include, without limitation, those described, e.g., in Mortier E et al, 2006; The Journal of Biological Chemistry 2006281: 1612-1619; or in Bessard-A et al., Mol Cancer Ther. 2009 Sep;8(9):2736-45, the entire contents of each of which are incorporated by reference herein. In some embodiments, membrane bound trans-presentation of IL-15 is a more potent activation pathway than soluble IL-15 (see e.g., Imamura et al., Autonomous growth and increased cytotoxicity of natural killer cells expressing membrane-bound interleukin- 15, Blood , 2014). In some embodiments, IL-15R expression comprises: IL15 and IL15Ra expression using a self-cleaving peptide; a fusion protein of IL15 andIL15Ra; an IL15/IL15Ra fusion protein with intracellular domain of IL15Ra truncated; a fusion protein of IL15 and membrane bound Sushi domain of IL15Ra; a fusion protein of IL15 and IL 15Rβ; a fusion protein of IL15 and common receptor yC, wherein the common receptor yC is native or modified; and/or a homodimer of IL15Rβ.

[0635] As used herein, the term “IL-12” refers to interleukin- 12, a cytokine that acts on T and natural killer cells. In some embodiments, a genetically engineered stem cell and/or progeny cell comprises a genetic modification that leads to expression of one or more of an interleukin 12 (IL12) pathway agonist, e.g., IL-12, interleukin 12 receptor (IL-12R) or a variant thereof (e.g., a constitutively active variant of IL-12R, e.g., an IL-12R fused to an IL-12R agonist (IL-12RA). [0636] In some embodiments, the gene product of interest comprises a protein or polypeptide whose expression within a cell, e.g., a cell modified as described herein, enables the cell to inhibit or evade immune rejection after transplant or engraftment into a subject. In some embodiments, the gene product of interest is HLA-E, HLA-G, CTL4, CD47, or an associated ligand.

[0637] In some embodiments, the gene product of interest is a T cell receptor (TCR) or an antigen-binding fragment thereof, e.g., a recombinant TCR. In some embodiments, the recombinant TCR can bind to an antigen of interest, e.g., an antigen selected from, but not limited to, CD279, CD2, CD95, CD152, CD223CD272, TIM3, KIR, A2aR, SIRPa, CD200, CD200R, CD300, LPA5, NY-ESO, PD1, PDL1, or MAGE-A3/A6. In some embodiments, the TCR or antigen-binding fragment thereof can bind to a viral antigen, e.g., an antigen from hepatitis A, hepatitis B, hepatitis C (HCV), human papilloma virus (HPV) (e.g., HPV-16 (such as HPV-16 E6 or HPV-16 E7), HPV-18, HPV-31, HPV-33, or HPV-35), Epstein-Barr virus (EBV), human herpes virus 8 (HHV-8), human T-cell leukemia virusOl (HTLV-1), human T-cell leukemia virus-2 (HTLV-2) or a cytomegalovirus (CMV).

[0638] In some embodiments, the gene product of interest comprises a single-chain variable fragment that can bind to CD47, PD1, CTLA4, CD28, 0X40, 4-1BB, and ligands thereof.

[0639] As used herein, the term “HLA-G” refers to the HLA non-classical class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. HLA-G is expressed on fetal derived placental cells. HLA-G is a ligand for NK cell inhibitory receptor KIR2DL4, and therefore expression of this HLA by the trophoblast defends it against NK cell- mediated death. See e.g., Favier et al., Tolerogenic Function of Dimeric Forms of HLA-G Recombinant Proteins: A Comparative Study In Vivo PLOS One 2011, the entire contents of which are incorporated herein by reference. An exemplary sequence of HLA-G is set forth as NG_029039.1.

[0640] As used herein, the term “HLA-E” refers to the HLA class I histocompatibility antigen, alpha chain E, also sometimes referred to as MHC class I antigen E. The HLA-E protein in humans is encoded by the HLA-E gene. The human HLA-E is a non-classical MHC class I molecule that is characterized by a limited polymorphism and a lower cell surface expression than its classical paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. HLA-E binds a restricted subset of peptides derived from the leader peptides of other class I molecules. HLA-E expressing cells escape allogeneic responses and lysis by NK cells.

See e.g., Geornalusse-G et al., Nature Biotechnology 201735(8), the entire contents of which are incorporated herein by reference. Exemplary sequences of the HLA-E protein are provided in NM_005516.6.

[0641] As used herein, the term “CD47,” also sometimes referred to as “integrin associated protein” (LAP), refers to a transmembrane protein that in humans is encoded by the CD47 gene. CD47 belongs to the immunoglobulin superfamily, partners with membrane integrins, and also binds the ligands thrombospondin- 1 (TSP-1) and signal-regulatory protein alpha (SIRPa). CD47 acts as a signal to macrophages that allows CD47-expressing cells to escape macrophage attack. See, e.g., Deuse-T, et al., Nature Biotechnology 2019 37: 252-258, the entire contents of which are incorporated herein by reference.

[0642] In some embodiments, a gene product of interest comprises a chimeric switch receptor (see e.g., WO2018094244A1 - TGFBeta Signal Converter; Ankri et al., Human T cells Engineered to express a programmed death 1/28 costimulatory retargeting molecule display enhanced antitumor activity, The Journal of Immunology, October 15, 2013, 191; Roth et al., Pooled knockin targeting for genome engineering of cellular immunotherapies, Cell. 2020 Apr 30;181(3):728-744.e21; and Boyerinas et al., ANovel TGF-β2/Interleukin Receptor Signal Conversion Platform That Protects CAR/TCR T Cells from TGF-β2-Mediated Immune Suppression and Induces T Cell Supportive Signaling Networks, Blood, 2017). In some embodiments, chimeric switch receptors are engineered cell-surface receptors comprising an extracellular domain from an endogenous cell-surface receptor and a heterologous intracellular signaling domain, such that ligand recognition by the extracellular domain results in activation of a different signaling cascade than that activated by the wild type form of the cell-surface receptor. In some embodiments, a chimeric switch receptor comprises an extracellular domain of an inhibitory cell-surface receptor fused to an intracellular domain that leads to the transmission of an activating signal rather than the inhibitory signal normally transduced by the inhibitory cell-surface receptor. In some embodiments, extracellular domains derived from cell-surface receptors known to inhibit immune effector cell activation can be fused to activating intracellular domains. In such an embodiment, engagement of the corresponding ligand may then activate signaling cascades that increase, rather than inhibit, the activation of the immune effector cell.

For example, in some embodiments, a gene product of interest is a PD1-CD28 switch receptor, wherein the extracellular domain of PD1 is fused to the intracellular signaling domain of CD28 ( See e.g.. Liu et al., Cancer Res 76:6 (2016), 1578-1590 and Moon et al., Molecular Therapy 22 (2014), S201). In some embodiments, encoding gene product of interest is or comprises the extracellular domain of CD200R and the intracellular signaling domain of CD28 (See Oda et al., Blood 130:22 (2017), 2410-2419).

[0643] In some embodiments, a gene product of interest is a reporter gene (e.g., GFP, mCherry, etc.). In some embodiments, a reporter gene is utilized to confirm the suitability of a knock-in cassette’s expression capacity. In certain embodiments, a gene product of interest may be a colored or fluorescent protein such as: blue/UV proteins, e.g. TagBFP, mTagBFP2, Azurite, EBFP2, mKalamal, Sirius, Sapphire, T-Sapphire; cyan proteins, e.g. ECFP, Cerulean, SCFP3A, mTurquoise, mTurquoise2, monomeric Midoriishi-Cyan, TagCFP, mTFP1; green proteins, e.g. EGFP, Emerald, Superfolder GFP, Monomeric Azami Green, TagGFP2, mUKG, m Wasabi, Clover, mNeonGreen; yellow proteins, e.g. EYFP, Citrine, Venus, SYFP2, TagYFP; orange proteins, e.g. Monomeric Kusabira-Orange, mKOK, mK02, mOrange, m0range2; red proteins, e.g. mRaspberry, mStrawberry, mTangerine, tdTomato, TagRFP, TagRFP-T, mApple, mRuby, mRuby2; far-red proteins, e.g. mPlum, HcRed-Tandem, mKate2, mNeptune, NirFP; near-IR proteins, e.g. TagRFP657, IFP1.4, iRFP; long stokes shift proteins, e.g. mKeima Red, LSS- mKatel, LSS-mKate2, mBeRFP; photoactivatible proteins, e.g. PA-GFP, PAmCherryl, PATagRFP; photoconvertible proteins, e.g. Kaede (green), Kaede (red), KikGR1 (green), KikGR1 (red), PS-CFP2, PS-CFP2, mEos2 (green), mEos2 (red), mEos3.2 (green), mEos3.2 (red), PSmOrange, PSmOrange, photoswitchable proteins, e.g. Dronpa, and combinations thereof. [0644] In some embodiments, a gene of interest provided herein can optionally include a sequence encoding a destabilizing domain (“a destabilizing sequence”) for temporal and/or spatial control of protein expression. Non-limiting examples of destabilizing sequences include sequences encoding a FK506 sequence, a dihydrofolate reductase (DHFR) sequence, or other exemplary destabilizing sequences.

[0645] In the absence of a stabilizing ligand, a protein sequence operatively linked to a destabilizing sequence is degraded by ubiquitination. In contrast, in the presence of a stabilizing ligand, protein degradation is inhibited, thereby allowing the protein sequence operatively linked to the destabilizing sequence to be actively expressed. As a positive control for stabilization of protein expression, protein expression can be detected by conventional means, including enzymatic, radiographic, colorimetric, fluorescence, or other spectrographic assays; fluorescent activating cell sorting (FACS) assays; immunological assays (e.g., enzyme linked immunosorbent assay (ELISA), radioimmunoassay (RIA), and immunohistochemistry).

[0646] Additional examples of destabilizing sequences are known in the art. In some embodiments, the destabilizing sequence is aFK506- and rapamycin-binding protein (FKBP12) sequence, and the stabilizing ligand is Shield-1 (Shld1) (Banaszynski et al. (2012) Cell 126(5): 995-1004, which is incorporated in its entirety herein by reference). In some embodiments, a destabilizing sequence is a DHFR sequence, and a stabilizing ligand is trimethoprim (TMP) (Iwamoto et al. (2010) Chem Biol 17:981-988, which is incorporated in its entirety herein by reference). In some embodiments, a destabilizing domain is small molecule-assisted shutoff (SMASh), where a constitutive degron with a protease and its corresponding cleavage site derived from hepatitis C virus are combined. In some embodiments, a destabilizing domain comprises a HaloTag system, dTag system, and/or nanobody (see e.g., Luh et al., Prey for the proteasome: targeted protein degradation - a medicinal chemist’s perspective; Angewandte Chemie, 2020).

[0647] In some embodiments, a destabilizing sequence can be used to temporally control a cell modified as described herein.

[0648] In some embodiments, a gene product of interest may be a suicide gene, (see e.g., Zarogoulidis et al., Suicide Gene Therapy for Cancer - Current Strategies; J Genet Syndr Gene Idler. 2013). In some embodiments, a suicide gene can use a gene-directed enzyme prodrug therapy (GDEPT) approach, a dimerization inducing approach, and/or therapeutic monoclonal antibody mediated approach. In some embodiments, a suicide gene is biologically inert, has an adequate bio-availability profile, an adequate bio-distribution profile, and can be characterized by intrinsic acceptable and/or absence of toxicity. In some embodiments, a suicide gene codes for a protein able to convert, at a cellular level, a non-toxic prodrug into a toxic product. In some embodiments, a suicide gene may improve the safety profile of a cell described herein (see e.g., Greco et al., Improving the safety of cell therapy with the TK-suicide gene; Front Pharmacology. 2015; Jones et al., Improving the safety of cell therapy products by suicide gene transfer; Frontiers Pharmacology, 2014). In some embodiments, a suicide gene is a herpes simplex virus thymidine kinase (HSV-TK). In some embodiments, a suicide gene is a cytosine deaminase (CD). In some embodiments, a suicide gene is an apoptotic gene (e.g., a caspase). In some embodiments, a suicide gene is dimerization inducing, e.g., comprising an inducible FAS (iFAS) or inducible Caspase9 (iCasp9)/AP1903 system. In some embodiments, a suicide gene is a CD20 antigen, and cells expressing such an antigen can be eliminated by clinical-grade anti- CD20 antibody administration. In some embodiments, a suicide gene is a truncated human EGFR polypeptide (huEGFRt) which confers sensitivity to a pharmaceutical-grade anti-EGFR monoclonal antibody, e.g., cetuximab. In some embodiments a suicide gene is a c-myc tag, which confers sensitivity to pharmaceutical-grade anti-cmyc antibodies.

[0649] In some embodiments, a gene product of interest may be a safety switch signal.

In cell therapy, a safety switch can be used to stop proliferation of the genetically modified cells when their presence in the patient is not desired, for example, if the cells do not function properly, if planned therapeutic interventions change, or if the therapeutic goal has been achieved. In some embodiments, a safety switch may, for example, be a so-called suicide gene, or suicide switch, which upon administration of a pharmaceutical compound to the patient, will be activated or inactivated such that the cells enter apoptosis. Suicide genes, sometimes called suicide switches or safety switches can be triggered or activated by a cellular event, environmental event or chemical agent resulting in a cellular response by cells that have the suicide gene incorporated in their genome. In some embodiments, activation of a safety switch induces cellular apoptosis. In some embodiments, activation of the safety switch inhibits growth of cells incorporated with the safety switch. In some embodiments, a suicide switch may encode an enzyme not found in humans (e.g., a bacterial or viral enzyme) that converts a harmless substance into a toxic metabolite in the human cell. Examples of suicide switch include, without limitation, genes for thymidine kinases, cytosine deaminases, intracellular antibodies, telomerases, toxins, caspases (e.g., iCaspase9) and HSV-TK, and DNases. In some embodiments, the suicide gene may be a thymidine kinase (TK) gene from the Herpes Simplex Virus (HSV) and the suicide TK gene becomes toxic to the cell upon administration of ganciclovir, valganciclovir, famciclovir, or the like to the patient.

[0650] In some embodiments, a safety switch may be a rapamycin-inducible human Caspase 9-based (RapaCasp9) cellular suicide switch in which a truncated caspase 9 gene, which has its CARD domain removed, is linked after either the FRB (FKBP12-rapamycin binding) domain of mTOR, or FKBP12 (FK506-binding protein 12). Addition of the drug rapamycin enables heterodimerization of FRB and FKBP12 which subsequently causes homodimerization of truncated caspase 9 and induction of apoptosis. In some embodiments, using a two construct and/or biallelic approach as described herein, FRB and FKBP12 are separated onto different alleles by incorporating two donor constructs, one with one or more transgenes plus FRB, the other with one or more transgenes plus FKBP12. When referring to a safety switch in this application, it should be interpreted to include all components necessary for the function of the safety switch (e.g., FRB domain and FKBP12 domain and truncated caspase 9 gene are all components of, and make up, the safety switch).

SEQ ID NO: 160 - Exemplary DHFR destabilizing amino acid sequence MISLIAALAVDYVIGMENAMPWNLPADLAWFKRNTLNKPVIMGRHTWES IGRPLPGRKNIILSS QPSTDDRVTWVKSVDEAIAACGDVPEIMVIGGGRVIEQFLPKAQKLYLTHIDAEVEGDTH FPDY EPDDWESVFSEFHDADAQNSHSYCFEILERR

SEQ ID NO: 161 - Exemplary DHFR destabilizing nucleotide sequence

GGTACCATCAGTCTGATTGCGGCGTTAGCGGTAGATTACGTTATCGGCATGGAAAAC GCCATGC

CGTGGAACCTGCCTGCCGATCTCGCCTGGTTTAAACGCAACACCTTAAATAAACCCG TGATTAT

GGGCCGCCATACCTGGGAATCAATCGGTCGTCCGTTGCCAGGACGCAAAAATATTAT CCTCAGC

AGTCAACCGAGTACGGACGATCGCGTAACGTGGGTGAAGTCGGTGGATGAAGCCATC GCGGCGT

GTGGTGACGTACCAGAAATCATGGTGATTGGCGGCGGTCGCGTTATTGAACAGTTCT TGCCAAA

AGCGCAAAAACTGTATCTGACGCATATCGACGCAGAAGTGGAAGGCGACACCCATTT CCCGGAT

TACGAGCCGGATGACTGGGAATCGGTATTCAGCGAATTCCACGATGCTGATGCGCAG AACTCTC

ACAGCTATTGCTTTGAGATTCTGGAGCGGCGATAA

SEQ ID NO: 162 - Exemplary destabilizing domain

ATCAGTCTGATTGCGGCGTTAGCGGTAGATTACGTTATCGGCATGGAAAACGCCATG CCGTGGA ACCTGCCTGCCGATCTCGCCTGGTTTAAACGCAACACCTTAAATAAACCCGTGATTATGG GCCG CCATACCTGGGAATCAATCGGTCGTCCGTTGCCAGGACGCAAAAATATTATCCTCAGCAG TCAA CCGAGTACGGACGATCGCGTAACGTGGGTGAAGTCGGTGGATGAAGCCATCGCGGCGTGT GGTG ACGTACCAGAAATCATGGTGATTGGCGGCGGTCGCGTTATTGAACAGTTCTTGCCAAAAG CGCA AAAACTGTATCTGACGCATATCGACGCAGAAG TGGAAGGCGACACCCATTTCCCGGATTACGAG CCGGATGACTGGGAATCGGTATTCAGCGAATTCCACGATGCTGATGCGCAGAACTCTCAC AGCT ATTGCTTTGAGATTCTGGAGCGGCGA

SEQ ID NO: 163 - Exemplary FKBP12 destabilizing peptide amino acid sequence

MGVEKQVIRPGNGPKPAPGQTVTVHCTGFGKDGDLSQKFWSTKDEGQKPFSFQIGKG AVIKGWD

EGVIGMQIGEVARLRCSSDYAYGAGGFPAWGIQPNSVLDFEIEVLSVQ

[0651] In some embodiments, a coding sequence for a single gene product of interest may be included in a knock-in cassette. In some embodiments, coding sequences for two gene products of interest may be included in a single knock-in cassette; in some embodiments, this may be referred to as a bicistronic or multicistronic construct. In some embodiments, coding sequences for more than two gene products of interest may be included in a single knock-in cassette; in some embodiments, this may be referred to as a multicistronic construct. In some embodiments, when more than one coding sequence for more than one gene product of interest is included in a knock-in cassette, these sequences may have a linker sequence connecting them. Linker sequences are generally known in the art, an exemplary linker sequence is identified in SEQ ID NO: 164. In some embodiments, where more than one coding sequence for more than one gene product of interest is included in a knock-in cassette, these sequences may be connected by a linker sequence, an IRES, and/or 2A element. [0652] In some embodiments, an oligonucleotide encoding a gene product of interest comprises or consists of the sequence of any one of SEQ ID NOs: 161, 162, or 164-182. In some embodiments, a gene product of interest comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to any one of SEQ ID NOs: 161, 162, or 164-182.

SEQ ID NO: 164 - exemplary linker sequence

TCTGGCGGAGGAAGCGGAGGCGGAGGATCTGGTGGTGGTGGATCTGGCGGCGGTGGT AGTGGCG

GAGGTTCTCTGCAA

SEQ ID NO: 165 - exemplary CD 16 knock-in cassette sequence

ATGTGGCAACTGCTGCTGCCTACAGCTCTGCTGCTTCTGGTGTCTGCCGGCATGAGA ACCGAGG ATCTGCCTAAGGCCGTGGTGTTCCTGGAACCTCAGTGGTACAGAGTGCTGGAAAAGGACA GCGT GACCCTGAAGTGCCAGGGCGCCTATTCTCCCGAGGACAATAGCACCCAGTGGTTCCACAA CGAG AGCCTGATCAGCAGCCAGGCCAGCAGCTACTTTATCGATGCCGCCACCGTGGACGACAGC GGCG AGTACAGATGCCAGACCAATCTGAGCACCCTGAGCGACCCTGTGCAGCTGGAAGTGCACA TTGG ATGGTTGCTGCTGCAAGCCCCTAGATGGGTGTTCAAAGAAGAGGACCCCATCCACCTGAG ATGC C AC T C T T G GAAGAAC AC AG C C C T G C AC AAAG T GAC C T AC C T G C AGAAC G G C AAG G G C AGAAAG T ACTTCCACCACAACAGCGACTTCTACATCCCCAAGGCCACACTGAAGGACTCCGGCTCCT ACTT C T G C AGAG GCCTGGTCGG C AG C AAGAAC G T G T C C AG C GAGAC AG T GAAC AT C AC CAT C AC AC AG GGCCTCGCCGTGTCTACCATCAGCAGCTTTTTCCCACCTGGCTATCAGGTGTCCTTCTGC CTGG TCATGGTGCTGCTGTTCGCCGTGGATACCGGCCTGTACTTCAGCGTCAAGACCAACATCC GGTC C AG C AC C AGAGAC T G GAAG GAC C AC AAG T T C AAG T G G C G GAAG GAC C C T C AG GAC AAG T AA

SEQ ID NO: 166 - exemplary CD 16 knock-in cassette sequence

SEQ ID NO: 167 - exemplary CD47 knock-in cassette sequence

ATGTGGCCCCTGGTAGCGGCGCTGTTGCTGGGCTCGGCGTGCTGCGGATCAGCTCAG CTACTAT TTAATAAAACAAAATCTGTAGAATTCACGTTTTGTAATGACACTGTCGTCATTCCATGCT TTGT TAC T AAT AT G GAG G C AC AAAAC AC TAC T GAAG TAT AC GT AAAG T GGAAAT T T AAAG GAAGAGAT ATTTACACCTTTGATGGAGCTCTAAACAAGTCCACTGTCCCCACTGACTTTAGTAGTGCA AAAA T T GAAG T C T CACAAT TAC T AAAAG GAGAT G C C T C T T T GAAGAT G GAT AAGAG T GAT GC T G T C T C AC AC AC AG GAAAC TAC AC T T G T GAAG TAACAGAAT T AAC C AGAGAAG G T GAAAC GAT CAT C GAG CTAAAATATCGTGTTGTTTCATGGTTTTCTCCAAATGAAAATATTCTTATTGTTATTTTC CCAA TTTTTGCTATACTCCTGTTCTGGGGACAGTTTGGTATTAAAACACTTAAATATAGATCCG GTGG TATGGATGAGAAAACAATTGCTTTACTTGTTGCTGGACTAGTGATCACTGTCATTGTCAT TGTT GGAGCCATTCTTTTCGTCCCAGGTGAATATTCATTAAAGAATGCTACTGGCCTTGGTTTA ATTG T GAC T T C T AC AG G GAT AT T AAT AT T AC T T CAC T AC TAT G T G T T TAG T ACAGC GAT TGGAT T AAC CTCCTTCGTCATTGCCATATTGGTTATTCAGGTGATAGCCTATATCCTCGCTGTGGTTGG ACTG AGTCTCTGTATTGCGGCGTGTATACCAATGCATGGCCCTCTTCTGATTTCAGGTTTGAGT ATCT TAGCT C TAG CAC AAT TAC T TGGAC TAGTT TAT AT GAAAT TTGTGGCT TCCAAT CAGAAGAC TAT AC AAC C T C C TAG GAAAG C T G T AGAG GAAC C C C T T AAT G CAT T C AAAGAAT C AAAAG GAAT GAT G AATGATGAATGA

SEQ ID NO: 168 - exemplary IL15 knock-in cassette sequence

AAT T G G G T C AAC G T GAT C AG C GAC C T GAAGAAGAT C GAG GAC C T GAT C C AGAG CAT G CAC AT C G ACGCCACACTGTACACCGAGTCCGATGTGCACCCTAGCTGCAAAGTGACCGCCATGAAGT GCTT TCTGCTGGAACTGCAAGTGATCAGCCTGGAAAGCGGCGACGCCAGCATCCACGATACCGT GGAA AACCTGATCATCCTGGCCAACAACAGCCTGAGCAGCAACGGCAATGTGACCGAGAGCGGC TGCA AAGAG T G C GAG GAAC T G GAAGAGAAGAAC AT CAAAGAG T T C C T C C AGAG C T T C G T C CAC AT C G T G C AGAT G T T CAT C AAC AC C AG C

SEQ ID NO: 169 - exemplary IgE-IL15 knock-in cassette sequence

SEQ ID NO: 170 - exemplary IgE-IL15 pro-peptide cargo sequence

SEQ ID NO: 171 - exemplary IL15Ra cargo sequence

ATCACCTGTCCTCCACCTATGAGCGTGGAACACGCCGACATCTGGGTCAAGAGCTAC AGCCTGT AC AG C AGAGAG C G G TAC AT C T G C AAC AG C G G C T T C AAGAGAAAG G C C G G C AC AAG CAG C C T GAC CGAGTGTGTGCTGAACAAGGCCACAAACGTGGCCCACTGGACCACACCTAGCCTGAAGTG CATC AGAGATCCCGCTCTGGTTCATCAGAGGCCTGCCCCTCCATCTACAGTGACAACAGCTGGC GTGA CCCCTCAGCCTGAGTCTCTGTCTCCATCTGGAAAAGAGCCTGCCGCCAGCTCTCCCAGCT CTAA CAATACTGCTGCCACCACAGCCGCTATCGTGCCTGGATCTCAGCTGATGCCTAGCAAGAG CCCT AG C AC C G G C AC AAC AGAGAT C AG C T C T C AC GAGAG C AG C C AC G GAAC AC C T T C T C AGAC C AC C G CCAAGAATTGGGAGCTGACAGCCTCTGCCTCTCATCAGCCACCTGGCGTGTACCCACAGG GCCA CTCTGATACAACAGTGGCCATCAGCACCAGCACCGTTCTGCTGTGTGGCCTGTCTGCTGT TAGC CTGCTGGCCTGCTACCTGAAGTCTAGACAGACACCTCCTCTGGCCAGCGTGGAAATGGAA GCCA TGGAAGCTCTGCCTGTCACATGGGGCACCAGCAGCAGAGATGAGGACCTCGAGAATTGCA GCCA CCACCTG

SEQ ID NO: 172 - exemplary mbIL-15 cargo sequence

ATGGATTGGACCTGGATCCTGTTTCTGGTGGCCGCTGCCACAAGAGTGCACAGCAAT TGGGTCA AC G T GAT C AG C GAC C T GAAGAAGAT C GAG GAC C T GAT C C AGAG CAT G C AC AT C GAC G C C AC AC T GTACACCGAGTCCGATGTGCACCCTAGCTGCAAAGTGACCGCCATGAAGTGCTTTCTGCT GGAA CTGCAAGTGATCAGCCTGGAAAGCGGCGACGCCAGCATCCACGATACCGTGGAAAACCTG ATCA TCCTGGCCAACAACAGCCTGAGCAGCAACGGCAATGTGACCGAGAGCGGCTGCAAAGAGT GCGA G GAAC T G GAAGAGAAGAAC AT CAAAGAG T T C C T C C AGAG C T T C G T C C AC AT C G T G C AGAT G T T C ATCAACACCAGCTCTGGCGGAGGAAGCGGAGGCGGAGGATCTGGTGGTGGTGGATCTGGC GGCG GTGGTAGTGGCGGAGGTTCTCTGCAAATCACCTGTCCTCCACCTATGAGCGTGGAACACG CCGA CAT C T G G G T C AAGAG C T AC AG C C T G T AC AG C AGAGAG C G G T AC AT C T G C AAC AG C G G C T T C AAG AGAAAGGCCGGCACAAGCAGCCTGACCGAGTGTGTGCTGAACAAGGCCACAAACGTGGCC CACT GGACCACACCTAGCCTGAAGTGCATCAGAGATCCCGCTCTGGTTCATCAGAGGCCTGCCC CTCC ATCTACAGTGACAACAGCTGGCGTGACCCCTCAGCCTGAGTCTCTGTCTCCATCTGGAAA AGAG CCTGCCGCCAGCTCTCCCAGCTCTAACAATACTGCTGCCACCACAGCCGCTATCGTGCCT GGAT C T C AG C T GAT G C C TAG C AAGAG C C C TAG C AC C G G C AC AAC AGAGAT C AG C T C T C AC GAGAG C AG CCACGGAACACCTTCTCAGACCACCGCCAAGAATTGGGAGCTGACAGCCTCTGCCTCTCA TCAG CCACCTGGCGTGTACCCACAGGGCCACTCTGATACAACAGTGGCCATCAGCACCAGCACC GTTC TGCTGTGTGGCCTGTCTGCTGTTAGCCTGCTGGCCTGCTACCTGAAGTCTAGACAGACAC CTCC TCTGGCCAGCGTGGAAATGGAAGCCATGGAAGCTCTGCCTGTCACATGGGGCACCAGCAG CAGA GAT GAG GAC C T C GAGAAT T G C AG C C AC C AC C T G

SEQ ID NO: 173 - exemplary mbIL-15 cargo sequence

ATGGACTGGACCTGGATTCTGTTCCTGGTCGCGGCTGCAACGCGAGTCCATAGCGGT ATCCATG TTTTTATTCTTGGGTGTTTTTCTGCTGGGCTGCCTAAGACCGAGGCCAACTGGGTAAATG TCAT CAG T GAC C T CAAGAAAAT AGAAGAC C T T AT ACAAAGCAT GCACAT T GAT GC T AC T C T C T ACAC T GAGTCAGATGTACATCCCTCATGCAAAGTGACGGCCATGAAATGTTTCCTCCTCGAACTT CAAG TCATATCTCTGGAAAGTGGCGACGCGTCCATCCACGACACGGTCGAAAACCTGATAATAC TCGC TAATAAT AG T C T C T C T T CAAAT G G T AAC G T AAC C GAG T CAG G T T G CAAAGAG T G C GAAGAG T T G GAAGAAAAAAAC AT AAAG GAG T T C C T G CAAAG T T T C G T G C AC AT T G T G C AGAT G T T CAT T AAT A CCTCTAGCGGCGGAGGATCAGGTGGCGGTGGAAGCGGAGGTGGAGGCTCCGGTGGAGGAG GTAG TGGCGGAGGTTCTCTTCAAATAACTTGTCCTCCACCGATGTCCGTAGAACATGCGGATAT TTGG GTAAAATCCTATAGCTTGTACAGCCGAGAGCGGTATATCTGCAACAGCGGCTTCAAGCGG AAGG CCGGCACAAGCAGCCTGACCGAGTGCGTGCTGAACAAGGCCACCAACGTGGCCCACTGGA CCAC CCCTAGCCTGAAGTGCATCAGAGATCCCGCCCTGGTGCATCAGCGGCCTGCCCCTCCAAG CACA GTGACAACAGCTGGCGTGACCCCCCAGCCTGAGAGCCTGAGCCCTTCTGGAAAAGAGCCT GCCG CCAGCAGCCCCAGCAGCAACAATACTGCCGCCACCACAGCCGCCATCGTGCCTGGATCTC AGCT GAT G C C CAG C AAGAG C C C TAG C AC C G G C AC C AC C GAGAT CAG CAG C C AC GAG T C TAG C C AC G G C ACCCCATCTCAGACCACCGCCAAGAACTGGGAGCTGACAGCCAGCGCCTCTCACCAGCCT CCAG GCGTGTACCCTCAGGGCCACAGCGATACCACAGTGGCCATCAGCACCTCCACCGTGCTGC TGTG TGGACTGAGCGCCGTGTCACTGCTGGCCTGCTACCTGAAGTCCAGACAGACCCCTCCACT GGCC

AGCGTGGAAATGGAAGCCATGGAAGCACTGCCCGTGACCTGGGGCACCAGCTCCAGA GATGAGG

ATCTGGAAAACTGCTCCCACCACCTG

SEQ ID NO: 174 - exemplary multi cistronic CD 16, mbIL-15 cargo sequence

ATGTGGCAGCTGTTGCTGCCGACAGCCCTCCTGTTGCTGGTCTCCGCTGGCATGAGA ACCGAGG ATCTGCCTAAGGCCGTGGTGTTCCTGGAACCTCAGTGGTACAGAGTGCTGGAAAAGGACA GCGT GACCCTGAAGTGCCAGGGCGCCTATTCTCCCGAGGACAATAGCACCCAGTGGTTCCACAA CGAG AGCCTGATCAGCAGCCAGGCCAGCAGCTACTTTATCGATGCCGCCACCGTGGACGACAGC GGCG AGTACAGATGCCAGACCAATCTGAGCACCCTGAGCGACCCTGTGCAGCTGGAAGTGCACA TTGG ATGGTTGCTGCTGCAAGCCCCTAGATGGGTGTTCAAAGAAGAGGACCCCATCCACCTGAG ATGC C AC T C T T G GAAGAAC AC AG C C C T G C AC AAAG T GAC C T AC C T G C AGAAC G G C AAG G G C AGAAAG T ACTTCCACCACAACAGCGACTTCTACATCCCCAAGGCCACACTGAAGGACTCCGGCTCCT ACTT C T G C AGAG GCCTGGTCGG C AG C AAGAAC G T G T C C AG C GAGAC AG T GAAC AT C AC CAT C AC AC AG GGCCTCGCCGTGTCTACCATCAGCAGCTTTTTCCCACCTGGCTATCAGGTGTCCTTCTGC CTGG TCATGGTGCTGCTGTTCGCCGTGGATACCGGCCTGTACTTCAGCGTCAAGACCAACATCC GGTC C AG C AC C AG AG AC T G G AAG GAC C AC AAG T T C AAG T G G C G G AAG GAC C C T C AG GAC AAG G G AAG C GGAGCCACAAACTTCTCTCTGCTGAAGCAGGCAGGAGATGTTGAAGAAAACCCTGGACCT ATGG ATTGGACCTGGATCCTGTTTCTGGTGGCCGCTGCCACAAGAGTGCACAGCAATTGGGTCA ACGT GAT C AG C GAC C T GAAGAAGAT C GAG GAC C T GAT C C AGAG CAT G C AC AT C GAC G C C AC AC T G T AC ACCGAGTCCGATGTGCACCCTAGCTGCAAAGTGACCGCCATGAAGTGCTTTCTGCTGGAA CTGC AAGTGATCAGCCTGGAAAGCGGCGACGCCAGCATCCACGATACCGTGGAAAACCTGATCA TCCT G G C C AAC AAC AG C C T GAG C AG C AAC G G C AAT G T GAC C GAG AG C G G C T G C AAAG AG T G C GAG G AA C T G GAAGAGAAGAAC AT CAAAGAG T T C C T C C AGAG C T T C G T C C AC AT C G T G C AGAT G T T C AT C A ACACCAGCTCTGGCGGAGGAAGCGGAGGCGGAGGATCTGGTGGTGGTGGATCTGGCGGCG GTGG TAGTGGCGGAGGTTCTCTGCAAATCACCTGTCCTCCACCTATGAGCGTGGAACACGCCGA CATC T G G G T C AAGAG C T AC AG C C T G T AC AG C AGAGAG C G G T AC AT C T G C AAC AG C G G C T T C AAGAGAA AGGCCGGCACAAGCAGCCTGACCGAGTGTGTGCTGAACAAGGCCACAAACGTGGCCCACT GGAC CACACCTAGCCTGAAGTGCATCAGAGATCCCGCTCTGGTTCATCAGAGGCCTGCCCCTCC ATCT ACAGTGACAACAGCTGGCGTGACCCCTCAGCCTGAGTCTCTGTCTCCATCTGGAAAAGAG CCTG CCGCCAGCTCTCCCAGCTCTAACAATACTGCTGCCACCACAGCCGCTATCGTGCCTGGAT CTCA G C T GAT G C C TAG C AAGAG C C C TAG C AC C G G C AC AAC AGAGAT C AG C T C T C AC GAGAG C AG C C AC GGAACACCTTCTCAGACCACCGCCAAGAATTGGGAGCTGACAGCCTCTGCCTCTCATCAG CCAC CTGGCGTGTACCCACAGGGCCACTCTGATACAACAGTGGCCATCAGCACCAGCACCGTTC TGCT GTGTGGCCTGTCTGCTGTTAGCCTGCTGGCCTGCTACCTGAAGTCTAGACAGACACCTCC TCTG GCCAGCGTGGAAATGGAAGCCATGGAAGCTCTGCCTGTCACATGGGGCACCAGCAGCAGA GATG AG GAC C T C GAGAAT T G C AG C C AC C AC C T G

SEQ ID NO: 175 - exemplary CD 19 CAR cargo sequence

ATGCTTCTCCTGGTGACAAGCCTTCTGCTCTGTGAGTTACCACACCCAGCATTCCTC CTGATCC C AGAC AT C C AGAT GAC AC AGAC T AC AT CCTCCCTGTCTGCCTCTCTGG GAGAC AGAG T C AC CAT C AG T T G C AG G G C AAG T C AG GAC AT TAG T AAAT AT T T AAAT T G G T AT C AG C AGAAAC C AGAT G GA AC T G T TAAAC TCCTGATCTAC CAT AC AT CAAGAT T AC AC T C AG GAG T C C C AT C AAG G T T C AG T G G C AG TGGGTCTG GAAC AGAT TATTCTCT C AC CAT TAG C AAC C T G GAG C AAGAAGAT AT T G C C AC TTACTTTTGCCAACAGGGTAATACGCTTCCGTACACGTTCGGAGGGGGGACTAAGTTGGA AATA ACAGGCTCCACCTCTGGATCCGGCAAGCCCGGATCTGGCGAGGGATCCACCAAGGGCGAG GTGA AACTGCAGGAGTCAGGACCTGGCCTGGTGGCGCCCTCACAGAGCCTGTCCGTCACATGCA CTGT CTCAGGGGTCTCATTACCCGACTATGGTGTAAGCTGGATTCGCCAGCCTCCACGAAAGGG TCTG GAG T G G C T G G GAG T AAT AT G G G G TAG T GAAAC C AC AT AC T AT AAT T C AG C T C T C AAAT C C AGAC T GAC CAT CAT CAAG GAC AAC T C CAAGAG C CAAG T T T T C T TAAAAAT GAAC AG T C T G CAAAC T GA T GAC AC AG C CAT T T AC T AC T G T G C C AAAC AT TAT T AC T AC G G T G G TAG CTATGCTATG GAC T AC TGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAGCGGCCGCAATTGAAGTTATGTATCCT CCTC C T T AC C T AGAC AAT GAGAAGAG C AAT G GAAC CAT T AT C CAT G T GAAAG G GAAAC AC CTTTGTCC AAGTCCCCTATTTCCCGGACCTTCTAAGCCCTTTTGGGTGCTGGTGGTGGTTGGGGGAGT CCTG GCTTGCTATAGCTTGCTAGTAACAGTGGCCTTTATTATTTTCTGGGTGAGGAGTAAGAGG AGCA GGCTCCTGCACAGTGACTACATGAACATGACTCCCCGCCGCCCCGGGCCCACCCGCAAGC ATTA CCAGCCCTATGCCCCACCACGCGACTTCGCAGCCTATCGCTCCAGAGTGAAGTTCAGCAG GAGC GCAGACGCCCCCGCGTACCAGCAGGGCCAGAACCAGCTCTATAACGAGCTCAATCTAGGA CGAA GAG AG GAG T AC GAT G T T T T G G AC AAG AG AC GTGGCCGG GAC C C T GAG AT G G G G G GAAAG C C GAG AAG GAAGAAC C C T C AG GAAG G C C T G T AC AAT GAAC T G C AGAAAGAT AAGAT G G C G GAG G C C T AC AGTGAGATTGGGATGAAAGGCGAGCGCCGGAGGGGCAAGGGGCACGATGGCCTTTACCAG GGTC TCAGTACAGCCACCAAGGACACCTACGACGCCCTTCACATGCAGGCCCTGCCCCCTCGCT AA

SEQ ID NO: 176 - exemplary EGFR CAR cargo sequence

ATGGCACTCCCCGTCACCGCCCTTCTCTTGCCCCTCGCCCTGCTGCTGCATGCTGCC AGGCCCA TGGACGAAGTGCAGCTCGTGGAGTCCGGTGGAGGACTCGTCCAACCGGGCGGATCCCTTC GCTT GTCCTGCGCCGCATCAGGCTTCAGCTTCACCAACTATGGCGTCCACTGGGTCAGACAGGC CCCC GGAAAGGGACTGGAATGGGTGTCCGTGATCTGGAGCGGCGGGAACACCGACTACAACACC TCCG TGAAGGGCCGGTTCACTATTAGCCGCGACAACTCCAAGAACACTCTGTACCTCCAAATGA ACTC CCTGAGGGCCGAAGATACTGCTGTGTACTATTGCGCGAGAGCCCTGACCTACTACGACTA CGAG TTCGCGTACTGGGGCCAGGGGACTCTCGTGACCGTGTCCAGCGGTGGTGGAGGTTCCGGA GGCG GAGGTTCTGGTGGCGGGGGATCAGAAATCGTGCTGACTCAGTCCCCTGCGACCTTGTCCC TGAG CCCTGGAGAACGGGCCACCCTGAGCTGTAGAGCCAGCCAGAGCATCGGGACAAATATTCA CTGG TACCAGCAGAAACCCGGACAAGCACCACGGCTGCTGATCTACTACGCCTCCGAGTCGATT TCCG GAATCCCGGCTCGCTTTTCGGGGTCTGGATCGGGAACGGACTTCACTCTGACCATCTCGT CGCT GGAACCCGAGGATTTCGCCGTGTACTACTGCCAACAGAACAACAATTGGCCGACCACGTT CGGC CAGGGCACCAAGCTCGAGATTAAGGGATCACTGGAAGCGGCCGCAACCACAACACCTGCT CCAA GGCCCCCCACACCCGCTCCAACTATAGCCAGCCAACCATTGAGCCTCAGACCTGAAGCTT GCAG GCCCGCAGCAGGAGGCGCCGTCCATACGCGAGGCCTGGACTTCGCGTGTGATATTTATAT TTGG GCCCCTTTGGCCGGAACATGTGGGGTGTTGCTTCTCTCCCTTGTGATCACTCTGTATTGT AAGC GCGGGAGAAAGAAGCTCCTGTACATCTTCAAGCAGCCTTTTATGCGACCTGTGCAAACCA CTCA GGAAGAAGATGGGTGTTCATGCCGCTTCCCCGAGGAGGAAGAAGGAGGGTGTGAACTGAG GGTG AAAT T T T C T AGAAG C G C C GAT G C T C C C G CAT AT C AG C AG G G T C AGAAT C AG C T C T AC AAT GAAT T GAAT C T C G G C AG G C GAGAAGAG T AC GAT G T T C T G GAC AAGAGAC G G G G C AG G GAT C C C GAGAT GGGGGGAAAGCCCCGGAGAAAAAATCCTCAGGAGGGGTTGTACAATGAGCTGCAGAAGGA CAAG AT G G C T GAAG C C TAT AG C GAGAT C G GAAT GAAAG G C GAAAGAC G C AGAG G CAAG G G G CAT GAC G GTCTGTACCAGGGTCTCTCTACAGCCACCAAGGACACTTATGATGCGTTGCATATGCAAG CCTT GCCACCCCGCTAA

SEQ ID NO: 177 - exemplary GFP cargo sequence

ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTG GACGGCG

ACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACG GCAAGCT GACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGAC CACC CTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTC TTCA AGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCA ACTA CAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAA GGGC AT C GAC T T C AAG GAG GAC G G C AAC AT C C T G G G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC A AC GTCTATAT CAT G G C C GAC AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AAGAT C C G C C AC AA CATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGA CGGC CCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCC AACG AGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCA TGGA C GAG C T G T AC AAG T GA

SEQ ID NO: 178 - exemplary CXCR1 cargo sequence

AT GT CAAATAT TACAGAT CCACAGAT GT GGGAT T T T GAT GAT C TAAAT T T CAC T GGCAT GCCAC C T G C AGAT G AAGAT T AC AG CCCCTGTATGC TAGAAAC T GAGAC AC T C AAC AAG TATGTTGTGAT CATCGCCTATGCCCTAGTGTTCCTGCTGAGCCTGCTGGGAAACTCCCTGGTGATGCTGGT CATC TTATACAGCAGGGTCGGCCGCTCCGTCACTGATGTCTACCTGCTGAACCTGGCCTTGGCC GACC TACTCTTTGCCCTGACCTTGCCCATCTGGGCCGCCTCCAAGGTGAATGGCTGGATTTTTG GCAC ATTCCTGTGCAAGGTGGTCTCACTCCTGAAGGAAGTCAACTTCTACAGTGGCATCCTGCT GTTG GCCTGCATCAGTGTGGACCGTTACCTGGCCATTGTCCATGCCACACGCACACTGACCCAG AAGC GTCACTTGGTCAAGTTTGTTTGTCTTGGCTGCTGGGGACTGTCTATGAATCTGTCCCTGC CCTT CTTCCTTTTCCGCCAGGCTTACCATCCAAACAATTCCAGTCCAGTTTGCTATGAGGTCCT GGGA AATGACACAGCAAAATGGCGGATGGTGTTGCGGATCCTGCCTCACACCTTTGGCTTCATC GTGC CGCTGTTTGTCATGCTGTTCTGCTATGGATTCACCCTGCGTACACTGTTTAAGGCCCACA TGGG GCAGAAGCACCGAGCCATGAGGGTCATCTTTGCTGTCGTCCTCATCTTCCTGCTTTGCTG GCTG CCCTACAACCTGGTCCTGCTGGCAGACACCCTCATGAGGACCCAGGTGATCCAGGAGAGC TGTG AGCGCCGCAACAACATCGGCCGGGCCCTGGATGCCACTGAGATTCTGGGATTTCTCCATA GCTG CCTCAACCCCATCATCTACGCCTTCATCGGCCAAAATTTTCGCCATGGATTCCTCAAGAT CCTG GCTATGCATGGCCTGGTCAGCAAGGAGTTCTTGGCACGTCATCGTGTTACCTCCTACACT TCTT CGTCTGTCAATGTCTCTTCCAACCTCTGA

SEQ ID NO: 179 - exemplary CXCR3B cargo sequence

ATGGAGTTGAGGAAGTACGGCCCTGGAAGACTGGCGGGGACAGTTATAGGAGGAGCT GCTCAGA G TAAAT C AC AGAC TAAAT CAGAC T CAAT C AC AAAAGAG T T C C T G C C AG G C C T T T AC AC AG C C C C TTCCTCCCCGTTCCCGCCCTCACAGGTGAGTGACCACCAAGTGCTAAATGACGCCGAGGT TGCC GCCCTCCTGGAGAACTTCAGCTCTTCCTATGACTATGGAGAAAACGAGAGTGACTCGTGC TGTA CCTCCCCGCCCTGCCCACAGGACTTCAGCCTGAACTTCGACCGGGCCTTCCTGCCAGCCC TCTA CAGCCTCCTCTTTCTGCTGGGGCTGCTGGGCAACGGCGCGGTGGCAGCCGTGCTGCTGAG CCGG CGGACAGCCCTGAGCAGCACCGACACCTTCCTGCTCCACCTAGCTGTAGCAGACACGCTG CTGG TGCTGACACTGCCGCTCTGGGCAGTGGACGCTGCCGTCCAGTGGGTCTTTGGCTCTGGCC TCTG CAAAGTGGCAGGTGCCCTCTTCAACATCAACTTCTACGCAGGAGCCCTCCTGCTGGCCTG CATC AGCTTTGACCGCTACCTGAACATAGTTCATGCCACCCAGCTCTACCGCCGGGGGCCCCCG GCCC GCGTGACCCTCACCTGCCTGGCTGTCTGGGGGCTCTGCCTGCTTTTCGCCCTCCCAGACT TCAT CTTCCTGTCGGCCCACCACGACGAGCGCCTCAACGCCACCCACTGCCAATACAACTTCCC ACAG GTGGGCCGCACGGCTCTGCGGGTGCTGCAGCTGGTGGCTGGCTTTCTGCTGCCCCTGCTG GTCA TGGCCTACTGCTATGCCCACATCCTGGCCGTGCTGCTGGTTTCCAGGGGCCAGCGGCGCC TGCG GGCCATGCGGCTGGTGGTGGTGGTCGTGGTGGCCTTTGCCCTCTGCTGGACCCCCTATCA CCTG GTGGTGCTGGTGGACATCCTCATGGACCTGGGCGCTTTGGCCCGCAACTGTGGCCGAGAA AGCA

GGGTAGACGTGGCCAAGTCGGTCACCTCAGGCCTGGGCTACATGCACTGCTGCCTCA ACCCGCT

GCTCTATGCCTTTGTAGGGGTCAAGTTCCGGGAGCGGATGTGGATGCTGCTCTTGCG CCTGGGC

TGCCCCAACCAGAGAGGGCTCCAGAGGCAGCCATCGTCTTCCCGCCGGGATTCATCC TGGTCTG

AGACCTCAGAGGCCTCCTACTCGGGCTTGTGA

SEQ ID NO: 180 - exemplary CXCR3 A cargo sequence

ATGGTCCTTGAGGTGAGTGACCACCAAGTGCTAAATGACGCCGAGGTTGCCGCCCTC CTGGAGA

ACTTCAGCTCTTCCTATGACTATGGAGAAAACGAGAGTGACTCGTGCTGTACCTCCC CGCCCTG

CCCACAGGACTTCAGCCTGAACTTCGACCGGGCCTTCCTGCCAGCCCTCTACAGCCT CCTCTTT

CTGCTGGGGCTGCTGGGCAACGGCGCGGTGGCAGCCGTGCTGCTGAGCCGGCGGACA GCCCTGA

GCAGCACCGACACCTTCCTGCTCCACCTAGCTGTAGCAGACACGCTGCTGGTGCTGA CACTGCC

GCTCTGGGCAGTGGACGCTGCCGTCCAGTGGGTCTTTGGCTCTGGCCTCTGCAAAGT GGCAGGT

GCCCTCTTCAACATCAACTTCTACGCAGGAGCCCTCCTGCTGGCCTGCATCAGCTTT GACCGCT

ACCTGAACATAGTTCATGCCACCCAGCTCTACCGCCGGGGGCCCCCGGCCCGCGTGA CCCTCAC

CTGCCTGGCTGTCTGGGGGCTCTGCCTGCTTTTCGCCCTCCCAGACTTCATCTTCCT GTCGGCC

CACCACGACGAGCGCCTCAACGCCACCCACTGCCAATACAACTTCCCACAGGTGGGC CGCACGG

CTCTGCGGGTGCTGCAGCTGGTGGCTGGCTTTCTGCTGCCCCTGCTGGTCATGGCCT ACTGCTA

TGCCCACATCCTGGCCGTGCTGCTGGTTTCCAGGGGCCAGCGGCGCCTGCGGGCCAT GCGGCTG

GTGGTGGTGGTCGTGGTGGCCTTTGCCCTCTGCTGGACCCCCTATCACCTGGTGGTG CTGGTGG

ACATCCTCATGGACCTGGGCGCTTTGGCCCGCAACTGTGGCCGAGAAAGCAGGGTAG ACGTGGC

CAAGTCGGTCACCTCAGGCCTGGGCTACATGCACTGCTGCCTCAACCCGCTGCTCTA TGCCTTT

GTAGGGGTCAAGTTCCGGGAGCGGATGTGGATGCTGCTCTTGCGCCTGGGCTGCCCC AACCAGA

GAGGGCTCCAGAGGCAGCCATCGTCTTCCCGCCGGGATTCATCCTGGTCTGAGACCT CAGAGGC

CTCCTACTCGGGCTTGTGA

SEQ ID NO: 181 - exemplary CCR5 cargo sequence

ATGGATTAT CAAG T G T CAAG T C CAAT C T AT GAC AT CAAT TAT TAT AC AT C G GAG C C C T G C C AAA AAATCAATGTGAAGCAAATCGCAGCCCGCCTCCTGCCTCCGCTCTACTCACTGGTGTTCA TCTT TGGTTTTGTGGGCAACATGCTGGTCATCCTCATCCTGATAAACTGCAAAAGGCTGAAGAG CATG ACTGACATCTACCTGCTCAACCTGGCCATCTCTGACCTGTTTTTCCTTCTTACTGTCCCC TTCT GGGCTCACTATGCTGCCGCCCAGTGGGACTTTGGAAATACAATGTGTCAACTCTTGACAG GGCT CTATTTTATAGGCTTCTTCTCTGGAATCTTCTTCATCATCCTCCTGACAATCGATAGGTA CCTG GCTGTCGTCCATGCTGTGTTTGCTTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACA AGTG TGATCACTTGGGTGGTGGCTGTGTTTGCGTCTCTCCCAGGAATCATCTTTACCAGATCTC AAAA AGAAG G T C T T C AT T AC AC C T G C AG CTCTCATTTTC CAT AC AG T C AG TAT CAAT T C T G GAAGAAT TTCCAGACATTAAAGATAGTCATCTTGGGGCTGGTCCTGCCGCTGCTTGTCATGGTCATC TGCT ACTCGGGAATCCTAAAAACTCTGCTTCGGTGTCGAAATGAGAAGAAGAGGCACAGGGCTG TGAG GCTTATCTTCACCATCATGATTGTTTATTTTCTCTTCTGGGCTCCCTACAACATTGTCCT TCTC CTGAACACCTTCCAGGAATTCTTTGGCCTGAATAATTGCAGTAGCTCTAACAGGTTGGAC CAAG CTATGCAGGTGACAGAGACTCTTGGGATGACGCACTGCTGCATCAACCCCATCATCTATG CCTT TGTCGGGGAGAAGTTCAGAAACTACCTCTTAGTCTTCTTCCAAAAGCACATTGCCAAACG CTTC TGCAAATGCTGTTCTATTTTCCAGCAAGAGGCTCCCGAGCGAGCAAGCTCAGTTTACACC CGAT CCACTGGGGAGCAGGAAATATCTGTGGGCTTGTGA

SEQ ID NO: 182 - exemplary CCR2 cargo sequence ATGCTGTCCACATCTCGTTCTCGGTTTATCAGAAATACCAACGAGAGCGGTGAAGAAGTC ACCA CCTTTTTTGATTATGATTACGGTGCTCCCTGTCATAAATTTGACGTGAAGCAAATTGGGG CCCA ACTCCTGCCTCCGCTCTACTCGCTGGTGTTCATCTTTGGTTTTGTGGGCAACATGCTGGT CGTC CTCATCTTAATAAACTGCAAAAAGCTGAAGTGC TTGACTGACATTTACCTGCTCAACCTGGCCA TCTCTGATCTGCTTTTTCTTATTACTCTCCCATTGTGGGCTCACTCTGCTGCAAATGAGT GGGT CTTTGGGAATGCAATGTGCAAATTATTCACAGGGCTGTATCACATCGGTTATTTTGGCGG AATC TTCTTCATCATCCTCCTGACAATCGATAGATACCTGGCTATTGTCCATGCTGTGTTTGCT TTAA AAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACCTGGTTGGTGGCTGTGT TTGC TTCTGTCCCAGGAATCATCTTTACTAAATGCCAGAAAGAAGATTCTGTTTATGTCTGTGG CCCT TATTTTCCACGAGGATGGAATAATTTCCACACAATAATGAGGAACATTTTGGGGCTGGTC CTGC CGCTGCTCATCATGGTCATCTGCTACTCGGGAATCCTGAAAACCCTGCTTCGGTGTCGAA ACGA GAAGAAGAGGCATAGGGCAGTGAGAGTCATCTTCACCATCATGATTGTTTACTTTCTCTT CTGG ACTCCCTATAATATTGTCATTCTCCTGAACACCTTCCAGGAATTCTTCGGCCTGAGTAAC TGTG AAAGCACCAGTCAACTGGACCAAGCCACGCAGGT GACAGAGACTCTTGGGATGACTCACTGCTG CATCAATCCCATCATCTATGCCTTCGTTGGGGAGAAGTTCAGAAGCCTTTTTCACATAGC TCTT GGCTGTAGGATTGCCCCACTCCAAAAACCAGTGTGTGGAGGTCCAGGAGTGAGACCAGGA AAGA ATGTGAAAGTGACTACACAAGGACTCCTCGATGGTCGTGGAAAAGGAAAGTCAATTGGCA GAGC CCCTGAAGCCAGTCTTCAGGACAAAGAAGGAG CCTAG

[0653] In some embodiments, a gene product of interest comprises or consists of an amino acid sequence of any one of SEQ ID NOs: 161, 164, or 183-200. In some embodiments, a gene product of interest comprises or consists of an amino acid sequence that is at least 85%, 90%, 95%, 98% or 99% identical to any one of SEQ ID NOs: 161, 164, or 183-200.

SEQ ID NO: 183 - exemplary linker amino acid sequence SGGGSGGGGSGGGGSGGGGSGGGSLQ

SEQ ID NO: 184 - exemplary CD 16 amino acid sequence

MWQLLLPTALLLLVSAGMRTEDLPKAW FLEPQWYRVLEKDSVTLKCQGAYSPEDNSTQWFHNE

SLISSQASSYFIDAATVDDSGEYRCQTNLSTLSDPVQLEVHIGWLLLQAPRWVFKEE DPIHLRC

HSWKNTALHKVTYLQNGKGRKYFHHNSDFYIPKATLKDSGSYFCRGLVGSKNVSSET VNITITQ

GLAVSTISSFFPPGYQVSFCLVMVLLFAVDTGLYFSVKTNIRSSTRDWKDHKFKWRK DPQDK

SEQ ID NO: 185 - exemplary CD47 amino acid sequence

MWPLVAALLLGSACCGSAQLLFNKTKSVEFTFCNDTW IPCFVTNMEAQNTTEVYVKWKFKGRD IYTFDGALNKSTVPTDFSSAKIEVSQLLKGDASLKMDKSDAVSHTGNYTCEVTELTREGE TI IE LKYRVVSWFSPNENILIVIFPIFAILLFWGQFGIKTLKYRSGGMDEKTIALLVAGLVITV IVIV GAILFVPGEYSLKNATGLGLIVTSTGILILLHYYVFSTAIGLTSFVIAILVIQVIAYILA W GL SLCIAACIPMHGPLLISGLSILALAQLLGLVYMKFVASNQKT IQPPRKAVEEPLNAFKESKGMM NDE

SEQ ID NO: 186 - exemplary IL15 amino acid sequence

NWVNVISDLKKIEDLIQSMHIDATLYTESDVHPSCKVTAMKCFLLELQVISLESGDA S IHDTVE NLIILANNSLSSNGNVTESGCKECEELEEKNIKEFLQSFVHIVQMFINTS SEQ ID NO: 187 - exemplary IgE-IL15 amino acid sequence

MDWTWILFLVAAATRVHSNWVNVISDLKKIEDLIQSMHIDATLYTESDVHPSCKVTA MKCFLLE LQVISLESGDASIHDTVENLIILANNSLSSNGNVTESGCKECEELEEKNIKEFLQSFVHI VQMF INTS

SEQ ID NO: 188 - exemplary IgE-IL15 pro-peptide amino acid sequence

MDWTWILFLVAAATRVHSGIHVFILGCFSAGLPKTEANWVNVISDLKKIEDLIQSMH IDATLYT ESDVHPSCKVTAMKCFLLELQVISLESGDAS IHDTVENLIILANNSLSSNGNVTESGCKECEEL EEKNIKEFLQSFVHIVQMFINTS

SEQ ID NO: 189 - exemplary IL15Ra amino acid sequence

ITCPPPMSVEHADIWVKSYSLYSRERYICNSGFKRKAGTSSLTECVLNKATNVAHWT TPSLKCI

RDPALVHQRPAPPSTVTTAGVTPQPESLSPSGKEPAASSPSSNNTAATTAAIVPGSQ LMPSKSP

STGTTEISSHESSHGTPSQTTAKNWELTASASHQPPGVYPQGHSDTTVAISTSTVLL CGLSAVS

LLACYLKSRQTPPLASVEMEAMEALPVTWGTSSRDEDLENCSHHL

SEQ ID NO: 190 - exemplary mbIL-15 amino acid sequence

MDWTWILFLVAAATRVHSNWVNVISDLKKIEDLIQSMHIDATLYTESDVHPSCKVTA MKCFLLE LQVISLESGDASIHDTVENLIILANNSLSSNGNVTESGCKECEELEEKNIKEFLQSFVHI VQMF INTSSGGGSGGGGSGGGGSGGGGSGGGSLQITCPPPMSVEHADIWVKSYSLYSRERYICN SGFK RKAGTSSLTECVLNKATNVAHWTTPSLKCIRDPALVHQRPAPPSTVTTAGVTPQPESLSP SGKE PAASSPSSNNTAATTAAIVPGSQLMPSKSPSTGTTEISSHESSHGTPSQTTAKNWELTAS ASHQ PPGVYPQGHSDTTVAISTSTVLLCGLSAVSLLACYLKSRQTPPLASVEMEAMEALPVTWG TSSR DEDLENCSHHL

SEQ ID NO: 191 - exemplary mbIL-15 amino acid sequence

MDWTWILFLVAAATRVHSGIHVFILGCFSAGLPKTEANWVNVISDLKKIEDLIQSMH IDATLYT ESDVHPSCKVTAMKCFLLELQVISLESGDAS IHDTVENLIILANNSLSSNGNVTESGCKECEEL EEKNIKEFLQSFVHIVQMFINTSSGGGSGGGGSGGGGSGGGGSGGGSLQITCPPPMSVEH ADIW VKSYSLYSRERYICNSGFKRKAGTSSLTECVLNKATNVAHWTTPSLKCIRDPALVHQRPA PPST VTTAGVTPQPESLSPSGKEPAASSPSSNNTAATTAAIVPGSQLMPSKSPSTGTTEISSHE SSHG TPSQTTAKNWELTASASHQPPGVYPQGHSDTTVAISTSTVLLCGLSAVSLLACYLKSRQT PPLA SVEMEAMEALPVTWGTSSRDEDLENCSHHL

SEQ ID NO: 192 - exemplary multi cistronic CD 16, mbIL-15 amino acid sequence

MWQLLLPTALLLLVSAGMRTEDLPKAW FLEPQWYRVLEKDSVTLKCQGAYSPEDNSTQWFHNE SLISSQASSYFIDAATVDDSGEYRCQTNLSTLSDPVQLEVHIGWLLLQAPRWVFKEEDPI HLRC HSWKNTALHKVTYLQNGKGRKYFHHNSDFYIPKATLKDSGSYFCRGLVGSKNVSSETVNI TITQ GLAVSTISSFFPPGYQVSFCLVMVLLFAVDTGLYFSVKTNIRSSTRDWKDHKFKWRKDPQ DKGS GATNFSLLKQAGDVEENPGPMDWTWILFLVAAATRVHSNWVNVISDLKKIEDLIQSMHID ATLY TESDVHPSCKVTAMKCFLLELQVISLESGDAS IHDTVENLIILANNSLSSNGNVTESGCKECEE LEEKNIKEFLQSFVHIVQMFINTSSGGGSGGGGSGGGGSGGGGSGGGSLQITCPPPMSVE HADI WVKSYSLYSRERYICNSGFKRKAGTSSLTECVLNKATNVAHWTTPSLKCIRDPALVHQRP APPS TVTTAGVTPQPESLSPSGKEPAASSPSSNNTAATTAAIVPGSQLMPSKSPSTGTTEISSH ESSH

GTPSQTTAKNWELTASASHQPPGVYPQGHSDTTVAISTSTVLLCGLSAVSLLACYLK SRQTPPL

ASVEMEAMEALPVTWGTSSRDEDLENCSHHL

SEQ ID NO: 193 - exemplary CD 19 CAR amino acid sequence

MLLLVTSLLLCELPHPAFLLIPDIQMTQTTSSLSASLGDRVTISCRASQDISKYLNW YQQKPDG TVKLLIYHTSRLHSGVPSRFSGSGSGTDYSLTISNLEQEDIATYFCQQGNTLPYTFGGGT KLEI TGSTSGSGKPGSGEGSTKGEVKLQESGPGLVAPSQSLSVTCTVSGVSLPDYGVSWIRQPP RKGL EWLGVIWGSETTYYNSALKSRLTI IKDNSKSQVFLKMNSLQTDDTAIYYCAKHYYYGGSYAMDY WGQGTSVTVSSAAAIEVMYPPPYLDNEKSNGTI IHVKGKHLCPSPLFPGPSKPFWVLVVVGGVL ACYSLLVTVAFIIFWVRSKRSRLLHSDYMNMTPRRPGPTRKHYQPYAPPRDFAAYRSRVK FSRS ADAPAYQQGQNQLYNELNLGRREEYDVLDKRRGRDPEMGGKPRRKNPQEGLYNELQKDKM AEAY SEIGMKGERRRGKGHDGLYQGLSTATKDTYDALHMQALPPR

SEQ ID NO: 194 - exemplary EGFR CAR amino acid sequence

MALPVTALLLPLALLLHAARPMDEVQLVESGGGLVQPGGSLRLSCAASGFSFTNYGV HWVRQAP GKGLEWVSVIWSGGNTDYNTSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCARALTY YDYE FAYWGQGTLVTVSSGGGGSGGGGSGGGGSEIVLTQSPATLSLSPGERATLSCRASQS IGTNIHW YQQKPGQAPRLLIYYASESISGIPARFSGSGSGTDFTLTISSLEPEDFAVYYCQQNNNWP TTFG QGTKLEIKGSLEAAATTTPAPRPPTPAPTIASQPLSLRPEACRPAAGGAVHTRGLDFACD I YIW APLAGTCGVLLLSLVITLYCKRGRKKLLYI FKQPEMRPVQTTQEEDGCSCRFPEEEEGGCELRV KFSRSADAPAYQQGQNQLYNELNLGRREEYDVLDKRRGRDPEMGGKPRRKNPQEGLYNEL QKDK MAEAYSEIGMKGERRRGKGHDGLYQGLSTATKDTYDALHMQALPPR

SEQ ID NO: 195 - exemplary GFP amino acid sequence

MVSKGEELFTGW PILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTT LTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTI FFKDDGNYKTRAEVKFEGDTLVNRIELKG IDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTP IGDG PVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYK

SEQ ID NO: 196 - exemplary CXCR1 amino acid sequence

MSNITDPQMWDFDDLNFTGMPPADEDYSPCMLETETLNKYVVI IAYALVFLLSLLGNSLVMLVI LYSRVGRSVTDVYLLNLALADLLFALTLPIWAASKVNGWIFGTFLCKW SLLKEVNFYSGILLL ACISVDRYLAIVHATRTLTQKRHLVKFVCLGCWGLSMNLSLPFFLFRQAYHPNNSSPVCY EVLG NDTAKWRMVLRILPHTFGFIVPLFVMLFCYGFTLRTLFKAHMGQKHRAMRVIFAW LIFLLCWL PYNLVLLADTLMRTQVIQESCERRNNIGRALDATEILGFLHSCLNPI IYAFIGQNFRHGFLKIL AMHGLVSKEFLARHRVTSYTSSSVNVSSNL

SEQ ID NO: 197 - exemplary CXCR3B amino acid sequence

MELRKYGPGRLAGTVIGGAAQSKSQTKSDS ITKEFLPGLYTAPSSPFPPSQVSDHQVLNDAEVA ALLENFSSSYDYGENESDSCCTSPPCPQDFSLNFDRAFLPALYSLLFLLGLLGNGAVAAV LLSR RTALSSTDTFLLHLAVADTLLVLTLPLWAVDAAVQWVFGSGLCKVAGAL FNINFYAGALLLACI SFDRYLNIVHATQLYRRGPPARVTLTCLAVWGLCLLFALPDFI FLSAHHDERLNATHCQYNFPQ VGRTALRVLQLVAGFLLPLLVMAYCYAHILAVLLVSRGQRRLRAMRLW W W AFALCWTPYHL W LVDILMDLGALARNCGRESRVDVAKSVTSGLGYMHCCLNPLLYAFVGVKFRERMWMLLLR LG

CPNQRGLQRQPSSSRRDSSWSETSEASYSGL

SEQ ID NO: 198 - exemplary CXCR3 A amino acid sequence

MVLEVSDHQVLNDAEVAALLENFSSSYDYGENESDSCCTSPPCPQDFSLNFDRAFLP ALYSLLF LLGLLGNGAVAAVLLSRRTALSSTDTFLLHLAVADTLLVLTLPLWAVDAAVQWVFGSGLC KVAG ALFNINFYAGALLLACISFDRYLNIVHATQLYRRGPPARVTLTCLAVWGLCLLFALPDFI FLSA HHDERLNATHCQYNFPQVGRTALRVLQLVAGFLLPLLVMAYCYAHILAVLLVSRGQRRLR AMRL W W W AFALCWTPYHLW LVDILMDLGALARNCGRESRVDVAKSVTSGLGYMHCCLNPLLYAF VGVKFRERMWMLLLRLGCPNQRGLQRQPSSSRRDSSWSETSEASYSGL

SEQ ID NO: 199 - exemplary CCR5 amino acid sequence

MDYQVSSPIYDINYYTSEPCQKINVKQIAARLLPPLYSLVFI FGFVGNMLVILILINCKRLKSM TDIYLLNLAISDLFFLLTVPFWAHYAAAQWDFGNTMCQLLTGLYFIGFFSGI FFIILLTIDRYL AW HAVFALKARTVTFGW TSVITWW AVFASLPGIIFTRSQKEGLHYTCSSHFPYSQYQFWKN FQTLKIVILGLVLPLLVMVICYSGILKTLLRCRNEKKRHRAVRLI FTIMIVYFLFWAPYNIVLL LNTFQEFFGLNNCSSSNRLDQAMQVTETLGMTHCC INP11YAFVGEKFRNYLLVFFQKHIAKRF CKCCSIFQQEAPERASSVYTRSTGEQE ISVGL

SEQ ID NO: 200 - exemplary CCR2 cargo sequence

MLSTSRSRFIRNTNESGEEVTTFFDYDYGAPCHKFDVKQIGAQLLPPLYSLVFI FGFVGNMLW LILINCKKLKCLTDIYLLNLAISDLLFLITLPLWAHSAANEWVFGNAMCKLFTGLYHIGY FGGI FFIILLTIDRYLAIVHAVFALKARTVTFGVVTSVITWLVAVFASVPGI IFTKCQKEDSVYVCGP YFPRGWNNFHTIMRNILGLVLPLLIMVICYSGILKTLLRCRNEKKRHRAVRVI FTIMIVYFLFW TPYNIVILLNTFQEFFGLSNCESTSQLDQATQVTETLGMTHCCINPI IYAFVGEKFRSLFHIAL GCRIAPLQKPVCGGPGVRPGKNVKVTTQGLLDGRGKGKS IGRAPEASLQDKEGA

AA V Capsids

[0654] In some embodiments, the present disclosure provides one or more polynucleotide constructs (e.g., knock-in cassettes) packaged into an AAV capsid. In some embodiments, an AAV capsid is from or derived from an AAV capsid of an AAV2, 3, 4, 5, 6, 7, 8, 9, or 10 serotype, or one or more hybrids thereof. In some embodiments, an AAV capsid is from an AAV ancestral serotype. In some embodiments, an AAV capsid is an ancestral (Anc) AAV capsid. An Anc capsid is created from a construct sequence that is constructed using evolutionary probabilities and evolutionary modeling to determine a probable ancestral sequence. In some embodiments, an AAV capsid has been modified in a manner known in the art (see e.g., Biining and Srivastava, Capsid modifications for targeting and improving the efficacy of AAV vectors, Mol Ther Methods Clin Dev. 2019) [0655] In some embodiments, as provided herein, any combination of AAV capsids and

AAV constructs (e.g., comprising AAV ITRs) may be used in recombinant AAV (rAAV) particles of the present disclosure. In some embodiments, an AAV ITR is from or derived from an AAV ITR of AAV2, 3, 4, 5, 6, 7, 8, 9, or 10. For example, wild-type or variant AA6 ITRs and AAV6 capsid, wild-type or variant AAV2 ITRs and AAV6 capsid, etc. In some embodiments of the present disclosure, an AAV particle is wholly comprised of AAV6 components (e.g., capsid and ITRs are AAV6 serotype). In some embodiments, an AAV particle is an AAV6/2, AAV6/8 or AAV6/9 particle (e.g., an AAV2, AAV8 or AAV9 capsid with an AAV construct having AAV6 ITRs).

Exemplary AAV Constructs

[0656] In some embodiments, a donor template is included within an AAV construct. In some embodiments, an AAV construct sequence comprises or consists of the sequence of any one of SEQ ID NO: 201-204. In some embodiments, an exemplary AAV construct is represented by SEQ ID NO: 201. In some embodiments, an exemplary AAV construct is represented by SEQ ID NO: 202. In some embodiments, an exemplary AAV construct is represented by SEQ ID NO: 203. In some embodiments, an exemplary AAV construct is represented by SEQ ID NO: 204. In some embodiments, an exemplary AAV construct is at least 80%, 85%, 90%, 95%, 98%, or 99% identical to a sequence represented by SEQ ID NO: 201-204.

SEQ ID NO: 201 - exemplary AAV construct for donor template insertion at GAPDH locus

CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGC GTCGGGC GACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAACTC CATC ACTAGGGGTTCCTGTCGACGAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATG GCCG CGGGGCTCTCCAGAACATCATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGT CATC CCTGAGCTGAACGGGAAGCTCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCA GTGG TGGACCTGACCTGCCGTCTAGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGA AGCA GGCGTCGGAGGGCCCCCTCAAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTC TGAC TTCAACAGCGACACCCACTCCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGAC CACT TTGTCAAGCTCATTTCCTGGTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAG GGTC TGGCGCCCTCTGGTGGCTGGCTCAGAAAAAGGGCCCTGACAACTCTTTACATCTTCTAGG TATG AC AAC GAG T T C G GAT AT AG C AAT AGAG T G G T C GAT C T GAT G G C T CAT AT G G C TAG C AAAGAG G G AAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGG ACCT ATGTGGCAACTGCTGCTGCCTACAGCTCTGCTGCTTCTGGTGTCTGCCGGCATGAGAACC GAGG ATCTGCCTAAGGCCGTGGTGTTCCTGGAACCTCAGTGGTACAGAGTGCTGGAAAAGGACA GCGT GACCCTGAAGTGCCAGGGCGCCTATTCTCCCGAGGACAATAGCACCCAGTGGTTCCACAA CGAG AGCCTGATCAGCAGCCAGGCCAGCAGCTACTTTATCGATGCCGCCACCGTGGACGACAGC GGCG AGTACAGATGCCAGACCAATCTGAGCACCCTGAGCGACCCTGTGCAGCTGGAAGTGCACA TTGG ATGGTTGCTGCTGCAAGCCCCTAGATGGGTGTTCAAAGAAGAGGACCCCATCCACCTGAG ATGC C AC T C T T G GAAGAAC AC AG C C C T G C AC AAAG T GAC C T AC C T G C AGAAC G G C AAG G G C AGAAAG T ACTTCCACCACAACAGCGACTTCTACATCCCCAAGGCCACACTGAAGGACTCCGGCTCCT ACTT C T G C AGAG GCCTGGTCGG C AG C AAGAAC G T G T C C AG C GAGAC AG T GAAC AT C AC CAT C AC AC AG GGCCTCGCCGTGTCTACCATCAGCAGCTTTTTCCCACCTGGCTATCAGGTGTCCTTCTGC CTGG TCATGGTGCTGCTGTTCGCCGTGGATACCGGCCTGTACTTCAGCGTCAAGACCAACATCC GGTC C AG C AC C AGAGAC T G GAAG GAC C AC AAG T T C AAG T G G C G GAAG GAC C C T C AG GAC AAG T AAG C G GCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTA GTTG CCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCC CACT GTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATT CTGG GGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTG GGGA TGCGGTGGGCTCTATGGATTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCCCACATG GCCT C C AAG GAG T AAGAC C C C T G GAC C AC C AG C C C C AG C AAGAG C AC AAGAG GAAGAGAGAGAC C C T C ACTGCTGGGGAGTCCCTGCCACACTCAGTCCCCCACCACACTGAATCTCCCCTCCTCACA GTTG CCATGTAGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCATGTACCATC AATA AAGTACCCTGTGCTCAACCAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGGGCAGAGGG GAGG GAAGCTGGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATGTTCTCC TCAG ACTGAGGGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGAACCATTTGCTTCCCGCTCA GACG TCTTGAGTGCTACAGGAAGCTGGCACCACTACTTCAGAGAACAAGGCCTTTTCCTCTCCT CGCT CCAGTAGATCTAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTC GCTC ACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTG AGCG AGCGAGCGCGCAGCTGCCTGCAGG

SEQ ID NO: 202 - exemplary AAV construct for donor template insertion at GAPDH locus

CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGC GTCGGGC GACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAACTC CATC ACTAGGGGTTCCTGTCGACGAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATG GCCG CGGGGCTCTCCAGAACATCATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGT CATC CCTGAGCTGAACGGGAAGCTCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCA GTGG TGGACCTGACCTGCCGTCTAGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGA AGCA GGCGTCGGAGGGCCCCCTCAAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTC TGAC TTCAACAGCGACACCCACTCCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGAC CACT TTGTCAAGCTCATTTCCTGGTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAG GGTC TGGCGCCCTCTGGTGGCTGGCTCAGAAAAAGGGCCCTGACAACTCTTTACATCTTCTAGG TATG AC AAC GAG T T C G GAT AT AG C AAT AGAG T G G T C GAT C T GAT G G C T CAT AT G G C TAG C AAAGAG G G AAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGG ACCT ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGAC GGCG ACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCA AGCT GACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGAC CACC CTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTC TTCA AGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCA ACTA CAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAA GGGC AT C GAC T T C AAG GAG GAC G G C AAC AT C C T G G G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC A AC GTCTATAT CAT G G C C GAC AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AAGAT C C G C C AC AA CATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGA CGGC CCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCC AACG AGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCA TGGA CGAGCTGTACAAGTGAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCA GCCT CGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGA CCCT GGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCT GAGT AGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAA GACA ATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGATTTGGCTACAGCAACAGGGTGGT GGAC CTCATGGCCCACATGGCCTC C AAG GAG T AAG AC C C C T G G AC CACCAGCCCCAG C AAG AG C AC AA GAG G AAG AG AG AG AC CCTCACTGCTGGG GAG TCCCTGCCACACTCAGTCCCCCACCACACT G AA TCTCCCCTCCTCACAGTTGCCATGTAGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCG CACC TTGTCATGTACCATCAATAAAGTACCCTGTGCTCAACCAGTTACTTGTCCTGTCTTATTC TAGG GTCTGGGGCAGAGGGGAGGGAAGCTGGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGA GGGA CCTGGTATGTTCTCCTCAGACTGAGGGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGA ACCA TTTGCTTCCCGCTCAGACGTCTTGAGTGCTACAGGAAGCTGGCACCACTACTTCAGAGAA CAAG GCCTTTTCCTCTCCTCGCTCCAGTAGATCTAGGAACCCCTAGTGATGGAGTTGGCCACTC CCTC TCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTT TGCC CGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGG

SEQ ID NO: 203 - exemplary AAV construct for donor template insertion at GAPDH locus

CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGC GTCGGGC GACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAACTC CATC ACTAGGGGTTCCTGTCGACGAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATG GCCG CGGGGCTCTCCAGAACATCATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGT CATC CCTGAGCTGAACGGGAAGCTCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCA GTGG TGGACCTGACCTGCCGTCTAGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGA AGCA GGCGTCGGAGGGCCCCCTCAAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTC TGAC TTCAACAGCGACACCCACTCCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGAC CACT TTGTCAAGCTCATTTCCTGGTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAG GGTC TGGCGCCCTCTGGTGGCTGGCTCAGAAAAAGGGCCCTGACAACTCTTTACATCTTCTAGG TATG AC AAC GAG T T C G GAT AT AG C AAT AGAG T G G T C GAT C T GAT G G C T CAT AT G G C TAG C AAAGAG G G AAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGG ACCT ATGCTTCTCCTGGTGACAAGCCTTCTGCTCTGTGAGTTACCACACCCAGCATTCCTCCTG ATCC C AGAC AT C C AGAT GAC AC AGAC T AC AT CCTCCCTGTCTGCCTCTCTGG GAGAC AGAG T C AC CAT C AG T T G C AG G G C AAG T C AG GAC AT TAG T AAAT AT T T AAAT T G G T AT C AG C AGAAAC C AGAT G GA AC T G T TAAAC TCCTGATCTAC CAT AC AT CAAGAT T AC AC T C AG GAG T C C C AT C AAG G T T C AG T G G C AG TGGGTCTG GAAC AGAT TATTCTCT C AC CAT TAG C AAC C T G GAG C AAGAAGAT AT T G C C AC TTACTTTTGCCAACAGGGTAATACGCTTCCGTACACGTTCGGAGGGGGGACTAAGTTGGA AATA ACAGGCTCCACCTCTGGATCCGGCAAGCCCGGATCTGGCGAGGGATCCACCAAGGGCGAG GTGA AACTGCAGGAGTCAGGACCTGGCCTGGTGGCGCCCTCACAGAGCCTGTCCGTCACATGCA CTGT CTCAGGGGTCTCATTACCCGACTATGGTGTAAGCTGGATTCGCCAGCCTCCACGAAAGGG TCTG GAG T G G C T G G GAG T AAT AT G G G G TAG T GAAAC C AC AT AC TAT AAT T C AG C T C T C AAAT C C AGAC T GAC CAT CAT C AAG GAC AAC T C CAAGAG C C AAG T T T T C T TAAAAAT GAAC AG T C T G CAAAC T GA T GAC AC AG C CAT T T AC T AC T G T G C C AAAC AT TAT T AC T AC G G T G G TAG CTATGCTATG GAC T AC TGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAGCGGCCGCAATTGAAGTTATGTATCCT CCTC C T T AC C T AGAC AAT GAGAAGAG C AAT G GAAC CAT T AT C CAT G T GAAAG G GAAAC AC CTTTGTCC AAGTCCCCTATTTCCCGGACCTTCTAAGCCCTTTTGGGTGCTGGTGGTGGTTGGGGGAGT CCTG GCTTGCTATAGCTTGCTAGTAACAGTGGCCTTTATTATTTTCTGGGTGAGGAGTAAGAGG AGCA GGCTCCTGCACAGTGACTACATGAACATGACTCCCCGCCGCCCCGGGCCCACCCGCAAGC ATTA CCAGCCCTATGCCCCACCACGCGACTTCGCAGCCTATCGCTCCAGAGTGAAGTTCAGCAG GAGC GCAGACGCCCCCGCGTACCAGCAGGGCCAGAACCAGCTCTATAACGAGCTCAATCTAGGA CGAA GAG AG GAG T AC GAT G T T T T G G AC AAG AG AC GTGGCCGG G AC C C T GAG AT G G G G G G AAAG C C GAG AAG GAAGAAC C C T C AG GAAG G C C T G T AC AAT GAAC T G C AGAAAGAT AAGAT G G C G GAG G C C T AC AGTGAGATTGGGATGAAAGGCGAGCGCCGGAGGGGCAAGGGGCACGATGGCCTTTACCAG GGTC TCAGTACAGCCACCAAGGACACCTACGACGCCCTTCACATGCAGGCCCTGCCCCCTCGCT AAAG CGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTC TAGT TGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACT CCCA CTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTA TTCT GGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGC TGGG GATGCGGTGGGCTCTATGGATTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCCCACA TGGC C T C C AAG GAG T AAGAC C C C T G GAC C AC C AG C C C C AG C AAGAG C AC AAGAG GAAGAGAGAGAC C C TCACTGCTGGGGAGTCCCTGCCACACTCAGTCCCCCACCACACTGAATCTCCCCTCCTCA CAGT TGCCATGTAGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCATGTACCA TCAA TAAAGTACCCTGTGCTCAACCAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGGGCAGAG GGGA GGGAAGCTGGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATGTTCT CCTC AGACTGAGGGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGAACCATTTGCTTCCCGCT CAGA CGTCTTGAGTGCTACAGGAAGCTGGCACCACTACTTCAGAGAACAAGGCCTTTTCCTCTC CTCG CTCCAGTAGATCTAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGC TCGC TCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAG TGAG CGAGCGAGCGCGCAGCTGCCTGCAGG

SEQ ID NO: 204 - exemplary AAV construct for donor template insertion at GAPDH locus

CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCCGGGC GTCGGGC GACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTGGCCAACTC CATC ACTAGGGGTTCCTGTCGACGAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATG GCCG CGGGGCTCTCCAGAACATCATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGT CATC CCTGAGCTGAACGGGAAGCTCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCA GTGG TGGACCTGACCTGCCGTCTAGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGA AGCA GGCGTCGGAGGGCCCCCTCAAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTC TGAC TTCAACAGCGACACCCACTCCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGAC CACT TTGTCAAGCTCATTTCCTGGTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAG GGTC TGGCGCCCTCTGGTGGCTGGCTCAGAAAAAGGGCCCTGACAACTCTTTACATCTTCTAGG TATG AC AAC GAG T T C G GAT AT AG C AAT AGAG T G G T C GAT C T GAT G G C T CAT AT G G C TAG C AAAGAG G G AAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGG ACCT ATGGCACTCCCCGTCACCGCCCTTCTCTTGCCCCTCGCCCTGCTGCTGCATGCTGCCAGG CCCA TGGACGAAGTGCAGCTCGTGGAGTCCGGTGGAGGACTCGTCCAACCGGGCGGATCCCTTC GCTT GTCCTGCGCCGCATCAGGCTTCAGCTTCACCAACTATGGCGTCCACTGGGTCAGACAGGC CCCC GGAAAGGGACTGGAATGGGTGTCCGTGATCTGGAGCGGCGGGAACACCGACTACAACACC TCCG TGAAGGGCCGGTTCACTATTAGCCGCGACAACTCCAAGAACACTCTGTACCTCCAAATGA ACTC CCTGAGGGCCGAAGATACTGCTGTGTACTATTGCGCGAGAGCCCTGACCTACTACGACTA CGAG TTCGCGTACTGGGGCCAGGGGACTCTCGTGACCGTGTCCAGCGGTGGTGGAGGTTCCGGA GGCG GAGGTTCTGGTGGCGGGGGATCAGAAATCGTGCTGACTCAGTCCCCTGCGACCTTGTCCC TGAG CCCTGGAGAACGGGCCACCCTGAGCTGTAGAGCCAGCCAGAGCATCGGGACAAATATTCA CTGG TACCAGCAGAAACCCGGACAAGCACCACGGCTGCTGATCTACTACGCCTCCGAGTCGATT TCCG GAATCCCGGCTCGCTTTTCGGGGTCTGGATCGGGAACGGACTTCACTCTGACCATCTCGT CGCT GGAACCCGAGGATTTCGCCGTGTACTACTGCCAACAGAACAACAATTGGCCGACCACGTT CGGC CAGGGCACCAAGCTCGAGATTAAGGGATCACTGGAAGCGGCCGCAACCACAACACCTGCT CCAA GGCCCCCCACACCCGCTCCAACTATAGCCAGCCAACCATTGAGCCTCAGACCTGAAGCTT GCAG GCCCGCAGCAGGAGGCGCCGTCCATACGCGAGGCCTGGACTTCGCGTGTGATATTTATAT TTGG GCCCCTTTGGCCGGAACATGTGGGGTGTTGCTTCTCTCCCTTGTGATCACTCTGTATTGT AAGC GCGGGAGAAAGAAGCTCCTGTACATCTTCAAGCAGCCTTTTATGCGACCTGTGCAAACCA CTCA GGAAGAAGATGGGTGTTCATGCCGCTTCCCCGAGGAGGAAGAAGGAGGGTGTGAACTGAG GGTG AAAT T T T C T AGAAG C G C C GAT G C T C C C G CAT AT C AG C AG G G T C AGAAT C AG C T C T AC AAT GAAT T GAAT C T C G G C AG G C GAGAAGAG T AC GAT G T T C T G GAC AAGAGAC G G G G C AG G GAT C C C GAGAT GGGGGGAAAGCCCCGGAGAAAAAATCCTCAGGAGGGGTTGTACAATGAGCTGCAGAAGGA CAAG AT G G C T GAAG C C TAT AG C GAGAT C G GAAT GAAAG G C GAAAGAC G C AGAG G C AAG G G G CAT GAC G GTCTGTACCAGGGTCTCTCTACAGCCACCAAGGACACTTATGATGCGTTGCATATGCAAG CCTT GCCACCCCGCTAAAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGC CTCG ACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACC CTGG AAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGA GTAG GTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGA CAAT AGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGATTTGGCTACAGCAACAGGGTGGTGG ACCT CAT G G C C C AC AT G G C C T C C AAG GAG T AAGAC C C C T G GAC C AC C AG C C C C AG C AAGAG C AC AAGA GGAAGAGAGAGACCCTCACTGCTGGGGAGTCCCTGCCACACTCAGTCCCCCACCACACTG AATC TCCCCTCCTCACAGTTGCCATGTAGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCGCA CCTT GTCATGTACCATCAATAAAGTACCCTGTGCTCAACCAGTTACTTGTCCTGTCTTATTCTA GGGT CTGGGGCAGAGGGGAGGGAAGCTGGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGG GACC TGGTATGTTCTCCTCAGACTGAGGGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGAAC CATT TGCTTCCCGCTCAGACGTCTTGAGTGCTACAGGAAGCTGGCACCACTACTTCAGAGAACA AGGC CTTTTCCTCTCCTCGCTCCAGTAGATCTAGGAACCCCTAGTGATGGAGTTGGCCACTCCC TCTC TGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTG CCCG GGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGG

Exemplary Donor Template Sequences

[0657] In some embodiments, a donor template comprises in 5' to 3' order, a target sequence 5' homology arm (which optionally comprises an optimized sequence that is not a wild type sequence), a second regulatory element that enables expression of a cargo sequence as a separate translational product (e.g., an IRES sequence and/or a 2A element), a cargo sequence (e.g., a gene product of interest), optionally a second regulatory element that enables expression of a cargo sequence as a separate translational product (e.g., an IRES sequence and/or a 2A element), optionally a second cargo sequence (e.g., a gene product of interest), optionally a 3' UTR, a poly adenylation signal (e.g., a BGHpA signal), and a target sequence 3' homology arm (which optionally comprises an optimized sequence that is not a wild type sequence).

[0658] In some embodiments, a donor template comprises or consists of the sequence of any one of SEQ ID NOs: 38-57 and 205-218. In some embodiments, a donor template comprises or consists of a sequence that is at least 85%, 90%, 95%, 98% or 99% identical to any one of SEQ ID NOs: 38-57 and 205-218.

SEQ ID NO: 38 - exemplary donor template for insertion at GAPDH locus GAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGGGCTCTCCAGAA CATC ATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATCCCTGAGCTGAACGGG AAGC TCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCAGTGGTGGACCTGACCTGCC GTCT AGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGAAGCAGGCGTCGGAGGGCCC CCTC AAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTCTGACTTCAACAGCGACACC CACT CCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGACCACTTTGTCAAGCTCATTT CCTG GTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAGGGTCTGGCGCCCTCTGGTG GCTG G C T C AGAAAAAG G G C C C T GAC AAC T C T T T AC AT C T T C TAG G T AT GAC AAC GAG T T C G GAT AT AG C AAT AGAG T G G T C GAT C T GAT G G C T CAT AT G G C TAG C AAAGAG GAG G G C AGAG GAAG T C T T C T A ACATGCGGTGACGTGGAGGAGAATCCTGGCCCGATGGTGAGCAAGGGCGAGGAGCTGTTC ACCG GGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGT CCGG CGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGG CAAG CTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGC CGCT ACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCC AGGA GCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGA GGGC GACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATC CTGG G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC AAC GTCTATAT CAT G G C C GAC AAG C AGAAGAA CGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGC CGAC CACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTAC CTGA GCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGG AGTT CGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGGAGGGCAGAGGAAG TCTT CTAACATGCGGTGACGTGGAGGAGAATCCTGGCCCGATGGTGAGCAAGGGCGAGGAGGAT AACA TGGCCATCATCAAGGAGTTCATGCGCTTCAAGGTGCACATGGAGGGCTCCGTGAACGGCC ACGA GTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTACGAGGGCACCCAGACCGCCAAGCT GAAG GTGACCAAGGGTGGCCCCCTGCCCTTCGCCTGGGACATCCTGTCCCCTCAGTTCATGTAC GGCT CCAAGGCCTACGTGAAGCACCCCGCCGACATCCCCGACTACTTGAAGCTGTCCTTCCCCG AGGG CTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGGCGGCGTGGTGACCGTGACCCAGGA CTCC TCCCTGCAGGACGGCGAGTTCATCTACAAGGTGAAGCTGCGCGGCACCAACTTCCCCTCC GACG GCCCCGTAATGCAGAAGAAGACAATGGGCTGGGAGGCCTCCTCCGAGCGGATGTACCCCG AGGA CGGCGCCCTGAAGGGCGAGATCAAGCAGAGGCTGAAGCTGAAGGACGGCGGCCACTACGA CGCT GAGGTCAAGACCACCTACAAGGCCAAGAAGCCCGTGCAGCTGCCCGGCGCCTACAACGTC AACA T C AAG T T G GAC AT C AC C T C C C AC AAC GAG GAC T AC AC CAT C G T G GAAC AG T AC GAAC G C G C C GA GGGCCGCCACTCCACCGGCGGCATGGACGAGCTGTACAAGTAAGCGGCCGCGTCGAGTCT AGAG GGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTG TTTG CCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATA AAAT GAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGG CAGG ACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTA TGGA TTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCCCACATGGCCTCCAAGGAGTAAGAC CCCT G GAC CACCAGCCCCAG C AAG AG C AC AAG AG GAAG AG AG AG AC CCTCACTGCTGGG GAG T C C C T G CCACACTCAGTCCCCCACCACACTGAATCTCCCCTCCTCACAGTTGCCATGTAGACCCCT TGAA GAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCATGTACCATCAATAAAGTACCCTGTGCT CAAC CAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGGGCAGAGGGGAGGGAAGCTGGGCTTGT GTCA AGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATGTTCTCCTCAGACTGAGGGTAGGGC CTCC AAACAGCCTTGCTTGCTTCGAGAACCATTTGCTTCCCGCTCAGACGTCTTGAGTGCTACA GGAA GCTGGCACCACTACTTCAGAGAACAAGGCCTTTTCCTCTCCTCGCTCCAGT

SEQ ID NO: 39 - exemplary donor template for insertion at GAPDH locus GAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGGGCTCTCCAGAA CATC ATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATCCCTGAGCTGAACGGG AAGC TCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCAGTGGTGGACCTGACCTGCC GTCT AGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGAAGCAGGCGTCGGAGGGCCC CCTC AAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTCTGACTTCAACAGCGACACC CACT CCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGACCACTTTGTCAAGCTCATTT CCTG GTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAGGGTCTGGCGCCCTCTGGTG GCTG G C T C AGAAAAAG G G C C C T GAC AAC T C T T T AC AT C T T C TAG G T AT GAC AAC GAG T T C G GAT AT AG CAATAGAGTGGTCGATCTGATGGCTCATATGGCTAGCAAAGAGGGAAGCGGAGCTACTAA CTTC AGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCTATGGTGAGCAAGGGC GAGG AGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACA AGTT CAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCAT CTGC ACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTG CAGT GCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCG AAGG CTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGA GGTG AAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAG GACG G C AAC AT C C T G G G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC AAC GTCTATAT CAT G G C C GA C AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AAGAT C C G C C AC AAC AT C GAG GAC G G C AG C G T G CAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCC GACA ACCACTACCT GAG CACCCAGTCCGCCCT GAG C AAAG AC C C C AAC GAG AAG C G C GAT CACATGGT CCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTA ACCC CTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGC GTTT GTCTATATGTTATTTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCT GGCC CTGTCTTCTTGACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTC TGTT GAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGC GACC CTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGT GTAT AAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGG AAAG AGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACC CCAT TGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTA AAAA AACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAAA TGGT GAG C AAG G G C GAG GAG GAT AAC AT G G C CAT CAT C AAG GAG T T CAT G C G C T T C AAG G T G C AC AT G GAGGGCTCCGTGAACGGCCACGAGTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTAC GAGG GCACCCAGACCGCCAAGCTGAAGGTGACCAAGGGTGGCCCCCTGCCCTTCGCCTGGGACA TCCT GTCCCCTCAGTTCATGTACGGCTCCAAGGCCTACGTGAAGCACCCCGCCGACATCCCCGA CTAC TTGAAGCTGTCCTTCCCCGAGGGCTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGGC GGCG TGGTGACCGTGACCCAGGACTCCTCCCTGCAGGACGGCGAGTTCATCTACAAGGTGAAGC TGCG CGGCACCAACTTCCCCTCCGACGGCCCCGTAATGCAGAAGAAGACAATGGGCTGGGAGGC CTCC TCCGAGCGGATGTACCCCGAGGACGGCGCCCTGAAGGGCGAGATCAAGCAGAGGCTGAAG CTGA AGGACGGCGGCCACTACGACGCTGAGGTCAAGACCACCTACAAGGCCAAGAAGCCCGTGC AGCT GCCCGGCGCC T AC AAC G T C AAC AT C AAG T T G GAC AT C AC C T C C C AC AAC GAG GAC T AC AC CAT C GTGGAACAGTACGAACGCGCCGAGGGCCGCCACTCCACCGGCGGCATGGACGAGCTGTAC AAGT AAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGCC TTCT AGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCC ACTC CCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATT CTAT TCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCA TGCT GGGGATGCGGTGGGCTCTATGGATTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCCC ACAT G G C C T C C AAG GAG T AAGAC C C C T G GAC C AC C AG C C C C AG C AAGAG C AC AAGAG GAAGAGAGAGA CCCTCACTGCTGGGGAGTCCCTGCCACACTCAGTCCCCCACCACACTGAATCTCCCCTCC TCAC

AGTTGCCATGTAGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCAT GTACCAT

CAATAAAGTACCCTGTGCTCAACCAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGG GCAGAGG

GGAGGGAAGCTGGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGGGACCTGGTA TGTTCTC

CTCAGACTGAGGGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGAACCATTTGCTT CCCGCTC

AGACGTCTTGAGTGCTACAGGAAGCTGGCACCACTACTTCAGAGAACAAGGCCTTTT CCTCTCC

TCGCTCCAGT

SEQ ID NO: 40 - exemplary donor template for insertion at GAPDH locus

GAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGGGCTCTCCA GAACATC ATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATCCCTGAGCTGAACGGG AAGC TCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCAGTGGTGGACCTGACCTGCC GTCT AGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGAAGCAGGCGTCGGAGGGCCC CCTC AAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTCTGACTTCAACAGCGACACC CACT CCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGACCACTTTGTCAAGCTCATTT CCTG GTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAGGGTCTGGCGCCCTCTGGTG GCTG G C T C AGAAAAAG G G C C C T GAC AAC T C T T T AC AT C T T C TAG G T AT GAC AAC GAG T T C G GAT AT AG CAATAGAGTGGTCGATCTGATGGCTCATATGGCTAGCAAAGAGGGAAGCGGAGCTACTAA CTTC AGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCTATGGTGAGCAAGGGC GAGG AGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACA AGTT CAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCAT CTGC ACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTG CAGT GCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCG AAGG CTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGA GGTG AAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAG GACG G C AAC AT C C T G G G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC AAC GTCTATAT CAT G G C C GA C AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AAGAT C C G C C AC AAC AT C GAG GAC G G C AG C G T G CAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCC GACA ACCACTACCT GAG CACCCAGTCCGCCCT GAG C AAAG AC C C C AAC GAG AAG C G C GAT CACATGGT CCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGGG AAGC GGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCT ATGG T GAG C AAG G G C GAG GAG GAT AAC AT G G C CAT CAT C AAG GAG T T CAT G C G C T T C AAG G T G C AC AT GGAGGGCTCCGTGAACGGCCACGAGTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTA CGAG GGCACCCAGACCGCCAAGCTGAAGGTGACCAAGGGTGGCCCCCTGCCCTTCGCCTGGGAC ATCC TGTCCCCTCAGTTCATGTACGGCTCCAAGGCCTACGTGAAGCACCCCGCCGACATCCCCG ACTA CTTGAAGCTGTCCTTCCCCGAGGGCTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGG CGGC GTGGTGACCGTGACCCAGGACTCCTCCCTGCAGGACGGCGAGTTCATCTACAAGGTGAAG CTGC GCGGCACCAACTTCCCCTCCGACGGCCCCGTAATGCAGAAGAAGACAATGGGCTGGGAGG CCTC CTCCGAGCGGATGTACCCCGAGGACGGCGCCCTGAAGGGCGAGATCAAGCAGAGGCTGAA GCTG AAGGACGGCGGCCACTACGACGCTGAGGTCAAGACCACCTACAAGGCCAAGAAGCCCGTG CAGC TGCCCGGCGCC T AC AAC G T C AAC AT C AAG T T G GAC AT C AC C T C C C AC AAC GAG GAC T AC AC CAT CGTGGAACAGTACGAACGCGCCGAGGGCCGCCACTCCACCGGCGGCATGGACGAGCTGTA CAAG TAAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGC CTTC TAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGC CACT CCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCAT TCTA TTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGC ATGC TGGGGATGCGGTGGGCTCTATGGATTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCC CACA T G G C C T C C AAG GAG T AAGAC C C C T G GAC C AC C AG C C C C AG C AAGAG C AC AAGAG GAAGAGAGAG ACCCTCACTGCTGGGGAGTCCCTGCCACACTCAGTCCCCCACCACACTGAATCTCCCCTC CTCA CAGTTGCCATGTAGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCATGT ACCA TCAATAAAGTACCCTGTGCTCAACCAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGGGC AGAG GGGAGGGAAGCTGGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATG TTCT CCTCAGACTGAGGGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGAACCATTTGCTTCC CGCT CAGACGTCTTGAGTGCTACAGGAAGCTGGCACCACTACTTCAGAGAACAAGGCCTTTTCC TCTC CTCGCTCCAGT

SEQ ID NO: 41 - exemplary donor template for insertion at GAPDH locus

GAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGGGCTCTCCA GAACATC ATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATCCCTGAGCTGAACGGG AAGC TCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCAGTGGTGGACCTGACCTGCC GTCT AGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGAAGCAGGCGTCGGAGGGCCC CCTC AAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTCTGACTTCAACAGCGACACC CACT CCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGACCACTTTGTCAAGCTCATTT CCTG GTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAGGGTCTGGCGCCCTCTGGTG GCTG G C T C AGAAAAAG G G C C C T GAC AAC T C T T T AC AT C T T C TAG G T AT GAC AAC GAG T T C G GAT AT AG CAATAGAGTGGTCGATCTGATGGCTCATATGGCTAGCAAAGAGGGAAGCGGAGCTACTAA CTTC AGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCTATGGTGAGCAAGGGC GAGG AGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACA AGTT CAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCAT CTGC ACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTG CAGT GCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCG AAGG CTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGA GGTG AAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAG GACG G C AAC AT C C T G G G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC AAC GTCTATAT CAT G G C C GA C AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AAGAT C C G C C AC AAC AT C GAG GAC G G C AG C G T G CAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCC GACA ACCACTACCT GAG CACCCAGTCCGCCCT GAG C AAAG AC C C C AAC GAG AAG C G C GAT CACATGGT CCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGGA GGGC AGAGGAAGTCTTCTAACATGCGGTGACGTGGAGGAGAATCCTGGCCCGATGGTGAGCAAG GGCG AGGAGGATAACATGGCCATCATCAAGGAGTTCATGCGCTTCAAGGTGCACATGGAGGGCT CCGT GAACGGCCACGAGTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTACGAGGGCACCCA GACC GCCAAGCTGAAGGTGACCAAGGGTGGCCCCCTGCCCTTCGCCTGGGACATCCTGTCCCCT CAGT TCATGTACGGCTCCAAGGCCTACGTGAAGCACCCCGCCGACATCCCCGACTACTTGAAGC TGTC CTTCCCCGAGGGCTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGGCGGCGTGGTGAC CGTG ACCCAGGACTCCTCCCTGCAGGACGGCGAGTTCATCTACAAGGTGAAGCTGCGCGGCACC AACT TCCCCTCCGACGGCCCCGTAATGCAGAAGAAGACAATGGGCTGGGAGGCCTCCTCCGAGC GGAT GTACCCCGAGGACGGCGCCCTGAAGGGCGAGATCAAGCAGAGGCTGAAGCTGAAGGACGG CGGC CACTACGACGCTGAGGTCAAGACCACCTACAAGGCCAAGAAGCCCGTGCAGCTGCCCGGC GCCT AC AAC G T C AAC AT C AAG T T G GAC AT C AC C T C C C AC AAC GAG GAC T AC AC C AT C G T G GAAC AG T A CGAACGCGCCGAGGGCCGCCACTCCACCGGCGGCATGGACGAGCTGTACAAGTAAGCGGC CGCG TCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCC AGCC ATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGT CCTT TCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGG GGTG GGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATG CGGT GGGCTCTATGGATTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCCCACATGGCCTCC AAGG AG T AAGAC C C C T G GAC C AC C AG C C C C AG C AAGAG C AC AAGAG GAAGAGAGAGAC C C T C AC T G C T GGGGAGTCCCTGCCACACTCAGTCCCCCACCACACTGAATCTCCCCTCCTCACAGTTGCC ATGT AGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCATGTACCATCAATAAA GTAC CCTGTGCTCAACCAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGGGCAGAGGGGAGGGA AGCT GGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATGTTCTCCTCAGAC TGAG GGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGAACCATTTGCTTCCCGCTCAGACGTC TTGA GTGCTACAGGAAGCTGGCACCACTACTTCAGAGAACAAGGCCTTTTCCTCTCCTCGCTCC AGT

SEQ ID NO: 42 - exemplary donor template for insertion at GAPDH locus

GAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGGGCTCTCCA GAACATC ATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATCCCTGAGCTGAACGGG AAGC TCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCAGTGGTGGACCTGACCTGCC GTCT AGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGAAGCAGGCGTCGGAGGGCCC CCTC AAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTCTGACTTCAACAGCGACACC CACT CCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGACCACTTTGTCAAGCTCATTT CCTG GTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAGGGTCTGGCGCCCTCTGGTG GCTG G C T C AGAAAAAG G G C C C T GAC AAC T C T T T AC AT C T T C TAG G T AT GAC AAC GAG T T C G GAT AT AG C AAT AGAG T G G T C GAT C T GAT G G C T CAT AT G G C TAG C AAAGAG GAG G G C AGAG GAAG T C T T C T A ACATGCGGTGACGTGGAGGAGAATCCTGGCCCGATGGTGAGCAAGGGCGAGGAGCTGTTC ACCG GGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGT CCGG CGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGG CAAG CTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGC CGCT ACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCC AGGA GCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGA GGGC GACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATC CTGG G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC AAC GTCTATAT CAT G G C C GAC AAG C AGAAGAA CGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGC CGAC CACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTAC CTGA GCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGG AGTT CGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGGGAAGCGGAGCTAC TAAC TTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCTATGGTGAGCAAG GGCG AGGAGGATAACATGGCCATCATCAAGGAGTTCATGCGCTTCAAGGTGCACATGGAGGGCT CCGT GAACGGCCACGAGTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTACGAGGGCACCCA GACC GCCAAGCTGAAGGTGACCAAGGGTGGCCCCCTGCCCTTCGCCTGGGACATCCTGTCCCCT CAGT TCATGTACGGCTCCAAGGCCTACGTGAAGCACCCCGCCGACATCCCCGACTACTTGAAGC TGTC CTTCCCCGAGGGCTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGGCGGCGTGGTGAC CGTG ACCCAGGACTCCTCCCTGCAGGACGGCGAGTTCATCTACAAGGTGAAGCTGCGCGGCACC AACT TCCCCTCCGACGGCCCCGTAATGCAGAAGAAGACAATGGGCTGGGAGGCCTCCTCCGAGC GGAT GTACCCCGAGGACGGCGCCCTGAAGGGCGAGATCAAGCAGAGGCTGAAGCTGAAGGACGG CGGC CACTACGACGCTGAGGTCAAGACCACCTACAAGGCCAAGAAGCCCGTGCAGCTGCCCGGC GCCT AC AAC G T C AAC AT C AAG T T G GAC AT C AC C T C C C AC AAC GAG GAC T AC AC C AT C G T G GAAC AG T A CGAACGCGCCGAGGGCCGCCACTCCACCGGCGGCATGGACGAGCTGTACAAGTAAGCGGC CGCG TCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCC AGCC ATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGT CCTT TCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGG GGTG GGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATG CGGT GGGCTCTATGGATTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCCCACATGGCCTCC AAGG AG T AAGAC C C C T G GAC C AC C AG C C C C AG C AAGAG C AC AAGAG GAAGAGAGAGAC C C T C AC T G C T GGGGAGTCCCTGCCACACTCAGTCCCCCACCACACTGAATCTCCCCTCCTCACAGTTGCC ATGT AGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCATGTACCATCAATAAA GTAC CCTGTGCTCAACCAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGGGCAGAGGGGAGGGA AGCT GGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATGTTCTCCTCAGAC TGAG GGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGAACCATTTGCTTCCCGCTCAGACGTC TTGA GTGCTACAGGAAGCTGGCACCACTACTTCAGAGAACAAGGCCTTTTCCTCTCCTCGCTCC AGT

SEQ ID NO: 43 - exemplary donor template for insertion at GAPDH locus

GAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGGGCTCTCCA GAACATC ATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATCCCTGAGCTGAACGGG AAGC TCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCAGTGGTGGACCTGACCTGCC GTCT AGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGAAGCAGGCGTCGGAGGGCCC CCTC AAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTCTGACTTCAACAGCGACACC CACT CCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGACCACTTTGTCAAGCTCATTT CCTG GTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAGGGTCTGGCGCCCTCTGGTG GCTG G C T C AGAAAAAG G G C C C T GAC AAC T C T T T AC AT C T T C TAG G T AT GAC AAC GAG T T C G GAT AT AG C AAT AGAG T G G T C GAT C T GAT G G C T CAT AT G G C TAG C AAAGAG GAG G G C AGAG GAAG T C T T C T A ACATGCGGTGACGTGGAGGAGAATCCTGGCCCGATGGTGAGCAAGGGCGAGGAGCTGTTC ACCG GGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGT CCGG CGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGG CAAG CTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGC CGCT ACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCC AGGA GCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGA GGGC GACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATC CTGG G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC AAC GTCTATAT CAT G G C C GAC AAG C AGAAGAA CGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGC CGAC CACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTAC CTGA GCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGG AGTT CGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAACCCCTCTCCCT CCCC CCCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATAT GTTA TTTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTC TTGA CGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCG TGAA GGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAG GCAG CGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACA CCTG CAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAAT GGCT CTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGG ATCT GATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTA GGCC CCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAAATGGTGAGCAAGG GCGA GGAGGATAACATGGCCATCATCAAGGAGTTCATGCGCTTCAAGGTGCACATGGAGGGCTC CGTG AACGGCCACGAGTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTACGAGGGCACCCAG ACCG CCAAGCTGAAGGTGACCAAGGGTGGCCCCCTGCCCTTCGCCTGGGACATCCTGTCCCCTC AGTT CATGTACGGCTCCAAGGCCTACGTGAAGCACCCCGCCGACATCCCCGACTACTTGAAGCT GTCC TTCCCCGAGGGCTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGGCGGCGTGGTGACC GTGA CCCAGGACTCCTCCCTGCAGGACGGCGAGTTCATCTACAAGGTGAAGCTGCGCGGCACCA ACTT CCCCTCCGACGGCCCCGTAATGCAGAAGAAGACAATGGGCTGGGAGGCCTCCTCCGAGCG GATG TACCCCGAGGACGGCGCCCTGAAGGGCGAGATCAAGCAGAGGCTGAAGCTGAAGGACGGC GGCC ACTACGACGCTGAGGTCAAGACCACCTACAAGGCCAAGAAGCCCGTGCAGCTGCCCGGCG CCTA C AAC G T C AAC AT C AAG T T G GAC AT C AC C T C C C AC AAC GAG GAC T AC AC C AT C G T G GAAC AG T AC GAACGCGCCGAGGGCCGCCACTCCACCGGCGGCATGGACGAGCTGTACAAGTAAGCGGCC GCGT CGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCA GCCA TCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTC CTTT CCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGG GTGG GGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGC GGTG GGCTCTATGGATTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCCCACATGGCCTCCA AGGA G T AAGAC C C C T G GAC C AC C AG C C C C AG C AAGAG C AC AAGAG GAAGAGAGAGAC C C T C AC T G C T G GGGAGTCCCTGCCACACTCAGTCCCCCACCACACTGAATCTCCCCTCCTCACAGTTGCCA TGTA GACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCATGTACCATCAATAAAG TACC CTGTGCTCAACCAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGGGCAGAGGGGAGGGAA GCTG GGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATGTTCTCCTCAGACT GAGG GTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGAACCATTTGCTTCCCGCTCAGACGTCT TGAG TGCTACAGGAAGCTGGCACCACTACTTCAGAGAACAAGGCCTTTTCCTCTCCTCGCTCCA GT

SEQ ID NO: 44 - exemplary donor template for insertion at GAPDH locus

GAAGCTGGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATGTTC TCCTCAG

ACTGAGGGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAGAACCATTTGCTTCCCGC TCAGACG

TCTTGAGTGCTACAGGAAGCTGGCACCACTACTTCAGAGAACAAGGCCTTTTCCTCT CCTCGCT

CCAGT

SEQ ID NO: 45 - exemplary donor template for insertion at GAPDH locus

GAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGGGCTCTCCA GAACATC ATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATCCCTGAGCTGAACGGG AAGC TCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCAGTGGTGGACCTGACCTGCC GTCT AGAAAAACCTGCCAAATATGATGACATCAAGAAGGTGGTGAAGCAGGCGTCGGAGGGCCC CCTC AAGGGCATCCTGGGCTACACTGAGCACCAGGTGGTCTCCTCTGACTTCAACAGCGACACC CACT CCTCCACCTTTGACGCTGGGGCTGGCATTGCCCTCAACGACCACTTTGTCAAGCTCATTT CCTG GTATGTGGCTGGGGCCAGAGACTGGCTCTTAAAAAGTGCAGGGTCTGGCGCCCTCTGGTG GCTG G C T C AGAAAAAG G G C C C T GAC AAC T C T T T AC AT C T T C TAG G T AT GAC AAC GAG T T C G GAT AT AG CAATAGAGTGGTCGATCTGATGGCTCATATGGCTAGCAAAGAGGGAAGCGGAGCTACTAA CTTC AGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCTATGGATTGGACCTGG ATCC TGTTTCTGGTGGCCGCTGCCACAAGAGTGCACAGCAATTGGGTCAACGTGATCAGCGACC TGAA GAAGAT C GAG GAC C T GAT C C AGAG C AT G C AC AT C GAC G C C AC AC T G T AC AC C GAG TCCGATGTG CACCCTAGCTGCAAAGTGACCGCCATGAAGTGCTTTCTGCTGGAACTGCAAGTGATCAGC CTGG AAAGCGGCGACGCCAGCATCCACGATACCGTGGAAAACCTGATCATCCTGGCCAACAACA GCCT GAG C AG C AAC G G C AAT G T GAC C GAGAG C G G C T G C AAAGAG T G C GAG GAAC T G GAAGAGAAGAAC ATCAAAGAGTTCCTCCAGAGCTTCGTCCACATCGTGCAGATGTTCATCAACACCAGCTCT GGCG GAGGAAGCGGAGGCGGAGGATCTGGTGGTGGTGGATCTGGCGGCGGTGGTAGTGGCGGAG GTTC TCTGCAAATCACCTGTCCTCCACCTATGAGCGTGGAACACGCCGACATCTGGGTCAAGAG CTAC AG C C T G T AC AG C AGAGAG C G G T AC AT C T G C AAC AG C G G C T T C AAGAGAAAG G C C G G C AC AAG C A GCCTGACCGAGTGTGTGCTGAACAAGGCCACAAACGTGGCCCACTGGACCACACCTAGCC TGAA GTGCATCAGAGATCCCGCTCTGGTTCATCAGAGGCCTGCCCCTCCATCTACAGTGACAAC AGCT GGCGTGACCCCTCAGCCTGAGTCTCTGTCTCCATCTGGAAAAGAGCCTGCCGCCAGCTCT CCCA GCTCTAACAATACTGCTGCCACCACAGCCGCTATCGTGCCTGGATCTCAGCTGATGCCTA GCAA GAG C C C TAG C AC C G G C AC AAC AGAGAT C AG C T C T C AC GAGAG C AG C C AC G GAAC AC C T T C T C AG ACCACCGCCAAGAATTGGGAGCTGACAGCCTCTGCCTCTCATCAGCCACCTGGCGTGTAC CCAC AGGGCCACTCTGATACAACAGTGGCCATCAGCACCAGCACCGTTCTGCTGTGTGGCCTGT CTGC TGTTAGCCTGCTGGCCTGCTACCTGAAGTCTAGACAGACACCTCCTCTGGCCAGCGTGGA AATG GAAG C CAT G GAAG CTCTGCCTGT C AC AT G G G G C AC C AG C AG C AGAGAT GAG GAC C T C GAGAAT T GCAGCCACCACCTGTAGGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATC AGCC TCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTG ACCC TGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTC TGAG TAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGA AGAC AATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGATTTGGCTACAGCAACAGGGTGG TGGA CCTCATGGCCCACATGGCCTCCAAGGAGTAAGACCCCTGGACCACCAGCCCCAGCAAGAG CACA AGAG GAAG AG AG AG AC CCTCACTGCTGGG GAG TCCCTGCCACACTCAGTCCCCCACCACACT G A ATCTCCCCTCCTCACAGTTGCCATGTAGACCCCTTGAAGAGGGGAGGGGCCTAGGGAGCC GCAC CTTGTCATGTACCATCAATAAAGTACCCTGTGCTCAACCAGT TACT TGTCCTGTCT TAT TCTAG GGTCTGGGGCAGAGGGGAGGGAAGCTGGGCTTGTGTCAAGGTGAGACATTCTTGCTGGGG AGGG ACCTGGTATGTTCTCCTCAGACTGAGGGTAGGGCCTCCAAACAGCCTTGCTTGCTTCGAG AACC ATTTGCTTCCCGCTCAGACGTCTTGAGTGCTACAGGAAGCTGGCACCACTACTTCAGAGA ACAA

GGCCTTTTCCTCTCCTCGCTCCAGT

SEQ ID NO: 46 - exemplary donor template for insertion at GAPDH locus

GGCTTTCCCATAATTTCCTTTCAAGGTGGGGAGGGAGGTAGAGGGGTGATGTGGGGA GTACGCT GCAGGGCCTCACTCCTTTTGCAGACCACAGTCCATGCCATCACTGCCACCCAGAAGACTG TGGA TGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGGGCTCTCCAGAACATCATCCCTGC CTCT ACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATCCCTGAGCTGAACGGGAAGCTCACTGGC ATGG CCTTCCGTGTCCCCACTGCCAACGTGTCAGTGGTGGACCTGACCTGCCGTCTAGAAAAAC CTGC CAAATATGATGACATCAAGAAGGTGGTGAAGCAGGCGTCGGAGGGCCCCCTCAAGGGCAT CCTG GGCTACACTGAGCACCAGGTGGTCTCCTCTGACTTCAACAGCGACACCCACTCCTCCACC TTTG ACGCTGGGGCTGGCATTGCCCTCAACGACCACTTTGTCAAGCTCATCTCTTGGTACGACA ATGA G T T C G GAT AT AG C AAT AGAG T G G T C GAT C T GAT G G C T CAT AT G G C TAG C AAAGAG G GAAG C G GA GCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCTATG GTGA GCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACG TAAA CGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGAC CCTG AAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTG ACCT ACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGT CCGC CATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAA GACC CGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATC GACT T C AAG GAG GAC G G C AAC AT C C T G G G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC AAC G T C T A TAT CAT G G C C GAC AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AAGAT C C G C C AC AAC AT C GAG GACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCC GTGC TGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGA AGCG CGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGA GCTG TACAAGTGAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGA CTGT GCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGA AGGT GCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGG TGTC ATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATA GCAG GCATGCTGGGGATGCGGTGGGCTCTATGGAGACTGGCTCTTAAAAAGTGCAGGGTCTGGC GCCC TCTGGTGGCTGGCTCAGAAAAAGGGCCCTGACAACTCTTTTCATCTTCTAGGTATGACAA CGAA TTTGGCTACAGCAACAGGGTGGTGGACCTCATGGCCCACATGGCCTCCAAGGAGTAAGAC CCCT G GAC CACCAGCCCCAG C AAG AG C AC AAG AG GAAG AG AG AG AC CCTCACTGCTGGG GAG T C C C T G CCACACTCAGTCCCCCACCACACTGAATCTCCCCTCCTCACAGTTGCCATGTAGACCCCT TGAA GAGGGGAGGGGCCTAGGGAGCCGCACCTTGTCATGTACCATCAATAAAGTACCCTGTGCT CAAC CAGTTACTTGTCCTGTCTTATTCTAGGGTCTGGGGCAGAGGGGAGGGAAGCTGGGCTTGT GTCA AGGTGAGACATTCTTGCTGGGGAGGGACCTGGTATGTTCTCCTCAGACTGAGGGTAGGGC CTCC AAACAGCCTTGCTTGCT

SEQ ID NO: 47 - exemplary donor template for insertion at TBP locus

G C AGAC T T C CAT T T AC AG T GAG GAG G T GAG CAT T G CAT T GAAC AAAAGAT GGCGTTTT C AC T T G GAAT TAGT TAT C T GAAGCT T TAGGAT T CC T C AG C AAT AT GAT TAT GAGAC AAGAAAG GAAGAT T CAGAAAT GAG T C T AG T T GAAG G C AG C AAT T C AGAGAAGAAGAT T C AG TTGTTATCATTGCCGTC CTGCTTGGTTTATGGCCTGGTTCAGGACCAAGGAGAGAAGTGTGAATACATGCCTCTTGA GCTA TAGAAT GAGAC G C T G GAG T C AC T AAGAT GAT T T T T TAAAAG TATTGTTT T AT AAAC AAAAAT AA GATTGTGACAAGGGATTCCACTATTAATGTTTTCATGCCTGTGCCTTAATCTGACTGGGT ATGG TGAGAATTGTGCTTGCAGCTTTAAGGTAAGAATTTTACCATCTTAATATGTTAAGAAGTG CCAT TTCAGTCTCTCATCTCTACTCCAACTTGTCTTCTTAGGTGCTAAAGTCAGAGCCGAAATC TACG AGGCCTTCGAGAACATCTACCCCATCCTGAAGGGCTTCAGAAAGACCACCGGAAGCGGAG CTAC TAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCTATGGTGAG CAAG GGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAAC GGCC ACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGA AGTT CATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTA CGGC GTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCC ATGC CCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCC GCGC CGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTT CAAG GAG GAC G G C AAC AT C C T G G G G CAC AAG C T G GAG T AC AAC T AC AAC AG C CAC AAC GTCTATAT C A T G G C C GAC AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AAGAT C C G C CAC AAC AT C GAG GAC G G CAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCT GCTG C C C GAC AAC CACTACCT GAG CACCCAGTCCGCCCT GAG C AAAG AC C C C AAC GAG AAG C G C GAT C ACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGT ACAA GTGAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTG CCTT CTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTG CCAC TCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCA TTCT ATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGG CATG CTGGGGATGCGGTGGGCTCTATGGCAGAAATTTATGAAGCATTTGAAAACATCTACCCTA TTCT AAAGGGATTCAGGAAGACGACGTAATGGCTCTCATGTACCCTTGCCTCCCCCACCCCCTT CTTT TTTTTTTTTTAAACAAATCAGTTTGTTTTGGTACCTTTAAATGGTGGTGTTGTGAGAAGA TGGA TGTTGAGTTGCAGGGTGTGGCACCAGGTGATGCCCTTCTGTAAGTGCCCACCGCGGGATG CCGG GAAGGGGCATTATTTGTGCACTGAGAACACCGCGCAGCGTGACTGTGAGTTGCTCATACC GTGC TGCTATCTGGGCAGCGCTGCCCATTTATTTATATGTAGATTTTAAACACTGCTGTTGACA AGTT GGTTTGAGGGAGAAAACTTTAAGTGTTAAAGCCACCTCTATAATTGATTGGACTTTTTAA TTTT AAT G T T T T T C C C CAT GAAC CAC AG T T T T TAT AT T T C T AC CAGAAAAG T AAAAAT C T T T T T T AAA AGTGTTGTTTTT

SEQ ID NO: 49 - exemplary donor template for insertion at TBP locus

C T GACCACAGCT C T GCAAGCAGAC T TCCAT T T AC AG T GAG GAG G T GAG CAT T GCAT T GAACAAA AGAT GGCGT T T T CAC T TGGAAT TAGT TAT C T GAAGCT T TAGGAT T CC T CAGCAATAT GAT TAT G AGAC AAGAAAG GAAGAT T CAGAAAT GAGTC TAGT T GAAGGCAGCAAT T CAGAGAAGAAGAT T CA GTTGTTATCATTGCCGTCCTGCTTGGTTTATGGCCTGGTTCAGGACCAAGGAGAGAAGTG TGAA T ACAT GC C T C T T GAG C T AT AGAAT GAG AC G C T G GAG T CAC TAAGAT GAT T T T T TAAAAG TAT T G T T T T AT AAAC AAAAAT AAGAT T G T GAC AAG G GAT T C C AC T AT T AAT G T T T T C AT G C C T G T G C C T TAATCTGACTGGGTATGGTGAGAATTGTGCTTGCAGCTTTAAGGTAAGAATTTTACCATC TTAA TATGTTAAGAAGTGCCATTTCAGTCTCTCATCTCTACTCCAACTTGTCTTCTTAGGGGCT AAAG TGCGGGCCGAGATCTACGAGGCCTTCGAGAATATCTACCCCATCCTGAAGGGCTTCAGAA AGAC CACCGGAAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAA CCCT GGACCTATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAG CTGG ACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCT ACGG CAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCT CGTG ACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCAC GACT TCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACG ACGG CAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGA GCTG AAG G G CAT C GAC T T C AAG GAG GAC G G C AAC AT C C T G G G G CAC AAG C T G GAG T AC AAC T AC AAC A G C CAC AAC GTCTATAT CAT G G C C GAC AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AAGAT C C G C CAC AAC AT C GAG GAC GGCAGCGTGCAGCTCGCC GAC CACTACCAG C AG AAC AC CCCCATCGGC GACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAA GACC CCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTC TCGG CATGGACGAGCTGTACAAGTGAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAACCCGC TGAT CAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTT CCTT GACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCA TTGT CTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGAT TGGG AAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGTAGGTGCTAAAGTCAGAG CAGA AAT T T AT GAAG CAT T T GAAAAC AT C T AC C C T AT T C T AAAG G GAT T C AG GAAGAC GAC G T AAT G G CTCTCATGTACCCTTGCCTCCCCCACCCCCTTCTTTTTTTTTTTTTAAACAAATCAGTTT GTTT TGGTACCTTTAAATGGTGGTGTTGTGAGAAGATGGATGTTGAGTTGCAGGGTGTGGCACC AGGT GATGCCCTTCTGTAAGTGCCCACCGCGGGATGCCGGGAAGGGGCATTATTTGTGCACTGA GAAC ACCGCGCAGCGTGACTGTGAGTTGCTCATACCGTGCTGCTATCTGGGCAGCGCTGCCCAT TTAT T T AT AT G T AGAT T T T AAAC AC T G C T G T T GAC AAG T T G G T T T GAG G GAGAAAAC T T T AAG T G T T A AAGCCACCTCTATAATTGATTGGACTTTTTAATTTTAATGTTTTTCCCCATGAACCACAG TTTT TAT AT T T C T AC C AGAAAAG T AAAAAT C T T T

SEQ ID NO: 50 - exemplary donor template for insertion at TBP locus

ACAAAAGATGGCGTTTTCACTTGGAATTAGTTATCTGAAGCTTTAGGATTCCTCAGC AATATGA T TAT GAGAC AAGAAAG GAAGAT T CAGAAAT GAGTC TAGTT GAAGGCAGCAAT T CAGAGAAGAAG ATTCAGTTGTTATCATTGCCGTCCTGCTTGGTTTATGGCCTGGTTCAGGACCAAGGAGAG AAGT G T GAAT ACAT GC C T C T T GAG C T AT AGAAT GAGAC GC T GGAG T CAC TAAGAT GAT T T T T TAAAAG TATTGTTT T AT AAAC AAAAAT AAGAT T G T GAC AAG G GAT T C CAC TATTAATGTTTTCATGCCTG TGCCTTAATCTGACTGGGTATGGTGAGAATTGTGCTTGCAGCTTTAAGGTAAGAATTTTA CCAT CTTAATATGTTAAGAAGTGCCATTTCAGTCTCTCATCTCTACTCCAACTTGTCTTCTTAG GTGC T AAAG T C AGAG CAGAAAT T TAT G AAG CAT T CGAGAACAT C TACCC TAT T C TAAAGGGAT T CAGG AAG AC GAC G G G AAG C G GAG C T AC T AAC TTCAGCCTGCT G AAG C AG G C T G GAGAC G T G GAG GAGA ACCCTGGACCTATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGG TCGA GCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGC CACC TACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCC ACCC TCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGC AGCA CGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAA GGAC GACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGC ATCG AGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACA ACTA C AAC AG C CAC AAC GTCTATAT CAT G G C C GAC AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AAG AT C C G C CAC AAC AT C GAG GAC GGCAGCGTGCAGCTCGCC GAC CACTACCAG C AG AAC AC C C C C A TCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGA GCAA AGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGAT CACT CTCGGCATGGACGAGCTGTACAAGTGAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAAA CCCG CTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGT GCCT TCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCA TCGC ATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGG AGGA TTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGAAGGGATTCAGGA AGAC GACGTAATGGCTCTCATGTACCCTTGCCTCCCCCACCCCCTTCTTTTTTTTTTTTTAAAC AAAT CAGTTTGTTTTGGTACCTTTAAATGGTGGTGTTGTGAGAAGATGGATGTTGAGTTGCAGG GTGT GGCACCAGGTGATGCCCTTCTGTAAGTGCCCACCGCGGGATGCCGGGAAGGGGCATTATT TGTG CACTGAGAACACCGCGCAGCGTGACTGTGAGTTGCTCATACCGTGCTGCTATCTGGGCAG CGCT GCCCATTTATTTATATGTAGATTTTAAACACTGCTGTTGACAAGTTGGTTTGAGGGAGAA AACT TTAAGTGTTAAAGCCACCTCTATAATTGATTGGACTTTTTAATTTTAATGTTTTTCCCCA TGAA CCACAGTTTTTATATTTCTACCAGAAAAGTAAAAATCTTTTTTAAAAGTGTTGTTTTTCT AATT TAT AAC T C C TAG GGGTTATTTCTGTGC C AGAC AC A

SEQ ID NO: 51 - exemplary donor template for insertion at G6PD locus

GGCCCGGGGGACTCCACATGGTGGCAGGCAGTGGCATCAGCAAGACACTCTCTCCCT CACAGAA CGTGAAGCTCCCTGACGCCTATGAGCGCCTCATCCTGGACGTCTTCTGCGGGAGCCAGAT GCAC TTCGTGCGCAGGTGAGGCCCAGCTGCCGGCCCCTGCATACCTGTGGGCTATGGGGTGGCC TTTG CCCTCCCTCCCTGTGTGCCACCGGCCTCCCAAGCCATACCATGTCCCCTCAGCGACGAGC TCCG TGAGGCCTGGCGTATTTTCACCCCACTGCTGCACCAGATTGAGCTGGAGAAGCCCAAGCC CATC CCCTATATTTATGGCAGGTGAGGAAAGGGTGGGGGCTGGGGACAGAGCCCAGCGGGCAGG GGCG GGGTGAGGGTGGAGCTACCTCATGCCTCTCCTCCACCCGTCACTCTCCAGCCGAGGCCCC ACGG AG G C AGAC GAG C T GAT GAAGAGAG TGGGCTTC C AG T AC GAG G GAAC C T AC AAAT G G G T C AAC C C TCACAAGCTGGGAAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGA GGAG AACCCTGGACCTATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTG GTCG AGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATG CCAC CTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCC CACC CTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAG CAGC ACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCA AGGA CGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCG CATC GAG C T G AAG G G C AT C G AC T T C AAG GAG G AC G G C AAC AT C C T G G G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC AAC GTCTATAT CAT G G C C GAC AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C AA GAT C C G C C AC AAC AT C GAG G AC GGCAGCGTGCAGCTCGCC G AC CACTACCAG C AG AAC AC C C C C ATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTG AGCA AAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGA TCAC TCTCGGCATGGACGAGCTGTACAAGTGAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTAA ACCC GCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCG TGCC TTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGC ATCG CATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGG GAGG ATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGGTGGGTGAACCC CCAC AAGCTCTGAGCCCTGGGCACCCACCTCCACCCCCGCCACGGCCACCCTCCTTCCCGCCGC CCGA CCCCGAGTCGGGAGGACTCCGGGACCATTGACCTCAGCTGCACATTCCTGGCCCCGGGCT CTGG CCACCCTGGCCCGCCCCTCGCTGCTGCTACTACCCGAGCCCAGCTACATTCCTCAGCTGC CAAG CACTCGAGACCATCCTGGCCCCTCCAGACCCTGCCTGAGCCCAGGAGCTGAGTCACCTCC TCCA CTCACTCCAGCCCAACAGAAGGAAGGAGGAGGGCGCCCATTCGTCTGTCCCAGAGCTTAT TGGC CACTGGGTCTCACTCCTGAGTGGGGCCAGGGTGGGAGGGAGGGACGAGGGGGAGGAAAGG GGCG AGCACCCACGTGAGAGAATCTGCCTGTGGCCTTGCCCGCCAGCCTCAGTGCCACTTGACA TTCC TTGTCACCAGCAACATCTCGAGCCCCCTGGATGTCC

SEQ ID NO: 52 - exemplary donor template for insertion at E2F4 locus

CCAGGGGGCTGTAGTGGGGCCAGGCTGGACCTCTGTGCCCTGAGCATGGCTTTCTTG TTTTTCA

GTTTTGGAACTCCCCAAAGAGCTGTCAGAAATCTTTGATCCCACACGAGGTAGGCTG CTGCATT

CCTCCCTGAGGCTAGGGGTAAGGGACACAGCTCATTGGGTCCTATGGCTGTTTTCTT GCCCTTT

TGAGGACCTTGTTGTGGCGCTTATGGTAACTGGGGCAAAGGGTGAAGTTCCTGATGG GCAGGTG GGGTTCCCTTTCCTGGGCTTTGGTGGGTGGAGAGGTGGGAGCTGGAATGTTAGTAACTGA GCTC CCTCCATTCCCAGAGTGCATGAGCTCGGAGCTGCTGGAGGAGTTGATGTCCTCAGAAGGT GGGT GGCCCTGGAAGGTGGGAGTGGGTGTGGGCAGGGGTTGGGCTGCTGCTAGGGGAGCCCTGG CCCA GGGCCTGAGACTAGTGCTCTCTGCAGTGTTCGCCCCTCTGCTGAGACTTTCTCCTCCTCC TGGC GACCACGACTACATCTACAACCTGGACGAGAGCGAGGGCGTGTGCGACCTGTTTGATGTG CCCG TGCTGAACCTGGGAAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGG AGGA GAACCCTGGACCTATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCT GGTC GAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGAT GCCA CCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGC CCAC CCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAA GCAG CACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTC AAGG ACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACC GCAT C GAG C T G AAG G G C AT C G AC T T C AAG GAG G AC G G C AAC AT C C T G G G G C AC AAG C T G GAG T AC AAC T AC AAC AG C C AC AAC GTCTATAT CAT G G C C GAC AAG C AGAAGAAC G G CAT C AAG G T GAAC T T C A AGAT C C G C C AC AAC AT C GAG GAC G G C AG C G T G C AG C T C G C C GAC C AC T AC C AG C AGAAC AC C C C CATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCT GAGC AAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGG ATCA CTCTCGGCATGGACGAGCTGTACAAGTGAGCGGCCGCGTCGAGTCTAGAGGGCCCGTTTA AACC CGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCC GTGC CTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTG CATC GCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGG GGAG GATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCCACCCCCGGG AGAC CACGATTATATCTACAACCTGGACGAGAGTGAAGGTGTCTGTGACCTCTTTGATGTGCCT GTTC TCAACCTCTGACTGACAGGGACATGCCCTGTGTGGCTGGGACCCAGACTGTCTGACCTGG GGGT TGCCTGGGGACCTCTCCCACCCGACCCCTACAGAGCTTGAGAGCCACAGACGCCTGGCTT CTCC GGCCTCCCCTCACCGCACAGTTCTGGCCACAGCTCCCGCTCCTGTGCTGGCACTTCTGTG CTCG CAGAGCAGGGGAACAGGACTCAGCCCCCATCACCGTGGAGCCAAAGTGTTTGCTTCTCCC TTTC TGCGGCCTTCGCCAGCCCAGGCTCGGCTGCCACCCAGTGGCACAGAACCGAGGAGCTGCC ATTA CCCCCCATAGGGGGCAGTGTCTTGTTCCTGCCAGCCTCAGTGTCTTGCTTCTGCCAGCTC CTTC CCCTAGGAGGGAAGGGTGGGGTGGAACTGGGCACATG

SEQ ID NO: 53 - exemplary donor template for insertion at E2F4 locus