Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SPLIT PRIME EDITING ENZYME
Document Type and Number:
WIPO Patent Application WO/2022/234051
Kind Code:
A1
Abstract:
The invention relates to a system comprising two viral vectors encoding the components of a prime editor complex for delivery into cells. The prime editor complex is assembled by split intein assembly and consists of a Cas9 nickase and a reverse transcriptase.

Inventors:
SCHWANK GERALD (CH)
VILLIGER LUKAS (CH)
BÖCK DESIREE (CH)
Application Number:
PCT/EP2022/062223
Publication Date:
November 10, 2022
Filing Date:
May 05, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV ZUERICH (CH)
International Classes:
C12N9/12; C12N9/22; C12N15/10; C12N15/11; C12N15/86
Domestic Patent References:
WO2020191246A12020-09-24
WO2021072328A12021-04-15
Foreign References:
EP21172542A2021-05-06
EP21206002A2021-11-02
Other References:
ZHENG CHUNWEI ET AL: "Development of a flexible split prime editor using truncated reverse transcriptase", BIORXIV, 29 August 2021 (2021-08-29), XP055915218, Retrieved from the Internet [retrieved on 20220425], DOI: 10.1101/2021.08.26.457801
DONG-JIUNN JEFFERY TRUONG ET AL: "Development of an intein-mediated split-Cas9 system for gene therapy", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, GB, vol. 43, no. 13, 1 January 2015 (2015-01-01), pages 6450 - 6458, XP002758945, ISSN: 0305-1048, [retrieved on 20150616], DOI: 10.1093/NAR/GKV601
ANZALONE ANDREW V ET AL: "Search-and-replace genome editing without double-strand breaks or donor DNA", NATURE, NATURE PUBLISHING GROUP UK, LONDON, vol. 576, no. 7785, 21 October 2019 (2019-10-21), pages 149 - 157, XP036953141, ISSN: 0028-0836, [retrieved on 20191021], DOI: 10.1038/S41586-019-1711-4
YAN JUN ET AL: "Prime Editing: Precision Genome Editing by Reverse Transcription", MOLECULAR CELL, ELSEVIER, AMSTERDAM, NL, vol. 77, no. 2, 16 January 2020 (2020-01-16), pages 210 - 212, XP086007503, ISSN: 1097-2765, [retrieved on 20200116], DOI: 10.1016/J.MOLCEL.2019.12.016
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 2012, COLD SPRING HARBOR LABORATORY PRESS
AUSUBEL ET AL.: "Short Protocols in Molecular Biology", 2002, JOHN WILEY & SONS, INC.
SMITHWATERMAN, ADV. APPL. MATH, vol. 2, 1981, pages 482
NEEDLEMANWUNSCH, J. MOL. BIOL., vol. 48, 1970, pages 443
PEARSONLIPMAN, PROC. NAT. ACAD. SCI., vol. 85, 1988, pages 2444
ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410
ANZALONE ET AL., NATURE, vol. 576, 2019, pages 149 - 157
YANEZ-MUNOZ ET AL., NATURE MEDICINE, vol. 12, 2006, pages 348 - 353
HAMILTON ET AL., HUMAN GENE THERAPY, October 2018 (2018-10-01), pages 1213 - 1225
SHAWCORNETTA, BIOMEDICINES, vol. 2, 2014, pages 14 - 35
LIN ET AL., NATURE BIOTECHNOLOGY, vol. 38, 2020, pages 582 - 585
Attorney, Agent or Firm:
JUNGHANS, Claas (DE)
Download PDF:
Claims:
Claims

1. A prime editing protein RNA complex, said complex comprising a. a first fusion sequence comprising CAS9 nickase activity and b. a second fusion sequence comprising reverse transcriptase activity but lacking RNAseH activity, said first fusion sequence and said second fusion sequence forming a contiguous polypeptide sequence, and c. a pegRNA comprising, from 5’ to 3’ end, i. a guide RNA sequence tract capable of hybridizing to a genomic DNA target adjacent sequence, ii. a structural RNA sequence tract facilitating interaction and trans-activation of the first fusion polypeptide comprising CAS9 nickase activity, iii. a template strand sequence tract containing a sequence reverse complementary to an edited target sequence and iv. a hybridizing sequence tract.

2. A combination medicament (a composition) comprising a first AAV vector and a second AAV vector, a. the first AAV vector encoding a first vector polypeptide having a first N- terminal end and a first C-terminal end, and said first vector polypeptide comprising an N-terminal part of a split intein component at the first C-terminal end; b. the second AAV vector encoding a second vector polypeptide having a second N-terminal end and a second C-terminal end, and said second vector polypeptide comprising a C-terminal part of a split intein component at its N- terminal end; c. the combination medicament further comprising an AAV vector, particularly the first or second AAV vector, further encoding a pegRNA; wherein the first vector polypeptide and the second vector polypeptide, when both are present within a target cell, are capable of forming a fusion polypeptide comprising a first fusion sequence characterized by CAS9 nickase activity and a second fusion sequence characterized by reverse transcriptase activity, and wherein d. the pegRNA comprises, or essentially consists of, from 5’ to 3’ end, i. a guide RNA sequence tract capable of hybridizing to a genomic DNA target adjacent sequence, ii. a structural RNA sequence tract facilitating interaction with the first fusion part and trans-activation of the first fusion part, iii. a template sequence tract containing a sequence reverse complementary to an edited target sequence and iv. a hybridizing sequence tract; and v. the pegRNA is capable of interacting with the fusion polypeptide to yield a prime editing protein RNA complex

3. The combination medicament according to claim 2, wherein the first AAV vector and the second AAV vector are selected from an Adeno-associated virus-based vector and a nonintegrating lentiviral vector, particularly wherein the first AAV vector and the second AAV vector are an Adeno-associated virus-based vector, more particularly an AAV2 vector.

4. The combination medicament according to any one of claims 2 to 3, wherein a. said first vector polypeptide comprises SEQ ID NO 001 , or a sequence at least 90% identical to SEQ ID NO 001 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 001 when joined to SEQ ID NO 002; and b. said second vector polypeptide comprises SEQ ID NO 002, or a sequence at least 90% identical to SEQ ID NO 002 and having -when joined to the first vector polypeptide- the same biological activity as SEQ ID NO 002 when joined to SEQ ID NO 001.

5. The combination medicament according to any one of claims 2 to 4, wherein said N- terminal part of a split intein component and said C-terminal part of a split intein component are, or are derived from, a split intein system found in an organism of the group comprising the cyanobacterium Nostoc punctiforme (Npu), Mxe intein from Mycobacterium xenopi GyrA, DnaE and Rma intein from Rhodothermus marinus, particularly wherein the split intein is the Nostoc punctiforme split intein.

6. The combination medicament according to any one of claims 2 to 5, wherein a. said first vector polypeptide is or comprises SEQ ID NO 003, or a sequence at least 90% identical to SEQ ID NO 003 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 003 when joined to SEQ ID NO 004; and b. said second vector polypeptide comprises SEQ ID NO 004, or a sequence at least 90% identical to SEQ ID NO 004 and having -when joined to the first vector polypeptide- the same biological activity as SEQ ID NO 004 when joined to SEQ ID NO 003.

7. The prime editing protein RNA complex according to claim 1 , or the combination medicament according to claim 2 to 6, wherein a. the first fusion sequence is or comprises SEQ ID NO 005, or a sequence at least 90% identical to SEQ ID NO 005 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 005 when joined to SEQ ID NO 006; and b. the second fusion sequence is or comprises SEQ ID NO 006, or a sequence at least 90% identical to SEQ ID NO 006 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 006 when joined to SEQ ID NO 005.

8. The prime editing protein RNA complex according to claim 1 or 7, or the combination medicament according to claim 2 to 7, wherein: a. the first fusion sequence characterized by CAS9 nickase activity is streptococcus pyogenes CAS9 H840A; and b. the second fusion polypeptide sequence is Moloney murine leukemia virus reverse transcriptase lacking the RNAseH domain.

9. The prime editing protein RNA complex according to any one of the preceding claims 1 or 7 to 8, wherein the prime editing protein RNA complex comprises a. a pegRNA comprising the structural RNA sequence tract is SEQ ID NO. 008; and b. a polypeptide comprising or consisting of SEQ ID NO 007, or a sequence at least 85% identical, particularly >90% identical, more particularly >95% identical to SEQ ID NO 007 and having the same biological activity as SEQ ID NO 007.

10. The combination medicament, or the prime editing protein RNA complex according to any one of claims 1 to 12, wherein: a. the guide RNA sequence tract is SEQ ID NO 009 and/or b. the template sequence tract is SEQ ID NO 010 and/or c. the hybridizing sequence tract is SEQ ID NO 011.

11. The combination medicament, or the prime editing protein RNA complex according to any one of claims 1 to 13, wherein a. the target sequence is characteristic of a genetic condition in a mammal, particularly a human, characterized by a transition or transversion mutation of a wild type sequence, b. and the template sequence tract is characteristic of the reverse complimentary sequence of the wild type sequence.

12. The combination medicament, or the prime editing protein RNA complex, according to claim 14, wherein the genetic condition is associated to expression of the target sequence in the eye, liver, CNS / brain, myocard, lung, or muscle, particularly in the liver, the brain or the eye.

13. The combination medicament, or the prime editing protein RNA complex, according to any of the preceding claims, for use in the treatment of phenylketonuria.

14. A nucleic acid encoding the prime editing protein RNA complex according to any one of claims 8 to 11, comprising a first nucleic acid sequence encoding a polypeptide comprising said first fusion sequence and said second fusion sequence, and a second nucleic acid sequence encoding said pegRNA, both first and second nucleic acid sequences being under control of a promoter operable in a mammalian cell.

15. A viral vector comprising the nucleic acid according to claim 14, particularly wherein the viral vector is an Adenovirus (AdV), more particularly a human AdV, even more particularly a human AdV5.

Description:
Split Prime Editing Enzyme

The present invention relates to a vector system and enzyme for prime editing. Particular embodiments disclose AAV vector systems encoding two parts of a fusion protein comprising a CAS9 H840A nickase and a reverse transcriptase shortened by deletion of the RNAseH domain. In one aspect, the two components are delivered in two separate vectors and assembled inside of a cell by action of a split intein system in order to fit into the AAV system.

This application claims the benefit of priority of European patent applications EP21172542.9 filed 2 May 2021 and EP21206002.4 filed 2. November 2021 , both of which are incorporated by reference herein.

Background of the Invention

Prime editing represents a promising approach for the treatment of various genetic diseases. Prime editors are composed of a fusion protein between a Cas9 H840A nickase, or a protein having an analogous function, and a modified reverse transcriptase (RT), and the so-called pegRNA. The pegRNA contains a sgRNA domain that guides Cas9 to the locus of interest, and a tail domain at the 3’ end that is used as a primer and template for the RT. Upon DNA binding the Cas9 element of the prime editor nicks the genomic DNA, and the RT element polymerises DNA onto the nicked strand based on the pegRNA sequence. Several studies have shown that prime editing can be used to correct disease-causing mutations in vitro in cell lines. However, to enable the translation of prime editing into the clinics and to treat patients with monogenetic diseases, safe and efficient delivery strategies are needed. Here the large size of prime editors (>6 kb) becomes a major problem. The most promising delivery vectors are Adeno-associated viruses (AAVs), which have already been approved by the FDA, for example for a gene addition therapy to treat spinal muscular atrophy (Zolgensma®). Their packaging size, however, is limited to 4.8kb, and thus they cannot be used to deliver prime editors.

Based on the above-mentioned state of the art, the objective of the present invention is to provide means and methods to enable prime editing technology to treat disease in humans. This objective is attained by the subject-matter of the independent claims of the present specification, with further advantageous embodiments described in the dependent claims, examples, figures and general description of this specification.

The inventors generated prime editors that are significantly reduced in size, in order to facilitate in vivo delivery. As an illustration of the principle of the invention, the inventors first generated a novel variant of the SpCas9 prime editor that lacks the RnaseH domain. Deletion of the RnaseH domain in the reverse transcriptase of the prime editor reduces its size by 10 %, while retaining editing activity comparable to the original prime editor across various genomic sites. Second, the inventors used a split-intein system to divide the prime editor in two parts, allowing expression from two separate AAV vectors (each protein encoding part is < 3kb), leading to constructs containing all regulatory and structural sequences in two packages each being equal or smaller to 4,8kb, thereby fitting into the packaging limit of AAV.

The inventor ' s identified split-intein variants display editing efficiencies comparable to the full-length prime editor across various loci.

Summary of the Invention

A first aspect of the invention relates to a combination medicament comprising a first viral vector and a second viral vector.

The first viral vector encodes a first vector polypeptide having a first N-terminal end and a first C-terminal end, and said first vector polypeptide comprises an N-terminal part of a split intein component at the first C-terminal end.

The second viral vector encodes a second vector polypeptide having a second N-terminal end and a second C-terminal end, and said second vector polypeptide comprises a C- terminal part of a split intein component at its N-terminal end. The combination medicament further comprises, as part of the first or second viral vector, or encoded by a third vector, a sequence encoding a pegRNA.

The first polypeptide and the second polypeptide, when both are expressed (i.e. operably present) within a target cell, are capable of forming, under the conditions prevailing in the target cell, a fusion polypeptide comprising a first fusion polypeptide sequence characterized by CAS9 nickase activity (corresponding to mutation H840A in the RuvC domain of spCas9) and a second fusion sequence characterized by reverse transcriptase activity but lacking RNAseH activity. The pegRNA is capable of interacting with the fusion polypeptide to yield a prime editing protein RNA complex.

Likewise, the present invention relates a pharmaceutical composition comprising the first and second vectors of the invention formulated with a pharmaceutically acceptable carrier, diluent or excipient.

Another aspect of the invention relates to a prime editing protein RNA complex comprising a. a first fusion polypeptide sequence comprising CAS9 nickase activity (corresponding to mutation H840A in the RuvC domain of spCas9) and b. a second fusion polypeptide sequence comprising reverse transcriptase activity but lacking RNAseH activity, and c. a pegRNA comprising, from 5’ to 3’ end, i. a guide RNA sequence tract capable of hybridizing to a genomic DNA target adjacent sequence forming one strand of a dsDNA target, ii. a partially stem loop double stranded structural RNA sequence tract facilitating interaction and trans-activation of the first fusion polypeptide comprising CAS9 nickase activity, iii. a template strand sequence tract containing a sequence reverse complementary to a edited target sequence and iv. a hybridizing sequence tract.

Both first and second fusion polypeptide sequences are part of a single polypeptide chain.

The invention further provides the use of the combination or prime editing complex in the therapy of genetic diseases.

Terms and definitions

For purposes of interpreting this specification, the following definitions will apply and whenever appropriate, terms used in the singular will also include the plural and vice versa. In the event that any definition set forth below conflicts with any document incorporated herein by reference, the definition set forth shall control.

The terms “comprising,” “having,” “containing,” and “including,” and other similar forms, and grammatical equivalents thereof, as used herein, are intended to be equivalent in meaning and to be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. For example, an article “comprising” components A, B, and C can consist of (i.e., contain only) components A, B, and C, or can contain not only components A, B, and C but also one or more other components. As such, it is intended and understood that “comprises” and similar forms thereof, and grammatical equivalents thereof, include disclosure of embodiments of “consisting essentially of” or “consisting of.”

Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit, unless the context clearly dictate otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.

Reference to “about” a value or parameter herein includes (and describes) variations that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X.” As used herein, including in the appended claims, the singular forms “a,” “or,” and “the” include plural referents unless the context clearly dictates otherwise.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art (e.g., in cell culture, molecular genetics, nucleic acid chemistry, hybridization techniques and biochemistry). Standard techniques are used for molecular, genetic and biochemical methods (see generally, Sambrook et al., Molecular Cloning: A Laboratory Manual, 4th ed. (2012) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. and Ausubel et al., Short Protocols in Molecular Biology (2002) 5th Ed, John Wiley & Sons, Inc.) and chemical methods.

The term CAS9 in the context of the present specification relates to CRISPR associated protein 9, formerly called Cas5, Csn1 , or Csx12, a 160 kilodalton protein which plays a vital role in the immunological defence of certain bacteria against DNA viruses and plasmids, and is heavily utilized in genetic engineering applications.

The abbreviation AAV in the context of the present specification relates to adeno-associated virus.

The term AAV vector in the context of the present specification relate to a viral vector composed of 60 AAV capsid proteins and an encapsidated AAV nucleic acid. An AAV vector is derived from an AAV virion, but the AAV vector is engineered to be replication-incompetent in the presence of a helper virus by removing the rep and cap genes from the AAV genome. The encapsidated AAV nucleic acid may comprise a transgene which is to be delivered into a target cell.

Amino acid residue sequences are given from amino to carboxyl terminus. Capital letters for sequence positions refer to L-amino acids in the one-letter code (Stryer, Biochemistry, 3 rd ed. p. 21). Lower case letters for amino acid sequence positions refer to the corresponding D- or (2R)-amino acids. Sequences are written left to right in the direction from the amino to the carboxy terminus. In accordance with standard nomenclature, amino acid residue sequences are denominated by either a three letter or a single letter code as indicated as follows: Alanine (Ala, A), Arginine (Arg, R), Asparagine (Asn, N), Aspartic Acid (Asp, D), Cysteine (Cys, C), Glutamine (Gin, Q), Glutamic Acid (Glu, E), Glycine (Gly, G), Histidine (His, H), Isoleucine (lie, I), Leucine (Leu, L), Lysine (Lys, K), Methionine (Met, M), Phenylalanine (Phe, F), Proline (Pro, P), Serine (Ser, S), Threonine (Thr, T), Tryptophan (Trp, W), Tyrosine (Tyr, Y), and Valine (Val, V).

The term gene refers to a polynucleotide containing at least one open reading frame (ORF) that is capable of encoding a particular polypeptide or protein after being transcribed and translated. A polynucleotide sequence can be used to identify larger fragments or full-length coding sequences of the gene with which they are associated. Methods of isolating larger fragment sequences are known to those of skill in the art.

The term transgene in the context of the present specification relates to a gene or genetic material that has been transferred from one organism to another. In the present context, the term may also refer to transfer of the natural or physiologically intact variant of a genetic sequence into tissue of a patient where it is missing. It may further refer to transfer of a natural encoded sequence the expression of which is driven by a promoter absent or silenced in the targeted tissue.

The terms gene expression or expression, or alternatively the term gene product, may refer to either of, or both of, the processes - and products thereof - of generation of nucleic acids (RNA) or the generation of a peptide or polypeptide, also referred to transcription and translation, respectively, or any of the intermediate processes that regulate the processing of genetic information to yield polypeptide products. The term gene expression may also be applied to the transcription and processing of a RNA gene product, for example a regulatory RNA or a structural (e.g. ribosomal) RNA. If an expressed polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell. Expression may be assayed both on the level of transcription and translation, in other words mRNA and/or protein product.

The term variant refers to a polypeptide that differs from a reference polypeptide, but retains essential properties. A typical variant of a polypeptide differs in its primary amino acid sequence from another, reference polypeptide. Generally, differences are limited so that the sequences of the reference polypeptide and the variant are closely similar overall and, in many regions, identical. A variant and reference polypeptide may differ in amino acid sequence by one or more modifications (e.g., substitutions, additions, and/or deletions). A substituted or inserted amino acid residue may or may not be one encoded by the genetic code. A variant of a polypeptide may be naturally occurring such as an allelic variant, or it may be a variant that is not known to occur naturally.

The terms capable of forming a hybrid or hybridizing sequence in the context of the present specification relate to sequences that under the conditions existing within the cytosol of a mammalian cell, are able to bind selectively to their target sequence. Such hybridizing sequences may be contiguously reverse-complimentary to the target sequence, or may comprise gaps, mismatches or additional non-matching nucleotides. The minimal length for a sequence to be capable of forming a hybrid depends on its composition, with C or G nucleotides contributing more to the energy of binding than A or T/U nucleotides, and on the backbone chemistry. The term sgRNA (single guide RNA) in the context of the present specification relates to an RNA molecule capable of sequence-specific repression of gene expression via the CRISPR (clustered regularly interspaced short palindromic repeats) mechanism.

The term nucleic acid expression vector in the context of the present specification relates to a polynucleotide, for example a plasmid, a viral genome or a synthetic RNA molecule, which is used to transfect (in case of a plasmid or an RNA) or transduce (in case of a viral genome) a target cell with a certain gene of interest. In the case of a DNA expression construct, the gene of interest is under control of a promoter sequence and the promoter sequence is operational inside the target cell, thus, the gene of interest is transcribed either constitutively or in response to a stimulus or dependent on the cell’s status. In the case of an RNA expression construct, it is understood that the term expression relates to translation of the RNA and the construct can be employed by the target cell as an m-RNA. In certain embodiments, the viral genome is packaged into a capsid to become a viral vector, which is able to transduce the target cell.

In the context of the present specification, the terms sequence identity and percentage of sequence identity refer to a single quantitative parameter representing the result of a sequence comparison determined by comparing two aligned sequences position by position. Methods for alignment of sequences for comparison are well-known in the art. Alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman, Adv. Appl. Math. 2:482 (1981), by the global alignment algorithm of Needleman and Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson and Lipman, Proc. Nat. Acad. Sci. 85:2444 (1988) or by computerized implementations of these algorithms, including, but not limited to: CLUSTAL, GAP, BESTFIT, BLAST, FASTA and TFASTA. Software for performing BLAST analyses is publicly available, e.g., through the National Center for Biotechnology-Information (http://blast.ncbi.nlm.nih.gov/).

One example for comparison of amino acid sequences is the BLASTP algorithm that uses the default settings: Expect threshold: 10; Word size: 3; Max matches in a query range: 0; Matrix: BLOSUM62; Gap Costs: Existence 11 , Extension 1 ; Compositional adjustments: Conditional compositional score matrix adjustment. One such example for comparison of nucleic acid sequences is the BLASTN algorithm that uses the default settings: Expect threshold: 10; Word size: 28; Max matches in a query range: 0; Match/Mismatch Scores: 1.-2; Gap costs: Linear. Unless stated otherwise, sequence identity values provided herein refer to the value obtained using the BLAST suite of programs (Altschul et al., J. Mol. Biol. 215:403-410 (1990)) using the above identified default parameters for protein and nucleic acid comparison, respectively.

Reference to identical sequences without specification of a percentage value implies 100% identical sequences (i.e. the same sequence). The term having substantially the same biological activity in the context of the present invention relates to either the capability of a protein complex to function in prime editing in a clinically relevant fashion. While small deviations from the sequences disclosed herein may lead to a variation of prime editing efficacy, a prime editing complex that achieves editing rates of >1 % is considered clinically relevant. In some conditions, correction rates of 1 % or even below would already be sufficient, one example being tyrosinemia

In the context of the present specification, the term amino acid linker refers to a polypeptide of variable length that is used to connect two polypeptides in order to generate a single chain polypeptide. Exemplary embodiments of linkers useful for practicing the invention specified herein are oligopeptide chains consisting of 1 , 2, 3, 4, 5, 10, 20, 30, 40 or 50 amino acids. A non-limiting example of an amino acid linker is a monomer or di-, tri- or tetramer of a tetraglycine-serine peptide linker.

The term “prime editing protein RNA complex” in the context of the present specification refers to a composition in which a protein component, and an RNA component are bound in a way allowing prime editing to occur as described herein.

Detailed Description of the Invention

A first aspect of the invention relates to a combination medicament combining two individually formulated viral vectors, a first viral vector and a second viral vector, or to a composition comprising the two viral vectors in combination.

The first viral vector encodes a first vector polypeptide, characterized by a first N-terminal end and a first C-terminal end. The first vector polypeptide comprises an N-terminal part of a split intein component at its C-terminal end. It may also comprise a nuclear localisation signal in order to assure that once translated and processed, the payload delivered by the first viral vector is delivered to the target cell’s nucleus.

The second viral vector encodes a second vector polypeptide having a second N-terminal end and a second C-terminal end, and the second vector polypeptide comprises a C-terminal part of a split intein component at its N-terminal end. The second vector polypeptide may also comprise a nuclear localisation signal on either end of the peptide chain, or on both (prior to split intein joining).

In certain embodiments, the split intein sequence is connected to the prime editing component by a flexible linker, particularly by a flexible linker of 12-18 amino acids selected from G and S in length.

The combination medicament or composition of the invention further comprises a sequence encoding a pegRNA, which is transcribed (expressed) in the target cell to facilitate association of the pegRNA with the protein formed by reaction of the first and second vector polypeptides to yield a functional prime editing complex.

The first polypeptide and the second polypeptide, when both are present expressed within a target cell, are capable of forming under the conditions prevailing in the target cell a fusion polypeptide comprising a first fusion polypeptide sequence characterized by CAS9 nickase activity (corresponding to mutation H840A in the RuvC domain of spCas9) and a second fusion sequence characterized by reverse transcriptase activity but lacking RNAseH activity. The pegRNA is capable of interacting with the fusion polypeptide to yield a prime editing protein RNA complex.

The two vectors do not necessarily need to be present in the same vial / administration form at the time of delivery, but for the invention to have the desired effect, the two vectors need to be administered to the same tissues / cells and cells need to be transduced simultaneously with both vectors. While pharmaceutical manufacturing and packaging regulations may require that the vectors are made and quality-tested separately, a pragmatic approach to assure combined administration and transduction of cells may be to join the two vector preparations immediately prior to administration to the patient. This may be of particular importance in settings where the physico-chemical conditions of administration are difficult (e.g. intraocular delivery).

The pegRNA component is advantageously comprised on the vector encoding the smaller part of the full sequence. Equivalently, the pegRNA might be encoded on a third vector for uses where administration of a third vector is feasible, which again has to be expressed in any cell transduced by the first and second vector for the invention to work.

The pegRNA comprises, or essentially consists of, from 5’ to 3’ end, i. a guide RNA sequence tract capable of hybridizing to a genomic DNA target adjacent sequence forming one strand of a dsDNA target, ii. a partially stem loop double stranded structural RNA sequence tract facilitating interaction with and trans-activation of the first fusion polypeptide part characterized by CAS9 nickase activity, iii. a template strand sequence tract containing a sequence reverse complementary to an edited target sequence and iv. a hybridizing sequence tract.

Fig 13 shows a schematic overview of the components and the terminology used herein to describe them (the graphic is adapted from the graphic by Letitia Dinatto in the English Wikipedia entry on prime editing): 1. The nick is created in one of two strands of genomic DNA (a first strand), here called “edited” strand. The other strand is here called opposite strand. Both the transcribed (antisense) and the sense strand can be edited.

2. The edited sequence is 5’ of the nicked position on the edited strand.

3. The guide hybridizes to a “target adjacent” sequence on the opposite strand, 3’ (in the polarity of the opposite strand) of the edited position. The nicked position is at position (counted from the 3’ end of the guide sequence) of the guide on the edited strand, so the guide overlaps with the nicking position. The edit can be introduced 3’ downstream of the nicked position (on the edited strand), so it can be within these 3 nucleotides - in that case the edit would overlap with the guide - or the edit can be outside the guide sequence, in that case the guide hybridizes indeed to the opposite strand, 3’ of the edited position.

4. The system accommodates variations of the template strand’s length. Anzalone et al. (Nature 576, 149-157 (2019)) tested different template lengths, the longest distance between nick and substituted position being 34 nucleotides, and citing an 80mer deletion. Several bases can be edited, and both insertions and deletions (up to 80-bp) can be introduced. In these inventors’ experience, edits installed closer to the nicked position seem more efficient. It remains to be analyzed in detail how many bases can be edited simultaneously and whether there is a preference for binding to specific sequence motifs or mutating some bases more efficiently over others.

5. The hybridizing sequence tract (opposite the primer generated from the edited strand at the nick site) is reverse complimentary to the guide strand at least partially.

Generally, as both first and second vector need to be expressed in the same cells in similar amounts, the first and second vector systems will be derived from the same virus and will be essentially identical except for their encoded “payload”. In certain embodiments, the first viral vector and the second viral vector are selected from an Adeno associated virus-based vector and a nonintegrating lentiviral vector. In particular embodiments, the first viral vector and the second viral vector are an Adeno associated virus-based vector.

AAV is a small non-pathogenic virus that infects humans and other primate species.

The AAV2 infection starts by docking to the cell surface receptor heparan sulphate proteoglycan (HSPG). Its low-affinity binding to glycans induces a reversible structural rearrangement of the capsid that promotes binding to the co-receptor ov 35 or adbΊ integrin inducing formation of a clathrin-coated pit. The clathrin-coated pit becomes internalized via endocytosis and the viral particles are transported to the nucleus. The pH drops due to acidification of the endosomal compartments, which is a feature of the endosomal vesicle maturation. Acidification-triggered conformational change takes place in the capsid, and the virus escapes from the late endosome by lipolytic pore formation.

In wild-type AAV the genome is built of a 4.7 kilobase long single stranded DNA (ssDNA), either positive- or negative-sensed. The genome comprises three open reading frames (ORFs) flanked by inverted terminal repeats (ITRs). The ITRs are self-complementary, CG-rich, T- shaped hairpins at the 5’ and 3’-end of the AAV genome and the only necessary viral component present in recombinant vector genomes. The ITR include a terminal resolution site (TRS) and a Rep binding element (RBE), which facilitate replication and encapsidation of the viral genome. The ORFs encode the genes rep, cap, AAP. Four multifunctional non-structural Rep proteins encoded by rep are required for the AAV life cycle. Cap encodes the capsid proteins VP1 , VP2 and VP3, which interact together to form a capsid of an icosahedral symmetry, and the assembly-activating protein (AAP), which is required for stabilizing and transporting newly produced VP proteins from the cytoplasm into the cell nucleus. All three VPs are translated from one mRNA and spliced differently. The largest 90 kDa VP1 is an unspliced transcript, the 72 kDa VP2 is translated from a non-conventional ACG start codon whereas the smallest 60 kDa VP3 is translated from an AUG codon. All the three VPs have overlapping C-termini.

In certain more particular embodiments, both first and second vector are an AAV2 vector.

Alternatively, nonintegrating lentiviral vectors can be employed. Such vectors have been developed to combine gene transfer efficiency of lentiviral systems with the safety of non integrating vectors (see Yanez-Munoz et al., Nature Medicine 12, 348-353 (2006); Hamilton et al. Human Gene Therapy Oct 2018.1213-1225 (htps://doi.Org/10.1089/hum.2018.111); Shaw and Cornetta, Biomedicines 2014, 2, 14-35).

In certain embodiments, said first vector polypeptide comprises SEQ ID NO 001 , or a sequence at least 90% identical to SEQ ID NO 001 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 001 when joined to SEQ ID NO 002; and said second vector polypeptide comprises SEQ ID NO 002, or a sequence at least 90% identical to SEQ ID NO 002 and having -when joined to the first vector polypeptide- the same biological activity as SEQ ID NO 002 when joined to SEQ ID NO 001 .

Two pairings form the basis of the invention as encompassed by this first aspect of the invention: a nickase enzyme activity needs to be associated to a reverse transcriptase (RT) activity, and the resulting fusion protein needs to be split in such way that the two parts cooperatively form a prime editing complex together with the pegRNA.

While individual amino acids of the components may be mutated without affecting the function of the whole complex, it is understood that the criterion of functionality can only be assessed for the resulting complex as a whole. In other words, while a single change in an amino acid in the first or second fusion polypeptide may affect the functionality of that part, compared to the wild type disclosed as a sequence herein, a corresponding change in the partner may revert the full complex to functionality.

In certain embodiments, the N-terminal part of a split intein component and the C-terminal part of a split intein component are, or are derived from, a split intein system found in an organism of the group comprising the cyanobacterium Nostoc punctiforme (Npu). Other systems that can be employed include Mxe intein from Mycobacterium xenopi GyrA, DnaE and Rma intein from Rhodothermus marinus.

In certain embodiments, said first vector polypeptide is or comprises SEQ ID NO 003, or a sequence at least 90% identical to SEQ ID NO 003 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 003 when joined to SEQ ID NO 004; and said second vector polypeptide comprises SEQ ID NO 004, or a sequence at least 90% identical to SEQ ID NO 004 and having -when joined to the first vector polypeptide- the same biological activity as SEQ ID NO 004 when joined to SEQ ID NO 003.

In certain embodiments, the first vector polypeptide and the second vector polypeptide of the combination medicament according to the invention comprise a nuclear localization sequence (NLS). A number of NLS sequences is known in the art that lead to transfer of the protein they are attached to into the nucleus. In certain particular embodiments, the NLS is the SV40 NLS. In even more particular embodiments, each first and second vector polypeptide comprise two SV40 NLS. The prime editing protein RNA complex requires nuclear localization in order to be effective.

Another aspect of the invention relates to a combination medicament prime editing protein RNA complex, said complex comprising a. a first fusion polypeptide sequence comprising CAS9 nickase activity (corresponding to mutation H840A in the RuvC domain of spCas9) and b. a second fusion polypeptide sequence comprising reverse transcriptase activity but lacking RNAseH activity, and c. a pegRNA comprising, from 5’ to 3’ end, v. a guide RNA sequence tract capable of hybridizing to a genomic DNA target adjacent sequence forming one strand of a dsDNA target, vi. a partially stem loop double stranded structural RNA sequence tract facilitating interaction and trans-activation of the first fusion polypeptide comprising CAS9 nickase activity, vii. a template strand sequence tract containing a sequence reverse complementary to an edited target sequence and viii. a hybridizing sequence tract.

The retroviral ribonuclease H (retroviral RNase H) is a catalytic domain of the retroviral reverse transcriptase (RT) enzyme. The RT enzyme is used to generate complementary DNA (cDNA) from the retroviral RNA genome. This process is called reverse transcription. To complete this complex process, the retroviral RT enzymes need to adopt a multifunctional nature. They therefore possess 3 of the following biochemical activities: RNA-dependent DNA polymerase, ribonuclease H, and DNA-dependent DNA polymerase activities). Like all RNase H enzymes, the retroviral RNase H domain cleaves DNA/RNA duplexes and will not degrade DNA or unhybridized RNA.

In certain embodiments of the combination medicament or composition as specified in the first aspect of the invention and its particular embodiments, or of the prime editing protein RNA complex according to the second aspect of the invention, the first fusion polypeptide sequence is or comprises SEQ ID NO 005, or a sequence at least 90% identical to SEQ ID NO 005 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 005 when joined to SEQ ID NO 006; and the second fusion polypeptide sequence is or comprises SEQ ID NO 006, or a sequence at least 90% identical to SEQ ID NO 006 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 006 when joined to SEQ ID NO 005.

In certain embodiments of the combination medicament or composition as specified in the first aspect of the invention and its particular embodiments, or of the prime editing protein RNA complex according to the second aspect of the invention, the first fusion polypeptide sequence is or comprises SEQ ID NO 111 , or a sequence at least 90% identical to SEQ ID NO 111 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 111 when joined to SEQ ID NO 112; and the second fusion polypeptide sequence is or comprises SEQ ID NO 112, ora sequence at least 90% identical to SEQ ID NO 112 and having -when joined to the second vector polypeptide- the same biological activity as SEQ ID NO 112 when joined to SEQ ID NO 111.

In certain embodiments of the combination medicament or composition, or of the prime editing protein RNA complex, the first fusion polypeptide sequence characterized by CAS9 nickase activity is streptococcus pyogenes CAS9 H840A; and the second fusion polypeptide sequence is Moloney murine leukemia virus reverse transcriptase lacking the RNAseH domain.

Other RT systems that have worked in the context of prime editing are disclosed in Lin et al., Nature Biotechnology volume 38, pages582-585 (2020) and include the enzyme from cauliflower mosaic virus12 (RT-CaMV) and a retron-derived RT (RT-retron) from E. coli BL21 . In certain embodiments, the prime editing complex according to the invention comprises or consists of SEQ ID NO 007, or a sequence at least 85% identical, particularly >90% identical, more particularly >95% identical to SEQ ID NO 007 and having the same biological activity as SEQ ID NO 007.

In certain embodiments, the prime editing complex according to the invention comprises a nuclear localization sequence (NLS). In certain particular embodiments, the NLS is the SV40 NLS. As stated above, the prime editing protein RNA complex requires nuclear localization.

The invention also relates to a nucleic acid encoding the prime editing RNA complex according to the invention as defined herein, comprising a first nucleic acid sequence encoding polypeptide comprising said first fusion sequence and said second fusion sequence, and a second nucleic acid sequence encoding said pegRNA, both first and second nucleic acid sequences being under control of a promoter operable in a mammalian cell.

In certain embodiments, the nucleic acid as defined in the previous paragraph is contained in and expressed by a viral vector. As related in the examples, in one particular embodiment found to be of advantage, the viral vector is an Adenovirus (AdV), more particularly a human AdV, even more particularly a human AdV5.

In certain embodiments the prime editing complex comprises or consists of SEQ ID NO 007, or a sequence at least 85% identical, particularly >90% identical, more particularly >95% identical to SEQ ID NO 007 and having the same biological activity as SEQ ID NO 007.

In certain embodiments, the structural RNA sequence tract is SEQ ID NO. 008.

In certain embodiments, the guide RNA sequence tract is SEQ ID NO 009 and/or the template sequence tract is SEQ ID NO 010 and/or the hybridizing sequence tract is SEQ ID NO 011.

In certain embodiments, the target sequence is characteristic of a genetic condition in a mammal, particularly a human, characterized by a transition or transversion mutation of a functioning wild type sequence, and the template (providing instruction to synthesize the edited sequence) sequence tract is characteristic of the reverse complimentary sequence of the wild type sequence.

In certain embodiments, the genetic condition is associated to expression of the target sequence in the liver, the brain and the eye.

AAV vector systems for specific targeting of the eye have been published widely. AAV to the inventors’ knowledge is the only clinically feasible vector capable of transferring the blood- brain barrier.

In certain embodiments, the genetic condition is associated to expression of the target sequence in the eye, liver, CNS / brain, myocard, lung, or muscle. Colloquially, the target sequence causing a genetically transmitted condition will be referred to as a mutated wild type sequence, and the edit function according to the invention aims to “back-mutate” this mutation to the wild type, thereby restoring the edited sequence’s original function and ameliorating the condition. The invention further relates to the use of a combination medicament or composition as specified in the first aspect of the invention and its particular embodiments, or the prime editing protein RNA complex, for use in the treatment of a disease caused by a genetic defect leading to loss of function or impaired function of a gene, whereby the genetic defect is characterized by a mutation of a functioning wild type sequence. Particular genetic diseases for which the invention holds promise are laid out in the second to left column of Table 1. The disease is characterized by a mutation as laid out in the right column of Table 1 , or its reverse complementary sequence.

Table 1 :

The mutation can be corrected by prime editing using the tools provided herein, choosing an appropriate pegRNA to edit the mutated sequence back to the wild type.

The mutations addressed by the invention include transition and transversion mutations and indel mutations. Currently the editing rates in vivo leave room for further optimization (up to 15%), but in vitro in cell lines the inventors have observed edit rates in excess of 90% editing.

Wherever alternatives for single separable features are laid out herein as “embodiments”, it is to be understood that such alternatives may be combined freely to form discrete embodiments of the invention disclosed herein.

The invention is further illustrated by the following examples and figures, from which further embodiments and advantages can be drawn. These examples are meant to illustrate the invention but not to limit its scope.

Description of the Fipures

Fig. 1 shows establishment of the size-reduced PE variant pE ΔRnH . (a) Schematic representation of the full-length PE2 and the inventor ' s size-reduced PE2 variant RE2 ΔRnH , lacking the RNaseH (RnH) domain of the RT. (b) rSTOP reporter: conversion of a TAG stop codon results in GFP expression (c) TLR reporter: correction of a 2- bp frameshift results in tagRFP expression. Editing efficiency can be scored by flow cytometry (b, c). (d, e) Performance of PE2 and PE2 ΔRnH in rSTOP (d) and TLR (e) reporter cells by targeting 2 different protospacers (f) Comparative analyses of editing efficiency and accuracy of PE2 and pE2 ΔRnH at seven genomic sites. Incorrectly edited reads contain all base conversions other than the target nucleotide. Unless indicated otherwise, differences between PE2 and pE2 ΔRnH were not significant ( P>0.05\ two-tailed student’s t-test). Data from all experiments are represented as mean ± s.d. of at least two independent experiments. PE, prime editor; M-MLV, Moloney murine leukemia virus; RT, reverse transcriptase; NLS, nuclear localization signal; bGH, bovine growth hormone polyadenylation signal; EF-1a, eukaryotic translation elongation factor 1a; rSTOP, remove stop codon; adrbl , bi- adrenergic receptor; app, amyloid b-precursor protein; elF2B, eukaryotic translation initiation factor 2B; otc, ornithine carbamoyltransferase; gabaRla, gamma- aminobutyric acid receptor subunit a-1.

Fig. 2 shows establishment of an intein-split PE variant for dual AAV-mediated delivery (a) Depiction of the two intein-split PE moieties (N-int-PE2 and C-int-PE2), forming the full-length prime editor after protein trans-splicing (b) Editing efficiencies of various intein-split PEs with [GGGGS]3-linkers (SEQ ID NO 052) in the rSTOP reporter. Untreated reporter cells or transfected pegRNA were used as controls. Efficiencies of intein-split PEs were compared to CMV-PE2 and analyzed using a two-tailed student’s t-test with Welch’s correction (c) Optimization of the linker length and the intein-split splice-acceptor site at position p.1153. Left panel: Schematic maps of N- and C-terminal intein-split PE halves, highlighting the linker position. Right panel: Comparison of editing performance of different combinations of optimized N- and C- int-PE2 in the rSTOP GFP reporter. Data of all experiments are depicted as means of at least three independent biological replicates. N-int, N-intein; C-int, C-intein; CMV, human cytomegalovirus promoter; N-term, N-terminal PE half; C-term, C-terminal PE half. ** P>0.005, *** P>0.0005, **** P>0.0001.

Fig. 3 shows AAV-mediated prime editing at the Dnmtl locus in the mouse liver (a) Schematic outline of the experimental setup in new-born and adult mice (b) PE3b correction rates in new-born and adult animals after AAV8-mediated delivery (c) Percentage of sequencing reads with bystander base substitutions (any base substitution within the protospacer), and indels within the protospacer region determined by deep sequencing (d) Distribution of precise edits and deletions at the Dnmtl target site (P2A mutation) after in vivo PE3b editing in new-born (top) and adult mice (bottom). Data are represented as mean ± s.d. (n= 3-4 mice per group) and were analyzed using a two-tailed t-test (b) or a two-way ANOVA with Tukey’s multiple comparisons test (c). Unless indicated otherwise, differences between new-born and adult animals (b, c) were not significant ( * P>0.05 ; **** P<0.0001). The sequences depicted in Figure 3 correspond to SEQ ID NO 053 to SEQ ID NO 056 within the ST.25 sequence listing.

Fig. 4 shows correction of the Pah enu2 allele in vivo in mice using prime editing (a) Three pegRNAs were designed to target the mutant Pah enu2 allele that harbors the disease- causing c.835T>C (p.F263S) mutation on exon 7, indicated in blue. pegRNAs mPKU-

1. * and -2. * allow for binding of SpCas to NGG PAMs (indicated in green); pegRNAs mPKU-3. * allow for binding of SpRY to NRN and NYN PAMs (indicated in purple).

Conversion of the target C (c.835) leads to the desired serine (Ser) to phenylalanine

(Phe) change at amino acid position 263, restoring Pah enzyme activity (b, c) In vitro optimization of pegRNAs for PE2 and PE3b approaches. Experiments were performed in reporter HEK293T cells in which the mutated exon 7 of the Pah enu2 gene was stably integrated. Data are represented as mean ± s.d. of two independent experiments, and PE2 and PE3b were compared using a two-tailed t-test. (d) PE2 and PE3b correction rates in new-born and adult animals after AdV5-mediated delivery of full-length PE. (e) Percentage of sequencing reads with bystander base substitutions (any base substitution within the protospacer), and with indels within the protospacer region determined by deep sequencing (f) Blood L-Phe levels after in vivo prime editing compared to untreated heterozygous and homozygous control animals. L-Phe levels below 360 pmol/L are considered therapeutic. L-Phe concentrations below 120 pmol/L represent physiological levels (g) H&E-stained liver sections of untreated and treated mice (AdV5). Scale bars, 100 miti. (h) Deep amplicon sequencing of 10 computationally predicted off-target sites in untreated and PE3b-treated mice (AdV5). Data are represented as mean ± s.d. of at least three animals and were analyzed using a two-way ANOVA with Tukey’s multiple comparisons test. Unless indicated otherwise, differences were not significant (P>0.05). The sequences shown in figure 4 correspond to SEQ ID NO 057 to SEQ ID NO 058 within the ST.25 sequence listing.

Fig. 5 shows optimization of intein-split PEs for dual AAV-mediated delivery (a) The effect of a [GGGGS]3-linker (SEQ ID NO 052) on prime editing efficiencies at different split sites. Statistical analysis was performed using a two-tailed student’s t-test. (b) Schematic maps of N- and C-terminal intein-split PE halves for AAV delivery. Positioning of NLSs alters the performance of intein-split PE2-p.1153. Data were analyzed using a two-way ANOVA with Tukey’s multiple comparisons test (c) Editing efficiencies of different circular permutant (CP) PE2 compared to full-length PE2. Amino acid positions, at which N- and C-termini were interchanged, are indicated. Data were analyzed using a two-tailed student’s t-test. Data in (a-c) are represented as mean ± s.d. of at least three independent biological replicates. **** P>0.0001\ * ** P>0.001; * P>0.05\ ITR, inverted terminal repeat sequences; W3, Woodchuck Hepatitis Virus post-transcriptional regulatory element; SV40, polyadenylation signal.

Fig. 6 shows orthogonal PE systems display low editing efficiencies in HEK293T cells (a, b) Schematic maps and sizes of orthogonal RE2 ΔRnH where the RT is fused to Cas9 from S. aureus, S. auricularis, and C. jejuni (c, d) Editing efficiencies of the different RE2 ΔRnH variants in rSTOP reporter cells (c) and on endogenous sites (d). Data are represented as mean ± s.d. of at least three independent biological replicates. Data were analyzed using a two-tailed student’s t-test. n.s., non-significant; ** P<0.005; * ** P<0.001\ nT, N-terminal RT fusion; cT, C-terminal RT fusion.

Fig. 7 shows partial hepatectomy (PHx) does not increase Dnmtl editing rates (a) Schematic outline of the experimental setup in adult mice with and without PHx. (b, c) PHx did not result in increased prime editing rates in adult mouse livers.

Fig. 8 shows in vitro correction of the Pah enu2 locus using SpCas9-PE2 and -PE3b. (a) Correction of the disease-causing mutation (c.835C>T; p.263S>F; blue) changes the PAM from NGG to NGA, reducing potential re-cutting of the locus after introduction of the desired edits. An ectopic NGG-PAM for the PE3b nicking sgRNA is introduced as a synonymous mutation (c.830A>G; p.261R>R; grey) by the pegRNA, leading to nicking of the non-edited strand only after the edit has been installed (b) Percentage of sequencing reads with bystander base substitutions and indels at any position within the protospacer regions of PE2 and PE3b determined by deep amplicon sequencing. Experiments were performed in reporter HEK293T cells, in which the mutated exon 7 of the Pah enu2 gene was stably integrated. Unless indicated otherwise, differences between PE2 and PE3b were not significant ( * P>0.05 ). (c) Distribution of precise editis and deletions at the Pah enu2 target site (S263F correction) after in vitro PE2 and PE3b editing. Sequencing reads of pegRNA mPKU-2.1 are shown as a representative. The PE target site is highlighted in green. PAM sequences are underlined and nucleotide substitutions are labelled in blue and grey. Deleted bases are indicated by dashes. The sequences shown in Fig. 8 correspond to SEQ ID NO 059 to SEQ ID NO 082 within the ST.25 sequence listing.

Fig. 9 shows in vivo prime editing at the Pah enu2 locus in the mouse liver (a) Schematic outline of the experimental setup for AAV8- and AdV5-mediated treatment in new born and adult PKU mice (b) PE2 and PE3b correction rates in new-born and adult animals after AAV-8-mediated delivery (c) Percentage of sequencing reads with bystander substitutions and indels at any position within the protospacer region determined by deep amplicon sequencing (d) Blood L-Phe levels after in vivo prime editing compared to untreated, heterozygous and homozygous control animals. L- Phe levels below 360 pmol/L are considered therapeutic. L-Phe concentrations below 120 pmol/L represent physiological levels. Data are represented as mean ± s.d. (n= 3-4 mice per group) and were analyzed using a two-way AN OVA with Tukey’s multiple comparisons test. Unless indicated otherwise, differences between new-born and adult animals (b-d) were not significant ( * P<0.05 ; ** P<0.005; **** P<0.0001).

Fig. 10 shows distribution of precise edits and deletions at the Pah enu2 target site (a, b) Representative sequencing reads of pegRNA mPKU-2.1 are shown after in vivo AAV8 (a) or AdV5 (b) delivery to new-born and adult mice. The PE target site is highlighted in green. PAM sequences are underlined and nucleotide substitutions are labelled in blue and grey. Deleted bases are indicated by dashes. The sequences shown in Fig. 10 correspond to SEQ ID NO 083 to SEQ ID NO 110 of the ST.25 sequence listing.

Fig. 11 shows in vivo prime editing does not induce extensive liver damage (a) Representative H&E-stained liver sections of untreated, AAV8- and AdV5- treated mice. Vectors were injected into neonatal and adult mice and tissues were analyzed after 4 or 12 weeks. Scale bars, 100 pm. (b) Serum transaminases alanine aminotransferase (ALT) and aspartate aminotransferase (AST) at experimental end points. ALT and AST levels are calculated relative to untreated C57BI/6 mice (average 68 U/L for ALT and 133 U/L for AST). Fig. 12 shows a schematic view of the prime editing complex and the terminology employed to define the pegRNA sections.

Fig 13 Equimolar amounts of plasmid DNA were transfected into HEK293T GFP reporter cells using lipofectamine 2000 (PE:pegRNA ratio = 3:1). Cells were harvested 72h after transfection and DNA was extracted. Deep sequencing of the GFP reporter amplicon was performed on a Miseq using locus specific oligonucleotides and lllumina adapters. All experimental conditions are the same between the individual samples and experiments. The difference is not statistically significant. The experiment illustrates that both the 713/714 split and the. 1153/1154 split (SEQ ID NO 111 and 112) deliver comparable rates of conversion.

Examples

Example 1: Establishment of the size-reduced PE variant PE ΔRnH

In vivo prime editing holds great potential for therapeutic applications. However, the large size of PEs (~6.6 kb; Fig. 1a) is detrimental for in vivo delivery via viral and non-viral vectors. Since the RNaseH domain (-0.6 kb) of the RT is used to degrade DNA-RNA heteroduplexes, the inventors speculated that it could be negligible in the context of prime editing and tested whether its removal leads to a size-optimized but fully functional PE variant (PE ΔRnH ). First, the inventors compared editing rates of PE2 ΔRnH and PE2 in two different HEK293T fluorescent reporter cell lines: the rSTOP reporter, which contains a premature TAG stop codon abolishing translation of a functional GFP (Fig. 2b), and the traffic light reporter (TLR), where translation of a functional tagRFP is prevented by a 2-bp frameshift (Fig. 2c). Using multiple pegRNAs, the inventors observed comparable editing efficiencies between PE2 and pE2 ΔRnH in either reporter cell line by FACS (Fig. 1 d, e). Confirming these results, no difference in correction rates and accuracy was found when seven different disease loci were targeted with PE2 and RE2 ΔRnH and analyzed by deep sequencing (Fig. 1f). The inventor ' s data therefore demonstrates that size-reduced pE2 ΔRnH retains efficiency and accuracy of full-length PE2.

Example 2: Establishment of an intein-split PE variant for dual AA V-mediated delivery

Due to their low immunogenicity and broad range of serotype specificity, AAVs are promising candidates for in vivo delivery of genome editing tools. Their limited packaging capacity of 4.7 kb, nevertheless, represents a major obstacle for the delivery of Cas9-based genome editing tools, including PEs. The inventors have previously adapted the intein-mediated protein transsplicing system from cyanobacterium Nostoc punctiforme (Npu) to split the coding sequence of Cas9 nucleases or BEs on two separate AAVs - with both halves of the protein being reconstituted to the full-length Cas9 or BE upon cell transduction. To optimize the intein-split approach for AAV-mediated PE delivery, the inventors assessed activity of SpCas9- PE variants split at different sites. Within the region where both generated PE segments would not exceed the packaging limit of an AAV, the inventors identified eight surface-exposed positions with either a Cys, Ser, or Thr at the N-terminal position of the C-intein PE moiety. Two of these positions have already been previously used to split SpCas9 nucleases and base editors (p.573 and p.714). However, when applied to the rSTOP reporter both variants showed a substantial reduction in editing activity compared to full-length PE2 (Fig. 2b), and were outperformed by the PE2-p.1153 variant that maintained 75% of the activity of full-length PE2 (Fig. 2b). Further analysis of the intein-split architecture of PE2-p.1153 revealed the importance of flexible [GGGGSk-linkers (SEQ ID NO 052) between the intein domains and PE segments (Fig. 5a), and showed that removal of the nuclear localization signal (NLS) from the N-intein splice-donor is beneficial for the activity (Fig. 2c; Fig. 5b). Notably, the inventors also intended to evaluate potential split sites for circular permutant SpCas9-PE (Fig. 5c), and for PE variants based on orthogonal Cas9 from Staphylococcus aureus (SaCas), Staphylococcus auricularis ( SauriCas ), and Campylobacter jejuni (C/ ' Cas) (Fig. 6). However, none of the tested full-length variants led to editing rates comparable to SpCas9-PE, prompting the inventors to continue with linker- and NLS-optimized SpCas9-PE ΔRnH -p.1153 for in vivo experiments.

Example 3: AAV-mediated prime editing in the mouse liver

To assess the potential of intein-split PE2 ΔRnH -p.1153 for in vivo prime editing in the liver, the inventors decided to target Dnmtl. This locus has previously been edited with high efficiency in mouse embryos using the PE3b system, which sequentially nicks the non-edited strand after installing the desired edit at the target strand. The inventors first replaced the CMV promoters of both intein-split PE ΔRnH -p.1153 expression vectors with the synthetic liver-specific P3 promoter 32 , and added the locus-specific PE3b pegRNA and sgRNA to the C-terminal construct. Due to the use of the size-reduced PE ΔRnH variant both expression cassettes remained within the size limit of AAV vectors (N-terminal construct: 4.6 bp; C-terminal construct: 4.5 bp), and could be packaged into AAV2 serotype 8 particles (Fig. 3a). These were then systemically delivered into one day old C57BL/6J pups in a one-to-one ratio via the temporal vein at a dose of 1 *10 12 vector genomes (vg) per construct and animal (Fig. 3a). To assess editing rates, the inventors isolated primary hepatocytes from perfused livers after 4 weeks, and extracted genomic DNA for deep amplicon sequencing. Importantly, the intended G-to-C conversion was observed in 10.7%-15.0% of Dnmtl alleles (Fig. 3b). Confirming high accuracy of PE3b, the inventors did not identify bystander base substitutions or indel mutations at the target locus above background (Fig. 3c, 3d). To next assess editing efficiencies in adult animals, the inventors injected 5 weeks old C57BL/6J mice via the tail vein, again at a dose of 1 *10 12 vg per construct and animal. After 12 weeks hepatocytes were isolated and analyzed for on-target editing by deep sequencing. While prime editing accuracy was again high with bystander mutations remaining below background (Fig. 3c), editing rates were significantly lower compared to treated neonatal animals, with the intended G-to-C conversion being present in only 1.42%-4.01 % of Dnmtl alleles (Fig. 3b, d). Since the inventors hypothesized that the lack of hepatocyte proliferation in adult animals might have caused the lower editing efficiency, the inventors next induced ectopic hepatocyte proliferation after PE administration via partial hepatectomy (PHx). This procedure, however, did not further increase editing rates (Fig. 7). Taken together, the inventors demonstrate functionality of AAV-delivered intein-split PE ΔRnH -p.1153 in vivo in the liver, with editing efficiencies of up to 15% in neonatal mice.

Example 4: Establishment of prime editing strategies to correct the Pah enu2 allele

For a number of genetic liver diseases, including PKU, correction rates of 10-15% would be sufficient for cure. Therefore, the inventors reasoned that in vivo prime editing could be used to correct the disease phenotype of a murine PKU disease model. PKU is an autosomal recessive metabolic liver disease, which is caused by mutations in the phenylalanine hydroxylase {Pah) gene. These result in a lowering of functional Pah enzyme levels, leading to toxic accumulation of phenylalanine (L-Phe) and its byproducts in the blood. The Pah enu2 mouse model for PKU carries a homozygous point mutation on exon 7 (c.835 T>C; p.F263S), resulting in abnormally high blood L-Phe levels of >1500 μmol/L. Correction of the mutation using cytidine base editors has led to a reduction in blood L-Phe levels below the therapeutic threshold, and full restoration of the PKU phenotype. To investigate the ability of PEs in correcting this pathogenic mutation, the inventors first evaluated editing efficiency and accuracy of various pegRNAs in vitro in a HEK293T cell line with stably integrated exon 7 of the Pah enu2 allele. The inventors designed pegRNAs for SpCas-PE2 and SpRY-PE2 (Fig. 4a), harboring a 13-nucleotide long PBS domain combined with 16- (mPKU- * .1 ) or 19- (mPKU- * .2) nucleotide long RT templates. In addition, the inventors designed nicking sgRNAs for testing the PE3b approach (Fig. 4a; Fig. 8a). Plasmids were transfected into the cell line and after 3 days genomic DNA was isolated for deep amplicon sequencing. Using the PE2 approach, the inventors achieved highest editing efficiencies with pegRNAs mPKU-1.1 (19.6%) and -2.1 (19.7%) (Fig. 4b). These could be further increased to 22.4% and 21 .4% by co-transfecting the corresponding PE3b nicking sgRNAs. Since bystander base substitutions and indel mutations were slightly lower with mPKU-2.1 (0.7% and 1.0%, Fig. 8b, 8c), the inventors decided to use this pegRNA together with the corresponding PE3b nicking sgRNA for subsequent in vivo experiments.

Example 5: In vivo correction of the Pah enu2 allele restores physiological L-Phe levels To elucidate whether PE2 and/or PE3b can reach therapeutic correction of the pathogenic Pah enu2 allele in vivo, the inventors cloned the specified pegRNA with- and without the PE3b sgRNA on the C-terminal intein-split PE ΔRnH -p.1153 expression vector for packaging into AAV2 serotype 8 particles. The inventors then delivered the N- and C-terminal intein-split PE ΔRnH vectors systemically in a 1 :1 ratio into new-born and adult PKU mice at a dose of 1 *10 12 vg per construct and animal (Fig. 9a). Editing rates, however, were consistently lower at the Pah enu2 locus compared to the Dnmtl locus (<1 .3% - Fig. 9b, c), and neither PE2 nor PE3b treatment led to a lowering of L-Phe levels below the therapeutic threshold of 360 pmol/L (Fig. 9d). Due to the relatively low abundance of reconstituted full-length proteins from intein-split moieties, the inventors hypothesized that the delivery of full-length PE ΔRnH might increase editing rates at the Pah enu2 locus. Therefore, the inventors next generated a human Adenovirus 5 (AdV5) vector expressing full-length PE ΔRnH under the liver-specific P3 promoter, and containing the mPKU-2.1 pegRNA alone (AdV5-PE2 ΔRnH ) or combined with the corresponding PE3b sgRNA (AdV5-PE3b ΔRnH ; Fig. 9a). Vectors were systemically delivered via the tail vein into 5-week-old mice, or via the temporal vein into new-born pups at a dose of 1 .5x10 10 vg per mouse. Editing at the Pah enu2 locus was analyzed by deep amplicon sequencing 4 weeks after injection. Importantly, the inventors observed substantially higher C-to-T correction rates compared to after AAV-mediated delivery of intein-split PE ΔRnH -p.115. With AdV5-PE2 ΔRnH the inventors obtained on average 2.0% editing in mice injected at 5 weeks of age, and 5.6% editing in mice injected as new-borns (Fig. 4d). Using AdV5-PE3b ΔRnH the inventors achieved on average correction rates of 0.5% in adult mice, and 9.1 % (with a maximum of 13.9%) in new-born pups (Fig. 4d). Despite the high editing rates in neonatal mice, neither PE2 ΔRnH nor PE3b ΔRnH resulted in bystander substitutions and indel mutations above untreated control animals (Fig. 4e; Fig. 9c and 10). To assess therapeutic efficacy of the inventor ' s in vivo prime editing approach, the inventors next quantified blood L-Phe levels at experimental endpoints. Compared to L-Phe levels of untreated homozygous animals (1701-2543 pmol/L), the inventors observed a significant, though not therapeutic, reduction of L-Phe levels in adult mice injected with AdV5- PE2 (1029-1185 pmol/L) and AdV5-PE3b (973 and 1157 pmol/L). However, in animals injected as new-borns the L-Phe concentration were further reduced to 187 pmol/L in AdV5- PE2-treated animals, and to 59 pmol/L in AdV5-PE3b-treated animals (Fig. 4f). Notably, the values for AdV5-PE3b treated Pah enu2 pups are well below the therapeutic L-Phe threshold of 360 pmol/L and the physiological threshold of 120 pmol/L.

Example 6: In vivo prime editing does not induce extensive liver damage or off-target mutations

For clinical application of in vivo prime editing persistent liver damage and extensive off-target editing triggered by PE expression could be critical limitations. Therefore, the inventors first performed histological assessment of liver sections from AdV5- and AAV8-treated animals. In none of the animals, however, mononuclear cell infiltrates or other signs of hepatocyte necrosis were observed (Fig. 4g; Fig. 11a). Likewise, the inventors only detected a very mild elevation of the serum transaminases alanine aminotransferase (ALT) and aspartate aminotransferase (AST) at the timepoints when animals were sacrificed (Fig. 11 b). Next, the inventors assessed whether editing with AdV5-PE2 and AdV5-PE3b was restricted to the targeted Pah enu2 locus. The inventors therefore computationally predicted potential off-target loci with homology to the protospacer sequence of pegRNA mPKU-2.1 , and analyzed the top 10 sites in animals treated with AdV5-PE3b by deep amplicon sequencing. In line with previous in vitro studies that demonstrated higher specificities of PEs compared to CRISPR-Cas9 nucleases, neither SNVs or indel mutations above background were observed (Fig. 4h; n=3 mice per group). Taken together, the inventor ' s data demonstrate that prime editing restores the pathogenic Pah enu2 allele in vivo without altering liver physiology or inducing substantial off-target mutations at sites homologous to the target locus.

Methods

Generation of plasmids

To generate pegRNA plasmids, annealed spacer, scaffold, and 3’ extension oligos were cloned into pU6-pegRNA-GG-acceptor by Golden Gate assembly. To generate nicking sgRNA plasmids, annealed and phosphorylated oligos were ligated into BsmBI- digested lentiGuide- Puro backbone. For the generation of split-intein and orthogonal PEs, inserts were ordered as gBIocks from Integrated DNA Technologies (idt) and cloned into pCMV-PE2 backbone using HiFi DNA assembly MasterMix (NEB). To generate piggyBac disease reporter plasmids, inserts with homology overhangs for cloning were ordered from idt and cloned into the pPB- Zeocin backbone using HiFi DNA assembly MasterMix (NEB). To engineer plasmids for virus production, inserts were ordered as gBIocks (idt) and cloned into AAV backbones using HiFi DNA assembly MasterMix (NEB). All PCR reactions were performed using Q5 High-Fidelity DNA polymerase (New England Biolabs). pU6-pegRNA-GG-acceptor (Addgene plasmid no. 132777) and pCMV-PE2 (Addgene plasmid no. 132775) were gifts from David Liu. lentiGuide-Puro and PX404 Campylobacter jejuni Cas9 were a gift from Feng Zhang (Addgene plasmid no. 52963 and 68338). pCMV-VSV-G was a gift from B. Weinberg (Addgene plasmid no. 8454) and psPAX2 was a gift from D. Trono (Addgene plasmid no. 12260). Saur/ABEmax was a gift from Yongming Wang (Addgene plasmid no. 135968).

Cell culture transfection and genomic DNA preparation

HEK293T (ATCC CRL-3216) cells were maintained in Dulbecco’s Modified Eagle’s Medium (DMEM) plus GlutaMax (Thermo Fisher), supplemented with 10% (v/v) fetal bovine serum (FBS) and 1% penicillin/streptomycin (Thermo Fisher) at 37°C and 5% CO2. Cells were maintained at confluency below 90% and seeded on 48-well cell culture plates (Greiner). Cells were transfected at 70% confluency using 1.5 pl_ of Lipofectamine 2000 (Thermo Fisher) with 375 ng PE, 125 ng pegRNA, and 40 ng sgRNA according to the manufacturer’s instructions. When intein-split PEs were transfected, 375 ng of each PE half was used for transfection. Unless otherwise noted, cells were incubated for 3 days and genomic DNA was isolated by direct lysis. For Lentivirus production, HEK293T cells were seeded into T75 flask (Greiner) with Opti-MEM (Thermo Fisher) and transfected at 70% confluency using polyethylenimine (PEI). Briefly, 60 pL PEI (0.1 mg/mL) was mixed with 370 m\- Opti-MEM, incubated at RT for 5 min, and added to 4.4 pg PAX2, 1.5 pg VSV-G, and 5.9 pg lentiviral vector plasmid (filled up to 430 pL Opti- MEM). Following 20 min incubation at RT, cells were transfected. The culture medium was changed one day after transfection. After two days, the cell culture supernatant was harvested and lentiviral particles were purified by filtration (0.20 pm, Sarstedt). Fresh HEK293T cells were subsequently transduced with lentiviral particles in a 24-well cell culture plate (Greiner). Two days after transduction, cells were enriched for 7 days using 2.5 pg/mL Puromycin.

For generation of disease reporter cell lines with the PiggyBac transposon, 30.000 HEK293T cells were seeded into a 24-well culture plate (Greiner) and transfected at 70% confluency using Lipofectamine 2000 (Thermo Fisher) according to the manufacturer’s instructions. Briefly, 1.5 pL Lipofectamine was mixed with 23.5 pL Opti-MEM, incubated at RT for 10 min, and added to 225 ng transposon plasmid and 25 ng transposon helper plasmid (filled up to 25 pL Opti-MEM). Following 30 min incubation at RT, cells were transfected. Three days after transfection, cells were enriched for 10 days using 150 pg/mL Zeocin.

Fluorescence reporter assays and fluorescence-activated cell sorting

Reporter cells were transfected with prime editing tools that are programmed to restore the expression of a fluorescent protein. Cells were incubated for 3 days post-transfection and trypsinized with T ryplE (Gibco). Cells were washed twice with phosphate-buffered saline (PBS) and resuspended in FACS Buffer (PBS supplemented with 2% FBS and 2 mM EDTA). Cell suspensions were filtered through 35 urn nylon mesh cell strainer snap caps (Corning) and kept on ice until analysis. For each sample, 100.000 events were counted on an LSR Fortessa (BD Biosciences) using the FACSDiva software version 8.0.1 (BD Biosciences). Experiments were performed in up to four replicates on different days. Data are reported as mean values ± standard deviation (s.d.). AAV and Adenovirus production

All pseudo-typed AAV2/8 vectors were produced by the Viral Vector Facility of the Neuroscience Center Zurich. All human Adenovirus 5 vectors were produced by ViraQuest.

Animal studies

Animal experiments were performed in accordance with protocols approved by the Kantonales Veterinaramt Zurich and in compliance with all relevant ethical regulations. Pah enu2 and C57BL/6 mice were housed in a pathogen-free animal facility at the EPIC Institute of Pharmacology and Toxicology of the University of Zurich. Mice were kept in a temperature- and humidity-controlled room on a 12-h light-dark cycle. Mice were fed a standard laboratory chow (Kliba Nafag no. 3437 with 18.5% crude protein) and genotyped at weaning. Heterozygous Pah enu2 littermates were used as controls for physiological L-Phe levels in the blood (<120 mM). For sampling of blood for L-Phe determination, mice were fasted for 3-4 h and blood was collected from the tail vein. Unless otherwise noted, new-born animals (P1) received 1.5x10 10 (Adenovirus) or 1 x10 12 (for each AAV) vg per animal and construct via temporal vein. Adult mice were injected with 1.5x10 11 (Adenovirus) per animal or 1 x10 12 (for each AAV) vg per animal and construct via tail vein. Average weight of neonatal and adult mice (5 weeks) was 1.5 g and 20 g, respectively. New-born mice were euthanized 4 weeks after injection. Adult mice were euthanized 4 weeks (Adenovirus) or 12 weeks (AAV) after injection.

Primary hepatocvte isolation

Primary hepatocytes were isolated using a two-step perfusion method. Briefly, pre-perfusion with HANKS’s buffer (HBSS supplemented with 0.5 mM EDTA, 25 mM HEPES) was performed by inserting the cannula through the superior vena cava and cutting the portal vein. Next, livers were perfused at low speed for approximately 10 min with Digestion Buffer (low glucose DMEM supplemented with 1 mM HEPES) containing freshly added Liberase TM (32 pg/mL; Roche). Digestion was stopped using Isolation Buffer (low glucose DMEM supplemented with 10% FBS) and cells were separated from the matrix by gently pushing with a cell scraper. The cell suspension was filtered through a 100 pm filter (Corning) and hepatocytes were purified by two low speed centrifugation steps (50xg for 2 min).

Amplification for deep sequencing

Genomic DNA from mouse liver tissues were isolated from whole liver lysate by direct lysis. Locus-specific primers were used to generate targeted amplicons for deep sequencing. First, input genomic DNA was amplified in a 20 pL reaction for 25 cycles using NEBNext High-Fidelity 2* PCR Master Mix (NEB). PCR products were purified using AMPure XP beads (Beckman Coulter) and subsequently amplified for 8 cycles using primers with sequencing adapters. Approximately equal amounts of PCR products from each sample were pooled, gel-purified and quantified using a Qubit 3.0 fluorometer and the dsDNA HS assay kit (Thermo Fisher). Paired-end sequencing of purified libraries was performed on an lllumina Miseq.

HTS data analysis

Sequencing reads were demultiplexed using MiSeq Reporter (lllumina). Amplicon sequences were aligned to their reference sequences using CRISPResso2 47 . Prime editing efficiencies were calculated as percentage of (number of reads containing only the desired edit)/(number of total reads). Incorrect editing was quantified as percentage of (number of reads containing bystander base substitutions)/(number of total reads). Indel yields were calculated as percentage of (number of indel-containing reads)/(total reads).

Histology

Livers were fixed using 4% paraformaldehyde (PFA, Sigma-Aldrich), followed by ethanol dehydration and paraffinization. Paraffin blocks were cut into 5-pm thick sections, deparaffinized with xylene, and rehydrated. Sections were HE-stained and examined for histopathological changes.

Statistical analysis

A priori power calculations to determine sample sizes for animal experiments were performed using the R “pwr” package. Statistical analyses were performed using GraphPad Prism 9.0.0. for MacOS. Data are represented as biological replicates and are depicted as mean ± s.d. as indicated in the corresponding figure legends. Likewise, sample sizes and the statistical tests used are described in detail in the respective figure legends. For all analyses, P<0.05 values were considered statistically significant.

Sequences

Sequences as stated in the sequence protocol (ST.25) are incorporated by reference herein.

Sequences relevant to claims are stated here; in case of discrepancies with the protocol, the below sequence information shall prevail.

SEQ ID NO 001, spCAS9 Nickase N-fragment, 1 st vector (partial sequence first vector polypeptide):

DKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNL!GALLFDSGET AEATRLKRTARRRYTRRKNR

ICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHP!FGNIVDEVAYHEKYPT! YHLRKKLVDSTDKADLRLIYL

ALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILS ARLSKSRRLENLIAQLPGEKK

NGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFL AAKNLSDAILLSDILRVNTEIT

KAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQE EFYKFIKPILEKMDGTEELL

VKLNREDLLRKQRTFDNGSIPHGIHLGELHAILRRQEDFYPFLKDNREKIEKILTFR IPYYVGPLARGNSRFAWMTR

KSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNEL TKVKYVTEGMRKPAFLSGE

QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKI IKDKDFLDNEENED!LEDIVLT LTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKT!LDF LKSDGFANRNFMQLIH

DDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHK PENIVIEMARENQTTQKGQ

KNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDIN RLSDYDVDAIVPQSFLKDD

SIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGG LSELDKAGFIKRQLVETR

QITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYH HAHDAYLNAVVGTALIKKYPK

LESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKR PLIETNGETGEIVWDKGRDF

ATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSP TVAYSVLVVAKVEKGK

SEQ ID NO 002, spCAS9 Nickase C-fragment and RTARNAseH, 2 nd vector (partial sequence 2 nd vector polypeptide)

SKKLKSVKELLGmMERSSFEKNPlDFLEAKGYKEVKKDUlKLPKYSLFELENGRKRM LASAGELQKGNELALPSK

YVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE!!EQ!SEFSKRVILADANLD KVLSAYNKHRDKPIREQAENI iHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDS GGSSGGSSGSETPGTSES

SEQ ID NO 003, spCAS9 Nickase N-fragment and N terminal intein, 1 st vector (full sequence 1 st vector polypeptide):

MKRTADGSEFESPKKKRKVDKKYSIGLDIGTNSVGWAViTDEYKVPSKKFKVLGNTD RHSIKKNLIGALLFDSGETA

EATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERH PIFGNIVDEVAYHEKYPTIY

HLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLF!QLVQTYN QLFEENPINASGVDAKAILSA

RLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTY DDDLDNLLAQIGDQYADLFL

AAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYK EIFFDQSKNGYAGYIDGGASQ

EEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ EDFYPFLKDNREKIEKILTFRIP

YYVGPLARGNSRFAWlVITRKSEETiTPA .YEYFTVYNEL

TKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEIS GVEDRFNASLGTYHDLLKIIK

DKDFLDNEENEDILEDIVLTLTLFEDREM!EERLKTYAHLFDDKVMKQLKRRRYTGW GRLSRKLINGIRDKQSGKTI

LDFLKSDGFANRNFIV!QLIHDDSLTFKEDIQKAQVSGQGDSLHEH!ANLAGSPAIK KGILQTVKVVDELVKVMGRHK

PENIVIEMARENQTTQKGGKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLY LYYLQNGRDMYVDQELDIN

RLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLN AKLITQRKFDNLTKAERG

GLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLV SDFRKDFQFYKVREINNYHHA

HDAYLNAVVGTAUKKYPKLESEf r VYGDYKVYDVRKlVllAKSEQElGKATAKYFFYSNIlV!NFFKTElTLANGElR KRPL lETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIAR KKDWDPKKYGGFDSPT

VAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLAS

AGELQKGNELALPSKYVNFLYLASHYEKLKGSF 5 EDNEQKQLFVEQHKHYLDEllEQlSEFSKRVlLADANLDKVLSA

YNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQ SITGLYETRIDLSQLGGDSGGS

SGGSSGSETPGTSESATPESSGGSSGGSSTLN I EDEYRLL

QAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGIL

VEDIHPTVPNPYNLLSGLPPSHGWYTVLDLKDAFFCLRLHF

TLFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQC KEGGRWLTEARKETVMGGPTPKTPRGLREFLGK; .YPLTK -IWGPDQQKAYQEI

KQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLT .DPVAAGWPPCLRMVAAIAVLTKD

AGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALI VGFGPVVALNPATLLPGSEFEPKKKRKV

SEQ ID NO 004, spCAS9 Nickase C-fragment and RTARNAseH 2 nd vector (full sequence 2 nd vector polypeptide)

MKRTADGSEFESPKKKRKVIKIATRKYLGKQNVYDIGVERDHNFALKNGFIASNSGG GGSGGGGSGGGGSSKKL

KSVKELLGITIMERSSFEKNP!DFLEAKGYKEVKKDLI!KLPKYSLFELENGRKRML ASAGELQKGNELALPSKYVNF

LYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLS AYNKHRDKPIREQAENIIHLFT

LTNLGAPAAFKYFDTT!DRKRYTSTKEVLDATLIHGS!TGLYETRIDLSQLGGDSGG SSGGSSGSETPGTSESATPE

SEQ ID NO 005

MKRTADGSEFESPKKKRKVDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTD RHSIKKNLIGALLFDSGETA

EATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERH PIFGNIVDEVAYHEKYPTIY

HLRKKLVDSTDKADLRL!YLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVGTYN QLFEENPINASGVDAKAILSA

RLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTY DDDLDNLLAQIGDQYADLFL

AAKNLSDA!LLSDILRVNTEITKAPLSASIVilKRYDEHHQDLTLLKALVRQQLPEK YKEIFFDQSKNGYAGY!DGGASQ

EEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ EDFYPFLKDNREKIEKILTFRIP

YYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEK VLPKHSLLYEYFTVYNEL

TKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEIS GVEDRFNASLGTYHDLLKIIK

DKDFLDNEENEDILEDIVLTLTLFEDREM!EERLKTYAHLFDDKVMKQLKRRRYTGW GRLSRKLINGIRDKQSGKTI

LDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKG ILQTVKVVDELVKVMGRHK LYYLQNGRDMYVDQELDIN

RLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLN AKLITQRKFDNLTKAERG

GLSELDKAGFIKRQLVETRG!TKHVAQ!LDSRMNTKYDENDKL!REVKV!TLKSKLV SDFRKDFGFYKVREiNNYHHA

HDAYLNAVVGTAL!KKYPKLESEFVYGDYKVYDVRKM!AKSEQEIGKATAKYFFYSN IMNFFKTE!TLANGEIRKRPL lETNGETGEIVWDKGRDFATVRKVLSMPQVNiVKKTEVQTGGFSKESILPKRNSDKLIAR KKDWDPKKYGGFDSPT

VAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLAS

AGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQI SEFSKRVILADANLDKVLSA

YNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQ SITGLYETRIDLSQLGGD

SEQ ID NO 006

EFLGKAGFI GLTAPALGLPDL : ¾ KGVLTQKLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAP HAVEALVKQPPDRW

LSNARMTHYQALLLDTDRVQFGPVVALNPATLLPGSEFEPKKKRKV

SEQ ID NO 007

MKRTADGSEFESPKKKRKVDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTD RHSIKKNLIGALLFDSGETA

EATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERH PIFGNIVDEVAYHEKYPTIY

HLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYN QLFEENPINASGVDAKAILSA

RLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTY DDDLDNLLAQIGDQYADLFL

AAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYK EIFFDQSKNGYAGYIDGGASQ

EEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQ EDFYPFLKDNREKIEKiLTFRIP

SEQ ID NO 008 (structural part of pegRNAs)

GTTTT AG AGCT AG AAAT AGCAAGTT AAAAT AAGGCT AGT CCGTTAT CAACTT GAAAAAGT GGCACCGAGTCGGTGC

SEQ ID NO 009 (guide RNA sequence tract):

CCT AAT GT ACT GTGT GCAGT

SEQ ID NO 010 (template sequence tract)

GCCTT CCGaGT CT cCCACT

SEQ ID NO 011 (hybridizing sequence tract):

GCACACAGTACAT

SEQ ID NOs 9 to 11 are specific for the edit performed in the example and will need to be adapted to the edited sequence and the intended edit for each individual application. N-intein split PE 713/714 (SEQ ID NO 111)

GSSGGSSTLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLII PLKAT

STPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPV QDLRE

VNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRD PEMG

ISGQLTWTRLPQGFKNSPTLFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELD CQQGTR

ALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPR QLRE

FLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGL PDLTK

PFELFVDEKQGYAKGVLTQKLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTK DAGK LTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPWALNPATLLPL P EEGLQHNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVI W I

Discussion

In vivo prime editing in somatic tissues holds great potential for therapeutic application in patients with genetic diseases. Like base editing, prime editing works independent of double strand (ds)DNA break formation, which can trigger excessive genetic damage, including translocations, inversions, and large deletions. With base editing efficient and precise correction of disease-causing mutations in vivo in somatic tissues has already been demonstrated for a number of loci. This includes the Pah enu2 locus, where the inventors achieved curative correction with SaCas9-CBE by AAV- and lipid nanoparticle (LNP)-mediated delivery into adult mice. Interestingly, in vivo prime editing was less efficient at this locus, and therapeutic correction required AdV5-mediated delivery of full-length PE into new-born mice. These results are in line with previous ex vivo studies, which also observed a trend for lower editing rates with PEs compared to BEs at a number of targeted loci. Nevertheless, PEs are more flexible with regard to PAM localization relative to the edit, they induce fewer bystander mutations, and are able to also correct indel mutations and transversion substitutions. Hence, PEs complement BEs by significantly increasing the number of pathogenic alleles that could be targeted by dsDNA-break independent genome editing.

In ourstudy systemic injection of a single dose of AAV-8 or AdV5 encoding the PE, pegRNA, and sgRNA resulted in > 10% editing at the Dnmt1 and Pah enu2 locus, respectively. Correction rates in Pah enu2 mice were sufficient to reduce blood-L-Phe to physiological levels, and neither unwanted bystander mutations at the target site nor editing at potential off-target sites were observed. Our findings therefore support previous studies that have demonstrated high specificity of PEs in vitro, and provide proof-of-concept for the efficacy to treat PKU. Moreover, correction rates of 10% in hepatocyte should also be sufficient to treat of a variety of other genetic liver diseases, including Tyrosinemia and Urea Cycle Disorders.

Prime editing was more efficient in newborn mice compared to adult mice. The inventors therefore hypothesized that hepatocyte proliferation in neonatal mice had positively influenced editing rates. Ectopic induction of hepatocyte proliferation by PHx in treated adult mice, however, did not increase editing efficiencies at the Dnmtl locus. Future work should therefore test whether increasing AAV or AdV5 doses in adult animals could elevate editing rates to those observed in pups. However, since the inventors already applied relatively high vector doses in our study, alternative strategies of enhancing PE activity would be desired in order to sustain clinical viability of in vivo prime editing. Examples for such strategies could be to increase nuclear import capacity of PE through an optimized NLS design, or to increase reactivity of the RT domain via directed protein evolution.