Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
REPROGRAMMING TROPISM VIA DISPLAYED PEPTIDES TILING RECEPTOR-LIGANDS
Document Type and Number:
WIPO Patent Application WO/2024/055020
Kind Code:
A2
Abstract:
Described herein are methods for engineering proteins and viruses to improve tropism, and proteins and viruses made by using said methods.

Inventors:
MALI PRASHANT (US)
PORTELL ANDREW (US)
FORD KYLE (US)
SUHARDJO AMANDA (US)
Application Number:
PCT/US2023/073808
Publication Date:
March 14, 2024
Filing Date:
September 08, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV CALIFORNIA (US)
International Classes:
C12N15/86; C07K14/075
Attorney, Agent or Firm:
BAKER, Joseph R., Jr. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method of improving tropism of a virus or other delivery agent, the method comprising identifying ligand protein sequences derived from all known receptor-interacting ligands; systematically tile the ligand peptides into 5-50 or 10-20 or 20 amino acid peptides which are inserted into surface-exposed loops of AAV capsids; assessing the engineered capsids for their packaging capacity, in vivo tropism, and enhanced protein interactions.

2. The method of claim 1, wherein the virus is an adeno- associated virus (AAV) .

3. The method of claim 2, wherein the AAV is selected from the group consisting of AAV1, AAV2, AAV5, AAV6, AAV7 , AAV8, and AAV9.

4. The method of claim 3, wherein the AAV is AAV5 or AAV9.

5. The method of claim 3, wherein peptide sequences were generated via pooled oligonucleotide synthesis and inserted into 4 distinct loop regions: AAV5-Loopl (N443) , AAV5-Loop2 (S576) , AAV9- Loopl (Q456) , and AAV9-Loop2 (A587) to generate over 1 million AAV variants .

6. The method of claim 3, wherein a 20'mer ligand peptide is inserted into one or both of 2 surface-exposed loops of AAV5 (SEQ ID NO: 2) or AAV9 (SEQ ID NO: 4) .

7. A recombinant vector comprising capsid proteins containing one or two peptides inserted into one or both of two surface-exposed loops in the capsid protein, wherein the vector has a desired tropism or immune-orthogonality and wherein the peptides are independently selected from the group consisting of SEQ ID NOs : 5 to 820 and 870-911.

The recombinant vector of claim 7, wherein the vector is an adeno-associated virus (AAV) .

9. The recombinant vector of claim 8, wherein the AAV vector is an AAV5 serotype.

10. The recombinant vector of claim 9, wherein the AAV5 comprises a capsid protein having a sequence as set forth in SEQ ID NO: 2.

11. The recombinant vector of claim 8, wherein the AAV vector is an AAV9 serotype.

12. The recombinant vector of claim 11, wherein the AAV9 comprises a capsid protein having a sequence as set forth in SEQ ID NO: 4.

13. The recombinant vector of claim 7, wherein the vector has tropism to pancreas, heart, brain, lung, liver, kidney, muscle, spleen or intestine.

14. The recombinant vector of claim 10 or 12, wherein a peptide of SEQ ID N0s:530-820, 870-910 or 911 is inserted into loop 1 and/or loop 2 of an AAV5 capsid or an AAV9 capsid.

15. The recombinant vector of claim 7, wherein the vector is immune orthogonal.

16. The recombinant vector of claim 15, wherein the vector is an adeno-associated virus (AAV) .

17. The recombinant vector of claim 16, wherein the AAV comprises a capsid protein of any one of SEQ ID NOs:822, 824, 826, 828, 830, 832, 834, 836, 838, 840, 842, 844, 846, 848, 850, 852, 854, 856, 858, 860, 862, 864, or 866, or a sequence that it at least 85% to 99% identical to any of the foregoing sequences.

18. A method of making a delivery vehicle or vector with a desired tropism, the method comprising selecting a peptide sequence from any one of SEQ ID NOs : 5-820 or 870-911 and (i) cloning a nucleic acid sequence encoding the peptide into a coding sequence for a capsid protein at an exposed loop site to obtain a recombinant capsid coding sequence and producing the vector using the recombinant capsid coding sequence or (ii) inserting the peptide into an exposed surface of the delivery vehicle.

19. The method of claim 18, wherein the vector is an adeno- associated virus (AAV) .

20. The method of claim 19, wherein the AAV is selected from the group consisting of AAV1, AAV2 , AAV5 , AAV6, AAV7 , AAV8 , and AAV9

21. The method of claim 20, wherein the AAV is AAV5 or AAV9.

22. The method of claim 21, wherein the AAV5 capsid coding seguence comprises SEQ ID NO: 1.

23. The method of claim 21, wherein the AAV9 capsid coding sequence comprises SEQ ID NO: 2.

24. An AAV vector wherein the capsid protein has been modified according to the method of claim 18.

25. An AAV vector comprising a capsid protein wherein the capsid protein expresses a peptide of any one of SEQ ID NO: 5-820 or 870- 911 in surface exposed loop 1 and/or loop 2 of the capsid protein.

26. The AAV vector of claim 25, wherein the wild-type capsid protein sequence comprises SEQ ID NO: 2 or 4.

27. The AAV vector of claim 25, wherein the AAV vector has a AAV5 or AAV9 serotype.

28. The AAV vector of claim 25, wherein the vector has a desired tropism.

29. The AAV vector of claim 28, wherein the AAV vector has a tropism to pancreas, heart, brain, lung, liver, kidney, muscle, spleen or intestine.

30. A viral vector having a capsid protein comprising a heterologous targeting peptide in a range of 10-30 amino acids in length inserted into a surface exposed portion of a capsid protein and wherein the targeting peptide is set forth in any one of SEQ ID NOs : 5-820 or 870-911.

31. The viral vector of claim 30, wherein the heterologous targeting peptide is about 15-25 amino acids in length.

32. The viral vector of claim 31, wherein the heterologous targeting peptide is about 20 amino acids in length.

33. The viral vector of claim 30, wherein the viral vector is an adeno-associated virus (AAV) .

34. The viral vector of claim 30, wherein the viral vector is a lentiviral vector.

35. The viral vector of claim 30, wherein the capsid protein is a VP1 capsid protein.

36. The viral vector of claim 30, wherein the capsid protein is a VP2 capsid protein.

37. The viral vector of claim 30, wherein the capsid protein is a VP3 capsid protein.

38. The viral vector of claim 30, wherein the heterologous targeting peptide is inserted into an AAV capsid protein at loop 1 and/or loop 2.

39. The viral vector of claim 30, wherein the viral vector is an AAV5.

40. The viral vector of claim 30, wherein the viral vector is an AAV9.

41. The viral vector of claim 39, wherein the heterologous targeting peptide is flanked by a linker peptide at the N-terminal and C-terminal ends of the heterologous targeting peptide.

42. The viral vector of claim 40, wherein the heterologous targeting peptide is flanked by a linker peptide at the N-terminal and C-terminal ends of the heterologous targeting peptide.

43. The viral vector of claim 30, wherein the heterologous targeting peptide targets the viral vector to hepatocytes or liver tissue .

44. The viral vector of claim 30, wherein the heterologous targeting peptide targets the viral vector to neuronal cells or brain tissue.

45. The viral vector of claim 30, wherein the heterologous targeting peptide targets the viral vector to pancreatic cells or pancreas tissue.

46. The viral vector of claim 30, wherein the heterologous targeting peptide targets the viral vector to cardiac cells or heart tissue .

47. The viral vector of claim 30, wherein the heterologous targeting peptide targets the viral vector to lung tissue.

48. The viral vector of claim 30, wherein the heterologous targeting peptide targets the viral vector to intestinal tissue.

49. The viral vector of claim 30, wherein the heterologous targeting peptide targets the viral vector to spleen tissue.

50. The viral vector of claim 30, wherein the heterologous targeting peptide targets the viral vector to renal cells or kidney tissue .

51. The viral vector of claim 30, wherein the heterologous targeting peptide targets the viral vector to muscle cells or tissue .

52. An adeno-associated virus (AAV) capsid protein comprising a heterologous targeting peptide cloned into loop 1 and/or loop 2 of the capsid protein, wherein the heterologous targeting peptide is about 10-30 amino acids in length and is contained or comprises any one of the peptides of SEQ ID NO: 5-820 or 870-911.

53. The AAV capsid protein of claim 52, wherein the capsid protein is a VP1 capsid protein.

54. The AAV capsid protein of claim 52, wherein the capsid protein is a VP2 capsid protein.

55. The AAV capsid protein of claim 52, wherein the capsid protein is a VP3 capsid protein.

56. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide is about 15-25 amino acids in length.

57. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide is about 20 amino acids in length.

58. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide is flanked by a linker peptide at the N-terminal and C-terminal ends of the heterologous targeting peptide.

59. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide targets hepatocytes or liver tissue.

60. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide targets neuronal cells or brain tissue.

61. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide targets pancreatic cells or pancreas tissue.

62. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide targets cardiac cells or heart tissue.

63. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide targets lung tissue.

64. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide targets intestinal tissue.

65. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide targets spleen tissue.

66. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide targets renal cells or kidney tissue.

67. The AAV capsid protein of claim 52, wherein the heterologous targeting peptide targets muscle cells or tissue.

68. A recombinant AAV (rAAV) comprising a capsid protein of any one of claims 52-67.

69. A recombinant AAV (rAAV) comprising a capsid protein having a targeting peptide in loop 1 and/or loop 2 wherein the targeting peptide is independently selected from SEQ ID Nos: 5-820 or 870-911.

70. The recombinant AAV of claim 69, wherein the recombinant AAV further comprises a heterologous polynucleotide for gene delivery.

71. The recombinant AAV of claim 70, wherein the heterologous polynucleotide is a therapeutic gene.

72. A composition comprising the recombinant rAAV of any one of claims 69-71.

73. The composition of claim 72 further comprising a pharmaceutically acceptable carrier.

73. A method for delivering a transgene to a subject comprising: administering a recombinant AAV (rAAV) to a subject, wherein the rAAV comprises:

(i) a capsid protein of any one of claims 52-67, and

(ii) at least one transgene, and wherein the rAAV infects cells of a target tissue of the subject.

74. The method of claim 73, wherein the at least one transgene encodes a protein.

75. The method of claim 74, wherein the protein is an immunoglobulin heavy chain or light chain or fragment thereof.

16. The method of claim 73, wherein the at least one transgene encodes a small interfering nucleic acid.

77. The method of claim 76, wherein the small interfering nucleic acid is a miRNA.

78. The method of claim 76, wherein the small interfering nucleic acid is a miRNA sponge or TuD RNA that inhibits the activity of at least one miRNA in the subject or animal.

79. The method of claim 77, wherein the miRNA is expressed in a cell of the target tissue.

80. The method of claim of claim 73, wherein the target tissue is skeletal muscle, heart, liver, pancreas, brain or lung.

81. The method of claim 73, wherein the transgene expresses a transcript that comprises at least one binding site for a miRNA, wherein the miRNA inhibits activity of the transgene, in a tissue other than the target tissue, by hybridizing to the binding site.

82. The method of claim 73, wherein the at least one transgene encodes a gene product that mediates genome editing.

83. The method of claim 73, wherein the transgene comprises a tissue specific promoter or inducible promoter.

84. The method of claim 83, wherein the tissue specific promoter is a liver-specific thyroxin binding globulin (TBG) promoter, an insulin promoter, a glucagon promoter, a somatostatin promoter, a pancreatic polypeptide (PRY) promoter, a synapsin-1 (Syn) promoter, a creatine kinase (MCK) promoter, a mammalian desmin (DES) promoter, a a-myosin heavy chain (a-MHC) promoter, or a cardiac Troponin T (cTnT) promoter.

85. The method of claim 73, wherein the rAAV is administered intravenously, intravascularly, transdermally, intraocularly, intrathecally, orally, intramuscularly, subcutaneously, intranasally, or by inhalation.

86. The method of claim 73, wherein the subject is selected from a mouse, a rat, a rabbit, a dog, a cat, a sheep, a pig, and a nonhuman primate.

59. The method of claim 73, wherein the subject is a human.

87. An isolated nucleic acid encoding an AAV capsid protein containing an amino acid sequence selected from the group consisting of SEQ ID No: 5-820 and 870-911.

88. A delivery vehicle for delivery of a small molecule drug or biological agent having a desired tropism, wherein the delivery vehicle comprises a peptide or peptide fragment of at least 10-20 amino acids of any one of SEQ ID NOs:5-820 or 870-911.

89. The delivery vehicle of claim 88, wherein the delivery vehicle is selected from the group consisting of a liposome, a nanoparticle, a bacteria, a bacteriophage, a virus-like particle (VLP) , a erythrocyte ghost, and an exosome.

90. The delivery vehicle of claim 88, wherein the biological agent comprises an siRNA, an antisense molecule, a protein or polypeptide, insulin, a vaccine, or an antibody.

91. The delivery vehicle of claim 88, wherein the small molecule drug comprises a chemotherapeutic agent, an anti-inflammatory, a steroid, and an antibiotic.

92. A biological agent having a desired tropism, the biological agent linked to a peptide or peptide fragment of at least 10-20 amino acids of any one of SEQ ID NOs:5-820 or 870-911.

93. The biological agent of claim 92, wherein the biological agent is a nucleic acid, a protein, a polypeptide, a peptide, an antibody, an antibody fragment, a non-immunoglobulin binding agent, or an enzyme .

Description:
REPROGRAMMING TROPISM VIA DISPLAYED PEPTIDES TILING RECE PTOR- LI GAND S

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority under 35 U . S . C . §119 from Provisional Application Serial No . 63/405 , 360 , filed September 9 , 2022 , the disclosures of which are incorporated herein by re ference in its entirety .

STATEMENT REGARDING FEDERALLY SPONOSORED RESEARCH

[0002] This invention was made with Government support under OD032742 , CA222826, and GM123313 awarded by the National Institutes of Health and W81XWH-22 - 1- 0401 awarded by the Department of De fense . The government has certain rights in the invention .

TECHNICAL FIELD

[0003] Described herein are methods for engineering proteins and viruses to improve tropism, and proteins and viruses made by using said methods .

INCORPORATION BY REFERENCE OF SEQUENCE LISTING

[0004] Accompanying this filing is a Sequence Listing entitled " 00015 - 417W01_SL . xml" , created on September 8 , 2023 and having 872 , 924 bytes of data , machine formatted on IBM-PC, MS-Windows operating system . The sequence listing is hereby incorporated herein by re ference in its entirety for all purposes .

BACKGROUND

[0005] Nucleic acid and protein based therapeutics to modulate healthy and diseased states are poised to enable the next frontier of human medicine . However their succes s ful deployment is contingent on the ability to deliver them efficiently and in a safe and targeted manner . In this regard, a host of viral and non-viral delivery formulations have been developed, but the ability to directly modulate their tropism remains challenging .

SUMMARY

[0006] The disclosure provides a method of improving tropism of a virus or other delivery agent , the method comprising identi fying ligand protein sequences derived from all known receptor-interacting ligands ; systematically tile the ligand peptides into 5 -50 or 10 -20 or 20 amino acid peptides which are inserted into surface-exposed loops of AAV capsids; assessing the engineered capsids for their packaging capacity, in vivo tropism, and enhanced protein interactions. In one embodiment, the virus is an adeno-associated virus (AAV) . In a further embodiment, the AAV is selected from the group consisting of AAV1, AAV2, AAV5, AAV6, AAV7 , AAV8, and AAV9. In still a further embodiment, the AAV is AAV5 or AAV9. In another embodiment, peptide sequences were generated via pooled oligonucleotide synthesis and inserted into 4 distinct loop regions: AAV5-Loopl (N443) , AAV5-Loop2 (S576) , AAV9-Loopl (Q456) , and AAV9- Loop2 (A587) to generate over 1 million AAV variants. In another embodiment, a 20'mer ligand peptide is inserted into one or both of 2 surface-exposed loops of AAV5 (SEQ ID NO: 2) or AAV9 (SEQ ID NO: 4) .

[0007] The disclosure provides a recombinant vector comprising capsid proteins containing one or two peptides inserted into one or both of two surface-exposed loops in the capsid protein, wherein the vector has a desired tropism or immune-orthogonality and wherein the peptides are independently selected from the group consisting of SEQ ID NOs : 5 to 820 and 870-911. In one embodiment, the vector is an adeno-associated virus (AAV) . In a further embodiment, the AAV vector is an AAV5 serotype. In still a further embodiment, the AAV5 comprises a capsid protein having a sequence as set forth in SEQ ID NO: 2. In another embodiment, the AAV vector is an AAV9 serotype. In a further embodiment, the AAV9 comprises a capsid protein having a sequence as set forth in SEQ ID NO: 4. In another embodiment, the vector has tropism to pancreas, heart, brain, lung, liver, kidney, muscle, spleen or intestine. In another or further embodiment, a peptide of SEQ ID NOs: 530-820, 870-910 or 911 is inserted into loop 1 and/or loop 2 of an AAV5 capsid or an AAV9 capsid. In another embodiment, the vector is immune orthogonal. In a further embodiment, the vector is an adeno-associated virus (AAV) . In still a further embodiment, the AAV comprises a capsid protein of any one of SEQ ID NOs:822, 824, 826, 828, 830, 832, 834, 836, 838, 840, 842, 844, 846, 848, 850, 852, 854, 856, 858, 860, 862, 864, or 866, or a sequence that it at least 85% to 99% identical to any of the foregoing sequences. [0008] The disclosure also provides a method of making a delivery vehicle or vector with a desired tropism, the method comprising selecting a peptide sequence from any one of SEQ ID NOs : 5-820 or 870-911 and (i) cloning a nucleic acid sequence encoding the peptide into a coding sequence for a capsid protein at an exposed loop site to obtain a recombinant capsid coding sequence and producing the vector using the recombinant capsid coding sequence or (ii) inserting the peptide into an exposed surface of the delivery vehicle. In one embodiment, the vector is an adeno-associated virus (AAV) . In a further embodiment, the AAV is selected from the group consisting of AAV1, AAV2 , AAV5 , AAV6, AAV7, AAV8, and AAV9. In still a further embodiment, the AAV is AAV5 or AAV9. In a further embodiment, the AAV5 capsid coding sequence comprises SEQ ID NO:1. In another embodiment, the AAV9 capsid coding sequence comprises SEQ ID NO : 2. The disclosure also provide an AAV vector comprising a capsid protein modified by any of the foregoing embodiments.

[0009] The disclosure also provides an AAV vector comprising a capsid protein wherein the capsid protein expresses a peptide of any one of SEQ ID NO: 5-820 or 870-911 in surface exposed loop 1 and/or loop 2 of the capsid protein. In one embodiment, the wild-type capsid protein sequence comprises SEQ ID NO: 2 or 4. In another embodiment, the AAV vector has a AAV5 or AAV9 serotype. In still another embodiment, the vector has a desired tropism. In a further embodiment, the AAV vector has a tropism to pancreas, heart, brain, lung, liver, kidney, muscle, spleen or intestine.

[0010] The disclosure also provides a viral vector having a capsid protein comprising a heterologous targeting peptide of 10-30 amino acids in length inserted into a surface exposed portion of the capsid protein and wherein the targeting peptide is set forth in any one of SEQ ID NOs: 5-820 or 870-911. In one embodiment, the heterologous targeting peptide is about 15-25 amino acids in length. In a further embodiment, the heterologous targeting peptide is about 20 amino acids in length. In another embodiment, the viral vector is an adeno-associated virus (AAV) . In still another embodiment, the viral vector is a lentiviral vector. In another embodiment, the capsid protein is a VP1 capsid protein. In still another embodiment, the capsid protein is a VP2 capsid protein. In yet another embodiment, the capsid protein is a VP3 capsid protein. In another embodiment, the heterologous targeting peptide is inserted into an AAV capsid protein at loop 1 and/or loop 2. In yet another embodiment, the viral vector is an AAV5. In another embodiment, the viral vector is an AAV9. In still another embodiment, the heterologous targeting peptide is flanked by a linker peptide at the N-terminal and C-terminal ends of the heterologous targeting peptide. In another embodiment, the heterologous targeting peptide targets the viral vector to hepatocytes or liver tissue. In yet still another embodiment, the heterologous targeting peptide targets the viral vector to neuronal cells or brain tissue. In still another embodiment, the heterologous targeting peptide targets the viral vector to pancreatic cells or pancreas tissue. In yet another embodiment, the heterologous targeting peptide targets the viral vector to cardiac cells or heart tissue. In another embodiment, the heterologous targeting peptide targets the viral vector to lung tissue. In still yet another embodiment, the heterologous targeting peptide targets the viral vector to intestinal tissue. In yet another embodiment, the heterologous targeting peptide targets the viral vector to spleen tissue. In still another embodiment, the heterologous targeting peptide targets the viral vector to renal cells or kidney tissue. In another embodiment, the heterologous targeting peptide targets the viral vector to muscle cells or tissue .

[0011] The disclosure also provides an adeno-associated virus

(AAV) capsid protein comprising a heterologous targeting peptide cloned into loop 1 and/or loop 2 of the capsid protein, wherein the heterologous targeting peptide is about 10-30 amino acids in length and is contained within or comprises any one of the peptides of SEQ ID NO: 5-820 or 870-911. In one embodiment, the capsid protein is a VP1 capsid protein. In another embodiment, the capsid protein is a VP2 capsid protein. In still another embodiment, the capsid protein is a VP3 capsid protein. In another embodiment, the heterologous targeting peptide is about 15-25 amino acids in length. In yet another embodiment, the heterologous targeting peptide is about 20 amino acids in length. In a still another embodiment, the heterologous targeting peptide is flanked by a linker peptide at the N-terminal and C-terminal ends of the heterologous targeting peptide. In another embodiment, the heterologous targeting peptide targets hepatocytes or liver tissue. In still another embodiment, the heterologous targeting peptide targets neuronal cells or brain tissue. In another embodiment, the heterologous targeting peptide targets pancreatic cells or pancreas tissue. In yet another embodiment, the heterologous targeting peptide targets cardiac cells or heart tissue. In another embodiment, the heterologous targeting peptide targets lung tissue. In yet another embodiment, the heterologous targeting peptide targets intestinal tissue. In another embodiment, the heterologous targeting peptide targets spleen tissue. In another embodiment, the heterologous targeting peptide targets renal cells or kidney tissue. In yet another embodiment, the heterologous targeting peptide targets muscle cells or tissue. The disclosure also provides a recombinant AAV (rAAV) comprising a capsid protein of any of the foregoing embodiments. [0012] The disclosure provides a recombinant AAV (rAAV) comprising a capsid protein having a targeting peptide in loop 1 and/or loop 2 wherein the targeting peptide is independently selected from SEQ ID Nos: 5-820 or 870-911. In one embodiment, a targeting peptide is present in both Loop 1 and Loop 2. In a further embodiment, the targeting peptide has the same tropism. In another embodiment, the recombinant AAV further comprises a heterologous polynucleotide for gene delivery. In a further embodiment, the heterologous polynucleotide is a therapeutic gene. In still another or further embodiment, of any of the foregoing embodiments, the rAAV is present in a pharmaceutical composition.

[0013] The disclosure also provides a method for delivering a transgene to a subject comprising: administering a recombinant AAV (rAAV) to a subject, wherein the rAAV comprises: (i) a capsid protein of the disclosure, and (ii) at least one transgene, and wherein the rAAV infects cells of a target tissue of the subject. In one embodiment, the at least one transgene encodes a protein. In a further embodiment, the protein is an immunoglobulin heavy chain or light chain or fragment thereof. In another embodiment, the at least one transgene encodes a small interfering nucleic acid. In a further embodiment, the small interfering nucleic acid is a miRNA. In another embodiment, the small interfering nucleic acid is a miRNA sponge or TuD RNA that inhibits the activity of at least one miRNA in the subject or animal. In yet another embodiment, the miRNA is expressed in a cell of the target tissue. In still another embodiment, the target tissue is skeletal muscle, heart, liver, pancreas, brain or lung. In another embodiment, the transgene expresses a transcript that comprises at least one binding site for a miRNA, wherein the miRNA inhibits activity of the transgene, in a tissue other than the target tissue, by hybridizing to the binding site. In yet another embodiment, the at least one transgene encodes a gene product that mediates genome editing. In another embodiment, the transgene comprises a tissue specific promoter or inducible promoter. In a further embodiment, the tissue specific promoter is a liver-specific thyroxin binding globulin (TBG) promoter, an insulin promoter, a glucagon promoter, a somatostatin promoter, a pancreatic polypeptide (PRY) promoter, a synapsin-1 (Syn) promoter, a creatine kinase (MCK) promoter, a mammalian desmin (DES) promoter, a a-myosin heavy chain (a-MHC) promoter, or a cardiac Troponin T (cTnT) promoter. In another embodiment, the rAAV is administered intravenously, intravascularly, transdermally, intraocularly, intrathecally, orally, intramuscularly, subcutaneously, intranasally, or by inhalation. In still another embodiment, the subject is selected from a mouse, a rat, a rabbit, a dog, a cat, a sheep, a pig, and a non-human primate. In yet another embodiment, the subject is a human.

[0014] The disclosure also provides an isolated nucleic acid encoding an AAV capsid protein containing an amino acid sequence selected from the group consisting of SEQ ID No: 5-820 and 870-911. [0015] The disclosure also provides a delivery vehicle for delivery of a small molecule drug or biological agent having a desired tropism, wherein the delivery vehicle comprises a peptide or peptide fragment of at least 10-20 amino acids of any one of SEQ ID NOs: 5-820 or 870-911. In one embodiment, the delivery vehicle is selected from the group consisting of a liposome, a nanoparticle, a bacteria, a bacteriophage, a virus-like particle (VLP) , a erythrocyte ghost, and an exosome. In another embodiment, the biological agent comprises an siRNA, an antisense molecule, a protein or polypeptide, insulin, a vaccine, or an antibody. In still another embodiment, the small molecule drug comprises a chemotherapeutic agent, an anti-inflammatory, a steroid, and an antibiotic .

[0016] The disclosure also provides a biological agent having a desired tropism, the biological agent linked to a peptide or peptide fragment of at least 10-20 amino acids of any one of SEQ ID NOs:5- 820 or 870-911. In one embodiment, the biological agent is a nucleic acid, a protein, a polypeptide, a peptide, an antibody, an antibody fragment, a non-immunoglobulin binding agent, or an enzyme.

DESCRIPTION OF DRAWINGS

[0017] Figure 1A-B Design of AAV libraries displaying peptides tiling receptor- 1 igands . (a) Schematic of approach for rationally engineering and characterizing AAV variants. Ligand protein sequences derived from all known receptor-interacting ligands are systematically tiled into 20 amino acid peptides which are inserted into surface-exposed loops of AAV capsids. These engineered variants were then assessed for their packaging capacity, in vivo tropism, and enhanced protein interactions. (b) Protein class distribution of the ligands which were tiled to compose the screening library. These include known receptor-interacting ligands (orange) , cell membrane permeable proteins (green) and protein domains (blue) , and stop codon containing negative controls (purple) . Peptide sequences were generated via pooled oligonucleotide synthesis and inserted into 4 distinct loop regions: AAV5-Loopl (N443) , AAV5-Loop2 (S576) , AAV9-Loopl (Q456) , and AAV9-Loop2 (A587) to generate over 1 million AAV variants. Capsid surface residues are colored according to their distance from the core of the capsid with the insertion site flanking residues shown in red.

[0018] Figure 2A-F. AAV library packaging analyses reveal biophysical features contributing to capsid fitness, (a) Schematic illustrating recombinant production of pooled AAV libraries in HEK293T cells, (b) Normalized abundance for each inserted peptide in the plasmid libraries versus DNA isolated from recombinantly produced AAV variant capsid libraries. Dotted-red line shows where the plasmid abundance is equal to the capsid abundance, (c) Normalized abundance for negative control peptides containing stop codons, in both the plasmid libraries and recombinantly produced AAV capsid libraries (d) Distribution of peptide biophysical parameters relevant to packaging. Peptide charge, alpha-helical content, flexibility, and hydrophobicity distributions are shown for peptides which are enriched in the capsid pool ("Packaging") and those which are depleted ("Non Packaging") . Statistical significance between groups was calculated via a T-test ( * * * *p<0.0001 ) . (e) Biophysical parameters of inserted peptides were used as features to train a support vector machine classifier predicting which AAV variants successfully package into capsids. The receiver operating characteristic curve is shown for the resulting model, with an area under the curve of 0.89. (f) UMAP embedding for each AAV variant, colored by packaging status. Inserted peptide charge, alpha-helical content, flexibility, and hydrophobicity were used as input features for the embedding.

[0019] Figure 3A-D. In vivo screen analyses enable predictive computational models of tropism, (a) Overview of in vivo screening methodology. The four AAV variant libraries were injected retro- orbitally into C57/BL6 mice in duplicate. Two weeks post-injection, nine organs were harvested from each mouse and the inserted peptide- containing region of the AAV capsid was amplified and subjected to next generation sequencing, (b) Replicate Pearson correlation for the log2 fold change (log2FC) values (organ vs. capsid) for each library and organ, (c) Overview of screen results. Significantly enriched AAV variants in a particular organ are defined as those with a log2FC > 1 and an FDR-adjusted p-value < 0.05. Bar plots show the number of significantly enriched variants detected per organ, as well as a comparison of peptide hits for each loop insertion site, for both AAV5 and AAV9. (d) Overview of classification model predicting AAV tissue tropism from peptide sequence alone. Inserted peptide sequences were converted to a binary one-hot encoding (for each peptide, 20 rows corresponding to position, and 20 columns corresponding to presence of a particular amino acid) . This one-hot encoding scheme was then used as input to a convolutional neural network (CNN) to predict organ targeting. Model performance was separately evaluated on each organ, via accuracy, area under the receiver operator characteristic curve (AUROC) , Fl score, and Matthews Correlation Coefficient (MCC) . Models were trained on of the data, and the remaining was held out as a validation dataset to evaluate performance. Figure discloses SEQ ID NOs: 867-869, respectively, in order of appearance.

[0020] Figure 4A-D. In vivo screen identified AAV variants demonstrate reprogrammed tropism, (a) Heatmap showing log 2 FC values for each AAV variant which was significantly enriched in at least one organ. Rows are individual variants, and columns are organs (n=2 per organ) . (b) UMAP embedding of significantly enriched AAV variants. Each dot represents a variant, colored by the organ with max log 2 FC. (c) AAVs were chosen for validation from the pool of significant hits on the basis of their tissue specificity, broad tropism, and/or internal consistency. Internal consistency was quantified by counting the number of similar (>50% homology) inserted peptides also detected as hits for a given organ. AAV variants were characterized structurally via transmission electron microscopy, and functionally via delivery of the mCherry transgene in vivo. Heatmaps depict all variants chosen for validation (n=21) , with the left heatmap showing the Z-normalized log2FC values from the pooled in vivo screen (n=2) , and the right heatmap showing the Z-normalized mCherry expression quantified by RT-qPCR, relative to AAV9 (n=2) . (d) Comparison between liver RT-qPCR quantification of mCherry delivery, versus protein level quantification via fluorescent microscopy.

[0021] Figure 5A-D. Characterization and mechanistic exploration of AAV variants with displayed ligand peptides, (a) Full characterization experiments for the variant AAV9.DKK1. In the top left, lung screen count values for all AAV9Loop2 variants with inserted DKK1 derived peptides are shown. The x-axis indicates the position in the DKK1 structure a given peptide starts on. Shown in blue are the lung counts, and shown in orange are the capsid counts. Red arrow indicates the location of the peptide inserted in AAV9.DKK1. Bar plot shows the RT-qPCR quantification (n=2) of mCherry transgene expression via individual validation, normalized to that of AAV9. Also shown are electron microscope images for AAV variant capsids, as well as fluorescent microscopy of mCherry protein expression levels in the lung and liver, (b) Knockout experiments to validate AAV9. DKK1 receptor dependency. Cas9- containing lentivirus was produced with either non-targeting controls (NTC) and LRP6 targeting sgRNA (n=2 each) . HEK293T cells were then transduced with lentivirus and positively selected with puromycin. Lentivirally transduced cells were then plated in a 24- well plate and transduced with either wild-type AAV9 (4xl0 9 viral genomes) or AAV9. DKK1 virus (1x10 s viral genomes) . Flow cytometry was then used to quantify the mean fluorescence intensity (MFI) of the infected cells. Shown to the right is the crystal structure of a 7-mer peptide DKK1 peptide (contained within AAV9.DKK1) in complex with LRP6 (58) . (c) Full characterization experiments for the variant AAV9.PDGFC. In the top left, muscle screen count values for all AAV9Loopl variants with inserted PDGFC derived peptides are shown. The x-axis indicates the position in the PDGFC structure a given peptide starts on. Shown in blue are the muscle counts, and shown in orange are the capsid counts. Red line indicates the location of the peptide inserted in AAV9. PDGFC. Bar plot shows the RT-qPCR quantification (n=2) of mCherry transgene expression via individual validation, normalized to that of AAV9. Also shown are electron micrographs for variant capsids, as well as fluorescent microscopy of mCherry protein expression levels in the heart and muscle . (d) Receptor overexpression studies for the variant

AAV9. PDGFC. Known receptors for the PDGFC ligand were cloned into an overexpression plasmid and transfected into HEK293T cells in a 24- well plate. After 24 hours, either AAV9 (4x10 s viral genomes) or AAV9. PDGFC (4x10 s viral genomes) were used to transduce the cells. Following 24 hours of transduction, the cells were collected and mCherry expression was quantified via flow cytometry. Bar plots show the MFI normalized to the average MFI of the AAV9 transduced cells with an empty vector overexpressed. Statistical significance between groups was calculated via a T-test (*p<0.05, **p<0.01, ***p<0.001, ****p<0.0001) .

[0022] Figure 6A-D. Inserted peptide facilitates efficient liver de-targeting across multiple AAV scaffolds in a mouse strainindependent manner, (a) Significantly enriched AAV variants from all capsids are projected into 2 dimensions via UMAP . The distance between points is calculated explicitly via the Levenshtein distance of the amino acid sequences of the inserted peptides, and subsequently embedded via UMAP . AAV variants are colored by their log 2 fold change in the brain. Select variant clusters which are highly enriched in the brain are highlighted. AAV variants are also colored by capsid insertion site in the embedding on the right, (b) Brain and capsid counts for all AAV5Loop2 and AAV9Loopl variants with inserted AP0A1 derived peptides. Shown in blue are the brain counts, and shown in orange are the capsid counts. Red arrow indicates the location of the peptide inserted in AAV5.AP0A1 and AAV9.AP0A1 respectively. Also shown are electron micrographs for AAV5.AP0A1 and AAV9.AP0A1. (c) RT-qPCR values (transduction relative to AAV9) for individual in vivo validations of AAV5.AP0A1 and AAV9.AP0A1 in C57/BL6 mice, confirming liver detargeting, (d) Performance of AAV5.AP0A1 in BALB/c mice. Bar graph shows RT-qPCR values (transduction relative to AAV9 in BALB/c) , for AAV5.AP0A1 across 8 organs, confirming liver detargeting is strain independent. [0023] Figure 7A-E . Novel immune-orthogonal AAV serotype mining, characterization and tropism reprogramming via displayed ligand peptides, (a) Schematic of the computational pipeline used to identify novel AAV serotypes for testing. Basic local alignment search tool (BLAST) was used to identify 687 initial capsids with sequence homology to the AAV2 cap gene. This initial list was then filtered to exclude truncated genomes, redundant samples, human and non-mammalian serotypes, as well as close orthologs. The final resulting list contained 23 AAV capsid sequences for subsequent investigation, (b) Hierarchical clustering dendrogram of AAV capsid sequences. Shown in red are previously identified AAV serotypes currently in use. Novel AAVs identified and used for downstream testing are shown in black. (c) Two stage filtering of novel AAVs. AAVs were first assessed by measuring their ability to package, and then by their ability to transduce the liver (using an mCherry transgene) in vivo. All values shown relative to the orthologous wild-type AAV5. (d) The four novel AAVs which could infect the liver (AAV MM2, AAV MG2 , AAV MG1 and AAV CHI) , were tested for immune cross-reactivity with AAV8. Mice were immunized with the indicated AAV, and then 3 weeks post-injection tested for antibody cross reactivity via an ELISA, (e) The PDGFC peptide from AAV9.PDGFC was inserted onto loopl of AAV MG2 to yield AAV MG2.PDGFC. AAV MG2 and AAV MG2. PDGFC were injected into C57BL/6 mice, quantifying muscle transduction via RT-qPCR after three weeks. Bar plots show muscle transduction relative to wild-type AAV MG2. Statistical significance between groups was calculated via a T-test (*p<0.05, **p<0.01, ***p<0.001, ****p<0.0001) .

DETAILED DESCRIPTION

[0024] As used herein and in the appended claims, the singular forms "a, " "an, " and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a cell" includes a plurality of such cells and reference to "the fragment" includes reference to one or more fragments and equivalents thereof known to those skilled in the art, and so forth. [0025] Also, the use of "or" means "and/or" unless stated otherwise. Similarly, "comprise," "comprises," "comprising" "include," "includes," and "including" are interchangeable and not intended to be limiting.

[0026] It is to be further understood that where descriptions of various embodiments use the term "comprising," those skilled in the art would understand that in some specific instances, an embodiment can be alternatively described using language "consisting essentially of" or "consisting of."

[0027] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this disclosure belongs. Although many methods and reagents are similar or equivalent to those described herein, the exemplary methods and materials are disclosed herein.

[0028] All publications mentioned herein are incorporated herein by reference in full for the purpose of describing and disclosing the methodologies, which might be used in connection with the description herein. Moreover, with respect to any term that is presented in one or more publications that is similar to, or identical with, a term that has been expressly defined in this disclosure, the definition of the term as expressly provided in this disclosure will control in all respects. [0029] It should be understood that this disclosure is not limited to the particular methodology, protocols, and reagents, etc., described herein and as such may vary. The terminology used herein is for the purpose of describing particular embodiments or aspects only and is not intended to limit the scope of the present disclosure .

[0030] Other than in the operating examples, or where otherwise indicated, all numbers expressing quantities of ingredients or reaction conditions used herein should be understood as modified in all instances by the term "about". The term "about" when used to describe the present invention, in connection with percentages means ±1%. The term "about", as used herein can mean within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which can depend in part on how the value is measured or determined, e.g., the limitations of the measurement system. Alternatively, "about" can mean a range of plus or minus 20%, plus or minus 10%, plus or minus 5%, or plus or minus 1% of a given value. Alternatively, particularly with respect to biological systems or processes, the term can mean within an order of magnitude, within 5-fold, or within 2-fold, of a value. Where particular values are described in the application and claims, unless otherwise stated the term "about" meaning within an acceptable error range for the particular value can be assumed. Also, where ranges and/or subranges of values are provided, the ranges and/or subranges can include the endpoints of the ranges and/or subranges. In some cases, variations can include an amount or concentration of 20%, 10%, 5%, 1 %, 0.5%, or even 0.1 % of the specified amount.

[0031] For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 6-9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.

[0032] The term "adeno-associated virus" or "AAV" as used herein refers to a member of the class of viruses associated with this name and belonging to the genus dependoparvovirus , family Parvoviridae . Multiple serotypes of this virus are known to be suitable for gene delivery; all known serotypes can infect cells from various tissue types. Non-limiting exemplary serotypes useful in the methods disclosed herein include any of the 11 or 12 serotypes, e.g. , AAV2, AAV5, and AAV8, or variant serotypes such as AAV-DJ. The AAV structural particle is composed of 60 protein molecules made up of VP1, VP2 and VP3. Each particle contains approximately 5 VP1 proteins, 5 VP2 proteins and 50 VP3 proteins ordered into an icosahedral structure. Non-limiting exemplary VP1 sequences useful in the methods disclosed herein are provided below.

[0033] The term "amino acid" refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g. , hydroxyproline, y-carboxyglutamate, and O- phosphoserine . In some embodiments, an amino acid analog refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e. , a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g. , homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g. , norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. In some embodiments, an amino acid mimetic refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid. The terms "non-naturally occurring amino acid" and "unnatural amino acid" refer to amino acid analogs, synthetic amino acids, and amino acid mimetics which are not found in nature. In certain instances, one or more D-amino acids can be used in various peptide compositions of the disclosure. The disclosure provides various peptides that are useful for treating various diseases and infections. These peptides can comprise naturally occurring amino acid. In other embodiments, the peptides can comprise non-natural amino acids. The use of non-natural amino acids can improve the peptides stability, decrease degradation and/or improve biological activity. For example, in some embodiments, one or more D-amino acids. In other embodiments, retroinverso peptides are contemplated using various amino acid configurations.

[0034] Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.

[0035] The term "Cas9" refers to a CRISPR-associated, RNA-guided endonuclease such as Streptococcus pyogenes Cas9 (spCas9; see Accession Number Q99ZW2.1, the seguence of which is incorporated herein by reference) and orthologs and biological equivalents thereof. Biological equivalents of Cas9 include, but are not limited to, C2cl from Allcyclobacillus acldeterrestrls and Cpfl (which performs cutting/cleaving functions analogous to Cas9) from various bacterial species including Acidaminococcus spp. and Francisella novicida U112. Cas9 may refer to an endonuclease that causes double stranded breaks in DNA, a nickase variant such as a RuvC or HNH mutant that causes a single stranded break in DNA, as well as other variations such as deadCas-9 ("dCas9") , which lack endonuclease activity. Cas9 may also refer to "split-Cas9" in which Cas9 is split into two halves - C-terminal Cas9 (C-Cas9) and an N- terminal Cas-9 (N-Cas9) - which can be fused with two intein moieties. See, e.g. , U.S. Pat. No. 9,074,199 Bl; Zetsche et al.

(2015) Nat Biotechnol. 33 (2) : 139-42; Wright et al. (2015) PNAS 112 (10) 2984-89. Non-limiting examples of commercially available sources of SpCas9 comprising plasmids can be found under the following AddGene reference numbers:

42230: PX330; SpCas9 and single guide RNA;

48138: PX458; SpCas 9-2A-EGFP and single guide RNA; 62988: PX459; SpCas 9-2A-Puro and single guide RNA; 48873: PX460; SpCas9n (D10A nickase) and single guide RNA; 48140: PX461; SpCas 9n-2A-EGFP (D10A nickase) and single guide RNA;

62987: PX462; SpCas 9n-2A-Puro (D10A nickase) and single guide RNA; and

48137: PX165; SpCas9; all of which are incorporated herein by reference.

[0036] As used herein, the term "CRISPR" refers to Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) . CRISPR may also refer to a technique or system of sequence-specific genetic manipulation relying on the CRISPR pathway. A CRISPR recombinant expression system can be programmed to cleave a target polynucleotide using a CRISPR endonuclease and a guideRNA. A CRISPR system can be used to cause double stranded or single stranded breaks in a target polynucleotide. A CRISPR system can also be used to recruit proteins or label a target polynucleotide. In some aspects, CRISPR-mediated gene editing utilizes the pathways of nonhomologous end- joining (NHEJ) or homologous recombination to perform the edits. These applications of CRISPR technology are known and widely practiced in the art. See, e.g., U.S. Pat. No. 8, 697,359 and Hsu et al. (2014) Cell 156 (6) : 1262-1278.

[0037] As used herein, the term "delivery vehicle" refers to a composition useful for delivering a payload to a cell, tissue or subject. The delivery vehicle can deliver various payloads including biological agents and small molecule drugs. Exemplary, but non-limiting, delivery vehicles include a liposome, a nanoparticle, a bacteria, a bacteriophage, a virus-like particle (VLP) , a erythrocyte ghost, and an exosome.

[0038] As used herein, the term "domain" can refer to a particular region of a larger molecule (e.g. , a particular region of a protein or polypeptide) , which can be associated with a particular function. For example, "a domain which binds to a cognate" can refer to the domain of a protein that binds one or more receptors or other protein moieties. Similarly, a corresponding coding sequence for a particular polypeptide domain can be referred to as a polynucleotide domain.

[0039] The term "encode" as it is applied to polynucleotides can refer to a polynucleotide which is said to "encode" a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, it can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof. In some cases, the antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom. [0040] The terms "equivalent" or "biological equivalent" are used interchangeably when referring to a particular molecule, biological, or cellular material and intend those having minimal homology while still maintaining desired structure or functionality. [0041] As used herein, "expression" can refer to the process by which polynucleotides are transcribed into mRNA and/or the process by which the transcribed mRNA is subsequently being translated into peptides, polypeptides, or proteins. If the polynucleotide is derived from genomic DNA, expression can include splicing of the mRNA in a eukaryotic cell.

[0042] As used herein, the term "functional" may be used to modify any molecule, biological, or cellular material to intend that it accomplishes a particular, specified effect.

[0043] The term "gRNA" or "guide RNA" as used herein refers to the guide RNA sequences used to target specific genes for correction employing the CRISPR technique. Techniques of designing gRNAs and donor therapeutic polynucleotides for target specificity are well known in the art. For example, Doench, J. , et al. Nature biotechnology 2014; 32 (12) : 1262-7, Mohr, S. et al. (2016) FEBS Journal 283: 3232-38, and Graham, D. , et al. Genome Biol. 2015; 16: 260. gRNA comprises or alternatively consists essentially of, or yet further consists of a fusion polynucleotide comprising CRISPR RNA (crRNA) and trans-activating CRIPSPR RNA (tracrRNA) ; or a polynucleotide comprising CRISPR RNA (crRNA) and trans-activating CRIPSPR RNA (tracrRNA) . In some aspects, a gRNA is synthetic (Kelley, M. et al. J of Biotechnology 233 (2016) 74-83) .

[0044] "Homology" or "identity" or "similarity" can refer to sequence similarity between two peptides or between two nucleic acid molecules. Homology can be determined by comparing a position in each sequence which can be aligned for purposes of comparison. For example, when a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. An "unrelated" or "non-homologous " sequence shares less than 40% identity, or alternatively less than 25% identity, with one of the sequences of the disclosure. [0045] Homology refers to a percent (%) identity of a sequence to a reference sequence. As a practical matter, any particular sequence can be at least 50%, 60%, 70%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98% or 99% identical to any sequence described herein. Whether such particular peptide, polypeptide or nucleic acid sequence has a particular identity/homology can be determined conventionally using known computer programs such the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics Computer Group, University Research Park, 575 Science Drive, Madison, Wis. 53711) . When using Bestfit or any other sequence alignment program to determine whether a particular sequence is, for instance, 95% identical to a reference sequence, the parameters can be set such that the percentage of identity is calculated over the full length of the reference sequence and that gaps in homology of up to 5% of the total reference sequence are allowed. A number of sequences are provided herein, it is contemplated that sequences having at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% and 100% to any one of the sequences herein find use in any of the compositions and methods described herein .

[0046] For example, in a specific embodiment the identity between a reference sequence (query sequence, i.e. , a sequence of the disclosure) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. (Comp. App . Biosci. 6:237-245 (1990) ) . In some cases, parameters for a particular embodiment in which identity is narrowly construed, used in a FASTDB amino acid alignment, can include: Scoring Scheme=PAM (Percent Accepted Mutations) 0, k-tuple=2, Mismatch Penalty=l, Joining Penalty=20, Randomization Group Length=0, Cutoff Score=l, Window Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05 , Window Size=500 or the length of the subject sequence, whichever is shorter. According to this embodiment, if the subject sequence is shorter than the query sequence due to N- or C-terminal deletions, not because of internal deletions, a manual correction can be made to the results to take into consideration the fact that the FASTDB program does not account for N- and C-terminal truncations of the subject sequence when calculating global percent identity. For subject sequences truncated at the N- and C-termini, relative to the query sequence, the percent identity can be corrected by calculating the number of residues of the query sequence that are lateral to the N- and C-terminal of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a percent of the total bases of the query sequence. A determination of whether a residue is matched/aligned can be determined by results of the FASTDB sequence alignment. This percentage can be then subtracted from the percent identity, calculated by the FASTDB program using the specified parameters, to arrive at a final percent identity score. This final percent identity score can be used for the purposes of this embodiment. In some cases, only residues to the island C-termini of the subject sequence, which are not matched/aligned with the query sequence, are considered for the purposes of manually adjusting the percent identity score. That is, only query residue positions outside the farthest N- and C-terminal residues of the subject sequence are considered for this manual correction. For example, a 90 residue subject sequence can be aligned with a 100 residue query sequence to determine percent identity. The deletion occurs at the N-terminus of the subject sequence and therefore, the FASTDB alignment does not show a matching/alignment of the first 10 residues at the N-terminus. The 10 unpaired residues represent 10% of the sequence (number of residues at the N- and C-termini not matched/total number of residues in the query sequence) so 10% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 90 residues were perfectly matched the final percent identity can be 90%. In another example, a 90 residue subject sequence is compared with a 100 residue query sequence. This time the deletions are internal deletions so there are no residues at the N- or C-termini of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by FASTDB is not manually corrected. Once again, only residue positions outside the N- and C-terminal ends of the subject sequence, as displayed in the FASTDB alignment, which are not matched/aligned with the query sequence are manually corrected for. [0047] "Hybridization" can refer to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding can occur by Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner. The complex can comprise two strands forming a duplex structure, three or more strands forming a multi-stranded complex, a single self-hybridizing strand, or any combination of these. A hybridization reaction can constitute a step in a more extensive process, such as the initiation of a PC reaction, or the enzymatic cleavage of a polynucleotide by a ribozyme.

[0048] Examples of stringent hybridization conditions include: incubation temperatures of about 25°C to about 37°C; hybridization buffer concentrations of about 6x SSC to about lOx SSC; formamide concentrations of about 0% to about 25%; and wash solutions from about 4x SSC to about 8x SSC. Examples of moderate hybridization conditions include: incubation temperatures of about 40 °C to about 50°C; buffer concentrations of about 9x SSC to about 2x SSC; formamide concentrations of about 30% to about 50%; and wash solutions of about 5x SSC to about 2x SSC. Examples of high stringency conditions include: incubation temperatures of about 55°C to about 68°C; buffer concentrations of about lx SSC to about O. lx SSC; formamide concentrations of about 55% to about 75%; and wash solutions of about lx SSC, O. lx SSC, or deionized water. In general, hybridization incubation times are from 5 minutes to 24 hours, with 1, 2, or more washing steps, and wash incubation times are about 1, 2, or 15 minutes. SSC is 0.15 M NaCl and 15 mN citrate buffer. It is understood that equivalents of SSC using other buffer systems can be employed.

[0049] As used herein, the term "immune orthogonal" refers to a lack of immune cross-reactivity between two or more antigens. In some embodiments, the antigens are proteins (e.g., Cas9) . In some embodiments, the antigens are viral antigens associated with a particular viral vector (e.g. , AAV) . As is recognized in the art, antigens typically include antigenic determinants having a particular sequence of 3 dimensional structure. Moreover, an antigenic determinant can comprise a domain or subsequence of a larger polypeptide or molecular sequence. In some embodiments, antigens that are immune orthogonal do not share an amino acid sequence of greater than 5, greater than 6, greater than 7, greater than 8, greater than 9, greater than 10, greater than 11, greater than 12, greater than 13, greater than 14, greater than 15, or greater than 16 consecutive amino acids. In some embodiments, antigens that are immune orthogonal do not share any highly immunogenic peptides. In some embodiments, antigens that are immune orthogonal do not share affinity for a major histocompatibility complex (e.g. , MHC class I or class II) . Antigens that are immune orthogonal are amenable for sequential dosing to evade a host immune system.

[0050] The term "immunosilent" refers to an epitope or foreign peptide, polypeptide or protein that does not elicit an immune response from a host upon administration. In some embodiments, the peptide, polypeptide or protein does not elicit an adaptive immune response. In some embodiments, the peptide, polypeptide or protein does not elicit an innate immune response. In some embodiments, the peptide, polypeptide or protein does not elicit either an adaptive or an innate immune response. In some embodiments, an immunosilent peptide, polypeptide or protein has reduced immunogenicity.

[0051] The term "isolated" as used herein can refer to molecules or biologicals or cellular materials being substantially free from other materials. In one aspect, the term "isolated" can refer to nucleic acid, such as DNA or RNA, or protein or polypeptide (e.gr. , an antibody or derivative thereof) , or cell or cellular organelle, or tissue or organ, separated from other DNAs or RNAs, or proteins or polypeptides, or cells or cellular organelles, or tissues or organs, respectively, that are present in the natural source. The term "isolated" also can refer to a nucleic acid or peptide that is substantially free of cellular material, viral material, or culture medium when produced by recombinant DNA techniques, or chemical precursors or other chemicals when chemically synthesized.

Moreover, an "isolated nucleic acid" is meant to include nucleic acid fragments which are not naturally occurring as fragments and may not be found in the natural state. In some cases, the term "isolated" is also used herein to refer to polypeptides which are isolated from other cellular proteins and is meant to encompass both purified and recombinant polypeptides. In some cases, the term "isolated" is also used herein to refer to cells or tissues that are isolated from other cells or tissues and is meant to encompass both cultured and engineered cells or tissues.

[0052] "Messenger RNA" or "mRNA" is a nucleic acid molecule that is transcribed from DNA and then processed to remove non-coding sections known as introns. In some cases, the resulting mRNA is exported from the nucleus (or another locus where the DNA is present) and translated into a protein. The term "pre-mRNA" can refer to the strand prior to processing to remove non-coding sections. mRNA has "U" in place of "T" in cDNA coding sequences. [0053] The term "ortholog" is used in reference of another gene or protein and intends a homolog of said gene or protein that evolved from the same ancestral source or which are evolved artificially using molecular biology and genetic engineering. Orthologs may or may not retain the same function as the gene or protein to which they are orthologous . Non-limiting examples of Cas9 orthologs include 5. aureus Cas9 ("spCas9") , S. thermophiles Cas9, L . pneumophilia Cas9, N. lactamica Cas9, N. meningitides Cas9, B. longum Cas9, A. inuciniphila Cas9, and O. laneus Cas9.

[0054] The term "payload" refers to a therapeutic and diagnostic agents that can be loaded into or onto a delivery vehicle. Such payload include biological and small molecule entities. Exemplary payload agents include small molecule drugs, biological molecules, viruses, therapeutic agents, prodrugs, gene silencing agents, chemotherapeutics, diagnostic agents, and/or components of gene editing systems. Examples of biological molecules include, but are not limited to, nucleic acids (e.g., DNA, RNA, mRNA, modified mRNA, small RNAs, siRNA, miRNA, genes, and transgenes) , peptides /proteins (including antibodies, enzymes, transcription factors, etc.) , viruses, hormones, carbohydrates, lipids, and vitamins. Examples of gene silencing agents, include siRNA, chRNAs, miRs, ribozymes, morpholines, and esiRNAs. Examples of gene editing systems include, but are not limited to, CRISPR-Cas systems, zinc finger nucleases, and TALENs. Examples of diagnostic agents, include but are not limited to, dyes and stains, radioactive tracers, and contrast agents. Examples, of anticancer agents and chemotherapeutics that can be used with or loaded into a delivery vehicle include, but are not limited to, alkylating agents such as thiotepa and CYTOXAN® cyclosphosphamide ; alkyl sulfonates such as busulfan, improsulfan and piposulfan; aziridines such as benzodopa, carboquone, meturedopa, and uredopa; ethylenimines and methylamelamines including altretamine, triethylenemelamine, trietylenephosphoramide , triethiylenethiophosphoramide and tiimethylolomelamine ; acetogenins (e.g. , bullatacin and bullatacinone) ; a camptothecin (including the synthetic analogue topotecan) ; bryostatin; callystatin; CC-1065 (including its adozelesin, carzelesin and bizelesin synthetic analogues) ; cryptophycins (particularly cryptophycin 1 and cryptophycin 8) ; dolastatin; duocarmycin (including the synthetic analogues, KW-2189 and CB1-TM1) ; eleutherobin; pancratistatin; a sarcodictyin; spongistatin; nitrogen mustards such as chlorambucil, chlornaphazine , cholophosphamide, estramustine, ifosfamide, mechlorethamine, mechlorethamine oxide hydrochloride, melphalan, novembichin, phenesterine, prednimustine , trof os famide , uracil mustard; nitrosureas such as carmustine, chlorozotocin, fotemustine, lomustine, nimustine, and ranimnustine ; vinca alkaloids; epipodophyllotoxins ; antibiotics such as the enediyne antibiotics (e.g. , calicheamicin, especially calicheamicin gammall and calicheamicin omegall; L-asparaginase ; anthracenedione substituted urea; methyl hydrazine derivatives; dynemicin, including dynemicin A; bisphosphonates, such as clodronate; an esperamicin; as well as neocar zinostatin chromophore and related chromoprotein enediyne antiobiotic chromophores) , aclacinomysins , actinomycin, authramycin, azaserine, bleomycins, cactinomycin, carabicin, carminomycin, car zinophilin, chromomycinis , dactinomycin, daunorubicin, detorubicin, 6-diazo-5-oxo-L-norleucine, ADRIAMYCIN® doxorubicin (including morpholino-doxorubicin, cyanomorpholino-doxorubicin, 2- pyrrolino-doxorubicin and deoxydoxorubicin) , epirubicin, esorubicin, idarubicin, marcellomycin, mitomycins such as mitomycin C, mycophenolic acid, nogalamycin, olivomycins, peplomycin, potf iromycin, puromycin, quelamycin, rodorubicin, streptonigrin, streptozocin, tubercidin, ubenimex, zinostatin, zorubicin; antimetabolites such as methotrexate and 5-f luorouracil (5-FU) ; folic acid analogs such as denopterin, methotrexate, pteropterin, trimetrexate ; purine analogs such as fludarabine, 6-mercaptopurine, thiamiprine, thioguanine; pyrimidine analogs such as ancitabine, azacitidine, 6-azauridine, carmofur, cytarabine, dideoxyuridine, doxif luridine , enocitabine, floxuridine; androgens such as calusterone, dromostanolone propionate, epitiostanol, mepitiostane, testolactone ; anti-adrenals such as aminoglutethimide, mitotane, trilostane; folic acid replenisher such as frolinic acid; aceglatone; aldophosphamide glycoside; aminolevulinic acid; eniluracil; amsacrine; bestrabucil; bisantrene; edatraxate; defofamine; demecolcine; diaziquone; elf ornithine; elliptinium acetate; an epothilone; etoglucid; gallium nitrate; hydroxyurea; lentinan; lonidainine; maytansinoids such as maytansine and ansamitocins ; mitoguazone; mitoxantrone ; mopidanmol; nitiaerine; pentostatin; phenamet; pirarubicin; losoxantione ; podophyllinic acid; 2-ethylhydrazide; procarbazine; PSK® polysaccharide complex (JHS Natural Products, Eugene, Oreg. ) ; razoxane; rhizoxin; sizofiran; spirogermanium; tenuazonic acid; triaziquone; 2,2 2' '- trichlorotiiethylamine ; trichothecenes (especially T-2 toxin, verracurin A, roridin A and anguidine) ; urethan; vindesine; dacarbazine; mannomustine; mitobronitol ; mitolactol; pipobroman; gacytosine; arabinoside ("Ara-C") ; cyclophosphamide; thiotepa; taxoids, e.g. , TAXOL® paclitaxel (Bristol-Myers Squibb Oncology, Princeton, N.J. ) , ABRAXANE® Cremophor-f ree, albumin-engineered nanoparticle formulation of paclitaxel (American Pharmaceutical Partners, Schaumberg, Ill. ) , and TAXOTERE® (docetaxel) (Rhone- Poulenc Rorer, Antony, France) ; chloranbucil ; GEMZAR® (gemcitabine) ; 6-thioguanine ; mercaptopurine; methotrexate; platinum coordination complexes such as cisplatin, oxaliplatin and carboplatin; vinblastine; platinum; etoposide (VP-16) ; ifosfamide; mitoxantrone; vincristine; NAVELBINE® vinorelbine; novantrone; teniposide; edatrexate; daunomycin; aminopterin; xeloda; ibandronate; irinotecan (e.g. , CPT-11) ; topoisomerase inhibitor RFS 2000; difluoromethylornithine (DFMO) ; retinoids such as retinoic acid; capecitabine ; leucovorin (LV) ; irenotecan; adrenocortical suppressant; adrenocorticosteroids; progestins; estrogens; androgens; gonadotropin-releasing hormone analogs; and pharmaceutically acceptable salts, acids or derivatives of any of the above. Also included anticancer agents are anti-hormonal agents that act to regulate or inhibit hormone action on tumors such as anti-estrogens and selective estrogen receptor modulators (SERMs) , including, for example, tamoxifen (including NOLVADEX® tamoxifen) , raloxifene, droloxifene, 4-hydroxytamoxifen, trioxifene, keoxifene, LY117018, onapristone, and FARESTON-toremifene ; aromatase inhibitors that inhibit the enzyme aromatase, which regulates estrogen production in the adrenal glands, such as, for example, 4 (5) - imidazoles, aminoglutethimide, MEGASE® megestrol acetate, AROMASL® exemestane, formestanie, fadrozole, RIVISOR® vorozole, FEMARA® letrozole, and ARTMIDEX® anastrozole; and anti-androgens such as flutamide, nilutamide, bicalutamide, leuprolide, and goserelin; as well as troxacitabine (a 1 , 3-dioxolane nucleoside cytosine analog) ; antisense oligonucleotides, particularly those which inhibit expression of genes in signaling pathways implicated in abherant cell proliferation, such as, for example, PKC-alpha, Ralf and H-Ras; ribozymes such as a VEGF-A expression inhibitor (e.g. , ANGIOZYME® ribozyme) and a HER2 expression inhibitor; vaccines such as gene therapy vaccines, for example, ALLOVECTIN® vaccine, LEUVECTIN® vaccine, and VAXID® vaccine; PROLEUKIN® rJL-2; LURTOTECAN® topoisomerase 1 inhibitor; ABARELLX® rmRH; antibodies such as trastuzumab and pharmaceutically acceptable salts, acids or derivatives of any of the above. Examples of protein payloads include mammalian proteins, such as, e.g. , growth hormone (GH) , including human growth hormone, bovine growth hormone, and other members of the GH supergene family; growth hormone releasing factor; parathyroid hormone; thyroid stimulating hormone; lipoproteins; alpha-l-antitrypsin; insulin A-chain; insulin B-chain; proinsulin; follicle stimulating hormone; calcitonin; luteinizing hormone; glucagon; clotting factors such as factor VIIIC, factor IX tissue factor, and von Willebrands factor; anti-clotting factors such as Protein C; atrial natriuretic factor; lung surfactant; a plasminogen activator, such as urokinase or tissue-type plasminogen activator (t-PA) ; bombazine; thrombin; alpha tumor necrosis factor, beta tumor necrosis factor; enkephalinase ; RANTES (regulated on activation normally T-cell expressed and secreted) ; human macrophage inflammatory protein (MIP-l-alpha) ; serum albumin such as human serum albumin; mullerian-inhibiting substance; relaxin A-chain; relaxin B-chain; prorelaxin; mouse gonadotropin-associated peptide; DNase; inhibin; activin; vascular endothelial growth factor (VEGF) ; receptors for hormones or growth factors; an integrin; protein A or D; rheumatoid factors; a neurotrophic factor such as bone-derived neurotrophic factor (BDNF) , neurotrophin-3, -4, -5, or -6 (NT-3, NT- 4, NT-5, or NT-6) , or a nerve growth factor such as NGF-beta; platelet-derived growth factor (PDGF) ; fibroblast growth factor such as aFGF and bFGF; epidermal growth factor (EGF) ; transforming growth factor (TGF) such as TGF-alpha and TGF-beta, including TGF-betal, TGF-beta2, TGF-beta3, TGF-beta4, or TGF-beta5; insulin-like growth factor-I and -II (IGF-I and IGF-II) ; des (1-3) -IGF-I (brain IGF-D; insulin-like growth factor binding proteins; CD proteins such as CD3, CD4, CD8, CD19 and CD20; osteoinductive factors; immunotoxins; a bone morphogenetic protein (BMP) ; T-cell receptors; surface membrane proteins; decay accelerating factor (DAF) ; a viral antigen such as, for example, a portion of the AIDS envelope; transport proteins; homing receptors; addressins; regulatory proteins; immunoadhesins ; antibodies; and biologically active fragments or variants of any of the above-listed polypeptides.

[0055] The members of the GH supergene family include growth hormone, prolactin, placental lactogen, erythropoietin, thrombopoietin, interleukin-2, interleukin-3, interleukin-4, interleukin-5, interleukin-6, interleukin-7, interleukin-9, interleukin-10, interleukin-11, interleukin-12 (p35 subunit) , interleukin-13, interleukin-15, oncostatin M, ciliary neurotrophic factor, leukemia inhibitory factor, alpha interferon, beta interferon, gamma interferon, omega interferon, tau interferon, granulocyte-colony stimulating factor, granulocyte-macrophage colony stimulating factor, macrophage colony stimulating factor, cardiotrophin-1 and other proteins identified and classified as members of the family.

[0056] Other payload agents that can be incorporated in the delivery vehicle include gastrointestinal therapeutic agents such as aluminum hydroxide, calcium carbonate, magnesium carbonate, sodium carbonate and the like; non-steroidal antifertility agents; parasympathomimetic agents; psychotherapeutic agents; major tranquilizers such as chloropromazine HC1, clozapine, mesoridazine, metiapine, reserpine, thioridazine and the like; minor tranquilizers such as chlordiazepoxide, diazepam, meprobamate, temazepam and the like; rhinological decongestants; sedative-hypnotics such as codeine, phenobarbital, sodium pentobarbital, sodium secobarbital and the like; other steroids such as testosterone and testosterone propionate; sulfonamides; sympathomimetic agents; vaccines; vitamins and nutrients such as the essential amino acids, essential fats and the like; antimalarials such as 4-aminoquinolines, 8- aminoquinolines , pyrimethamine and the like; anti-migraine agents such as mazindol, phentermine and the like; anti-Par kinson agents such as L-dopa; anti-spasmodics such as atropine, methscopolamine bromide and the like; antispasmodics and anticholinergic agents such as bile therapy, digestants, enzymes and the like; antitussives such as dextromethorphan, noscapine and the like; bronchodilators; cardiovascular agents such as anti-hypertensive compounds, Rauwolfia alkaloids, coronary vasodilators, nitroglycerin, organic nitrates, pentaerythritotetranitrate and the like; electrolyte replacements such as potassium chloride; ergotalkaloids such as ergotamine with and without caffeine, hydrogenated ergot alkaloids, dihydroergocristine methanesulfate, dihydroergocomine methanesulfonate, dihydroergokroyptine methanesulfate and combinations thereof; alkaloids such as atropine sulfate, Belladonna, hyoscine hydrobromide and the like; analgesics; narcotics such as codeine, dihydrocodienone , meperidine, morphine and the like; non-narcotics such as salicylates, aspirin, acetaminophen, d-propoxyphene and the like.

[0057] Antibiotic payloads include, for example, the cephalosporins, chlorarnphenical, gentamicin, kanamycin A, kanamycin B, the penicillins, ampicillin, streptomycin A, antimycin A, chloropamtheniol , metronidazole, oxytetracycline penicillin G, the tetracyclines, and the like.

[0058] Other payload agents can include vaccines or antigenic agents. For example, payload antigens derived from microorganisms such as Neisseria gonorrhea, Mycobacterium tuberculosis, Herpes virus (humonis, types 1 and 2) , Candida albicans, Candida tropicalis, Trichomonas vaginalis, Haemophilus vaginalis, Group B Streptococcus sp. , Microplasma hominis, Hemophilus ducreyi, Granuloma inguinale, Lymphopathia venereum, Treponema pallidum, Brucella abortus. Brucella melitensis, Brucella suis, Brucella canis, Campylobacter fetus, Campylobacter fetus intestinalis , Leptospira pomona, Listeria monocytogenes, Brucella ovis, equine herpes virus 1, equine arteritis virus, IBR-IBP virus, BVD-MB virus, Chlamydia psittaci, Trichomonas foetus, Toxoplasma gondii, Escherichia coli, Actinobacillus equuli, Salmonella abortus ovis, Salmonella abortus equi, Pseudomonas aeruginosa, Corynebacterium equi, Corynebacterium pyogenes, Actinobaccilus seminis, Mycoplasma bovigenitalium, Aspergillus fumigatus, Absidia ramosa, Trypanosoma equiperdum, Babesia caballi, Clostridium tetani, Clostridium botulinum and the like can be loaded. In other embodiments, the payload can comprise neutralizing antibodies that counteract the above microorganisms.

[0059] In other embodiments, the payload can comprise enzymes such as ribonuclease, neuramidinase, trypsin, glycogen phosphorylase, sperm lactic dehydrogenase, sperm hyaluronidase, adenossinetriphosphatase, alkaline phosphatase, alkaline phosphatase esterase, amino peptidase, trypsin chymotrypsin, amylase, muramidase, acrosomal proteinase, diesterase, glutamic acid dehydrogenase, succinic acid dehydrogenase, beta-glycophosphatase , lipase, ATP-ase alpha-peptate gamma-glutamylotranspeptidase, sterol- 3-beta-ol-dehydrogenase , DPN-di-aprorase .

[0060] Peptide-payload conjugates are also encompassed by the disclosure, wherein a peptide of the disclosure is linked or fused directly to a payload molecule as set forth herein such that the peptide-payload conjugate can be directly delivery (without loading into a delivery vehicle) to target a desired tissue based upon the peptide's tropism.

[0061] The term "promoter" as used herein refers to any sequence that regulates the expression of a coding sequence, such as a gene. Promoters may be constitutive, inducible, repressible, or tissuespecific, for example. A "promoter" is a control sequence that is a region of a polynucleotide sequence at which initiation and rate of transcription are controlled. It may contain genetic elements at which regulatory proteins and molecules may bind such as RNA polymerase and other transcription factors. Non-limiting exemplary promoters include CMV promoter and U6 promoter.

[0062] The term "protein", "peptide" and "polypeptide" are used interchangeably and in their broadest sense to refer to a compound of two or more subunit amino acids, amino acid analogs or peptidomimetics . The subunits can be linked by peptide bonds. In another embodiment, the subunit can be linked by other bonds, e.g., ester, ether, etc. A protein or peptide can contain at least two amino acids and no limitation is placed on the maximum number of amino acids which can comprise a protein's or peptide's sequence. As mentioned above, the term "amino acid" can refer to either natural and/or unnatural or synthetic amino acids, including glycine and both the D and L optical isomers, amino acid analogs and peptidomimetics. As used herein, the term "fusion protein" can refer to a protein comprised of domains from more than one naturally occurring or recombinantly produced protein, where generally each domain serves a different function. In this regard, the term "linker" can refer to a peptide fragment that is used to link these domains together - optionally to preserve the conformation of the fused protein domains and/or prevent unfavorable interactions between the fused protein domains which can compromise their respective functions.

[0063] The terms "polynucleotide" and "oligonucleotide" are used interchangeably and refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides or analogs thereof. Polynucleotides can have any three-dimensional structure and can perform any function, known or unknown. The following are non-limiting examples of polynucleotides: a gene or gene fragment (for example, a probe, primer, EST or SAGE tag) , exons, introns, messenger RNA (mRNA) , transfer RNA, ribosomal RNA, RNAi, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes and primers. A polynucleotide can comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure can be imparted before or after assembly of the polynucleotide . The sequence of nucleotides can be interrupted by non-nucleotide components . A polynucleotide can be further modi fied after polymeri zation, such as by conj ugation with a labeling component . The term also can refer to both double and single stranded molecules . Unles s otherwise specified or required, any embodiment of this disclosure that is a polynucleotide can encompass both the double stranded form and each of two complementary s ingle stranded forms known or predicted to make up the double stranded form.

[ 0064 ] The term "polynucleotide sequence" can be the alphabetical representation of a polynucleotide molecule . This alphabetical representation can be input into databases in a computer having a central process ing unit and used for bioinformatics applications such as functional genomics and homology searching .

[ 0065] Similarly, the term "polypeptide sequence" , "peptide sequence" or "protein sequence" can be the alphabetical representation of a polypeptide molecule . This alphabetical representation can be input into databases in a computer having a central process ing unit and used for bioinformatics applications such as functional proteomics and homology searching .

[ 0066] As used herein, the term " recombinant expres sion system" re fers to a genetic construct or constructs for the expres sion of certain genetic material formed by recombination .

[ 0067 ] As used herein, the term " recombinant protein" can refer to a polypeptide or peptide which is produced by recombinant DNA techniques , wherein generally, DNA encoding the polypeptide or peptide is inserted into a suitable expression vector which is in turn used to trans form a host cell to produce the heterologous polypeptide or peptide .

[ 0068 ] The term "sequencing" as used herein, can comprise bisul fite- free sequencing, bisulfite sequencing, TET-assisted bisul fite (TAB) sequencing, ACE-sequencing, high-throughput sequencing, Maxam-Gilbert sequencing, mas sively parallel s ignature sequencing, Polony sequencing, 454 pyrosequencing, Sanger sequencing, I llumina sequencing, SOLiD sequencing, Ion Torrent semiconductor sequencing, DNA nanoball sequencing, Heliscope s ingle molecule sequencing, single molecule real time (SMRT) sequencing, nanopore sequencing, shot gun sequencing, RNA sequencing, Enigma sequencing, or any combination thereof.

[0069] As used herein, the term "subject" is intended to mean any animal. In some embodiments, the subject may be a mammal; in further embodiments, the subject may be a bovine, equine, feline, murine, porcine, canine, human, or rat.

[0070] As used herein, the terms "transformation" and "transfection" are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection (e.g. , using commercially available reagents such as, for example, LIPOFECTIN® (Invitrogen Corp. , San Diego, CA) , LIPOFECTAMINE® ( Invitrogen) , EUGENE® (Roche Applied Science, Basel, Switzerland) , JETPEI™ (Polyplus-transfection Inc. , New York, NY) , EFFECTENE® (Qiagen,

Valencia, CA) , DREAMFECT™ (OZ Biosciences, France) and the like) , or electroporation. Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al. (Molecular Cloning: A Laboratory Manual. 2nd, ed. , Cold Spring harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. , 1989) , and other laboratory manuals. Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described in Sambrook, J. , Fritsch, E.F. and Maniatis, T. , Molecular Cloning: A Laboratory Manual, 2 nd ed. ; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. , (1989) and by Silhavy, T.J. , Bennan, M.L. and Enquist, L.W. , Experiments with Gene Fusions; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. , (1984) ; and by Ausubel, F.M. et. al. , Current Protocols in Molecular Biology, Greene Publishing and Wiley-Interscience (1987) each of which are hereby incorporated by reference in its entirety. Additional useful methods are described in manuals including Advanced Bacterial Genetics (Davis, Roth and Botstein, Cold Spring Harbor Laboratory, 1980) , Experiments with Gene Fusions (Silhavy, Berman and Enquist, Cold Spring Harbor Laboratory, 1984) , Experiments in Molecular Genetics (Miller, Cold Spring Harbor Laboratory, 1972) Experimental Techniques in Bacterial Genetics (Maloy, in Jones and Bartlett, 1990) , and A Short Course in Bacterial Genetics (Miller, Cold Spring Harbor Laboratory 1992) each of which are hereby incorporated by reference in its entirety.

[0071] The terms "treat", "treating" and "treatment", as used herein, refers to ameliorating symptoms associated with a disease or disorder (e.g., cancer, Covid-19 etc.) , including preventing or delaying the onset of the disease or disorder symptoms, and/or lessening the severity or freguency of symptoms of the disease or disorder .

[0072] As used herein, the term "vector" can refer to a nucleic acid construct deigned for transfer between different hosts, including but not limited to a plasmid, a virus, a cosmid, a phage, a BAG, a YAC, etc. In some embodiments, a "viral vector" is defined as a recombinantly produced virus or viral particle that comprises a polynucleotide to be delivered into a host cell, either in vivo, ex vivo or in vitro. In some embodiments, plasmid vectors can be prepared from commercially available vectors. In other embodiments, viral vectors can be produced from baculoviruses , retroviruses, adenoviruses, AAVs, etc. according to techniques known in the art. In one embodiment, the viral vector is a lentiviral vector.

Examples of viral vectors include retroviral vectors, adenovirus vectors, adeno-associated virus vectors, alphavirus vectors and the like. Infectious tobacco mosaic virus (TMV) -based vectors can be used to manufacturer proteins and have been reported to express Griffithsin in tobacco leaves (O'Keefe et al. (2009) Proc. Nat. Acad. Sci. USA 106(15) : 6099-6104) . Alphavirus vectors, such as Semliki Forest virus-based vectors and Sindbis virus-based vectors, have also been developed for use in gene therapy and immunotherapy. See, Schlesinger & Dubensky (1999) Curr. Opin. Biotechnol. 5: 434-439 and Ying et al. (1999) Nat. Med. 5 (7 ) : 823-827. In aspects where gene transfer is mediated by a retroviral vector, a vector construct can refer to the polynucleotide comprising the retroviral genome or part thereof, and a gene of interest. Further details as to modern methods of vectors for use in gene transfer can be found in, for example, Kotterman et al. (2015) Viral Vectors for Gene Therapy: Translational and Clinical Outlook Annual Review of Biomedical Engineering 17. Vectors that contain both a promoter and a cloning site into which a polynucleotide can be operatively linked are well known in the art. Such vectors are capable of transcribing RNA in vitro or in vivo and are commercially available from sources such as Agilent Technologies (Santa Clara, Calif. ) and Promega Biotech (Madison, Wis. ) . In one aspect, the promoter is a pol III promoter. [0073] Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g. , bacterial vectors having a bacterial origin of replication and episomal mammalian vectors) . Other vectors (e.g. , non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "expression vectors." In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. In the present specification, "plasmid" and 'Vector" can be used interchangeably. However, the disclosure is intended to include such other forms of expression vectors, such as viral vectors (e.g. , replication defective retroviruses, adenoviruses and adeno-associated viruses) , which serve equivalent functions. Typically, the vector or plasmid contains sequences directing transcription and translation of a relevant gene or genes, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5' of the gene which harbors transcriptional initiation controls and a region 3' of the DNA fragment which controls transcription termination. Both control regions may be derived from genes homologous to the transformed host cell, although it is to be understood that such control regions may also be derived from genes that are not native to the species chosen as a production host. [0074] Typically, the vector or plasmid contains sequences directing transcription and translation of a gene fragment, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5' of the gene which harbors transcriptional initiation controls and a region 3' of the DNA fragment which controls transcription termination. Both control regions may be derived from genes homologous to the transformed host cell, although it is to be understood that such control regions may also be derived from genes that are not native to the species chosen as a production host. [0075] Initiation control regions or promoters, which are useful to drive expression of the relevant pathway coding regions in the desired host cell are numerous and familiar to those skilled in the art. Virtually any promoter capable of driving these genetic elements is suitable for the present invention including, but not limited to, lac, ara, tet, trp, IPL, IPR, T7, tac, and trc (useful for expression in Escherichia coli and Pseudomonas) ; the amy, apr, npr promoters and various phage promoters useful for expression in Bacillus subtilis , and Bacillus lichen! formis ; nisA (useful for expression in gram positive bacteria, Eichenbaum et al. Appl. Environ. Microbiol. 64 ( 8 ) : 2763-2769 (1998) ) ; and the synthetic Pll promoter (useful for expression in Lactobacillus plantaruia, Rud et al. , Microbiology 152: 1011-1019 (2006) ) . Termination control regions may also be derived from various genes native to the preferred hosts .

[0076] Adeno-associated viruses (AAVs) are common gene therapy vectors, however, their effectiveness is hindered by poor target tissue transduction and off-target delivery. Hypothesizing that naturally occurring receptor-ligand interactions could be repurposed to engineer tropism. The disclosure provides a method wherein all annotated protein ligands known to bind human receptors were fragmented into tiling 20-mer peptides and displayed onto the surface loops of AAV5 and AAV9 capsids at two sites. The resulting capsid libraries, comprising >1 million AAV variants, were screened across 9 tissues in C57BL/6 mice. Tracking variant abundance, >250, 000 variants were identified which packaged into capsids, and >15,000 variants which efficiently transduced at least one mouse organ. Twenty-one (21) AAV variants were validated with 74.3% of the organ tropism predictions accurately reproducing, confirming overall screen efficacy. Systematic ligand tiling enabled prediction of putative AAV-receptor interactions, were successfully validated by targeted genetic perturbations. Comprehensive peptide tiling also enabled examination of homologous peptide activity. Interestingly, functional peptides that were observed tended to be derived from specific domains on ligands. Notably, certain peptides also displayed consistent activity across mice strains, capsid insertion contexts, and capsid serotypes, including novel immune orthogonal serotypes. Further analyses of displayed peptides revealed that biophysical attributes were highly predictive of AAV variant packaging, and there was a machine learnable relationship between peptide sequence and tissue tropism. The disclosed comprehensive ligand peptide tiling and display approach can enable engineering of tropism across diverse viral, viral-like and non-viral delivery platforms, and shed light into basic receptor-ligand biology.

[0077] The disclosure provides a plurality of peptides useful for targeting a desired tissue. The disclosure also provides peptides that are immunosilent . The disclosure also provides vectors, delivery vehicles and peptide-payload conjugates comprising one or more of the peptides of the disclosure having a desired tropism.

[0078] Rational screening strategies have immense potential to expand the molecular tools available for clinical gene therapy applications. While AAV engineering efforts have been conducted for over a decade, advances in DNA synthesis have enabled us to create here a data-driven library of AAV variants leveraging existing functional biomolecules from nature. Using natural biomolecules as a defined source of inserted peptides has multiple benefits over random hexamers (and similar methods) . First, natural biomolecules have been pre-filtered for biological functionality by millennia of evolutionary selection pressure. Second, a defined library allows for robust quantification of the fitness of each AAV variant, enabling facile stratification of AAV variants by infectivity across organs of interest. While this method was primarily applied to engineer AAVs, mining nature for functional biomolecules has applications in a wide range of protein engineering challenges, such as engineering orthogonal viral (including lentiviruses ) and non- viral (including lipid nanoparticles) vectors or identifying biologic inhibitors of critical protein/protein interactions.

[0079] Additionally, it is believed that the rational ligand tiling approach described herein could lead to important insights into basic AAV-receptor binding and transduction more generally. Previous efforts have either utilized large-scale genetic screens or unique observations about a particular AAV variant ' s transduction behavior . As the AAV variants described herein display peptides derived from natural receptor-interacting ligands , it is believed that the AAV variants may maintain some of this binding capacity which partially explains their altered transduction profiles . [ 0080 ] The disclosure focuses on adeno-associated viruses (AAVs ) which have emerged as a leading vector for gene delivery in clinical applications . While multiple AAV-mediated therapies have achieved regulatory approval , e fficient directing of treatment to target tis sues is challenging with systemic inj ection . To overcome this , high viral titers are often used in treatments , which has in-turn been as sociated with potential for hepatotoxicity in clinical trials . Locali zed inj ections are also problematic, often requiring invas ive procedures with the potential for organ damage and long recovery times . Due to these delivery challenges , some gene therapeutics have elected to pursue, where feas ible , ex vivo treatment des igns to overcome targeting issues , but this in-turn can lead to dependency on complex lab procedures , and high manufacturing costs .

[ 0081 ] The re are thus numerous ongoing research e fforts to improve in vivo therapeutic targeting, AAV variants have been engineered to specifically target tis sues such as the brain and muscle . This has predominantly been established using strategies of iteratively screening random peptides inserted into the AAV capsid, or caps id shuffling, or randomly mutageni zing the caps id sequence as a whole , or direct chemical engineering . Although mutagenizing AAV capsids via random oligomers has yielded functional capsids with novel properties a stochastic mutational screening strategy limits the ability to predict future functional variants , and thus rational and programmable engineering of viral phenotypes remains an elus ive goal .

[ 0082 ] Towards rational engineering of viral function, deep mutational libraries and associated screens of function have enabled systematic mapping of caps id mutation fitnes s , providing critical information which can be used to predict future variant activity . Additionally, de fined libraries of pooled oligonucleotides have been used to insert gene fragments derived from proteins with known affinity to synapses into the AAV caps id, with the goal of improving retrograde axonal transport . While these methodologies have provided important ins ights for AAV engineering, much is still unknown regarding how AAV genotype impacts packaging and tissue transduction . Consequently, there is a critical need for systematic datasets mapping AAV genotype to clinically relevant properties such as organ speci ficity . Given the clinical danger of hepatotoxicity and other efficacy issues related to off-target transduction, leveraging screening technologies to yield highly speci fic AAV capsids has great value to the medical and scienti fic community . [ 0083] With a goal to develop a rational and potentially programmable strategy for enabling tis sue targeting, it was hypothes i zed that receptor-ligand interactions , which mediate the spectrum of naturally occurring cell-protein and cell-cell interactions , could be repurposed for engineering AAV tropism . However , as neither the interaction interfaces nor the cell type speci ficity profiles of receptor-ligand interactions are fully mapped to comprehens ively interrogate as well as engineer these . To study and addres s this , all annotated protein ligands known to bind human receptors were fragmented into tiling short peptides and displayed onto surface loops of AAV caps ids , an approach termed "AAV-PepTile" . Short peptides grafted into stabili zing molecular scaffolds can recapitulate local protein domain structure , and as protein-protein interface sites are typically 1200 -2000 A 2 , with speci fic peptide hot loops ranging from 4 -8 amino acids (AAs ) contributing maximally to the binding energy of protein-protein interactions , it was hypothes i zed that 20 amino acid peptide insertions into the AAV capsid could drastically alter its binding ability, and therefore , its transduction profile in vivo . While insertion of still longer peptides could, in principle , more faithfully recapitulate local ligand structures , AAVs do not generally tolerate larger than 20 -25 amino acid insertions without severely compromising caps id packaging e fficiency. Based on this , experiments proceeded us ing a 20 amino acid peptide + flanking 2 amino acid linker insertions , all together a library of >1 million AAV variants was constructed by inserting corresponding oligonucleotide pool synthesi zed gene fragments coding for potential receptor-ligands and cell membrane permeable proteins into one of two surface loops on AAV5 and AAV9. Unlike random peptide libraries, ligand tiling enabled robust guantitation of tissue transduction rates for all variants screened. Furthermore, systematic examination of the activity of similar peptides generate predictions of putative receptor interactions driving AAV variant tropism. Quantifying transduction rates across nine organs, extremely specific variants targeting the brain and lung were identified, as well as muscle and heart targeting variants with broader organ transduction. The resulting data linking AAV variant genotype to packaging efficacy and tissue specificity expands the understanding of the AAV fitness landscape, and provides a unique resource from which further data-driven engineering efforts can be built.

[0084] The general schematic for the AAV screens is outlined in Fig. 3a. Briefly, in the primary screen, a 20 amino acid peptide library consisting of tiled fragments from all known receptorinteracting ligands along with other interesting protein classes flanked by glycine-serine linkers were displayed into the 2 surface- exposed loops of two clinically relevant AAV serotypes, AAV5 and AAV9 with their sequences and specific amino acid site of insertion shown in Table 1.

[0085] Table 1: Showing both the DNA and amino acid sequences and insertion sites of the two wild-type AAV serotypes utilized in the screens, AAV5 and AAV9. Bolded and underlined are the flanking residues of the of the 'Loop 1' site and shown in bold/ital/underline are the flanking residues of the 'Loop 2' site.

ATGTCTTTTGTTGATCACCCTCCAGATTGGTTGGAAGAAGTT GGTGAAGGTCTTCGCGAGTTTTTGGGCCTTGAAGCGGGCCCA CCGAAACCAAAACCCAATCAGCAGCATCAAGATCAAGCCCG TGGTCTTGTGCTGCCTGGTTATAACTATCTCGGACCCGGAAA CGGTCTCGATCGAGGAGAGCCTGTCAACAGGGCAGACGAGG TCGCGCGAGAGCACGACATCTCGTACAACGAGCAGCTTGAG GCGGGAGACAACCCCTACCTCAAGTACAACCACGCGGACGC CGAGTTTCAGGAGAAGCTCGCCGACGACACATCCTTCGGGG GAAACCTCGGAAAGGCAGTCTTTCAGGCCAAGAAAAGGGTT CTCGAACCTTTTGGCCTGGTTGAAGAGGGTGCTAAGACGGC CCCTACCGGAAAGCGGATAGACGACCACTTTCCAAAAAGAA AGAAGGCTCGGACCGAAGAGGACTCCAAGCCTTCCACCTCG MSFVDHPPDWLEEVGEGLREFLGL TCAGACGCCGAAGCTGGACCCAGCGGATCCCAGCAGCTGCA EAGPPKPKPNQQHQDQARGLVLPG AATCCCAGCCCAACCAGCCTCAAGTTTGGGAGCTGATACAA YNYLGPGNGLDRGEPVNRADEVAR TGTCTGCGGGAGGTGGCGGCCCATTGGGCGACAATAACCAA EHDISYNEQLEAGDNPYLKYNHAD GGTGCCGATGGAGTGGGCAATGCCTCGGGAGATTGGCATTG AEFQEKLADDTSFGGNLGKAVFQA CGATTCCACGTGGATGGGGGACAGAGTCGTCACCAAGTCCA KKRVLEPFGLVEEGAKTAPTGKRID CCCGAACCTGGGTGCTGCCCAGCTACAACAACCACCAGTAC DHFPKRKKARTEEDSKPSTSSDAEA

CGAGAGATCAAAAGCGGCTCCGTCGACGGAAGCAACGCCAA GPSGSQQLQIPAQPASSLGADTMSA CGCCTACTTTGGATACAGCACCCCCTGGGGGTACTTTGACTT GGGGPLGDNNQGADGVGNASGDW TAACCGCTTCCACAGCCACTGGAGCCCCCGAGACTGGCAAA HCD STWMGDRWTKSTRTWVLPS GACTCATCAACAACTACTGGGGCTTCAGACCCCGGTCCCTCA YNNHQYREIKSGSVDGSNANAYFG GAGTCAAAATCTTCAACATTCAAGTCAAAGAGGTCACGGTG YSTPWGYFDFNRFH SH WSPRD WQR CAGGACTCCACCACCACCATCGCCAACAACCTCACCTCCACC LINNYWGFRPRSLRVKIFNIQVKEV GTCCAAGTGTTTACGGACGACGACTACCAGCTGCCCTACGTC TVQD STTTI ANNLTSTVQ VFTODD Y GTCGGCAACGGGACCGAGGGATGCCTGCCGGCCTTCCCTCC QLPYWGNGTEGCLPAFPPQVFTLP

AAV5 GCAGGTCTTTACGCTGCCGCAGTACGGTTACGCGACGCTGA QYGYATLNRDNTENPTERSSFFCLE ACCGCGACAACACAGAAAATCCCACCGAGAGGAGCAGCTTC YFPSKMLRTGNNFEFTYNFEEVPFH TTCTGCCTAGAGTACTTTCCCAGCAAGATGCTGAGAACGGGC SSFAPSQNLFKLANPLVDQYLYRFV AACAACTTTGAGTTTACCTACAACTTTGAGGAGGTGCCCTTC STNNTGGVOFNKNLAGRYANTYK CACTCCAGCTTCGCTCCCAGTCAGAACCTGTTCAAGCTGGCC NWFPGPMGRTQGWNLGSGVNRAS AACCCGCTGGTGGACCAGTACTTGTACCGCTTCGTGAGCACA VSAFATTNRMELEGASYQVPPQPN AATAACACTGGCGGAGTCCAGTTCAACAAGAACCTGGCCGG GMTNNLQGSNTYALENTMIFNSQP GAGATACGCCAACACCTACAAAAACTGGTTCCCGGGGCCCA ANPGTTATYLEGNMLITSESETQPV TGGGCCGAACCCAGGGCTGGAACCTGGGCTCCGGGGTCAAC NR VA YN VGGQMATNNQ S5 T A P A CGCGCCAGTGTCAGCGCCTTCGCCACGACCAATAGGATGGA TGTYNLQEIVPGSVWMERDVYLQG GCTCGAGGGCGCGAGTTACCAGGTGCCCCCGCAGCCGAACG PIWAKIPETGAHFHPSPAMGGFGLK GCATGACCAACAACCTCCAGGGCAGCAACACCTATGCCCTG HPPPMMLIKNTPVPGNITSFSDVPVS GAGAACACTATGATCTTCAACAGCCAGCCGGCGAACCCGGG SFITQYSTGQVTVEMEWELKKENS CACCACCGCCACGTACCTCGAGGGCAACATGCTCATCACCA KRWNPEIQYTNNYNDPQFVDFAPD GCGAGAGCGAGACGCAGCCGGTGAACCGCGTGGCGTACAAC STGEYRTTRPIGTRYLTRPL (SEQ ID GTCGGCGGGC AGATGGCC ACC A AC A ACC AG AGC TCCA CC. C NO:2) TGCCCCCGCGACCGGCACGTACAACCTCCAGGAAATCGTGC CCGGCAGCGTGTGGATGGAGAGGGACGTGTACCTCCAAGGA CCCATCTGGGCCAAGATCCCAGAGACGGGGGCGCACTTTCA CCCCTCTCCGGCCATGGGCGGATTCGGACTCAAACACCCACC GCCCATGATGCTCATCAAGAACACGCCTGTGCCCGGAAATA TCACCAGCTTCTCGGACGTGCCCGTCAGCAGCTTCATCACCC AGTACAGCACCGGGCAGGTCACCGTGGAGATGGAGTGGGAG CTCAAGAAGGAAAACTCCAAGAGGTGGAACCCAGAGATCCA GTACACAAACAACTACAACGACCCCCAGTTTGTGGACTTTGC CCCGGACAGCACCGGGGAATACAGAACCACCAGACCTATCG GAACCCGATACCTTACCCGACCCCTTTAA (SEQ ID NO: 1)

[0086] These libraries were used to make AAV capsid particles which were injected systemically into mice in duplicate. After two weeks, the heart, lung, liver, small intestine, spleen, pancreas, kidney, brain, and skeletal muscle were extracted. Genomic DNA was isolated from these tissues and the variable peptide region of interest was amplified out and subjected to next generation sequencing (NGS) to quantify AAV abundance in each tissue by measuring the log 2 fold change (log2FC) value in each organ relative to the injected pool.

[0087] A candidate list of "hit peptides" in the AAV9Loopl and AAV9Loop2 scaffolds which were synthesized and subjected to deep mutagenesis to yield AAV libraries with millions of variants that were then profiled in a secondary screen. As done in the primary screen, AAV capsid particles were produced from these libraries and injected into mice where the tropism was quantified. Additionally, to further evaluate the species-dependent nature of the engineered viral particles, the AAV capsid libraries were pooled from the primary and secondary screens and injected into rhesus macaques as an important preclinical non-human primate (NHP) model. Similar to the experiments done in mice, various organs were isolated from the NHPs 3 weeks after injection, the variable peptide region was amplified out and the AAV abundance quantified in the liver, lung, and 4 lobes of the brain, moreover, to apply additional selected pressure, the variable peptide regions from the heart, lung, pancreas, intestine, brain, and muscle were amplified out and cloned back into the AAV9Loopl and AAV9Loop2 scaffolds. This tertiary screen was then subjected to two weeks on in vivo selective pressure in mice before the tropism was quantified by NGS. For all screens, a "hit" was defined as an engineered AAV variant that exhibited a log2FC relative to the capsid > 1 in both replicates and a p- value < 0.1 in at least one organ.

[0088] Shown in Table 2 are the peptides that were present in the starting library of the primary screen that showed prominent in vivo activity across all four scaffolds in which they were grafted. In addition to this list, there are a set of peptides that were consistently identified as hits across both the primary, secondary, tertiary, and NHP screens. These screen-independent hits for both AAV9Loopl and AAV9Loop2 are shown in Table 3.

[0089] Table 2: Table for "wild-type" peptides that were present in the primary screen library and showed in vivo activity across all 4 scaffolds in which they were grafted (i.e. AAVsLocpl, AAV5Loop2,

AAV9Loopl, AAV9Loop2) . Shown is the amino acid sequence.

[0090] Table 3: Table for "wild-type" peptides that were present as hits in both the primary, secondary, tertiary, and NHP screens.

Shown are the peptide amino acid sequences

[0091] Shown in Table 4A-B are the top performing peptides across all of these described screens. These include the mutant versions of the peptides that were originally identified in the primary screen. The information includes the amino acid sequence of the peptide, the AAV scaffold in which that peptide was inserted, the organ where that peptide was a hit, the log2FC in that organ, and the screen in which that peptide was identified. While the peptides shown here were highlighted for their ability to transduce a particular organ, many of these also display other interesting properties such as their increased ability to form functional capsids (high titers) , their significant de-targeting away from the liver, and their ability to target unique combinations of organs simultaneously. It is expected that these peptides and mutants thereof to have the ability to modulate tropism of AAV capsids, such as AAV5, AAV9, and the 23 immune orthogonal capsids (shown in Table 5) and beyond as well as in tropism of other delivery vehicles and peptide-payload conjugates.

[0092] Table 4A: Table for mutant peptides that were identified as the top hits in the primary, secondary, tertiary, and NHP screens. Shown are the peptide amino acid sequences, the scaffold where they were grafted, the log2FC in the organ they were a hit, the organ they were a hit in, and the screen in which that log2FC was observed. (P=pancreas; H=heart; B=Brain; L=Lung; Lv=liver; K=kidney; M=muscle; S=spleen; I=intestine) :

[0093] Table 4B: These sequences were identified by looking at peptides derived from domains that exhibit consistent organ trans fectivity across many variants. Shown in the table below is the amino acid sequence, the AAV scaffold in which the peptide was grafted, the log2fold change value and the respective organ that it transduces, as well as the screen in which the peptide was identified (P=pancreas; H=heart; B=Brain; L=Lung; Lv=liver; K=kidney; M=muscle; S=spleen; I=intestine) :

[0094] Table 5: table of the identified 23 immune-orthogonal AAV serotypes with their DNA and amino acid (AA) sequences.

V A

[ 0095] The disclosure shows that one of the discovered variants (AAV9 . DKK1 ) exhibit deprecated transduction when the cognate receptor for the ligand from which the peptide was derived is knocked out (Fig . 5a) . Furthermore , due to the systematic tiling nature of the peptide library, it was pos sible to generate full- length transduction maps acros s the entire res idue space of ligands utili zed in this study (Fig. 5a , c) . This not only increases confidence in the screening hits , in that homologous peptides behave similarly, but could also provide ins ight into critical structural domains mediating virus -receptor interactions and more broadly, provide a platform for further expanding the known protein interactome especially when the dataset is integrated with the cell sur faceome .

[ 0096] Additionally, in recent years , developing predictive models of AAV infectivity has garnered s igni ficant interest from multiple research groups . The application of machine learning to AAV engineering parallels maj or advances in machine learning acros s multiple areas of protein science such as structure prediction, enzyme activity forecasting, and antibody binding optimization . While deep learning and similar blackbox methodologies have rapidly become mature technologies , applying these methodologies to AAV engineering is still severely limited by the lack of available training data . The AAV screening data is an ideal training dataset for several reasons : ( 1 ) the experimental des ign features a large , defined library of variants (Fig. 1) , meaning that every variant has a reliable quantification of package infectivity, (2) each variant was screened across a panel of 9 major organs to map the infectivity across diverse tissue types, (3) the library was inserted across 2 AAV serotypes and 2 surface loops, providing important functional information on the capsid context in which the peptide resides, and (4) a large cohort of variants were validated to demonstrate the screening data is trustworthy (Fig. 4) and (5) to illustrate the utility of the dataset as training data, the packaging efficiency of AAV variants could be accurately predicted from the biophysical characteristics of the inserted peptides (Fig. 2e) , and also peptide amino acid sequence is directly predictive of tissue-tropism across multiple capsids and insertion sites (Fig. 3d) . As such, the screening data will have great utility for the machine-learning and computational biology community.

[0097] While the variants identified via the pooled-screen have tissue transduction exceeding AAV9 in many organs (Fig 5a, c) and several variants exhibit drastic liver de-targeting (Fig 4d, 6c) , further engineering could be performed to enhance potency and specificity. Accordingly, peptides with at least 85% identity to any one of the peptide sequences provided herein are also contemplated. This may be relevant in particular for certain peptides derived from ligands that engage receptors expressed on multiple cell types or which have promiscuous binding activity. In the validation experiments, a standard promoter (CMV) was used to drive expression of the mCherry transgene. Alternatively, tissue-specific promoters could be used to increase the specificity and magnitude of transgene expression in the organ of interest. Furthermore, the hit capsids identified here could be further engineered for increased activity. Existing hits could serve as a scaffold for further rounds of targeted mutagenesis and screening, or peptides could be inserted on both loopl and loop2 of the AAV capsid to increase the valency of the displayed ligands. Additionally, recently developed direct chemical engineering or peptide display strategies and/or alternative peptide linkers could be utilized to enhance peptide- mediated transduction. Also, scRNAseq could be used to screen hit variants towards more specific cell-types within the organ of interest .

[0098] While the screening strategy addresses the gene therapy hurdle of organ targeting, the issue of pre-existing AAV immunity also can limit clinical translation. Therefore, experiments computationally identified and fully characterized a set of 4 immune-orthogonal AAV capsids (Fig. 7a-d) , and showed that insertion of a screen identified re-targeting peptide can alter the in vivo tropism of one of these identified AAVs (Fig. 7e) . This not only increases the number of known serotypes with interesting immunological properties to be investigated alongside the current AAV set, but also lends credence that the peptide insertions identified through the screening methodology can be grafted onto diverse biological scaffolds to achieve unique targeting across broad delivery contexts.

[0099] Taken together, the disclosure presents a massive functional screen of engineered AAV variants, spanning over one million total variants derived from two capsids and multiple sites of insertional mutagenesis. Using this screening data, validation of 21 AAV variants, identifying AAV capsids with increased organ transduction across multiple organs (heart, muscle, and lung for AAV9.PDGFC, Fig. 4c) , as well as highly specific AAV capsids (AAV5.AP0A1, AAV9.DKK1, Fig. 5a) . Improved broad targeting AAV variants have massive potential for genetic diseases such as hemophilia A, where total factor VIII expression levels are most critical. At the same time, highly specific AAV capsids such as AAV5.APOA1 (which has less than 1% the liver infectivity of WT AAV9) would have great utility for neurodegenerative disorders, where maximizing transgene expression in the brain is essential. In addition to the novel variants identified herein, the bulk screening data itself is high value. Given the scale, reliability, and translational relevance of the screening dataset, the data set can serve as a foundation for future computational engineering of designer AAV capsids.

[00100] The following examples are intended to illustrate but not limit the disclosure. While they are typical of those that might be used, other procedures known to those skilled in the art may alternatively be used.

EXAMPLES

[00101] Mice

[00102] All animal care and experimental methods were performed in accordance with the University of California Institutional Animal Care and Use Committee. 6-8 week old male C57B1/6J (JAX, #000664) and Balb/cJ (JAX, #000651) mice were purchased from the Jackson Laboratories and systemic injections were administered retro- orbitally with either AAV or PBS.

[00103] Cell lines and culture conditions

[00104] All cells were cultured in a 37°C 5ft C02 humidified incubator. HEK293T cells were cultured in DMEM medium supplemented with 10% FBS, GlutaMAX (lx) (GIBCO) , and Penicillin-Streptomycin (100 U/mL) (GIBCO) .

[00105] Visualizing AAV capsid structures

[00106] To obtain structural representations of AAV capsids, AAV5 and AAV9 structure files were downloaded from the Protein Data Bank. These were visualized using the PyMOL Molecular Graphics System, Version 2.0 Schrodinger, LLC with the ramp new function utilized for coloring the surface capsid representation.

[00107] Design of displayed peptide library

[00108] Each AAV library consisted of 275,298 peptides, derived from 6,465 proteins. These protein sources were mined from a variety of protein families, including all protein ligands cataloged in the Guide to Pharmacology database, toxins, nuclear localization signals (NLS) , viral receptor binding domains, albumin and Fc binding domains, transmembrane domains, histones, granzymes, and predicted cell penetrating motifs. In addition to peptides coding for functional biomolecules, 444 control peptides coding for FLAG-tags with premature stop codons were included. For all human proteins, the cDNA coding for each protein was fragmented in sllico to generate DNA coding for all possible 20mer peptides. For viral proteins, cell penetrating motifs, and FLAG stop codon controls, the protein sequence was back-translated to DNA using the most abundant human codon for each amino acid. [00109] Oligonucleotide array synthesis, amplification, and cloning

[00110] Oligonucleotide libraries were synthesized by GenScript as three 91,766 element pools. Each oligonucleotide library was amplified using KAPA Hifi Hotstart Readymix, and the manufacturer recommended cycling conditions with an annealing temperature of 60 °C and an extension time of 30 seconds. The number of PCR cycles was optimized to avoid over-amplification of the peptide libraries. After amplifying each oligonucleotide pool and confirming amplicon size on an agarose gel, the amplified sub-libraries were pooled to yield the total 275,298 element peptide library.

[00111] These pools were cloned into loop 1 or loop 2 sites to yield pAAV5Ll Screen, pAAV5L2 Screen, pAAV9Ll Screen, and pAAV9L2 Screen. The AAV rep and cap were flanked by AAV inverted terminal repeat (ITR) sequences to facilitate packaging of cap genes into a recombinant AAV particle.

[00112] Recombinant AAV production

[00113] Utilizing the library plasmid pools described above (AAV5-Loopl, AAV5-Loop2, AAV9-Loopl, and AAV9-Loop2 ) , each AAV capsid library was produced by transfecting HEK293T cells in 40 15 cm dishes with the plasmid library pool (diluted 1:100 with pUC19 filler DNA to prevent capsid cross-packaging) and an adenoviral helper plasmid (pHelper) . Titers were determined via qPCR using the iTaq Universal SYBR green supermix and primers binding to the AAV ITR region. To prepare the capsid particles as templates for qPCR, 2 pL of virus was added to 50 pL of alkaline digestion buffer (25mM NaOH, 0.2 mM EDTA) and boiled for 8 minutes. Following this, 50 pL of neutralization buffer (40mM Tris-HCl, .05% Tween-20, pH 5) was added to each sample.

[00114] In vivo evaluation of AAV display libraries

[00115] Each AAV capsid library was retro-orbitally administered to mice in duplicate at a dose of 2E12 vg/mouse for the AAV9-based libraries or 1E12 vg/mouse for the AAV5-based libraries. Two weeks after injection, the heart, lung, liver, intestine, spleen, pancreas, kidneys, brain, and gastrocnemius muscle were harvested and placed in RNAlater storage solution. Total DNA was extracted from all mouse tissues using TRIzol reagent and the TNES-6U back extraction method. The resulting precipitated DNA was centrifuged for 15 minutes at 18,000G, and the supernatant discarded. The DNA pellet was then washed three times with 70% ethanol, and finally resuspended in 300 pL of EB after allowing the pellet to air dry. [00116] Preparation of plasmid and capsid DNA for next generation sequencing [00117] To sequence the plasmid libraries (AAV5/9 and loopl/2 peptide insertions) , 50 ng of plasmid was used as template for a 50 pL KAPA Hifi Hotstart Readymix PCR reaction with an annealing temperature of 60 °C and an extension time of 30 seconds. The primers were designed to amplify the peptide coding region from each sub-library. The number of cycles was optimized to avoid overamplification. The PCR reactions were purified using a QIAquick PCR Purification Kit according to the manufacturer's protocol. Following this, 50 ng of the PCR amplicon was used as template for a secondary 50 pL KAPA Hifi Hotstart Readymix PCR reaction to add illumina compatible adapters and indices (NEBNext Cat# E7600S) . The PCR reaction was performed with an annealing temperature of 60 °C, an extension time of 30 seconds. To sequence the capsid libraries, a similar protocol was performed, with a modified template amount in the step-1 PCR. To prepare the capsid particles as templates for PCR, 2 pL of virus was added to 50 pL of alkaline digestion buffer (25mM NaOH, 0.2 rnM EDTA) and boiled for 8 minutes. Following this, 50 pL of neutralization buffer (40mM Tris-HCl, .05% Tween-20, pH 5) was added to each sample. 1 pL of this digested capsid mix was then used as a template for a 50 pL PCR reaction. For each sample, the number of cycles was optimized to avoid overamplification, and a secondary PCR was subsequently performed to add illumina compatible adapters and indices. After generating illumina compatible libraries, the plasmid and capsid samples were sequenced on a NovaSeq 6000 with an S4 flowcell generating lOObp paired end reads. [00118] Preparation of tissue DNA for next generation sequencing

[00119] To sequence the AAV cap genes from each tissue for the pooled screen, as with the plasmid/capsid libraries, a two-step PCR based library prep method was used. For each organ and replicate, a 300 pL PCR reaction was performed with 120 uL of genomic DNA used as a template. For each tissue, the number of cycles was optimized via an initial qPCR to avoid overamplification of the library. All other parameters such as primers, and melting temperatures were identical to the PCRs for the plasmid libraries. Following this initial PCR, a secondary PCR was performed as above to add illumina compatible adapters and indices. The libraries were then sequenced on a NovaSeq 6000 with an S4 flowcell generating lOObp paired end reads.

[00120] In vivo validation of AAV variants

[00121] Either saline or the AAV-variant-mCherry, AAV9-mCherry, or AAV5-mCherry capsids were systematically administered to mice in duplicate at a dose of 5E11 vg/mouse. Three weeks after injection, the lungs were inflated with a PBS/OCT solution and the lungs, heart, liver, intestine, spleen, pancreas, kidneys, brain, and gastrocnemius muscle were harvested. Each organ was split with one portion placed in RNAlater and the other embedded in OCT blocks and flash frozen in a dry-ice/ethanol slurry. Total RNA was then isolated from all mouse tissues using TRIzol reagent and RNA Isolation kits with on-column DNase treatment (Zymo Cat# R2072) . cDNA synthesis was performed with random primers from the Protoscript cDNA synthesis kit (NEB Cat#E6560S) . Transgene expression was then quantified via qPCR using the iTaq Universal SYBR green supermix and primers binding to the mCherry transcript. mCherry transgene expression was normalized to that of an internal GAPDH control, using GAPDH specific primers. For histological examination, OCT frozen blocks were cryosectioned at approximately 10 pm thickness and tissue slides were then imaged on an Olympus SlideScanner S200. Exposure times between 5-1000 ms were used, with identical exposure times used for all samples of a given tissue type. mCherry expression from histological sections was then quantified using the Olympus OlyVIA software to calculate the mean pixel intensity across the entire organ section.

[00122] Quantifying AAV variant abundance from NGS data

[00123] Starting with FASTQ sequencing files, the MAGeCK (94) 'count' function was used to generate count matrices describing AAV abundance in each sample (plasmids/capsids/tissues) . Following this, the count matrices were normalized (via multiplication with a constant size-factor) for each sample to account for non-identical read depth. The sequencing counts were then transformed by taking the log base 2 of the raw counts, after addition of a pseudocount. Variants with no counts across all of the experimental samples were excluded from analysis.

[00124] Biophysical analysis of AAV capsids

[00125] The biophysical characteristics of the inserted peptides was calculated using the "ProteinAnalysis" module within the Biopython Python package. A variant was considered a successful packager if it had higher abundance in the capsid particles compared to the plasmid pool. Support vector machine training and visualization was accomplished via the "svm" module within the sklearn Python package (96) . UMAP projection of peptide biophysical characteristics was accomplished via the "plot" functionality within the UMAP Python package. All default parameters were used for the visualization. Boxplots and hexbin plots were generated using the matplotlib and seaborn Python packages.

[00126] Identifying significantly enriched variants in each tissue

[00127] To identify variants which successfully transduce each tissue, for each variant a one sample T-test was applied, comparing the abundance in the capsid particles to the abundance in the tissue. Resulting p-values were adjusted for multiple hypothesis testing via the Benj amini-Hochberg procedure. A variant was considered a significant transducer of an organ if it had an FDR adjusted p-value < .05, and a Log2FC > 1 in both replicates. When choosing variants for validation experiments, variants were prioritized that had inserted peptides which were identified as hits in multiple capsid/loop contexts, and variants for which similar inserted peptides infecting the same organ were identified.

[00128] Visualizing tissue transduction from pooled screen

[00129] Heatmaps for visualizing AAV transduction were generated using the 'clustermap' function within the seaborn Python package. Rows and columns were ordered via the scipy 'optimal_leaf_ordering' function to minimize the euclidean distance between adjacent leaves of the dendrogram. UMAP projections visualizing AAV tissue specificity were generated by embedding the tissue level log2 fold change into two dimensions via the "plot" functionality within the UMAP Python package. All default parameters were used for generating the embedding. The variants were colored by the organ in which they had the max log 2 fold change.

[00130] Assessing accuracy of predicted AAV variant tropism [00131] Fo r each variant which was individually validated, the accuracy of both positive and negative predictions of tissue infectivity were assessed. For variants predicted to target a specific organ, a prediction was considered accurate if the individual validations showed greater than 50% of wild-type AAV9 infectivity in that organ. For variants predicted not to target a specific organ, a prediction was considered accurate accurate if the individual validations showed less than 50% of wild-type AAV9 activity.

[00132] Peptide Distance Projections

[00133] To calculate the Levenshtein distance between inserted peptides, the "levenshtein" function from the Python package "rapidfuzz" was used with default parameters ( 100) . After building the pairwise distance matrix between all significantly enriched peptides, the matrix was projected into two dimensions via UMAP with metric="precomputed" , n neighbors=1500, and min dist=.l. Clusters of peptides with similar functions were then hand annotated onto the resulting plot.

[00134] Convolutional Neural Networks

[00135] To train convolutional neural networks (CNNs) to predict the tissue specificity of AAVs, the AA sequences of the inserted peptides were converted to a one-hot encoding via the "get dummies" function from the pandas Python package. Among the significantly enriched variants, a variant was considered a transducer of a given organ if the log 2 FC relative to the capsid in both replicates was greater than 0. The data was then randomly split into training (?3) and validation (h) datasets. For each variant, the one-hot encoding was reshaped to a 20x20 matrix with rows indicating residue positions, and columns indicating the presence or absence of a particular amino acid. The model architecture was instantiated via a Keras sequential model (102) . In brief, a convolutional layer (ConvlD) with 32 filters, a kernel size of 3 and "relu" activation was fed into a max pooling layer (MaxPoollD) with pool size of 2. These layers were followed with another set of convolutional and max pooling layers, this time with 64 filters in the convolutional layer. These layers were followed with a dense layer with units=20. Finally a dropout layer was added with the dropout rate=.5. A flattening layer and final dense layer (with sigmoid activation) was then used to output resulting class probabilities. A separate independent model was trained for each organ. When training the models, the classes (infective versus non-inf ective variants) were weighted proportionally to the inverse of the number of class examples. When calling 'model . fit () ' to train the CNN, a dictionary describing the class weights was passed via the 'class weight' parameter. Model performance was evaluated via accuracy, area under the receiver operator characteristic curve (AUROC) , Fl-score, and Matthews Correlation Coefficient (MCC) . Metrics were calculated via builtin Keras functions, and plotted via matplotlib. To assess how model accuracy changes as a function of the edit distance from the training data, the levenshtein distances between peptides in the testing and training datasets were calculated using the 'rapidfuzz' python package as above.

[00136] Transmission electron microscopy (TEM)

[00137] To obtain transmission electron microscopy images of select AAV variants, 20pL of the AAV solution was applied to formvar /carbon coated EM grids. Upon washing 3 times with water, the sample-containing grid was then negative-stained for 1 minute using a solution of 2 % of Uranyl acetate in water and blot-dried. The EM grids with each AAV variant were then imaged at 68,000X magnification using the FEI Tecnai Spirit G2 BioTWIN transmission electron microscope operated at 80keV.

[00138] Mechanism knockout experiments

[00139] sgRNA sequences targeting LRP6 or non-targeting controls were identified using CRISPick and cloned into the lentiCRISPR v2 plasmid backbone. Lentivirus was then produced as described in (33) . Briefly, HEK293T cells were seeded at -40% confluency the day before transfection. The day of transfection, Optimem serum reduced media was mixed with Lipof ectamine 2000 (Thermo Fisher) , 3 pg of pMD2. G plasmid, 12 pg of pCMV deltaR8.2 plasmid, and 9 pg of the respective lentiCRISPRv2 plasmid and added drop-wise onto the HEK293T cells following a 30 minute incubation period. 48 hours after transfection, the media was collected and replaced with fresh DMEM with 10% FBS . At 72 hours post-trans f ection, the supernatant containing viral particles were collected again, pooled, and concentrated to ImL using Amicon-15 centrifugal filters with a 100,000 NMWL cutoff (EMD Millipore) . For lentiviral transduction, HEK293T cells were seeded in a 12-well plate at -20% confluency the day before transduction. The day of transduction, lentiviral containing DMEM with 8pg/mL polybrene was added to the cells. The media was then replaced 24 hours later and then changed into puromycin (2]jg/mL) containing DMEM 28 hours post-transfection. Post selection, once the cells reached confluency, they were passaged into a 24-well plate at -40% confluency. 24 hours later, DMEM containing either AAV9 (4xl0 9 viral genomes) or AAV9.DKK1 (IxlO 9 viral genomes) was added to the cells. The cells were then collected 24 hours later and flow cytometry was performed to quantify mCherry transgene expression.

[00140] Identifying immune orthogonal AAV capsids

[00141] Potential AAV orthologs were curated by first identifying sequences which exhibited similarity to the AAV2 cap gene using the National Center for Biotechnology Information (NCBI) basic local alignment search tool (BLAST) . Next, incomplete, heavily truncated, and highly homologous sequences to one another were filtered out of the selection criteria. Viruses within the human AAV clade and those from non-mammalian hosts were then filtered out. Finally, sequences with high similarity to previously identified human serotypes were removed, resulting in the 23 potential AAV orthologues assessed. [00142] Immune orthogonal AAV production

[00143] To clone computationally identified AAV orthologs, the capsid sequences were codon-optimized and cloned downstream of the AAV2 rep gene using Gibson Assembly. Immune orthogonal AAV capsids were produced as described for the AAV variant validation capsids and AAV production titer was measured via qPCR with primers binding to the AAV ITR region. Any ortholog with a production titer within a power of 10 of AAV5 was considered to have successfully packaged.

[00144] Immune orthogonal AAV in vivo transduction

[00145] Immune orthogonal AAV capsids with sufficient packaging titer were then injected retro-orbitally into C57BL/6 mice at a dose of IxlO 12 viral genomes/mouse . Livers were harvested 3 weeks postinjection and total RNA was isolated as described above. cDNA was then generated and transgene expression was quantified via qPCR using the iTaq Universal SYBR green supermix and primers binding to the mCherry transcript. mCherry transgene expression was then normalized to GAPDH and the relative expression was compared to AAV5.

[00146] Assessing immune cross-reactivity

[00147] The assay to assess immune antibody cross-reactivity of the identified immune orthogonal was performed as previously described. Prior to injection, serum was collected via tail snip procedure and then the mice were injected with IxlO 12 viral genomes/mouse of AAV or PBS in triplicate. 3 weeks later, serum was collected from each of the mice and the antibody cross-reactivity ELISA was performed. For this, IxlO 9 viral genomes of AAV8, AAV MM2, AAV MG1, AAV MG2 , or AAV CHI were diluted in a lx coating buffer and incubated overnight in each well of 96-well Nunc MaxiSorp plates. Plates were washed three times for 5 mins with lx wash buffer (Bethyl) and blocked with lx BSA blocking buffer (Bethyl) for 2 hours at room temperature. The wells were then washed again and serum samples were added at a 1:40 dilutions. Plates were incubated for 5 hours at 4°C with shaking. Wells were 3x washed and 100 pL of HRP-labeled goat anti-mouse IgGl (Bethyl; diluted 1:100,000 in 1% BSA) was added to each well. Secondary antibody was incubated for 1 hour at room temperature, wells were washed 3 times, and 100 pL of TMB substrate was added to each well. Optical density at 450 nm was measured using a microplate absorbance reader (BioRad iMark) .

[00148] Engineering immune orthogonal AAV variants

[00149] To engineer the immune orthogonal 'AAV MG2' capsid, the AAV9 capsid amino acid sequence was aligned to the MG2 sequence using Clustal Omega (108) . The PDGFC peptide coding sequence was then ligated into the 'AAV MG2' vector at the appropriate Loop 1 location with an identical protocol as was done for cloning peptides into AAV9. The engineered 'AAV MG2. PDGFC' vector capsid was then assayed in vivo identically to the above validation experiments using AAV9.

[00150] Ethical Compliance

Ill [00151] All experiments involving live vertebrates performed at UCSD were done in compliance with ethical regulations approved by the UCSD IACUC committee.

[00152] Statistics

[00153] Unless otherwise noted, differences in means were calculated via an unpaired t test. When necessary, p values were adjusted for multiple hypothesis testing via the benj amini-hochberg procedure ( 99) . For all Figures: unless otherwise noted *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001.

[00154] A systematic library of AAV variants displaying fragmented proteins

[00155] To create the libraries AAV5 and AAV9 were chosen as the starting serotypes. This was due to their established clinical utility as well as two key characteristics: one, AAV5 is more evolutionarily distant to other AAV serotypes in clinical use, and has previously been shown to be immune orthogonal (to other prevalent AAV serotypes thereby enabling their sequential redosing) ; and two, AAV9 has been used extensively for clinical trials and has been shown to cross the blood-brain barrier outperforming other AAV serotypes in most tissues. To generate the library of diverse AAV variants, a DNA oligonucleotide pool of 275,298 gene fragments was generated (Fig. la-b) . Each gene fragment coded for a 20 amino acid peptide derived from the coding sequence of ligands with known extracellular receptors, or a gene predicted to have cellpenetrating or internalizing properties (Fig. la-b) . Protein ligands were sourced from the Guide to Pharmacology database, an expertly curated list of pharmacological targets and their associated ligands, and cell-penetrating/internalizing functionality was inferred through text mining of UniProt entries. Examples of protein classes identified as having potential internalizing function included toxins, histones, granzymes, viral receptor binding domains, and nuclear localization signal domains (NLS) . Mouse and human genomes share 80% of their protein coding genes, with 85% amino acid sequence identity between orthologs. These genomic similarities extend to the receptors for ligands in the libraries (e.g. , 86% of mouse GPCR proteins have human orthologs) , implying many human ligands will be similarly functional across human and mice contexts. After synthesizing the pool of single-stranded oligonucleotides coding for these gene fragments, it was amplified to double-stranded DNA via PCR, digested, and ligated into two distinct locations on the AAV5 and AAV9 cap genes. Seamless cloning was enabled by type IIS restriction enzymes cutting outside their recognition sequence. PaqCI sites engineered to yield compatible overhangs (coding for glycine-serine 2 amino acid linkers) were inserted at the ends of the peptide coding DNA library and on the AAV5/AAV9 cap plasmid DNA at sites coding for two distinct surface loops, hereon referred to as loop 1 and loop 2. Surface loop 1 (AA 443 and 456 on AAV5 and AAV9, respectively) and loop 2 (AA 576 and 587 on AAV5 and AAV9, respectively) were chosen as peptide insertion locations due to their distance from the viral particle core facilitating potential receptor engagement (Fig. lb) . While many previous AAV engineering efforts have focused on inserting peptides at surface loop 2, loop 1 sites have been shown to accommodate large insertions including even a full length fluorescent protein. Collectively, four libraries of variants were produced, spanning two AAV capsids, with two loop insertion sites each. In addition to protein coding gene fragments, 444 stop codon containing gene fragments were included as negative controls. This defined library synthesis methodology was adopted to enable quantitative inference of variant packaging and transduction efficiencies. The starting plasmid libraries were sequenced to establish initial relative variant abundances, and packaging efficiencies were quantified via comparison to this initial baseline.

[00156] Biophysical drivers of AAV capsid formation

[00157] To quantify how well different AAV cap variants package into functional capsids, recombinant AAV particles were generated with the engineered AAV5 and AAV9 cap plasmid libraries via transient triple transfection of HEK293T cells (Fig. 2a) . These viral particles were treated with benzonase to degrade residual plasmid DNA, and then subjected to next generation sequencing (NGS) to quantify relative variant abundance. Packaging efficiency was quantified by ranking AAV variants by the log2 fold change (log2FC) of their relative capsid abundance compared to their count in the plasmid pool (Fig. 2b) . Utilizing this method, over 250,000 AAV variants were identified which package efficiently into functional

AAV capsids with a log 2 FC > 0. To validate this packaging metric quantified from screening data, 25 AAV capsids were produced including 23 identified as successful packagers (log 2 FC >0) and 2 identified as non-packagers (log 2 FC < 0) . In these individual validations, the 2 non-packagers yielded >10-fold lower titer than those identified as packagers, thus providing confidence in the AAV packaging metric. Consistent with their disruption of the AAV capsid structure there was also a depletion of non-functional stop codon control AAV variants in the capsid pool, and importantly this confirmed lack of library cross-packaging during AAV production (Fig. 2C) . [00158] To better understand what features drove successful capsid formation, the biophysical characteristics of the inserted peptides that yielded AAV variants that packaged successfully were analyzed. It was found that peptide charge, flexibility, alpha helical content, and hydrophobicity were all significantly different in packaging AAV variants versus AAV variants unable to package (Fig. 2d) . The set of successfully packaged variants had a narrower charge distribution than the variants unable to package, suggesting peptides with extreme charge densities have a negative impact on capsid formation. Successfully packaging AAV variants also had inserted peptides with higher flexibility, lower alpha-helical content, and lower hydrophobicity than the variants unable to package. The observed depletion of hydrophobic peptide displaying variants is consistent with the solvent exposed nature of the AAV surface loops. [00159] To build an integrated model predicting if AAV variants will package based on the biophysical features of the inserted peptides, a support vector machine classifier was trained using the charge, flexibility, alpha helical content, and hydrophobicity of the peptides in the dataset (Fig. 2e) . While all of these biophysical features were significantly different when comparing packaging versus non-packaging AAV variants, the magnitude of this difference was relatively modest for each individual feature (Fig. 2d) . However, collectively these features were sufficient to train a model which could differentiate between packaging and non-packaging AAV variants (area under the receiver operating characteristic curve = 0.89) . For each AAV variant, embedding the inserted peptide's charge, flexibility, alpha helix content, and hydrophobicity into two dimensions using uniform manifold approximation and projections (UMAP) enabled visualization of this class separability (Fig. 2f) . The resulting embedding laid out AAV variants into distinct clusters, indicating that while each underlying biophysical feature is continuous, there are separable groups of AAV variants with similar biophysical features. AAV variants which package tended to cluster with other packaging variants in this unsupervised embedding, further supporting the predictive power of these four biophysical features. This thorough quantification and analytical framework for assessing AAV packaging, which is enabled by the diversity and length of the inserted peptide library, is of critical translational importance for identified AAV variants as AAV production costs are directly related to their packaging capability. [00160] High-throughput mapping of engineered AAV tissue tropism [00161] Having produced libraries of recombinant AAV particles packaging their own cap genes, next four viral pools were injected into C57BL/6 mice in duplicate (Fig. 3a) . After two weeks, mice were sacrificed and the peptide-containing region of the AAV cap gene was amplified from DNA isolated from mouse liver, kidney, spleen, brain, lung, heart, skeletal muscle, intestine, and pancreas. The relative abundance of each AAV variant via NGS was quantified, and the log2 enrichment calculated for each variant relative to its abundance in the capsid pool, observing good correlation between replicates (Fig. 3b) . Using this data, over 15, 000 variants were identified which efficiently infected at least one mouse tissue (Fig. 3c) . The spleen and liver were the most frequent tissue targets of the infectious variants, consistent with the established wild-type (WT) tropism of AAV5 and AAV9 towards the liver, as well as more recent research showing AAV5 and AAV9 readily transduce the spleen. The fewest AAV variants targeting the skeletal muscle and brain were identified, in line with the high therapeutic AAV doses needed to achieve clinical efficacy for muscle targeting gene therapies, and the challenge of delivery across the blood-brain barrier. A substantial number of the AAV variants were comprised of the same peptide inserted across different AAV capsids and insertion sites, giving credence to the hypothesis that tropism reprogramming is, at least partially, peptide-specific (Fig. 3c) .

[00162] Towards mapping and understanding the tissue-transduction patterns mediated by the inserted peptides, the feasibility of training predictive models linking inserted peptide seguence to tissue tropism were analyzed. Inspired by contemporary work using convolutional neural networks (CNN) to predict antibody specificity, a CNN multi-label classifier was trained to predict AAV tissue tropism using one-hot encoded inserted peptide sequences as input features (Fig. 3d) . To evaluate performance, the model was trained using a random selection of of the significantly enriched AAV variants, and evaluated on the testing dataset. This CNN model architecture had good performance across all organs, with a minimum accuracy of 72% in the kidney (Fig. 3d) . The highest Fl scores and Matthews Correlation Coefficients (MCC) were observed for the liver and spleen, likely due to the high number of liver and spleen targeting variants identified in the pooled screen (Fig. 3c) . It was also observed that for the liver and spleen, the CNN had only modest reduction in Fl scores when evaluated on peptides very different (greater than 10 edits away) form any training examples, implying the models had learned generalizable features relevant to AAV transduction. For other organs, the models were much less accurate when predicting the tissue tropism of peptides very different from the training data, likely due to the reduced amount of positive training examples. Collectively, the ability to predict AAV tropism supports the conclusion that the inserted peptides are mediating retargeting of tissue-tropism, and that there is a learnable relationship between peptide sequence and tissue-tropism.

[00163] Engineered AAV variants with clinically relevant tissue tropism

[00164] Like the WT scaffolds from which they were derived, transduction of multiple organs is a near ubiquitous phenotype among the infectious AAV variants identified (Fig. 4a) . Significant transduction of the liver and spleen was observed for the majority of infectious variants regardless of which other organs were cotransduced. This is true even for variants with insertions in surface loops known to be involved in WT capsid receptor binding. While liver and spleen targeting were near ubiquitous, allowing for identifying variants which specifically target the liver/spleen plus one other organ, as well as variants which transduced all tissues at high levels (Fig. 4a) . When variants were hierarchically clustered based on their tissue detection levels, variants derived from the same sub-library tended to cluster together, suggesting that the tissue specificity of the wild-type scaffold was at least partially a determinant of engineered variant tropism. Hierarchical clustering of the organ samples resulted in replicates clustering together, giving further confidence to the reliability of the screen results. To visualize the overall screen results, the tissue detection levels for each variant were embedded into two dimensions using UMAP, coloring the variants by the organ they most readily transduce (Fig. 4b) . In this reduced dimensional space, organ specific clusters can be readily identified, with the liver and spleen targeting variants especially prominent.

[00165] To confirm the tissue tropism of the novel AAV variants that were identified via the pooled screen, 21 variants were individually produced and individually validated by quantifying their ability to package and deliver an mCherry transgene in vivo (Fig. 4c) . All 21 variants were significantly enriched in at least one organ, and variants for validation which were internally consistent within the screening data were prioritized: consistent AAV capsids being defined as hits where other variants with similar inserted peptides enriching in the same organ were identified. Variants were initially characterized by in vivo quantification of tissue tropism at the mRNA level (via RT-qPCR quantification of mCherry transgene expression) (Fig. 4c) . The tissue tropism of the variants largely recapitulated the screen predictions (Fig. 4c) , with 74.3% of the tissue tropism predictions matching expectations. AAV variants which specifically targeted hard to infect organs such as the muscle, lung, and brain, while simultaneously de-targeting away from the liver were identified (Fig. 4c) . Notably, protein level quantification of mCherry delivery to the liver, quantified via fluorescent microscopy, confirmed excellent concordance between mRNA and protein measures of tissue transduction (R 2 =0.97) (Fig. 4d) . Altogether, across the variants individually tested, 9/21 variants were found to exceed AAV9 infectivity in at least one organ. Variants were identified which exceed AAV9 infectivity in all organs except the liver and pancreas, which had max relative transduction of 98.8% of WT AAV9 and 82.2% of WT AAV9 , respectively. Additionally, 18/21 variants had less than half the liver transduction of WT AAV9, with three variants below 5% AAV9 liver transduction levels (Fig. 4d) indicating clinically important liver de-targeting . To increase confidence in the fidelity of the individually validated variants in C67BL/6 mice, 3 of the above 21 variants were individually validated in BALB/c mice. Notably, their relative tropism was consistent across the two mouse strains.

[00166] Mechanistic insights into AAV reprogramming

[00167] Having confirmed the efficacy of the AAV variants, experiments were performed to explore the mechanisms underlying their reprogrammed tropism, and in particular, investigate whether the hypothesis that AAV variants displaying peptides derived from distinct ligand domains could indeed drive their transduction patterns. First, experiments were performed to interrogate how peptides derived from specific ligand protein regions alter AAV tropism. This was motivated by observation that tiled peptide enrichment patterns in a screen can yield insight into functional domains of the proteins from which they were derived. Analysis of how the peptides were derived was performed. To do this, experiments were performed to interrogate tissue-specific AAV variants identified in the screen, focusing first on a lung-specific variant (termed AAV9.DKK1) that was derived via display of a peptide fragment form the DKK gene on the AAV9-Loop2 scaffold. Specifically, as shown in Fig. 5a (top left) , it was observed that significantly enhanced lung transduction from peptides derived from a specific domain in the N-terminus of the protein. Upon confirming that this variant is packaged into a functional capsid via electron microscopy (Fig. 5a, top right) , next experiments quantified its transduction profile in vivo. This confirmed that AAV9. DKK1 was indeed an efficient lung transducing variant with greater than 2-3 fold higher expression than AAV9, and with consistent de-targeting across all other organs compared to AAV9 when quantified at both the RNA and protein level (Fig. 5a, bottom left and right) .

[00168] Next, experiments were conducted on the DKK1 displayed peptide to determine insight into the mechanism of transduction. Interestingly, the N-termini region of lung enrichment (Fig. 5a, top left) was centered on an evolutionarily conserved, linear peptide motif mediating binding to the low density lipoprotein receptor- related proteins 5 and 6 (LRP5/6) . As HEK293T cells robustly express LRP6, thus lentiviral-mediated CRISPR-Cas9 was utilized to disrupt LRP6 expression in these cells to investigate the relative transducability of the AAV9.DKK1 variant in this system (Fig. 5b) . Specifically, cells transduced with either non-targeting control (NTC) guides, or LRP6 targeting guides were subsequently transduced with either AAV9. DKK1 or wild-type AAV9. Using flow cytometry to quantify AAV delivery of the mCherry transgene, it was observed that a significant reduction in the infectivity of AAV9. DKK1 in the LRP6 knockout population, with no corresponding reduction in the infectivity of wild-type AAV9 (Fig. 5b) . The LRP6 dependent in vitro infectivity of AAV9.DKK1, combined with the known interaction between LRP6 and the N-termini of DKK1 (see crystal structure, Fig. 5b) , and its known robust expression in lung alveolar cells, to together gives strong support to the hypothesis that LRP6 mediates, at least in part, the reprogrammed tropism of AAV9. DKK1.

[00169] Building on these observations, and focusing on the brain as an exemplar, experiments were performed to examine more broadly the extent to which the inserted peptides were mediating retargeting. First, a distance matrix quantifying the similarities between all peptide hits identified as significantly enriched in at least one organ was generated (Fig. 6a) . By projecting this distance matrix into two dimensions, allowed for visualization of distinct clusters of homologous inserted peptides which infect the brain. It was found that families of similar inserted peptides yielded similar transduction rates (Fig. 6a) . This effect was not limited to a particular capsid or insertion site, insofar as inserted peptides were often functional across all tested capsids and insertion sites (Fig. 6a) . The observation that inserted peptides yielded consistent phenotypes across multiple capsids suggests that re-targeting is directly due to a peptide mediated mechanism. This result also highlights the power of peptide tiling library designs, insofar as having multiple overlapping peptides with similar sequences can function as internal controls, imparting confidence in the identification of particular AAV variant hits.

[00170] Experiments were performed to specifically interrogated two brain targeting variants (AAV5.AP0A1 and AAV9.APOA1) which contained the same inserted apolipoprotein-A 1 (APOA1) derived peptide. Both variants were confirmed to package into functional capsids by electron microscopy (Fig. 6b) . Across the entire APOA1 protein, functional APOAl-derived peptides were primarily from a tandem repeat region at the C-terminus of the protein (Fig. 6b) . Highlighting the peptide-specific effect of altered AAV tropism, it was observed that strong detargeting away from the liver at less than 2% of AAV9 levels for both variants, with unperturbed brain transduction relative to wild-type AAV9 (Fig. 6c) . This was especially remarkable for AAV5.AP0A1 as the engineered variant was able to overcome the limited transduction of its parental AAV serotype across the blood-brain barrier. AAVS.APOAl's improved specificity was subsequently shown to be strain independent as it retained both its brain targeting and liver detargeting functionality in BALB/c mice (FIG. 6d) .

[00171] Engineering re- targeted immune orthogonal AAV vectors

[00172] To generate additional variants with clinically relevant features, the AAV engineering approach was expanded to include novel capsids beyond AAV5 and AAV9. In particular, the potential of engineering tropism of highly divergent AAV orthologs was explored. This was motivated by the fact that one of the biggest challenges with AAV therapy is the associated immune responses: either, preexisting immunity that currently excludes a significant fraction of the human population from being eligible for any AAV therapy; or induced immunity upon AAV injection which prevents subsequent AAV re-dosing. This makes it extremely difficult to create effective therapies since it only allows a single opportunity to treat a patient, thus often mandating extremely high titers for a one-time therapy which in turn can lead to dangerous levels of toxicity. Recently, the concept of immune orthogonality to address this challenge has been performed. Specifically, it has been suggested that an orthologue, given sufficient sequence divergence, will not cross-react with the immune response generated by exposure to other orthologues, thereby allowing re-dosing that avoids neutralization by existing antibodies or clearance of treated cells by activated cytotoxic T cells. While in that study experiments focused on commonly used AAV orthologs, experiments were performed to explore highly divergent natural orthologs from the full spectrum of mammalian species, including those which have not yet been thoroughly tested for use as in vivo vectors. Towards this a computational experimental pipeline was established to identify novel immune-orthogonal AAV serotypes. First, a basic local alignment search tool (BLAST) was used to identify 687 capsids with sequence homology to the AAV2 cap gene. This initial list was then filtered to exclude truncated genomes, redundant samples, human and non-mammalian serotypes, as well as close orthologs, resulting in 23 AAV capsid sequences for experimental investigation (Fig. 7a) . Analyzing the evolutionary distance of these potential AAV orthologs, it was confirmed that they exhibit high dissimilarity from all wild-type AAV capsids (except AAV5 which had previously been shown to be immune-orthogonal) , with many exhibiting less than 60% sequence similarity to AAV2 (Fig. 7b) . To investigate if these computationally predicted AAV orthologs were functional, their ability to package into capsids was examined. Comparing titers to AAV5, 11 of the 23 AAV capsids were identified with packaging titers within 10-fold of AAV5 (Fig. 7c) . Next, to evaluate in vivo transduction potential, these 11 AAV capsids carrying a mCherry transgene were administered individually to C57BL/6 mice. Given that most AAV capsids have at least a degree of liver transducing potential, the livers of these mice were harvested 3 weeks post infection. Of the 11 AAV capsids analyzed, 4 of them had detectable levels of mCherry in the liver with 2 of them generating 3-fold higher mCherry expression than AAV5 (Fig 7c) .

[00173] Having identified 4 functional AAV capsids, their immune- orthogonality was characterized by injecting C57BL/6 mice with each of these and collecting serum at day 0 and day 21 timepoints (Fig. 7d) . AAV8 was selected for comparison in this study due to its propensity for transducing the liver and its similar serological profile to other wild-type serotypes. Performing an ELISA at the 3- week timepoint confirmed that antibodies developed against the 4 studied immune-orthogonal AAVs indeed exhibited no immune crossreactivity with AAV8 (Fig. 7d) . This result highlights the utility in combining computational and experimental methods to identify and characterize novel AAV serotypes which could be repurposed to enable gene therapy redosing.

[00174] Next experiments were performed to assess whether the tropism of the mined AAV immune-orthogonals could be re-engineered through insertion of one of the screen identified hit peptides. The muscle targeting AAV variants had broader tissue-tropism than the brain and lung targeting variants, insofar as they also readily infected the heart, lung, intestine, and spleen at level comparable to, or exceeding, AAV9, it was hypothesized that grafting the peptide from one of these variants onto an AAV ortholog would result in enhanced muscle targeting with minimal effects on parental AAV infectivity. To assess this AAV9. PDGFC was selected as a candidate variant, and confirmed its ability to form functional capsids (FIG. 6d) and enable efficient targeting of muscular tissue (with >10 fold expression above AAV9 as quantified by RT-qPCR) (Fig. 6d) . The skeletal muscle and cardiac muscle (heart) transduction enhancement relative to AAV9 was also confirmed by protein level visualization (Fig. 6d) . Grafts where then performed of the 20-mer PDGFC derived peptide onto one of the surface loops of the immune orthogonal AAV MGS (Fig. 7e) , and notably observed similar enhanced muscle transduction of the engineered variant above its parental AAV (FIG. 7e) . Taken together, the result highlights the fidelity of the ligand tiling display approach and further lends credence to the exciting possibility that displayed peptides identified in the initial screening approach could be utilized to re-engineer a diverse clade of AAV serotypes via peptide transfer.

[00175] It will be understood that various modifications may be made without departing from the spirit and scope of this disclosure. Accordingly, other embodiments are within the scope of the following claims .