ARTIFICIALLY ENGINEERED PROTEIN HYDROGELS TO MIMIC NUCLEOPORIN SELECTIVE GATING

Title:

ARTIFICIALLY ENGINEERED PROTEIN HYDROGELS TO MIMIC NUCLEOPORIN SELECTIVE GATING

Document Type and Number:

WIPO Patent Application WO/2015/196101

Kind Code:

Inventors:

KIM MINKYU (US)
OLSEN BRADLEY D (US)

Application Number:

PCT/US2015/036739

Publication Date:

December 23, 2015

Filing Date:

June 19, 2015

Export Citation:

Click for automatic bibliography generation Help

Assignee:

MASSACHUSETTS INST TECHNOLOGY (US)

International Classes:

C12Q1/68

Attorney, Agent or Firm:

STEELE, Alan, W. et al. (155 Seaport Blvd.Boston, MA, US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

We claim:

1. A polypeptide comprising a plurality of contiguous instances of a subsequence represented by PAFSFGAKPDEKKDSDTSK (SEQ ID NO: l).

2. The polypeptide of claim 1, wherein the polypeptide comprises 16 contiguous instances of the subsequence represented by SEQ ID NO:l .

3. The polypeptide of claim 1, wherein the polypeptide consists of 16 contiguous instances of the subsequence represented by SEQ ID NO:l .

4. The polypeptide of claim 1, further comprising a first leucine zipper domain endblock flanking the N-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: 1.

5. The polypeptide of claim 1, further comprising a first leucine zipper domain endblock flanking the C-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: 1.

6. The polypeptide of claim 4 or 5, wherein the first leucine zipper domain endblock consists of a pentameric coiled-coil domain (P domain).

7. The polypeptide of claim 6, wherein the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

8. The polypeptide of claim 1, further comprising a first leucine zipper domain endblock flanking the N-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: l; and a second leucine zipper domain endblock flanking the C-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: 1.

9. The polypeptide of claim 8, wherein the first leucine zipper domain endblock consists of a P domain.

10. The polypeptide of claim 9, wherein the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

11. The polypeptide of claim 8, wherein the second leucine zipper domain endblock consists of a P domain.

12. The polypeptide of claim 11, wherein the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

13. The polypeptide of claim 8, wherein the first leucine zipper domain endblock consists of a P domain; and the second leucine zipper domain endblock consists of a P domain.

14. The polypeptide of claim 13, wherein the first leucine zipper domain endblock P domain consists of the peptide represented by SEQ ID NO:3; and the second leucine zipper domain endblock P domain consists of the peptide represented by SEQ ID NO:3.

15. The polypeptide of claim 1, comprising the sequence represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDASGASPAFSFGAKPDE KKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPD EKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKP DEKKDSDTSKPAFSFGAKPDEKKDSDTSKTSPAFSFGAKPDEKKDSDTSKPAFSFGA KPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFG AKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSF GAKPDEKKDSDTSKTSAPQMLRELQETNAALQDVRELLRQQVKEITFLK TVMES DAS (SEQ ID NO:4).

16. A polypeptide comprising a plurality of contiguous instances of a subsequence represented by PAFSFGAKPDEKKDDDTSK (SEQ ID NO:2).

17. The polypeptide of claim 16, wherein the polypeptide comprises 16 contiguous instances of the subsequence represented by SEQ ID NO:2.

18. The polypeptide of claim 16, wherein the polypeptide consists of 16 contiguous instances of the subsequence represented by SEQ ID NO:2.

19. The polypeptide of claim 16, further comprising a first leucine zipper domain endblock flanking the N-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO:2.

20. The polypeptide of claim 16, further comprising a first leucine zipper domain endblock flanking the C-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO:2.

21. The polypeptide of claim 19 or 20, wherein the first leucine zipper domain endblock consists of a pentameric coiled-coil domain (P domain).

22. The polypeptide of claim 21, wherein the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

23. The polypeptide of claim 16, further comprising a first leucine zipper domain endblock flanking the N-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: l; and a second leucine zipper domain endblock flanking the C-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: 1.

24. The polypeptide of claim 23, wherein the first leucine zipper domain endblock consists of a P domain.

25. The polypeptide of claim 24, wherein the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

26. The polypeptide of claim 23, wherein the second leucine zipper domain endblock consists of a P domain.

27. The polypeptide of claim 26, wherein the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

28. The polypeptide of claim 23, wherein the first leucine zipper domain endblock consists of a P domain; and the second leucine zipper domain endblock consists of a P domain.

29. The polypeptide of claim 28, wherein the first leucine zipper domain endblock P domain consists of the peptide represented by SEQ ID NO:3; and the second leucine zipper domain endblock P domain consists of the peptide represented by SEQ ID NO:3.

30. The polypeptide of claim 16, comprising the sequence represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDASGASPAFSFGAKPDE KKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPD EKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKP DEKKDDDTSKPAFSFGAKPDEKKDDDTSKTSPAFSFGAKPDEKKDDDTSKPAFSFG AKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEK DDDTSKPAFSF GAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAF SFGAKPDEKKDDDTSKTSAPQMLRELQETNAALQDVRELLRQQVKEITFLKNTVM ESDAS (SEQ ID NO:5).

31. A polypeptide comprising a core sequence represented by

PSFSFGAKSDENKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTA

KPAFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKPDENKASAT

SKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNE

TSKPAFSFGAKSDEKKDGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEKKDS

GSSKPAFSFGAKPDEKK DEVSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEE

GDGAKAAISFGAKPEEQKSSDTSK (SEQ ID NO:6); and a first leucine zipper domain endblock flanking the N-terminal end or the C-terminal end of the core sequence.

32. The polypeptide of claim 31 , wherein the first leucine zipper domain endblock flanks the N-terminal end of the core sequence.

33. The polypeptide of claim 31 , wherein the first leucine zipper domain endblock flanks the C-terminal end of the core sequence.

34. The polypeptide of any one of claims 31 to 33, wherein the first leucine zipper domain endblock consists of a pentameric coiled-coil domain (P domain).

35. The polypeptide of claim 34, wherein the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

36. The polypeptide of claim 32, further comprising a second leucine zipper domain endblock flanking the C-terminal end of the core sequence.

37. The polypeptide of claim 36, wherein the first leucine zipper domain endblock P domain consists of the peptide represented by SEQ ID NO:3; and the second leucine zipper domain endblock P domain consists of the peptide represented by SEQ ID NO:3.

38. The polypeptide of claim 31 , comprising the sequence represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDASGASDNKTTNTTPSF SFGAKSDENKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKP AFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKPDENKASATSK PAFSFGAKPEEK DDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNETS KPAFSFGAKSDEK DGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEKKDSGS SKPAFSFGAKPDEKKNDEVSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEEGD GAKAAISFGAKPEEQKSSDTSKPAFTFGTSAPQMLRELQETNAALQDVRELLRQQV KEITFLKNTVMESDAS (SEQ ID NO:7).

39. A nucleic acid molecule encoding the polypeptide of any one of the preceding claims.

40. An expression vector comprising the nucleic acid molecule of claim 39.

41. A cell, comprising the expression vector of claim 40.

42. A hydrogel, comprising the polypeptide of any one of claims 1 to 38.

43. A filtering device, comprising the hydrogel of claim 42; and a housing or support for the hydrogel.

44. A drug delivery device, comprising a drug; and the hydrogel of claim 42.

45. The drug delivery device of claim 44, wherein the drug is dispersed within the hydrogel.

46. The drug delivery device of claim 44, wherein the drug is effectively enveloped by the hydrogel.

47. A method of separating or selectively filtering macromolecules, comprising contacting a source of macromolecules with a hydrogel of claim 42.

48. The method of claim 47, wherein the macromolecule is selected from the group consisting of R A, mR A, DNA, proteins, glycoproteins, carbohydrates, lipids, toxins, and any combination thereof.

Description:

ARTIFICIALLY ENGINEERED PROTEIN HYDROGELS TO MIMIC

NUCLEOPORIN SELECTIVE GATING

RELATED APPLICATION

This application claims benefit of priority to U.S. Provisional Patent Application No. 62/015,012, filed June 20, 2014.

GOVERNMENT SUPPORT

This invention was made with Government support under Contract No. W911NF- 13-D-0001 awarded by the Army Research Office. The Government has certain rights in the invention.

BACKGROUND OF THE FNVENTION

The entry and exit of large molecules from the eukaryotic cell nucleus is tightly controlled by nuclear pore complexes (NPCs). Although small molecules can enter and exit the nucleus without regulation, macromolecules such as RNA, mRNA, ribosomal proteins, DNA polymerase, lamins, carbohydrates, signaling molecules, and lipids require association with karyopherins called importins to enter the nucleus and exportins to exit.

Nuclear pore complexes are large protein complexes that span the nuclear envelope, which is the double membrane surrounding the eukaryotic cell nucleus. The proteins that make up the nuclear pore complex are known as nucleoporins.

Nucleoporins are only required for the transport of large hydrophilic molecules above 40 kDa, as smaller molecules pass through nuclear pores via passive diffusion. For example, nucleoporins play an important role in the transport of mRNA from the nucleus to the cytoplasm after transcription. Depending on their function, certain nucleoporins are localized to a single side of the nuclear pore complex, either cytosolic or nucleoplasmic. Other nucleoporins may be found on both faces.

There are three distinct types of nucleoporins, each having a unique structure and function. These three types are structural nucleoporins, membrane nucleoporins, and FG- nucleoporins.

Structural nucleoporins form the ring portion of the NPC. They span the membrane of the nuclear envelope and are often referred to as the scaffolding of a nuclear pore.

Structural nucleoporins come together to form Y-complexes that are composed of seven nucleoporins. Each nuclear pore contains sixteen Y-complexes for a total of 112 structural nucleoporins. Membrane nucleoporins are localized to the curvature of a nuclear pore. These proteins are embedded within the nuclear membrane at the region where the inner and outer leaflets connect.

FG-nucleoporins are so named because they contain repeats of the amino acid residues phenylalanine (F) and glycine (G). FG repeats are small hydrophobic segments that break up long stretches of hydrophilic amino acids. These FG-repeat segments are found in long random-coil portions of the protein which stretch into the channel of nuclear pores and are believed to be primarily responsible for the selective exclusivity of nuclear pore complexes. These segments of FG-nucleoporins form a mass of chains which allow smaller molecules to diffuse through but exclude large hydrophilic macromolecules. These macromolecules are only able to cross a nuclear pore if they are associated with a transport molecule (karyopherin) that temporarily interacts with a nucleoporin's FG-repeat segments. FG-nucleoporins also contain a globular portion that serves as an anchor for attachment to the nuclear pore complex.

Karyopherins and their cargo are passed between FG-repeats until they diffuse down their concentration gradient and through the nuclear pore complex. The release of their cargo from karyopherins is driven by Ran, a G protein. Ran is small enough that it can diffuse through nuclear pores down its concentration gradient without interacting with nucleoporins. Ran binds to either GTP or GDP and has the ability to change a

karyopherin's affinity for its cargo. Inside the nucleus, RanGTP causes an importin karyopherin to change conformation, allowing its cargo to be released. RanGTP can also bind to exportin karyopherins and pass through the nuclear pore. Once it has reached the cytosol, RanGTP can be hydrolyzed to RanGDP, allowing the exportin's cargo to be released.

Adapting artificially engineered protein polymers from consensus repeats of natural proteins is an attractive approach to mimic the unprecedented performance of natural materials. Tough silk-like polypeptides, thermoresponsive elastin-like polypeptides, and resilient and elastic resilin-like polypeptides have been synthesized to mimic the functions of natural materials. Important design principles have been developed for these artificial biopolymers that enable rational control over their thermodynamic, structural, and mechanical properties. The simplified repeat allows for a detailed understanding of sequence-structure-property relationships to be developed, and these tailor-made materials open up opportunities for applications such as drug delivery, tissue engineering, photonic films and smart responsive devices.

An additional natural material that has interesting engineering properties is the protein matrix which fills the nuclear pore complex (NPC) in the nuclear envelope and controls transport into the nucleus. It allows passage of less than 0.1 % of all proteins while translocating over 1,000 molecules per pore per second. Ribbeck K et al, EMBO J 20: 1320 (2001); Yang WD et al, Proc Natl Acad Sci USA 101 : 12887 (2004). The protein matrix is composed of nucleoporins, proteins containing Phe-Gly (FG) repeat sequences which contribute to specific binding of the nuclear transport receptors (NTRs) that facilitate transport of a specific subset of biological molecules into the matrix.

Individual nucleoporins can form hydrogels in vitro that recapitulate the enhanced permeability of selectively-labeled macromolecules into the gel, similar to the intact NPC, with varying degrees of passive diffusion of inert molecules. Labokha AA et al., EMBO J 32: 204 (2013); Jovanovic-Talisman T et al, Nature 457: 1023 (2009). This selectivity is rare in synthetic polymer hydrogels, making these natural materials an intriguing model for new filtration technologies. In spite of the advanced filtering function of natural nucleoporin hydrogels, until now a fundamental understanding of the sequence-structure- property relationships needed for materials engineering has been lacking, due to the complex sequence of the proteins and the inability to synthesize them recombinantly in high yields.

SUMMARY OF THE INVENTION

To adapt the function of nucleoporin hydrogels in a biosynthetic material, artificially engineered protein polymers were designed that can replicate the biological selective transport of the hydrogel in a synthetic mimic using a consensus repeat adapted from a well-investigated nucleoporin, Nspl . Frey S et al, Science 314: 815 (2006); Ader C et al, Proc Natl Acad Sci USA 107: 6281 (2010); Frey S et al, Cell 130: 512 (2007). The polymers provide a valuable tool for material engineering and an opportunity to tune the selectivity, transport rates, and barrier function of nucleoporin-inspired materials through rational repeat sequence design.

As described in detail herein, designed peptides 1NLP and 2NLP, extracted from partial Nspl nucleoporin, are useful for the preparation of nucleoporin-based hydrogels characterized by selective filtering capability. As described herein, recombinant nucleoporin-like polypeptides P-1NLP-P, P- 2NLP-P, and P-cNspI-P are useful for the preparation of hydrogels characterized by selective filtering capability.

As described in detail herein, hydrogels of the invention are useful as selectively permeable barriers.

Also as described in detail herein, hydrogels of the invention are useful for sequestration of compounds, including macromolecules.

Additionally, hydrogels of the invention find use as models for the nuclear pore in assays for nuclear permeability of drugs, biomaterials, nanoparticles, and other compounds.

As described in detail herein, various nuclear transport receptors, such as importin β and NTF2, can be used as carriers which can selectively bring target molecules into P- 1NLP-P, P-2NLP-P, and P-cNspI-P hydrogels.

Also as described in detail herein, hydrogels of the invention are useful for collecting selected target molecules into hydrogel with nuclear transport receptor associated with target molecule-specific binding tag.

An aspect of the invention is a polypeptide comprising a plurality of contiguous instances of a subsequence represented by PAFSFGAKPDEKKDSDTSK (SEQ ID NO: l).

In certain embodiments, the polypeptide comprises 16 contiguous instances of the subsequence represented by SEQ ID NO: 1.

In certain embodiments, the polypeptide consists of 16 contiguous instances of the subsequence represented by SEQ ID NO: 1.

In certain embodiments, the polypeptide further comprises a first leucine zipper domain endblock flanking the N-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: l; and a second leucine zipper domain endblock flanking the C-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: 1.

In certain embodiments, the leucine zipper domain endblock consists of a P domain.

In certain embodiments, the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

In certain embodiments, the polypeptide comprises the sequence represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDASGASPAFSFGAKPDE

KKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPD

EKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKP DEKKDSDTSKPAFSFGAKPDEKKDSDTSKTSPAFSFGAKPDEKKDSDTSKPAFSFGA KPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFG AKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSF GAKPDEKKDSDTSKTSAPQMLRELQETNAALQDVRELLRQQVKEITFLK TVMES DAS (SEQ ID NO:4).

An aspect of the invention is a polypeptide comprising a plurality of contiguous instances of a subsequence represented by PAFSFGAKPDEK DDDTSK (SEQ ID NO:2).

In certain embodiments, the polypeptide comprises 16 contiguous instances of the subsequence represented by SEQ ID NO:2.

In certain embodiments, the polypeptide consists of 16 contiguous instances of the subsequence represented by SEQ ID NO:2.

In certain embodiments, the polypeptide further comprises a first leucine zipper domain endblock flanking the N-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO:2; and a second leucine zipper domain endblock flanking the C-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO:2.

In certain embodiments, the leucine zipper domain endblock consists of a P domain.

In certain embodiments, the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

In certain embodiments, the polypeptide comprises the sequence represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDASGASPAFSFGAKPDE KKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPD EKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKP DEKKDDDTSKPAFSFGAKPDEKKDDDTSKTSPAFSFGAKPDEKKDDDTSKPAFSFG AKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSF GAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAF SFGAKPDEKKDDDTSKTSAPQMLRELQETNAALQDVRELLRQQVKEITFLKNTVM ESDAS (SEQ ID NO:5).

An aspect of the invention is a polypeptide comprising a core sequence represented by

PSFSFGAKSDENKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTA KPAFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKPDENKASAT SKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNE TSKPAFSFGAKSDEKKDGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEKKDS GSSKPAFSFGAKPDEKK DEVSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEE GDGAKAAISFGAKPEEQKSSDTSK (SEQ ID NO:6); and a first leucine zipper domain endblock flanking the N-terminal end or the C-terminal end of the core sequence.

In certain embodiments, the first leucine zipper domain endblock flanks the N- terminal end of the core sequence.

In certain embodiments, the polypeptide further comprises a second zipper domain endblock flanking the C-terminal end of the core sequence.

In certain embodiments, the leucine zipper domain endblock consists of a P domain. In certain embodiments, the P domain consists of the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

In certain embodiments, the polypeptide comprises the sequence represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDASGASDNKTTNTTPSF SFGAKSDENKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKP AFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKPDENKASATSK PAFSFGAKPEEK DDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNETS KPAFSFGAKSDEK DGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEKKDSGS SKPAFSFGAKPDEKK DEVSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEEGD GAKAAISFGAKPEEQKSSDTSKPAFTFGTSAPQMLRELQETNAALQDVRELLRQQV KEITFLK TVMESDAS (SEQ ID NO:7).

An aspect of the invention is a nucleic acid molecule encoding a polypeptide of the invention.

An aspect of the invention is an expression vector comprising a nucleic acid molecule of the invention.

An aspect of the invention is a cell, comprising an expression vector of the invention.

An aspect of the invention is a hydrogel, comprising a polypeptide of the invention.

An aspect of the invention is a filtering device, comprising a hydrogel of the invention; and a housing or support for the hydrogel.

An aspect of the invention is a drug delivery device, comprising a drug; and a hydrogel of the invention. An aspect of the invention is a method of separating or selectively filtering macromolecules, comprising contacting a source of macromolecules with a hydrogel of the invention.

In certain embodiments, the macromolecule is selected from the group consisting of RNA, mR A, DNA, proteins, glycoproteins, carbohydrates, lipids, toxins, and any combination thereof.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1A depicts design of synthetic protein polymers, P-cNspl-P and P-NLPs-P, which gel by association of pentameric (P) coiled-coil endblock domains (coils). Filled circles represent Phe-Gly (FG) sequences.

Figure IB depicts 3D gel network formed from assembly of designed artificially engineered proteins. Black dotted circles highlight Phe-mediated interactions within synthetic hydrogels.

Figure 2A depicts a series of three western blots SDS-PAGE of indicated lyophilized protein samples.

Figure 2B is a graph depicting yields of designed proteins flanked by P domains.

Figure 3A is a panel of three western blots depicting expression levels of 1NLP, 2NLP, and cNspl *, where cNspl * was obtained as a product from P-intein-cNspl .

Figure 3B is a bar graph depicting expression levels of cNspl, 1NLP, 2NLP, and cNsp 1 * , where cNsp 1 * was obtained as a product from P-intein-cNsp 1.

Figure 4 A depicts a schematic of capillary transport assay set-up. Blue (darker) and green (lighter) circles represent importin β and IBB-MBP-EGFP, respectively.

Figure 4B depicts a time course transport measurement of 20 w/v% P-cNspl-P hydrogel with importin β.

Figure 4C is a graph depicting 20 w/v% P-cNspl-P hydrogels in the presence (solid line) or absence (dotted line) of importin β.

Figure 4D is a graph depicting 20 w/v% P-1NLP-P hydrogels in the presence (solid line) or absence (dotted line) of importin β. Scale bar, 900 μιη.

Figure 4E is a graph depicting 20 w/v% P-2NLP-P hydrogels in the presence (solid line) or absence (dotted line) of importin β. Scale bar, 900 μιη.

Figure 4F is a graph depicting absorption of cargo-importin β complexes by P- cNspl-P, P-1NLP-P and P-2NLP-P hydrogels in one hour. * denotes p < 0.05. Figure 4G is a graph depicting fluorescence intensity measurements on P-2NLP-P hydrogels (20 w/v%) with 10% 1,6 hexanediol. Scale bar, 900 μιη.

Figure 4H is a graph depicting selective permeability test performed on P-1NLP-P biosynthetic hydrogels (20 w/v%) with the addition of 5 μΜ MBP-mCherry, a model inert molecule, into 5 μΜ IBB-MBP-EGFP/ importin β cargo complex mixtures. Scale bar, 900

Figure 41 is a graph depicting selective permeability test performed on P-2NLP-P biosynthetic hydrogels (20 w/v%) with the addition of 5 μΜ MBP-mCherry, a model inert molecule, into 5 μΜ IBB-MBP-EGFP/ importin β cargo complex mixtures. Scale bar, 900

Figure 5A is a graph depicting selective permeability of P-2NLP-P gel in 20 w/v%

Figure 5B is a graph depicting selective permeability of P-2NLP-P gel in 10 w/v%.

Figure 6 is a panel of four western blots depicting protein expression levels of Nspl, P-cNspl-P, P-1NLP-P, and P-2NLP-P. L, protein ladder; E, elution fraction.

Figure 7 A is a graph depicting frequency sweep, linear oscillatory shear rheology of 20 w/v% P-cNspl-P hydrogels in the absence (blue curves) or presence (red curves) of 10 % 1,6 hexanediol. The gel modulus and the crossover frequency in the absence of hexanediol are 9.3 kPa and 0.08 rad/s.

Figure 7B is a graph depicting frequency sweep, linear oscillatory shear rheology of 20 w/v% P-1NLP-P hydrogels in the absence (blue curves) or presence (red curves) of 10 % 1,6 hexanediol. The gel modulus and the crossover frequency in the absence of hexanediol are 10.7 kPa and 0.02 rad/s.

Figure 7C is a graph depicting frequency sweep, linear oscillatory shear rheology of 20 w/v% P-2NLP-P hydrogels in the absence (blue curves) or presence (red curves) of 10 % 1,6 hexanediol. The gel modulus and the crossover frequency in the absence of hexanediol are7.5 kPa and 0.04 rad/s.

Figure 7D is a graph depicting frequency sweep, linear oscillatory shear rheology of 20 w/v% P-C30-P gel, which lacks FG repeats in its midblock. C is a peptide having amino acid sequence AGAGAGPEG (SEQ ID NO: 8).

Figure 7E is a graph depicting Raman spectra of 20 w/v% cNspl midb locks, measured in buffer containing 50 mM Tris/HCl (pH 7.5) and 200 mM NaCl (blue curve) and with the addition of 10% hexanediol (red curve). The shaded boxes highlight Raman bands of 486, 685 and 710 cm ^"1 that decrease in intensity with the addition of hexanediol. Figure 7F is a graph depicting Raman spectra of 20 w/v% 1NLP midblocks, measured in buffer containing 50 mM Tris/HCl (pH 7.5) and 200 mM NaCl (blue curve) and with the addition of 10% hexanediol (red curve). The shaded boxes highlight Raman bands of 486, 685 and 710 cm ^"1 that decrease in intensity with the addition of hexanediol.

Figure 7G is a graph depicting Raman spectra of 20 w/v% 2NLP midblocks, measured in buffer containing 50 mM Tris/HCl (pH 7.5) and 200 mM NaCl (blue curve) and with the addition of 10% hexanediol (red curve). The shaded boxes highlight Raman bands of 486, 685 and 710 cm ^"1 that decrease in intensity with the addition of hexanediol.

Figure 8 is a series of four graphs depicting strain sweep oscillatory shear rheology of indicated 20 w/v% hydrogels at 100 rad/s and 25°C. C is a peptide having amino acid sequence AGAGAGPEG (SEQ ID NO:8).

Figure 9 A is a western blot depicting P-C30-P. C is a peptide having amino acid sequence AGAGAGPEG (SEQ ID NO:8).

Figure 9B is a graph depicting strain sweep oscillatory shear rheology of indicated P-C30-P hydrogels prepared by hydrating lyophilized samples (20 w/v%) with buffer containing 50 mM Tris/HCl (pH 7.5) and 200 mM NaCl in the presence or absence of 10% 1,6-hexanediol.

Figure 10 is a graph depicting Raman spectra of cNspl (20 w/v%) with selected band assignments; buffer contained 50 mM Tris/HCl (pH 7.5) and 200 mM NaCl. Δ = deformation; σ = stretching.

Figure 11 is a graph depicting Raman spectra of cNspl (upper spectrum) and 1NLP (lower spectrum) lyophilized samples with selected band assignments.

Figure 12 is a series of three graphs depicting diffusion of various sizes of FITC- dextran through the indicated hydrogels. Gel pore radii are estimated between 2.3 nm and 4.5 nm.

Figure 13 A is a graph depicting permeability profile of the P-cNspl-P hydrogel for various sizes of FITC-dextran (as per Figure 12). The florescence intensity profile on the gel at 1 hour is shown. Areas under the solid curves (> 0 μιη) were calculated and compared to the passive diffusion by inert molecules (dashed curve).

Figure 13B is a graph depicting permeability profile of the P-1NLP-P hydrogel for various sizes of FITC-dextran (as per Figure 12). The florescence intensity profile on the gel at 1 hour is shown. Areas under the solid curves (> 0 μιη) were calculated and compared to the passive diffusion by inert molecules (dashed curve). Figure 13C is a graph depicting permeability profile of the P-2NLP-P hydrogel for various sizes of FITC-dextran (as per Figure 12). The florescence intensity profile on the gel at 1 hour is shown. Areas under the solid curves (> 0 μιη) were calculated and compared to the passive diffusion by inert molecules (dashed curve).

Figure 14A is a graph depicting indicated gel and buffer interface changes by gel swelling during time lapse measurement of the capillary assay for IBB-MBP-GFP.

Figure 14B is a graph depicting indicated gel and buffer interface changes by gel swelling during time lapse measurement of the capillary assay for IBB-MBP-GFP + importin β.

Figure 15 is a graph depicting circular dichroism (CD) analysis of cNspl and indicated NLPs.

Figure 16 is a graph depicting permeability profiles of P-cNspl-P hydrogel (20 w/v%) with 10% 1,6 hexanediol, for IBB-MBP-GFP with or without importin β.

Figure 17A is a graph depicting time lapse measurements of fluorescence profiles measured to investigate the selective permeability of P-2NLP-P gel (20 w/v%) in the mixture of IBB-MBP-EGFP, MBP-mCherry, and importin β. Solid curves represent green fluorescence intensity observed in the gel.

Figure 17B is a graph depicting movement of the gel/buffer boundary over the course of the selective permeability tests of P-2NLP-P gel (20 w/v%) in the mixture of IBB-MBP-EGFP, MBP-mCherry, and importin β. For the control experiment, only MBP- mCherry was tested.

Figure 18 is a schematic depicting a method for selective capture of a target molecule by a hydrogel of the invention.

Figure 19A is a photographic image depicting NTF2-GFP in P-2NLP-P.

Figure 19B is a photographic image depicting NTF2-GFP in P-1NLP-P.

Figure 19C is a photographic image depicting NTF2-GFP in P-cNspl-P.

Figure 19D is a photographic image depicting 40 kg/mol FITC-dextran in P-2NLP-

Figure 19E is a photographic image depicting 40 kg/mol FITC-dextran in P-1NLP- P.

Figure 19F is a photographic image depicting 40 kg/mol FITC-dextran in P-cNspl-

P. DETAILED DESCRIPTION OF THE INVENTION Recent in vitro results indicate that the recombinant Ns l ^2-601 can be divided into an

2 2V 27 ^*4 601

N-terminal sequence Ns l ^" and a C-terminal sequence Nspl ^" . Ader C et al., Proc Natl Acad Sci USA 107: 6281 (2010). In Nspl, the C-terminal sequence contributes to selective transport of NTR-cargo complexes and less non-specific binding of inert molecules, core functions for selective transport. However, the C-terminal sequence alone forms a liquid that cannot restrict the passage of inert molecules. The N-terminal sequence is critical for gelation, suggesting that network formation is required for a fully functional selective transport system.

To prepare synthetic gels, the N-terminal sequence of Nspl, which gels slowly over a period of hours, was replaced with well-investigated pentameric (P) coiled-coil domains flanking the C-terminal sequence (cNspl). Petka WA et al, Science 281 : 389 (1998); Shen W et al, Nat Mater 5: 153 (2006); Olsen BD et al, Macromolecules 43: 9094 (2010). This triblock protein construct, P-cNspl-P, gels in minutes, and the transient interactions of the P domains allow network relaxation that is thought to be critical to transport. Ribbeck K et al, EMBO J 20: 1320 (2001).

282 585

Analysis of the cNspl consensus sequence (Nspl ^" ) allows reduction of the

282 585

protein to a polymer of short repeating segments. Nspl ^" is composed of 16 repeats of a 19-amino acid Phe-Gly (FG)-containing sequence, with a high consensus at each position except position 15, where equal numbers of Asp (D) and Ser (S) are observed. To elucidate

282 585 the 16 consecutive 19-amino acid segments, the 304-amino acid sequence of Nspl

(SEQ ID NO: 9) can be written thus:

PSFSFGAKSDENKAGATSK

PAFSFGAKPEEKKDDNSSK PAFSFGAKSNEDKQDGTAK

PAFSFGAKPAEKNNNETSK

PAFSFGAKSDEKKDGDASK

PAFSFGAKPDENKASATSK

PAFSFGAKPEEKKDDNSSK PAFSFGAKSNEDKQDGTAK

PAFSFGAKPAEKNNNETSK

PAFSFGAKSDEKKDGDASK

PAFSFGAKSDEKKDSDSSK PAFSFGTKSNEKKDSGSSK

PAFSFGAKPDEKK DEVSK PAFSFGAKANEKKESDESK SAFSFGSKPTGKEEGDGAK AAISFGAKPEEQKSSDTSK where the repeating FG sequences and the D and S residues at position 15 are shown in bold.

To capture the highest frequency of occurrence in all positions of cNspl, two separate repeat units were designed: one where position 15 was Asp (D)

(PAFSFGAKPDEKKDDDTSK; SEQ ID NO:l), and another where position 15 was Ser (S) (PAFSFGAKPDEKKDSDTSK; SEQ ID NO:2). These sequences were cloned to form an artificial protein polymer of 16 such units, producing two nucleoporin-like polypeptides (NLPs) denoted 1NLP and 2NLP, respectively. Both NLPs were genetically fused with P domain endb locks (P-1NLP-P and P-2NLP-P, Figure 1A) to construct polymers that form gels due to coiled-coil physical association (Figure IB). Since these simplified NLP polymers can mimic the properties of natural cNspl, the polymers represent a valuable tool for material engineering and an opportunity to tune the selectivity, transport rates and barrier function of nucleoporin-inspired materials through rational repeat sequence design.

Compounds of the Invention

2NLP

An aspect of the invention is a polypeptide comprising a plurality of contiguous instances of a subsequence represented by PAFSFGAKPDEKKDSDTSK (SEQ ID NO: l).

In certain embodiments, the polypeptide consists of a plurality of contiguous instances of a subsequence represented by SEQ ID NO:l .

In certain embodiments, the polypeptide comprises 16 contiguous instances of the subsequence represented by SEQ ID NO: 1.

In certain embodiments, the polypeptide consists of 16 contiguous instances of the subsequence represented by SEQ ID NO: 1.

In certain embodiments, the polypeptide further comprises a first leucine zipper domain endblock flanking the N-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: 1. In certain embodiments, the polypeptide further comprises a first leucine zipper domain endblock flanking the C-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO: 1.

In certain embodiments, the first leucine zipper domain endblock comprises a pentameric coiled-coil domain (P domain).

In certain embodiments, the first leucine zipper domain endblock consists of a pentameric coiled-coil domain (P domain).

In certain embodiments, the P domain comprises the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

In certain embodiments, the P domain consists of the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

In certain embodiments, the first leucine zipper domain endblock comprises a P domain.

In certain embodiments, the first leucine zipper domain endblock consists of a P domain.

In certain embodiments, the first leucine zipper domain endblock P domain comprises the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

In certain embodiments, the first leucine zipper domain endblock P domain consists of the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

In certain embodiments, the second leucine zipper domain endblock comprises a P domain.

In certain embodiments, the second leucine zipper domain endblock consists of a P domain.

In certain embodiments, the second leucine zipper domain endblock P domain comprises the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3). In certain embodiments, the second leucine zipper domain endblock P domain consists of the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

In certain embodiments, the first leucine zipper domain endblock comprises a P domain; and the second leucine zipper domain endblock comprises a P domain.

In certain embodiments, the first leucine zipper domain endblock comprises a P domain; and the second leucine zipper domain endblock consists of a P domain.

In certain embodiments, the first leucine zipper domain endblock consists of a P domain; and the second leucine zipper domain endblock comprises a P domain.

In certain embodiments, the first leucine zipper domain endblock consists of a P domain; and the second leucine zipper domain endblock consists of a P domain.

In certain embodiments, the first leucine zipper domain endblock P domain comprises the peptide represented by SEQ ID NO:3; and the second leucine zipper domain endblock P domain comprises the peptide represented by SEQ ID NO:3.

In certain embodiments, the first leucine zipper domain endblock P domain consists of the peptide represented by SEQ ID NO:3; and the second leucine zipper domain endblock P domain comprises the peptide represented by SEQ ID NO:3.

In certain embodiments, the polypeptide comprises the sequence represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDASGASPAFSFGAKPDE KKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPD EKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKP DEKKDSDTSKPAFSFGAKPDEKKDSDTSKTSPAFSFGAKPDEKKDSDTSKPAFSFGA KPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFG AKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSF GAKPDEKKDSDTSKTSAPQMLRELQETNAALQDVRELLRQQVKEITFLK TVMES DAS (SEQ ID NO:4). In certain embodiments, the polypeptide consists of the sequence represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDASGASPAFSFGAKPDE KKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPD EKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKP DEKKDSDTSKPAFSFGAKPDEKKDSDTSKTSPAFSFGAKPDEKKDSDTSKPAFSFGA KPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFG AKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSF GAKPDEKKDSDTSKTSAPQMLRELQETNAALQDVRELLRQQVKEITFLK TVMES DAS (SEQ ID NO:4).

1NLP

An aspect of the invention is a polypeptide comprising a plurality of contiguous instances of a subsequence represented by PAFSFGAKPDEK DDDTSK (SEQ ID NO:2).

In certain embodiments, the polypeptide consists of a plurality of contiguous instances of a subsequence represented by SEQ ID NO:2.

In certain embodiments, the polypeptide comprises 16 contiguous instances of the subsequence represented by SEQ ID NO:2.

In certain embodiments, the polypeptide consists of 16 contiguous instances of the subsequence represented by SEQ ID NO:2.

In certain embodiments, the polypeptide further comprises a first leucine zipper domain endblock flanking the C-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO:2.

In certain embodiments, the first leucine zipper domain endblock comprises a pentameric coiled-coil domain (P domain).

In certain embodiments, the first leucine zipper domain endblock consists of a pentameric coiled-coil domain (P domain).

In certain embodiments, the P domain comprises the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

In certain embodiments, the P domain consists of the peptide represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3). In certain embodiments, the polypeptide further comprises a first leucine zipper domain endblock flanking the N-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO:2; and a second leucine zipper domain endblock flanking the C-terminal end of the plurality of contiguous instances of the subsequence represented by SEQ ID NO:2.

In certain embodiments, the first leucine zipper domain endblock comprises a P domain.

In certain embodiments, the first leucine zipper domain endblock consists of a P domain.

In certain embodiments, the first leucine zipper domain endblock P domain comprises the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

In certain embodiments, the first leucine zipper domain endblock P domain consists of the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

In certain embodiments, the second leucine zipper domain endblock comprises a P domain.

In certain embodiments, the second leucine zipper domain endblock consists of a P domain.

In certain embodiments, the second leucine zipper domain endblock P domain comprises the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

In certain embodiments, the second leucine zipper domain endblock P domain consists of the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDAS (SEQ ID NO:3).

In certain embodiments, the first leucine zipper domain endblock comprises a P domain; and the second leucine zipper domain endblock comprises a P domain.

In certain embodiments, the first leucine zipper domain endblock comprises a P domain; and the second leucine zipper domain endblock consists of a P domain.

In certain embodiments, the first leucine zipper domain endblock consists of a P domain; and the second leucine zipper domain endblock comprises a P domain.

In certain embodiments, the first leucine zipper domain endblock consists of a P domain; and the second leucine zipper domain endblock consists of a P domain. In certain embodiments, the first leucine zipper domain endblock P domain comprises the peptide represented by SEQ ID NO:3; and the second leucine zipper domain endblock P domain comprises the peptide represented by SEQ ID NO:3.

In certain embodiments, the polypeptide comprises the sequence represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDASGASPAFSFGAKPDE KKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPD EKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKP DEKKDDDTSKPAFSFGAKPDEKKDDDTSKTSPAFSFGAKPDEKKDDDTSKPAFSFG AKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEK DDDTSKPAFSF GAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAF SFGAKPDEKKDDDTSKTSAPQMLRELQETNAALQDVRELLRQQVKEITFLKNTVM ESDAS (SEQ ID NO:5).

In certain embodiments, the polypeptide consists of the sequence represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDASGASPAFSFGAKPDE KKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPD EKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKP DEKKDDDTSKPAFSFGAKPDEKKDDDTSKTSPAFSFGAKPDEKKDDDTSKPAFSFG AKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEK DDDTSKPAFSF GAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAF SFGAKPDEKKDDDTSKTSAPQMLRELQETNAALQDVRELLRQQVKEITFLKNTVM ESDAS (SEQ ID NO:5).

cNspl

An aspect of the invention is a polypeptide comprising a core sequence represented by PSFSFGAKSDENKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTA KPAFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKPDENKASAT SKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNE TSKPAFSFGAKSDEKKDGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEKKDS GSSKPAFSFGAKPDEKK DEVSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEE GDGAKAAISFGAKPEEQKSSDTSK (SEQ ID NO:6); and a first leucine zipper domain endblock flanking the N-terminal end or the C-terminal end of the core sequence.

In certain embodiments, the polypeptide consists of a core sequence represented by PSFSFGAKSDENKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTA KPAFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKPDENKASAT SKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNE TSKPAFSFGAKSDEKKDGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEKKDS GSSKPAFSFGAKPDEKK DEVSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEE GDGAKAAISFGAKPEEQKSSDTSK (SEQ ID NO:6); and a first leucine zipper domain endblock flanking the N-terminal end or the C-terminal end of the core sequence.

In certain embodiments, the first leucine zipper domain endblock flanks the N- terminal end of the core sequence.

In certain embodiments, the first leucine zipper domain endblock flanks the C- terminal end of the core sequence.

In certain embodiments, the first leucine zipper domain endblock comprises a pentameric coiled-coil domain (P domain).

In certain embodiments, the first leucine zipper domain endblock consists of a pentameric coiled-coil domain (P domain).

In certain embodiments, the first leucine zipper domain endblock P domain comprises the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

In certain embodiments, the first leucine zipper domain endblock P domain consists of the peptide represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDAS (SEQ ID NO:3).

In certain embodiments, the polypeptide comprises the core domain, the first leucine zipper domain endblock flanking the N-terminal end of the core sequence, and a second leucine zipper domain endblock flanking the C-terminal end of the core sequence. In certain embodiments, the polypeptide consists of the core domain, the first leucine zipper domain endblock flanking the N-terminal end of the core sequence, and a second leucine zipper domain endblock flanking the C-terminal end of the core sequence.

In certain embodiments, the polypeptide comprises the sequence represented by APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDASGASDNKTTNTTPSF SFGAKSDENKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKP AFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKPDENKASATSK PAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNETS KPAFSFGAKSDEK DGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEKKDSGS SKPAFSFGAKPDEKK DEVSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEEGD GAKAAISFGAKPEEQKSSDTSKPAFTFGTSAPQMLRELQETNAALQDVRELLRQQV KEITFLK TVMESDAS (SEQ ID NO:7).

In certain embodiments, the polypeptide consists of the sequence represented by

APQMLRELQETNAALQDVRELLRQQVKEITFLK TVMESDASGASDNKTTNTTPSF SFGAKSDENKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKP AFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKPDENKASATSK PAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNETS KPAFSFGAKSDEK DGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEKKDSGS SKPAFSFGAKPDEKK DEVSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEEGD GAKAAISFGAKPEEQKSSDTSKPAFTFGTSAPQMLRELQETNAALQDVRELLRQQV KEITFLKNTVMESDAS (SEQ ID NO:7). An aspect of the invention is a nucleic acid molecule encoding a polypeptide of the invention.

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO:4. For example, in one embodiment, a DNA sequence encoding SEQ ID NO:4 is:

gcgccgcagatgctgcgcgaactgcaggaaaccaacgcggcgctgcaggatgtgcgc gaactgctgcgccagcaggtgaaag aaattacctttctgaaaaacaccgtgatggaaagcgatgcgagcggcgcgagcccggcgt ttagctttggcgcgaaaccggatga aaaaaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaa aaaagatagcgataccagcaaacc ggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccggc gtttagctttggcgcgaaaccggat gaaaaaaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaa aaaaaagatagcgataccagcaaa ccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccg gcgtttagctttggcgcgaaaccgg atgaaaaaaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaaccggatg aaaaaaaagatagcgataccagcaa aaccagcccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccag caaaccggcgtttagctttggcgcg aaaccggatgaaaaaaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaa ccggatgaaaaaaaagatagcgat accagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgatacc agcaaaccggcgtttagctttggcg cgaaaccggatgaaaaaaaagatagcgataccagcaaaccggcgtttagctttggcgcga aaccggatgaaaaaaaagatagcg ataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgata ccagcaaaccggcgtttagctttggc gcgaaaccggatgaaaaaaaagatagcgataccagcaaaaccagcgcgccgcagatgctg cgcgaactgcaggaaaccaacg cggcgctgcaggatgtgcgcgaactgctgcgccagcaggtgaaagaaattacctttctga aaaacaccgtgatggaaagcgatgc gage (SEQ ID NO: 18).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO:5. For example, in one embodiment, a DNA sequence encoding SEQ ID NO:5 is:

gcgccgcagatgctgcgcgaactgcaggaaaccaacgcggcgctgcaggatgtgcgc gaactgctgcgccagcaggtgaaag aaattacctttctgaaaaacaccgtgatggaaagcgatgcgagcggcgcgagcccggcgt ttagctttggcgcgaaaccggatga aaaaaaagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaa aaaagatgatgataccagcaaaccg gcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcg tttagctttggcgcgaaaccggatga aaaaaaagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaa aaaagatgatgataccagcaaaccg gcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcg tttagctttggcgcgaaaccggatga aaaaaaagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaa aaaagatgatgataccagcaaaacc agcccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaa ccggcgtttagctttggcgcgaaacc ggatgaaaaaaaagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccgga tgaaaaaaaagatgatgataccagc aaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaa ccggcgtttagctttggcgcgaaacc ggatgaaaaaaaagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccgga tgaaaaaaaagatgatgataccagc aaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaa ccggcgtttagctttggcgcgaaacc ggatgaaaaaaaagatgatgataccagcaaaaccagcgcgccgcagatgctgcgcgaact gcaggaaaccaacgcggcgctgc aggatgtgcgcgaactgctgcgccagcaggtgaaagaaattacctttctgaaaaacaccg tgatggaaagcgatgcgagc (SEQ ID N0: 19).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO:6. For example, in one embodiment, a DNA sequence encoding SEQ ID NO:6 is:

ccgagctttagctttggcgcgaaaagcgatgaaaacaaagcgggcgcgaccagcaaa ccggcgtttagctttggcgcgaaaccg gaagaaaaaaaagatgataacagcagcaaaccggcgtttagctttggcgcgaaaagcaac gaagataaacaggatggcaccgc gaaaccggcgtttagctttggcgcgaaaccggcggaaaaaaacaacaacgaaaccagcaa accggcgtttagctttggcgcgaa aagcgatgaaaaaaaagatggcgatgcgagcaaaccggcgtttagctttggcgcgaaacc ggatgaaaacaaagcgagcgcga ccagcaaaccggcgtttagctttggcgcgaaaccggaagaaaaaaaagatgataacagca gcaaaccggcgtttagctttggcgc gaaaagcaacgaagataaacaggatggcaccgcgaaaccggcgtttagctttggcgcgaa accggcggaaaaaaacaacaacg aaaccagcaaaccggcgtttagctttggcgcgaaaagcgatgaaaaaaaagatggcgatg cgagcaaaccggcgtttagctttgg cgcgaaaagcgatgaaaaaaaagatagcgatagcagcaaaccggcgtttagctttggcac caaaagcaacgaaaaaaaagatag cggcagcagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaaaacgatga agtgagcaaaccggcgtttagcttt ggcgcgaaagcgaacgaaaaaaaagaaagcgatgaaagcaaaagcgcgtttagctttggc agcaaaccgaccggcaaagaag aaggcgatggcgcgaaagcggcgattagctttggcgcgaaaccggaagaacagaaaagca gcgataccagcaaa (SEQ ID NO:20).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO:7. For example, in one embodiment, a DNA sequence encoding SEQ ID NO: 7 is:

gcgccgcagatgctgcgcgaactgcaggaaaccaacgcggcgctgcaggatgtgcgc gaactgctgcgccagcaggtgaaag aaattacctttctgaaaaacaccgtgatggaaagcgatgcgagcggcgcgagcgataaca aaaccaccaacaccaccccgagctt tagctttggcgcgaaaagcgatgaaaacaaagcgggcgcgaccagcaaaccggcgtttag ctttggcgcgaaaccggaagaaa aaaaagatgataacagcagcaaaccggcgtttagctttggcgcgaaaagcaacgaagata aacaggatggcaccgcgaaaccg gcgtttagctttggcgcgaaaccggcggaaaaaaacaacaacgaaaccagcaaaccggcg tttagctttggcgcgaaaagcgat gaaaaaaaagatggcgatgcgagcaaaccggcgtttagctttggcgcgaaaccggatgaa aacaaagcgagcgcgaccagcaa accggcgtttagctttggcgcgaaaccggaagaaaaaaaagatgataacagcagcaaacc ggcgtttagctttggcgcgaaaagc aacgaagataaacaggatggcaccgcgaaaccggcgtttagctttggcgcgaaaccggcg gaaaaaaacaacaacgaaaccag caaaccggcgtttagctttggcgcgaaaagcgatgaaaaaaaagatggcgatgcgagcaa accggcgtttagctttggcgcgaaa agcgatgaaaaaaaagatagcgatagcagcaaaccggcgtttagctttggcaccaaaagc aacgaaaaaaaagatagcggcagc agcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaaaacgatgaagtgagc aaaccggcgtttagctttggcgcga aagcgaacgaaaaaaaagaaagcgatgaaagcaaaagcgcgtttagctttggcagcaaac cgaccggcaaagaagaaggcgat ggcgcgaaagcggcgattagctttggcgcgaaaccggaagaacagaaaagcagcgatacc agcaaaccggcgtttacctttggc accagcgcgccgcagatgctgcgcgaactgcaggaaaccaacgcggcgctgcaggatgtg cgcgaactgctgcgccagcagg tgaaagaaattacctttctgaaaaacaccgtgatggaaagcgatgcgagc (SEQ ID N0:21).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO: 10 (see below). For example, in one embodiment, a DNA sequence encoding SEQ ID NO: 10 is:

atggatattggcattaacagcgatccgagcaccggcgcgggcgcgtttggcaccggc cagagcacctttggctttaacaacagcg cgccgaacaacaccaacaacgcgaacagcagcattaccccggcgtttggcagcaacaaca ccggcaacaccgcgtttggcaac agcaacccgaccagcaacgtgtttggcagcaacaacagcaccaccaacacctttggcagc aacagcgcgggcaccagcctgttt ggcagcagcagcgcgcagcagaccaaaagcaacggcaccgcgggcggcaacacctttggc agcagcagcctgtttaacaaca gcaccaacagcaacaccaccaaaccggcgtttggcggcctgaactttggcggcggcaaca acaccaccccgagcagcaccggc aacgcgaacaccagcaacaacctgtttggcgcgaccgcgaacgcgaacaaaccggcgttt agctttggcgcgaccaccaacgat gataaaaaaaccgaaccggataaaccggcgtttagctttaacagcagcgtgggcaacaaa accgatgcgcaggcgccgaccacc ggctttagctttggcagccagctgggcggcaacaaaaccgtgaacgaagcggcgaaaccg agcctgagctttggcagcggcag cgcgggcgcgaacccggcgggcgcgagccagccggaaccgaccaccaacgaaccggcgaa accggcgctgagctttggca ccgcgaccagcgataacaaaaccaccaacaccaccccgagctttagctttggcgcgaaaa gcgatgaaaacaaagcgggcgcg accagcaaaccggcgtttagctttggcgcgaaaccggaagaaaaaaaagatgataacagc agcaaaccggcgtttagctttggcg cgaaaagcaacgaagataaacaggatggcaccgcgaaaccggcgtttagctttggcgcga aaccggcggaaaaaaacaacaac gaaaccagcaaaccggcgtttagctttggcgcgaaaagcgatgaaaaaaaagatggcgat gcgagcaaaccggcgtttagctttg gcgcgaaaccggatgaaaacaaagcgagcgcgaccagcaaaccggcgtttagctttggcg cgaaaccggaagaaaaaaaagat gataacagcagcaaaccggcgtttagctttggcgcgaaaagcaacgaagataaacaggat ggcaccgcgaaaccggcgtttagc tttggcgcgaaaccggcggaaaaaaacaacaacgaaaccagcaaaccggcgtttagcttt ggcgcgaaaagcgatgaaaaaaaa gatggcgatgcgagcaaaccggcgtttagctttggcgcgaaaagcgatgaaaaaaaagat agcgatagcagcaaaccggcgttta gctttggcaccaaaagcaacgaaaaaaaagatagcggcagcagcaaaccggcgtttagct ttggcgcgaaaccggatgaaaaaa aaaacgatgaagtgagcaaaccggcgtttagctttggcgcgaaagcgaacgaaaaaaaag aaagcgatgaaagcaaaagcgcg tttagctttggcagcaaaccgaccggcaaagaagaaggcgatggcgcgaaagcggcgatt agctttggcgcgaaaccggaagaa cagaaaagcagcgataccagcaaaccggcgtttacctttggcaaactggcggcggcgctg gaacatcatcatcatcatcat (SEQ ID NO:22).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO: 11 (see below). For example, in one embodiment, a DNA sequence encoding SEQ ID NO: 11 is: atggatattggcattaacagcgatccgggcagcggcagcggcgcgagcgataacaaaacc accaacaccaccccgagctttagc tttggcgcgaaaagcgatgaaaacaaagcgggcgcgaccagcaaaccggcgtttagcttt ggcgcgaaaccggaagaaaaaaa agatgataacagcagcaaaccggcgtttagctttggcgcgaaaagcaacgaagataaaca ggatggcaccgcgaaaccggcgtt tagctttggcgcgaaaccggcggaaaaaaacaacaacgaaaccagcaaaccggcgtttag ctttggcgcgaaaagcgatgaaaa aaaagatggcgatgcgagcaaaccggcgtttagctttggcgcgaaaccggatgaaaacaa agcgagcgcgaccagcaaaccgg cgtttagctttggcgcgaaaccggaagaaaaaaaagatgataacagcagcaaaccggcgt ttagctttggcgcgaaaagcaacga agataaacaggatggcaccgcgaaaccggcgtttagctttggcgcgaaaccggcggaaaa aaacaacaacgaaaccagcaaac cggcgtttagctttggcgcgaaaagcgatgaaaaaaaagatggcgatgcgagcaaaccgg cgtttagctttggcgcgaaaagcga tgaaaaaaaagatagcgatagcagcaaaccggcgtttagctttggcaccaaaagcaacga aaaaaaagatagcggcagcagcaa accggcgtttagctttggcgcgaaaccggatgaaaaaaaaaacgatgaagtgagcaaacc ggcgtttagctttggcgcgaaagcg aacgaaaaaaaagaaagcgatgaaagcaaaagcgcgtttagctttggcagcaaaccgacc ggcaaagaagaaggcgatggcgc gaaagcggcgattagctttggcgcgaaaccggaagaacagaaaagcagcgataccagcaa accggcgtttacctttggcaccag cggcagcggcaaactggcggcggcgctggaacatcatcatcatcatcat (SEQ ID NO:23).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO: 12 (see below). For example, in one embodiment, a DNA sequence encoding SEQ ID NO: 12 is:

atggatattggcattaacagcgatccggcgccgcagatgctgcgcgaactgcaggaa accaacgcggcgctgcaggatgtgcgc gaactgctgcgccagcaggtgaaagaaattacctttctgaaaaacaccgtgatggaaagc gatgcgagcggcgcgagcgcgatta gcggcgatagcctgattagcctggcgagcaccggcaaacgcgtgagcattaaagatctgc tggatgaaaaagattttgaaatttgg gcgattaacgaacagaccatgaaactggaaagcgcgaaagtgagccgcgtgttttgcacc ggcaaaaaactggtgtatattctgaa aacccgcctgggccgcaccattaaagcgaccgcgaaccatcgctttctgaccattgatgg ctggaaacgcctggatgaactgagc ctgaaagaacatattgcgctgccgcgcaaactggaaagcagcagcctgcagctgagcccg gaaattgaaaaactgagccagagc gatatttattgggatagcattgtgagcattaccgaaaccggcgtggaagaagtgtttgat ctgaccgtgccgggcccgcataactttgt ggcgaacgatattattgtgcataacgcgagcgataacaaaaccaccaacaccaccccgag ctttagctttggcgcgaaaagcgatg aaaacaaagcgggcgcgaccagcaaaccggcgtttagctttggcgcgaaaccggaagaaa aaaaagatgataacagcagcaaa ccggcgtttagctttggcgcgaaaagcaacgaagataaacaggatggcaccgcgaaaccg gcgtttagctttggcgcgaaaccg gcggaaaaaaacaacaacgaaaccagcaaaccggcgtttagctttggcgcgaaaagcgat gaaaaaaaagatggcgatgcgag caaaccggcgtttagctttggcgcgaaaccggatgaaaacaaagcgagcgcgaccagcaa accggcgtttagctttggcgcgaa accggaagaaaaaaaagatgataacagcagcaaaccggcgtttagctttggcgcgaaaag caacgaagataaacaggatggcac cgcgaaaccggcgtttagctttggcgcgaaaccggcggaaaaaaacaacaacgaaaccag caaaccggcgtttagctttggcgc gaaaagcgatgaaaaaaaagatggcgatgcgagcaaaccggcgtttagctttggcgcgaa aagcgatgaaaaaaaagatagcga tagcagcaaaccggcgtttagctttggcaccaaaagcaacgaaaaaaaagatagcggcag cagcaaaccggcgtttagctttggc gcgaaaccggatgaaaaaaaaaacgatgaagtgagcaaaccggcgtttagctttggcgcg aaagcgaacgaaaaaaaagaaag cgatgaaagcaaaagcgcgtttagctttggcagcaaaccgaccggcaaagaagaaggcga tggcgcgaaagcggcgattagctt tggcgcgaaaccggaagaacagaaaagcagcgataccagcaaaccggcgtttacctttgg caccagcggcagcggcaaactgg cggcggcgctggaacatcatcatcatcatcat (SEQ ID NO:24).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO: 13 (see below). For example, in one embodiment, a DNA sequence encoding SEQ ID NO: 13 is:

atggatattggcattaacagcgatccgggcagcggcgcgagcccggcgtttagcttt ggcgcgaaaccggatgaaaaaaaagatg atgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatg ataccagcaaaccggcgtttagctttg gcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcgtttagctttggcg cgaaaccggatgaaaaaaaagatg atgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatg ataccagcaaaccggcgtttagctttg gcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcgtttagctttggcg cgaaaccggatgaaaaaaaagatg atgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatg ataccagcaaaaccagcccggcgttt agctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcgtttagc tttggcgcgaaaccggatgaaaaaa aagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaag atgatgataccagcaaaccggcgttt agctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcgtttagc tttggcgcgaaaccggatgaaaaaa aagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaag atgatgataccagcaaaccggcgttt agctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcgtttagc tttggcgcgaaaccggatgaaaaaa aagatgatgataccagcaaaaccagcggcagcggcaaactggcggcggcgctggaacatc atcatcatcatcat (SEQ ID NO:25).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO: 14 (see below). For example, in one embodiment, a DNA sequence encoding SEQ ID NO: 14 is:

atggatattggcattaacagcgatccgggcagcggcgcgagcccggcgtttagcttt ggcgcgaaaccggatgaaaaaaaagata gcgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcg ataccagcaaaccggcgtttagcttt ggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccggcgtttagctttggc gcgaaaccggatgaaaaaaaagat agcgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagc gataccagcaaaccggcgtttagct ttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccggcgtttagctttg gcgcgaaaccggatgaaaaaaaag atagcgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagata gcgataccagcaaaaccagcccgg cgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccggcgt ttagctttggcgcgaaaccggatgaa aaaaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaa aaagatagcgataccagcaaaccg gcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccggcg tttagctttggcgcgaaaccggatg aaaaaaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaa aaaaagatagcgataccagcaaac cggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccgg cgtttagctttggcgcgaaaccgga tgaaaaaaaagatagcgataccagcaaaaccagcaaaaccagcggcagcggcaaactggc ggcggcgctggaacatcatcatc atcatcat (SEQ ID NO:26).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO: 15 (see below). For example, in one embodiment, a DNA sequence encoding SEQ ID NO : 15 is :

atggatattggcattaacagcgatccggcgccgcagatgctgcgcgaactgcaggaa accaacgcggcgctgcaggatgtgcgc gaactgctgcgccagcaggtgaaagaaattacctttctgaaaaacaccgtgatggaaagc gatgcgagcggcgcgagcgataac aaaaccaccaacaccaccccgagctttagctttggcgcgaaaagcgatgaaaacaaagcg ggcgcgaccagcaaaccggcgttt agctttggcgcgaaaccggaagaaaaaaaagatgataacagcagcaaaccggcgtttagc tttggcgcgaaaagcaacgaagat aaacaggatggcaccgcgaaaccggcgtttagctttggcgcgaaaccggcggaaaaaaac aacaacgaaaccagcaaaccgg cgtttagctttggcgcgaaaagcgatgaaaaaaaagatggcgatgcgagcaaaccggcgt ttagctttggcgcgaaaccggatga aaacaaagcgagcgcgaccagcaaaccggcgtttagctttggcgcgaaaccggaagaaaa aaaagatgataacagcagcaaac cggcgtttagctttggcgcgaaaagcaacgaagataaacaggatggcaccgcgaaaccgg cgtttagctttggcgcgaaaccgg cggaaaaaaacaacaacgaaaccagcaaaccggcgtttagctttggcgcgaaaagcgatg aaaaaaaagatggcgatgcgagc aaaccggcgtttagctttggcgcgaaaagcgatgaaaaaaaagatagcgatagcagcaaa ccggcgtttagctttggcaccaaaa gcaacgaaaaaaaagatagcggcagcagcaaaccggcgtttagctttggcgcgaaaccgg atgaaaaaaaaaacgatgaagtg agcaaaccggcgtttagctttggcgcgaaagcgaacgaaaaaaaagaaagcgatgaaagc aaaagcgcgtttagctttggcagc aaaccgaccggcaaagaagaaggcgatggcgcgaaagcggcgattagctttggcgcgaaa ccggaagaacagaaaagcagc gataccagcaaaccggcgtttacctttggcaccagcgcgccgcagatgctgcgcgaactg caggaaaccaacgcggcgctgcag gatgtgcgcgaactgctgcgccagcaggtgaaagaaattacctttctgaaaaacaccgtg atggaaagcgatgcgagcggcaaac tggcggcggcgctggaacatcatcatcatcatcat (SEQ ID NO:27).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO: 16 (see below). For example, in one embodiment, a DNA sequence encoding SEQ ID NO: 16 is:

atggatattggcattaacagcgatccggcgccgcagatgctgcgcgaactgcaggaa accaacgcggcgctgcaggatgtgcgc gaactgctgcgccagcaggtgaaagaaattacctttctgaaaaacaccgtgatggaaagc gatgcgagcggcgcgagcccggcg tttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcgttt agctttggcgcgaaaccggatgaaaa aaaagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaa agatgatgataccagcaaaccggcg tttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcgttt agctttggcgcgaaaccggatgaaaa aaaagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaa agatgatgataccagcaaaccggcg tttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccggcgttt agctttggcgcgaaaccggatgaaaa aaaagatgatgataccagcaaaaccagcccggcgtttagctttggcgcgaaaccggatga aaaaaaagatgatgataccagcaaa ccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaccg gcgtttagctttggcgcgaaaccgg atgaaaaaaaagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccggatg aaaaaaaagatgatgataccagcaa accggcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaacc ggcgtttagctttggcgcgaaaccg gatgaaaaaaaagatgatgataccagcaaaccggcgtttagctttggcgcgaaaccggat gaaaaaaaagatgatgataccagca aaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatgatgataccagcaaaa ccagcgcgccgcagatgctgcgcg aactgcaggaaaccaacgcggcgctgcaggatgtgcgcgaactgctgcgccagcaggtga aagaaattacctttctgaaaaacac cgtgatggaaagcgatgcgagcggcggcaaactggcggcggcgctggaacatcatcatca tcatcat (SEQ ID NO:28).

In one embodiment, the nucleic acid molecule encodes a polypeptide represented by SEQ ID NO: 17 (see below). For example, in one embodiment, a DNA sequence encoding SEQ ID NO: 17 is:

atggatattggcattaacagcgatccggcgccgcagatgctgcgcgaactgcaggaa accaacgcggcgctgcaggatgtgcgc gaactgctgcgccagcaggtgaaagaaattacctttctgaaaaacaccgtgatggaaagc gatgcgagcggcgcgagcccggcg tttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccggcgttt agctttggcgcgaaaccggatgaaaa aaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaa agatagcgataccagcaaaccggc gtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccggcgtt tagctttggcgcgaaaccggatgaaa aaaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaa aagatagcgataccagcaaaccgg cgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaaccggcgt ttagctttggcgcgaaaccggatgaa aaaaaagatagcgataccagcaaaaccagcccggcgtttagctttggcgcgaaaccggat gaaaaaaaagatagcgataccagc aaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagcaaa ccggcgtttagctttggcgcgaaac cggatgaaaaaaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaaccgg atgaaaaaaaagatagcgatacca gcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccagca aaccggcgtttagctttggcgcgaa accggatgaaaaaaaagatagcgataccagcaaaccggcgtttagctttggcgcgaaacc ggatgaaaaaaaagatagcgatac cagcaaaccggcgtttagctttggcgcgaaaccggatgaaaaaaaagatagcgataccag caaaaccagcgcgccgcagatgct gcgcgaactgcaggaaaccaacgcggcgctgcaggatgtgcgcgaactgctgcgccagca ggtgaaagaaattacctttctgaa aaacaccgtgatggaaagcgatgcgagcggcggcaaactggcggcggcgctggaacatca tcatcatcatcat (SEQ ID NO:29).

An aspect of the invention is an expression vector comprising a nucleic acid molecule of the invention.

As used herein, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a plasmid, which refers to a circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non- episomal mammalian vectors) are integrated into the genome of a host cell upon

introduction into the host cell, and thereby are replicated along with the host genome.

Moreover, certain vectors, namely expression vectors, are capable of directing the expression of genes to which they are operably linked. In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids (vectors).

However, the present invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions.

The recombinant expression vectors of the invention comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell. This means that the recombinant expression vectors include one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operably linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, "operably linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). The term "regulatory sequence" is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Such regulatory sequences are described, for example, in Goeddel, Methods in Enzymology: Gene Expression Technology vol.185, Academic Press, San Diego, CA (1991). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cell and those which direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or peptides, including proteins or polypeptides, encoded by nucleic acids as described herein.

The recombinant expression vectors for use in the invention can be designed for expression of a polypeptide of the invention in prokaryotic (e.g., E. coli) or eukaryotic cells (e.g., insect cells (using baculovirus expression vectors), yeast cells, or mammalian cells). Suitable host cells are discussed further in Goeddel, supra. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith and Johnson, 1988, Gene 67:31-40), pMAL (New England Biolabs, Beverly, MA) and pRIT5 (Pharmacia,

Piscataway, NJ) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

Examples of suitable inducible non-fusion E. coli expression vectors include pTrc (Amann et al, 1988, Gene 69:301-315) and pET l id (Studier et al, p. 60-89, In Gene Expression Technology: Methods in Enzymology vol.185, Academic Press, San Diego, CA, 1991). Target biomarker nucleic acid expression from the pTrc vector relies on host RNA polymerase transcription from a hybrid trp-lac fusion promoter. Target biomarker nucleic acid expression from the pET l id vector relies on transcription from a T7 gnlO-lac fusion promoter mediated by a co-expressed viral RNA polymerase (T7 gnl). This viral polymerase is supplied by host strains BL21 (DE3) or HMS174(DE3) from a resident prophage harboring a T7 gnl gene under the transcriptional control of the lacUV 5 promoter.

One strategy to maximize recombinant protein expression in E. coli is to express the protein in a host bacterium with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, p. 119-128, In Gene Expression Technology: Methods in Enzymology vol. 185, Academic Press, San Diego, CA, 1990). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al, 1992, Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

In another embodiment, the expression vector is a yeast expression vector.

Examples of vectors for expression in yeast S. cerevisiae include pYepSecl (Baldari et al, 1987, EMBO J. 6:229-234), pMFa (Kurjan and Herskowitz, 1982, Cell 30:933-943), pJRY88 (Schultz et al, 1987, Gene 54: 113-123), pYES2 (Invitrogen Corporation, San Diego, CA), and pPicZ (Invitrogen Corp, San Diego, CA).

Alternatively, the expression vector is a baculovirus expression vector. Baculovirus vectors available for expression of proteins in cultured insect cells (e.g., Sf 9 cells) include the pAc series (Smith et al, 1983, Mol. Cell Biol. 3:2156-2165) and the pVL series

(Lucklow and Summers, 1989, Virology 170:31-39).

In yet another embodiment, a nucleic acid of the invention is expressed in mammalian cells using a mammalian expression vector. Examples of mammalian expression vectors include pCDM8 (Seed, 1987, Nature 329:840) and pMT2PC (Kaufman et al, 1987, EMBO J. 6: 187-195). When used in mammalian cells, the expression vector's control functions are often provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus, and Simian Virus 40 (SV40). For other suitable expression systems for both prokaryotic and eukaryotic cells see chapters 16 and 17 of Sambrook et al, ed., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989.

In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue- specific regulatory elements are used to express the nucleic acid). Tissue-specific regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al., 1987, Genes Dev.

1 :268-277), lymphoid-specific promoters (Calame and Eaton, 1988, Adv. Immunol. 43:235- 275), in particular promoters of T cell receptors (Winoto and Baltimore, 1989, EMBO J. 8:729-733) and immunoglobulins (Banerji et al, 1983, Cell 33:729-740; Queen and Baltimore, 1983, Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle, 1989, Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas- specific promoters (Edlund et al, 1985, Science 230:912-916), and mammary gland- specific promoters (e.g., milk whey promoter; U.S. Patent No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example the murine hox promoters (Kessel and Gruss, 1990, Science 249:374-379) and the a-fetoprotein promoter (Camper and Tilghman, 1989, Genes Dev. 3:537-546).

In certain embodiments, the expression vector is a prokaryotic expression vector. In certain embodiments, the expression vector is a eukaryotic expression vector.

An aspect of the invention is a cell comprising an expression vector of the invention.

In certain embodiments, the expression vector is a prokaryotic expression vector, and the cell is a prokaryotic cell, e.g., E. coli.

In certain embodiments, the expression vector is a eukaryotic expression vector, and the cell is a eukaryotic cell, e.g., a yeast cell, an insect cell, or a mammalian cell.

Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques. As used herein, the terms "transformation" and "transfection" are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid into a host cell, including calcium phosphate or calcium chloride co- precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al. {supra), and other laboratory manuals.

For stable transfection of mammalian cells, it is known that, depending upon the expression vector and transfection technique used, only a small fraction of cells may integrate the foreign DNA into their genome. In order to identify and select these integrants, a gene that encodes a selectable marker (e.g., for resistance to antibiotics) is generally introduced into the host cells along with the gene of interest. Preferred selectable markers include those which confer resistance to drugs, such as G418, hygromycin and methotrexate. Cells stably transfected with the introduced nucleic acid can be identified by drug selection (e.g., cells that have incorporated the selectable marker gene will survive, while the other cells die).

Compositions of the Invention

An aspect of the invention is a hydrogel, comprising a polypeptide of the invention. An aspect of the invention is a filtering device, comprising a hydrogel of the invention; and a housing or support for the hydrogel. In certain embodiments, the hydrogel is disposed within a housing. In certain embodiments, the hydrogel is disposed within and is in contact with the housing. The housing can be configured in any suitable manner for the intended use of the filtering device, e.g., as a plate, a cone, a manifold, a cartridge, an open-ended tube, a closed-end tube (e.g., a centrifuge tube), etc. The housing optionally can include a fitting, e.g., a screw- or compression-Luer lock fitting, to facilitate connection of the filtering device to a fluid path. In certain embodiments, the hydrogel is disposed upon a support. In certain embodiments, the hydrogel is disposed upon and is in contact with the support. The support can be configured in any suitable manner for the intended use of the filtering device, e.g., as a frame, a perforated plate, a mesh, a fabric, a filter, etc. In certain embodiments, the support, together with the hydrogel, can be fitted or placed into a housing which is configured and arranged to receive the support.

An aspect of the invention is a drug delivery device, comprising a drug; and a hydrogel of the invention.

In certain embodiments, the drug is a macromolecule.

In certain embodiments, the drug is a nucleic acid.

In certain embodiments, the drug is an R A.

In certain embodiments, the drug is a DNA.

In certain embodiments, the drug is a polymer.

In certain embodiments, the drug is a polypeptide.

In certain embodiments, the drug is a protein.

In certain embodiments, the drug is a fusion protein.

In certain embodiments, the drug is an antibody or an antigen-binding fragment of an antibody.

In certain embodiments, the drug is a cytokine.

In certain embodiments, the drug is a glycoprotein.

In certain embodiments, the drug is a carbohydrate.

In certain embodiments, the drug is a lipid.

In certain embodiments, the drug is a toxin.

In certain embodiments, the drug is a steroid.

In certain embodiments, the drug is a hormone, e.g., an estrogen or a progestogen.

In certain embodiments, the drug is dispersed within the hydrogel.

In certain embodiments, the drug is effectively enveloped by the hydrogel. For example, the drug can be enclosed within a housing, wherein at least a portion of the housing is open to the environment but for the presence of the hydrogel. In certain embodiments, the drug is enveloped by the hydrogel. For example, the drug can be present as a core which is completely surrounded by the hydrogel; the core can include more than a single active agent, and there can be one or more additional layers present between the drug or core and the hydrogel. In certain embodiments, there can be one or more additional layers external to the hydrogel.

Methods of the Invention

An aspect of the invention is a method of separating or selectively filtering macromolecules, comprising contacting a source of macromolecules with a hydrogel of the invention. For example, a solution comprising a macromolecule can be contacted with a hydrogel of the invention.

As used herein, the term "macromolecule" refers to any molecule having a molecular weight greater than about 1500 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about 10,000 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about 20,000 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about

30,000 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about 40,000 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about 50,000 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about 60,000 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about 70,000 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about 80,000 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about 90,000 Da. In certain embodiments, a macromolecule has a molecular weight greater than or equal to about 100,000 Da.

In certain embodiments, the macromolecule is a naturally occurring macromolecule.

In certain embodiments, the macromolecule is a synthetic or semi-synthetic macromolecule. In certain embodiments, the macromolecule is present as part of a complex or conjugate with another molecule, e.g., a karyopherin.

In certain embodiments, the macromolecule is selected from the group consisting of R A, mR A, DNA, proteins, glycoproteins, carbohydrates, lipids, toxins, and any combination thereof.

In certain embodiments, the macromolecule is RNA.

In certain embodiments, the macromolecule is mRNA. In certain embodiments, the macromolecule is DNA.

In certain embodiments, the macromolecule is a protein.

In certain embodiments, the macromolecule is a glycoprotein.

In certain embodiments, the macromolecule is a carbohydrate.

In certain embodiments, the macromolecule is a lipid.

In certain embodiments, the macromolecule is a toxin.

Having described the present invention in detail, the same will be more clearly understood by reference to the following examples, which are included herewith for purposes of illustration only and are not intended to be limiting of the invention.

EXAMPLES

Materials and Methods

DNA engineering: The Nspl ^30"591 gene from the Chait group was amplified by polymerase chain reaction (PCR) to prepare BamHI (i?)-NheI ( V)-Nspl ^30-591 -Spel (S)- Hindlll (H and 5-N-cNspl (Nspl ^282~585)-S-H. Then, both DNA fragments were subcloned into the pET22b expression plasmid (C-terminal 6xHis tag) using BamHI and Hindlll restriction sites. The pET22b vector was chosen due to the C-terminal 6xHis tag which allows isolation of full length proteins by metal affinity chromatography. B-N-WLP-S-H, B-N-2N P-S-H, 5-P-EcoRI-intein-N-cNspl-S-H, B-P domain (P)-N-cNspl-S-P-H, B-P-N- lNLP-S-P-H and B-P-N-2NLP-S-P-H were also prepared. Gene sequences of B-P domain (P)-N- INLP ₁/2-S-P-H and B-P-N-2N P _m-S-P-H were purchased (GenScript, USA) and subcloned into the pET22b expression plasmid using BamHI and Hindlll restriction sites. To prepare P-1NLP-P and P-2NLP-P, N-I LP ₁/2-S and N-2NLPi _/2-S ^* were subcloned into pET22b-5-P-N-lNLPi/2-S-P-H and pET22b-5-P-N-2NLPi _/2-S-P-H plasmids using the Spel restriction enzyme site. P-cNspl-P was prepared by subcloning N-cNspl-S into the pET22b-5-P-N-lNLPi _/2-5'-P-H plasmid using Nhel and Spel restriction sites. 1NLP and 2NLP were prepared by subcloning N-l LP-5 and N-2NLP-S ^* from pET22b-5-P-N-lNLP- S-P-H and pET22b-5-P-N-2NLP-S-P-H plasmids into pET22b-5-P-N- INLP ₁/2-S-P-H and pET22b-i?-P-N-2NLPi/ ₂-S-P-H plasmids using Nhel and Spel restriction enzyme sites. B- P-EcoRI-intein-N sequences were purchased (Integrated DNA Technologies, USA) and subcloned into the pET22b-5-N-cNsp 1 -S-H plasmid using BamHI and Nhel restriction enzyme sites. Sequences of all plasmids were confirmed by gene sequencing (Genewiz, USA). Prepared protein sequences are as follows. Nspl ^{iU yi} (SEQ ID NO: 10)

MDIGINSDPSTGAGAFGTGQSTFGFNNSAPNNTNNANSSITPAFGSNNTGNTAFGNS NPTSNVFGSNNSTTNTFGSNSAGTSLFGSSSAQQTKSNGTAGGNTFGSSSLFNNSTN SNTTKPAFGGLNFGGGNNTTPSSTGNANTSNNLFGATANANKPAFSFGATTNDDK KTEPDKPAFSFNSSVGNKTDAQAPTTGFSFGSQLGGNKTVNEAAKPSLSFGSGSAG ANPAGASQPEPTTNEPAKPALSFGTATSDNKTTNTTPSFSFGAKSDENKAGATSKPA FSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNETSKP AFSFGAKSDEKKDGDASKPAFSFGAKPDENKASATSKPAFSFGAKPEEKKDDNSSK PAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDA SKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEK DSGSSKPAFSFGAKPDEKKNDE VSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEEGDGAKAAISFGAKPEEQKSSD TSKPAFTFGKLAAALEHHHHHH cNspl (SEQ ID NO: 11)

MDIGINSDPGSGSGASDNKTTNTTPSFSFGAKSDENKAGATSKPAFSFGAKPEEKKD DNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNETSKPAFSFGAKSDEK DGDASKPAFSFGAKPDENKASATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNED KQDGTAKPAFSFGAKPAEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKSD EKKDSDSSKPAFSFGTKSNEKKDSGSSKPAFSFGAKPDEKKNDEVSKPAFSFGAKA NEKKESDESKSAFSFGSKPTGKEEGDGAKAAISFGAKPEEQKSSDTSKPAFTFGTSG SGKLAAALEHHHHHH

P-Intein-cNspl (SEQ ID NO: 12)

MDIGINSDPAPQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDASGASAI SGDSLISLASTGKRVSIKDLLDEKDFEIWAINEQTMKLESAKVSRVFCTGKKLVYIL KTPvLGRTIKATANHRFLTIDGWKRLDELSLKEHIALPRKLESSSLQLSPEIEKLSQSDI YWDSIVSITETGVEEVFDLTVPGPHNFVANDIIVHNASDNKTTNTTPSFSFGAKSDE NKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPA EKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKPDENKASATSKPAFSFGAKP EEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKPAEKNNNETSKPAFSFGAK SDEKKDGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTKSNEKKDSGSSKPAFSFGA KPDEKKNDEVSKPAFSFGAKANEKKESDESKSAFSFGSKPTGKEEGDGAKAAISFG AKPEEQKSSDTSKPAFTFGTSGSGKLAAALEHHHHHH 1NLP (SEQ ID N0: 13)

MDIGINSDPGSGASPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSF GAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAF SFGAKPDEKKDDDTSKPAFSFGAKPDEK DDDTSKPAFSFGAKPDEKKDDDTSKTS PAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTS KPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDD TSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEK DDDTSKTSGSGKLAAALEHH HHHH 2NLP (SEQ ID NO: 14)

MDIGINSDPGSGASPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSF GAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFS FGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKTSP AFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEK DSDTSK PAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTS KPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKTSKTSGSGKLAAALEHH HHHH P-cNs l-P (SEQ ID NO: 15)

MDIGINSDPAPQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDASGASDN KTTNTTPSFSFGAKSDENKAGATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNED KQDGTAKPAFSFGAKPAEKNNNETSKPAFSFGAKSDEK DGDASKPAFSFGAKPD ENKASATSKPAFSFGAKPEEKKDDNSSKPAFSFGAKSNEDKQDGTAKPAFSFGAKP AEKNNNETSKPAFSFGAKSDEKKDGDASKPAFSFGAKSDEKKDSDSSKPAFSFGTK SNEKKDSGSSKPAFSFGAKPDEKKNDEVSKPAFSFGAKANEKKESDESKSAFSFGS KPTGKEEGDGAKAAISFGAKPEEQKSSDTSKPAFTFGTSAPQMLRELQETNAALQD VRELLRQQVKEITFLKNTVMESDASGKLAAALEHHHHHH P-1NLP-P (SEQ ID N0: 16)

MDIGINSDPAPQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDASGASPA FSFGAKPDEKKDDDTSKPAFSFGAKPDEK DDDTSKPAFSFGAKPDEKKDDDTSKP AFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTS KPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKTSPAFSFGAKPDEKKDD DTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKD DDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEKKDDDTSKPAFSFGAKPDEK DDDTSKPAFSFGAKPDEKKDDDTSKTSAPQMLRELQETNAALQDVRELLRQQVKE ITFLKNTVMESDASGGKLAAALEHHHHHH P-2NLP-P (SEQ ID NO : 17)

MDIGINSDPAPQMLRELQETNAALQDVRELLRQQVKEITFLKNTVMESDASGASPA FSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKP AFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEK DSDTSK PAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKTSPAFSFGAKPDEKKDSDT SKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSD TSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDSDTSKPAFSFGAKPDEKKDS DTSKPAFSFGAKPDEKKDSDTSKTSAPQMLRELQETNAALQDVRELLRQQVKEITF LKNT VME SD AS GGKL AAALEHHHHHH Protein expression and purification: All prepared genes were transformed into E. coli OverExpress C41(DE3) cells (Lucigen, USA). For expression, a freshly-grown bacterial colony was inoculated in 10 mL LB medium with 100 μg/mL ampicillin at 37 °C overnight. 10 mL of overnight culture of all samples was inoculated into 1L terrific broth (TB) media at 37 °C with 100 μg/mL ampicillin until OD ₆oo ~ 1. Expression was then induced overnight at room temperature with the addition of 1 mM IPTG. The cells were harvested, lysed in 50 mM Tris (pH 8), 300 mM NaCl and 8 M urea, and frozen at -80 °C. Thawed cells were sonicated, and 15 w/v% ammonium sulfate was added. Cell lysates were clarified by centrifugation (14,000 g for 30 min at 4 °C), and the proteins of interest were purified by Ni-NTA affinity chromatography under denaturing conditions with 250 mM imidazole used for protein elution. Purified samples were dialyzed against deionized (DI) water, and 20 mM Tris (pH 8) and 6 M urea were added. The proteins were further purified by ion exchange chromatography using a HiTrap Q HP column (GE Healthcare, Sweden) in an AKTA pure FPLC. Samples containing the desired product were dialyzed against DI water and lyophilized. P-intein-cNspl was prepared using the same expression condition as the other proteins, but harvested cells were resuspended in 50 mM NaH ₂P0 ₄, 300 mM NaCl, and 10 mM imidazole and stored at -80 °C. Thawed cells were lysed by the sonication, and cell lysates clarified by centrifugation were stored overnight at 4 °C at pH 7 for intein self-cleavage. Sun Z et al, Protein Expr Purif : 26 (2005). After adjusting the pH to 8, samples with 6xHis tags were purified by Ni-NTA under native conditions and by FPLC as described above. P-C30-P protein was expressed and purified as reported previously (Olsen BD et al, Macromolecules 43: 9094 (2010)), with additional FPLC purification under denaturing conditions. Lyophilized samples were weighed to calculate yields of samples from 1L culture (Figure 2B and Figure 3B). The purity of samples was determined to be greater than 95% by SDS-PAGE analysis (Figure 2A and Figure 3A).

For transport studies, pQE80-14xHis tag-TEV-IBB-MBP-EGFP, pQE80-14xHis tag-TEV-MBP-mCherry and pQE30-scImp P-6xHis tag from the Gorlich group were transformed into SG13009 (pRep4) cells, and expression followed a previously described protocol. Frey S et al, EMBO J2 : 2554 (2009). After cutting off the His-tag with TEV protease (Eton Bioscience, USA) for IBB-MBP-EGFP and MBP-mCherry, additional purification was performed for all three proteins by size exclusion chromatography in buffer containing 50 mM Tris (pH 7.5) and 200 mM NaCl. The concentration of recombinant fluorescent proteins was determined using their optical absorbance.

FITC-dextran: Fluorescein isothiocyanate (FITC)-dextran with molar masses of 4, 10, 40, 70 and 150 kg/mol (catalog # 46994, FD10S, FD40, 46945 and 46946), were purchased from Sigma- Aldrich (USA). Approximate Stokes' radii of the FITC-dextran polymers were obtained from the supplier as follows: 1.4 nm (4 kg/mo 1), 2.3 nm (10 kg/mol), 4.5 nm (40 kg/mol), 6.0 nm (70 kg/mol) and 8.5 nm (150 kg/mol).

Hydrogel preparation: Lyophilized samples were dissolved at a concentration of 200 mg/mL in 50 mM Tris/HCl (pH 7.5) and 200 mM NaCl except where otherwise noted, mixed, and allowed to gel at either room temperature or 4 °C. It is known that natural nucleoporins can form homogeneous hydrogels at such high protein concentration, but at lower nucleoporin concentrations, the hydrogels lose their selective permeability function. For this reason, nucleoporin hydrogels, Nspl, are typically prepared at 150-200 mg/mL concentration throughout the literature. Since the invention concerns developing artificially engineered protein hydrogels that mimic the function of natural nucleoporin hydrogels, the nucleoporin hydrogels were benchmarked to these previous studies and used the

concentration where it showed the selective transport property. Having confirmed that the designed hydrogels at 20 w/v% can show selectivity (Figure 4 and Figure 5), the ability of the best performing gel, P-2NLP-P, to mimic this property at a lower gel concentration, 10 w/v% (Figure 5), was checked. During the hydrogel inversion test, food coloring was added in the buffer for better visualization of the test.

Rheology: Oscillatory shear rheology was performed on an Anton Paar MCR-301 in Direct Strain Oscillation mode with TruGap™ control. A Peltier heating system and environmental enclosure were employed to control sample temperature. Samples were loaded into a 25 mm cone-and-plate geometry with an angle of 1° and sealed with a mineral oil barrier to prevent dehydration.

Raman Spectroscopy: A custom-built NIR confocal Raman inverted microscopy system was used for Raman measurements. 785 nm light from a continuous- wave Ti: Sapphire laser (3900S, Spectra-Physics) pumped by a frequency-doubled Nd: YAG laser (Millennia 5s J, Spectra-Physics) was used for the excitation. A water immersion objective lens with 1.2 NA (UPLSAPO60XWIR 60X/1.20, Olympus) was used both to focus the laser onto the sample, which is on top of quartz coverslip (043210-KJ, Alfa Aesar), and to collect the backscattered light. The Rayleigh light in the collected signal was removed by dichroic mirrors (LPD01-785RU-25 ^x36x l . l, Semrock). Raman light was delivered to an imaging spectrograph (Holospec f/1.8i, Kaiser Optical Systems) and detected by a TE- cooled, back-illuminated, deep depleted CCD (PIXIS: lOOBR eXcelon, Princeton

Instruments). The laser power at the sample plane was ca. 60 mW, and the signal was integrated for 5 seconds. Nine spectra were collected from each sample and averaged. Spectrum processing (cosmic ray removal, background subtraction and normalization) was performed by MATLAB (Mathworks) scripts.

Capillary transport measurements: 1.5 inch borosilicate capillaries with 0.9 mm inner diameters (Vitrocom 8290) were loaded by piercing pre -made hydrogels. 5 μΜ solutions of IBB-MBP-EGFP, MBP-mCherry, and/or importin β were injected into the capillary and sealed by a 1 : 1 : 1 mixture of vaseline, lanolin, and paraffin. Time lapses of cargo transport into the hydrogels were taken at 1 minute intervals for 1-3 hours on a Nikon Ti Eclipse inverted microscope using a Nikon CFI Plan UW 2X and Hamamatsu CI 1440- 22CU camera. All fluorescence intensity profiles were obtained by averaging the fluorescence intensity within 100 μιη slice width through the center of the gel and across the gel/buffer interface. The profiles were normalized by the bath concentration in the capillaries at the first time point. The interface between gel and buffer is assigned by a 20% change in the intensity compared to the bath fluorescence intensity as the zero point of the distance scale in each fluorescence intensity profile.

Circular dichroism spectroscopy: CD spectra were obtained on an Aviv Model

202 Circular Dichroism spectrometer. CD spectra were recorded in a quartz cell of 0.1 cm path length at 25 °C between 190 and 250 nm, using a scan rate of 12 nm/min with a wavelength step of 1 nm. cNspl and the NLPs were dissolved in 50 mM Tris buffer with 200 mM NaCl at pH 7.5. CD band intensities, after the buffer signal subtractions, were converted into mean residue ellipticity (MRE).

Example 1: Preparation and Expression of Engineered Proteins

To reduce protein loss during washing step in Ni-NTA chromatography denaturing purification, the washing buffer did not contain imidazole and was prepared at pH 8, causing increased impurity during elution. After the elution, eluted proteins were run on denaturing gels (Figure 6). While Nspl (60 kg/mol), one of nuclear pore proteins, was expressed at low levels and difficult to identify on the gel, engineered proteins (P-cNspl-P and P-NLPs-P) were highly expressed. For better purity, anion exchange chromatography was performed, with final purified products shown in Figure 2A. Note that all proteins ran slower in SDS-PAGE (Figures 2A, 7A, and 6) than their calculated and measured molar masses (Table 1), similar to previous reports for P-C ₃₀-P protein. Olsen BD et al, Macromolecules 43: 9094 (2010). Table 1. Protein molar masses measured by MALDI

Table 2. DNA plasmids

Engineered proteins with P domain blocks - P-cNspl-P, P-1NLP-P and P-2NLP-P - were easily synthesized in much higher yield than recombinant nucleoporin Nspl . After protein expression and chromatographic purification, the yield of high purity protein was more than 20- to 70-fold higher than the recombinant Nspl protein (Figure 2 and Figure 6).

NLPs without the coiled-coil domain were also isolated at 10 times greater yield than their parent sequence, cNspl, after the same procedure (Figure 3). Interestingly, when the cNspl was fused to the P domain endb locks (P-cNspl-P), the construct was expressed at a similar yield as the NLP constructs.

Based on this observation, a single P domain together with an intein self-cleavage domain (Mathys S et al, Gene 231 : 1 (1999)) was subcloned into the N-terminal cNspl gene (P-intein-cNspl). After the self-cleavage of P-intein domains, cNspl was obtained at a similar yield as the NLPs (Figure 3). Significantly improved biosynthetic yields of these artificially engineered proteins enable detailed characterization of their material properties and engineering to control their performance. Example 2: Preparation and Characterization of Hydrogels

Engineered proteins with P domain endblocks rapidly formed hydrogels, while NLP midblocks alone failed inversion tests, indicating that structure beyond the FG repeat is necessary to give elastic mechanical properties. Consistent with a previous study on recombinant cNspl, the NLPs without associating coiled-coil domains did not pass hydrogel inversion tests, while the designed proteins with the P domains formed hydrogels within a few minutes, mainly limited by the time required for the lyophilized sample to swell in a buffer. The engineered proteins were found to gel in number of buffer conditions commonly used for the recombinant Nspl, demonstrating that P domain endblocks successfully replace the role of the N-terminal sequences of Nspl as a gel crosslinker.

More particularly, while cNspl is known not to form hydrogels, P-cNspl-P formed gels in all commonly used buffer conditions for Nspl at a protein concentration of 200 mg/mL. Buffers tested included (i) water; (ii) 0.1% TFA (in water), followed by neutralization with 1/4 volume of the buffer (400 mM Tris-base, 100 mM Tris/HCl pH 7.5, 1 M NaCl); (iii) 0.1% TFA (in water) and neutralized with 1/5 volume of 200 mM Tris-HCl (pH 8.5); (iv) 0.2% TFA (in water) and neutralized with 1/5 volume of 200 mM Tris-HCl (pH 9); and (v) 100 mM phosphate buffer.

Example 3: Significance of Midblock FG Repeats

To characterize the effect of midblock interactions on hydrogel mechanics, frequency sweep, linear oscillatory shear rheology of 20 w/v% hydrogels was performed in the absence or presence of 10 % 1,6 hexanediol. Measurements were performed at 25 °C with a strain amplitude of 1%, within the linear viscoelastic range. Representative results are shown in Figure 8.

Rheology showed that the midblock sequence has a significant impact on the low frequency viscoelastic properties of the triblock protein gels without affecting the high frequency elastic plateau modulus. Although the midblocks cNspl, 1NLP, and 2NLP are insufficient to cause gelation without the P domain, all three proteins with P domains formed gels with a comparable modulus (on the order of 10 kPa) with the crossover between G' and G" occurring below 0.1 rad/s (Figure 7A-C). In all three artificially engineered hydrogels, the addition of 10% hexanediol to a 20 w/v% gel led to a decrease in the gel relaxation time, increasing the crossover frequency of the gel by approximately a factor of 5, while the stiffness of hydrogels (the plateau modulus G') changed very little (Figure 7A-C).

Aliphatic alcohols, such as 1 ,6 hexanediol, are known to weaken FG associations, leading to a loss of selective permeability in vivo and in vitro. Comparison to a control hydrogel of similar molar mass but without FG repeats in the midblock (P-C30-P) showed no effect on the crossover frequency and the high frequency plateau modulus after the addition of hexanediol (Figure 7D and Figure 9), indicating that the endblock P domains are unaffected by hexanediol. Therefore, the changes in crossover frequency in

nucleoporin-mimetic gels, characteristic of changes in network relaxation rate, originate from differences in the state of the midblock domain.

It has been shown that interchain β-sheets in some nucleoporin FG repeat hydrogels contribute to crosslinking and enhance the FG hydrogel stability, and removing these crosslinks enhances permeability and reduces selectivity. Labokha AA et al., EMBO J 32: 204 (2013). It is expected that the choice of crosslinking group in the hydrogel may affect biomolecular transport and mechanical properties, as crosslinking controls the mesh size of the gel and the relaxation dynamics of the junction points. These properties can influence the transport of macromolecules interacting with the network chains.

Prominent changes in Raman bands upon the addition of hexanediol confirmed that these changes in gel mechanics are caused by disruption of FG repeats involved in molecular interactions within the midblocks, indicating that naturally observed FG interactions have been successfully adapted into the biosynthetic hydrogels. Upon the addition of hexanediol, a significant decrease was observed in the Raman band at 486 cm ^"1 corresponding to a Phe vibrational mode for cNspl and both consensus repeat NLP midblocks. Other Phe Raman bands (band assignments in Figure 10) are similar for all midblock polymers in the presence and absence of the hexanediol (Figure 7E-G), indicating that natural cNspl and synthetic NLPs have a similar physicochemical environment for the Phe residues.

Other common changes upon the addition of hexanediol were observed in Raman bands at 685 and 710 cm ^"1. These bands do not appear in lyophilized cNspl or NLP

(Figure 11) or in individual amino acids included in NLPs in water solution from a previous study. Zhu G et al, Spectrochim Acta A Mol Biomol Spectrosc 78: 1187 (2011). This suggests that the bands are a result of the association between midblocks in water. Molecular interactions between Phe and CH ₃ and Pro and Lys have been suggested in cNspl based upon Nuclear Overhauser Exchange Spectroscopy NMR spectra by Ader et al, Proc Natl Acad Sci USA 107: 6281 (2010). The addition of 10% hexanediol suppressed Raman bands responsible for Phe (486 cm ^"1), Pro (856 and 1097 cm ^"1), Lys (1442 cm ^"1), and CH ₃ (1452 cm ^"1) in cNspl (Figure 7E), consistent with the NMR result, ^[24] and therefore the observed Raman bands at 685 and 710 cm ^"1 may also relate to those residues.

The similar shifts in the crossover frequency in all designed hydrogels (Figure 7A- C) and large intensity differences of the 486, 685 and 710 cm ^"1 (Figure 7E-G) by the addition of hexanediol suggest that hydrophobic interactions, including Phe-mediated associations, between the midblocks exist, similar to the natural Nspl hydrogel. These interactions can influence the gel relaxation without contributing significantly to the plateau modulus G ^* (Figure 7A-C).

Example 4: Transport Selectivity of Engineered Protein Hydrogels

Engineered protein hydrogels containing cNspl, 1NLP, and 2NLP midblocks selectively enhanced transport of specific biomolecules into the hydrogels, mimicking the property of natural Nspl gels. A fluorescence assay originally established to test recombinant nucleoporin hydrogels was performed in a capillary geometry (Figure 4A) to test whether cargo-NTR complexes can permeate through the engineered biosynthetic gels with enhanced transport accumulation, while other molecules and cargo without the NTR are significantly retarded.

For the assay, importin β (95 kg/mol) was chosen as a NTR due to its well-known binding to cargo with an importin β binding (IBB) domain and to the FG repeat on nucleoporin hydrogels. To reduce the passage of cargo without the NTR and easily quantify the transport of selected cargo, recombinant IBB - maltose binding protein (MBP) - enhanced green fluorescent protein (EGFP) protein fusions were prepared as a model cargo protein (75 kg/ mol; Stokes' radii of MBP and GFP are reported as 2.85 nm and 2.42 nm, respectively.). Based on the widely applied dextran diffusion method, it was expected that cargo diffusion into the gel would be significantly reduced since the pore size of the gels is smaller than non-interacting 40 kg/mol dextran probe (4.5 nm of Stokes' radius; Figure 12). When importin β and the cargo were physically mixed and added to the capillary prepared with the engineered hydrogels, selective partitioning into the hydrogel occurred over time, while a slab diffusion profile was detected in the absence of importin β (Figure 4B-E and Figure 13). The enhanced transport into the gel occurred due to a combination of diffusion and convection caused by gel swelling with buffer and cargo complexes (Figure 14).

Because the length scale of the measurement was much larger than the molecular size and gel mesh size, the gel can be considered as a uniform, semi-infinite slab. The gels can also be treated as macroscopically homogeneous since they are optically clear (absence of inhomogeneity that would scatter on the length scale of visible light) and did not phase separate upon centrifugation. Under these conditions, the permeability coefficient is the product of the diffusivity and solubility coefficients. The discontinuous concentration profile at the interface during the capillary experiments suggested that the cargo-importin β complexes are more soluble in the gel phase than the cargo alone because of the physical association between importin β and FG repeats. This increase in solubility enhances the permeability of the cargo complexes.

Example 5: Electrostatic Effects on Transport Properties

The two simplified NLP midb locks, both consensus sequences of cNspl but differing by a single amino acid in the repeating peptide, showed quantitative differences in transport properties. When accumulated green fluorescence intensities were calculated compared to the intensities without importin β (Figure 4F and Figure 13), the P-2NLP-P gel showed almost twice the intensity of the P-INLP-P gel. P-cNspl-P, P-INLP-P, and P- 2NLP-P hydrogels absorbed 3.9 ± 0.4 (mean ± SD, n = 3), 2.3 ± 0.1 (n = 3), and 3.8 ± 0.3 (n = 6) times more cargo-importin β complexes than inert molecules in an hour, respectively. Both NLPs have equal numbers of FG repeats, the same P domain crosslinkers for gelation, similar secondary structure as determined by circular dichroism (Figure 15 and Table 3), and similar passive diffusion profiles for inert molecules over time (dotted black curves in Figure 4D-E). Therefore, this quantitative change in permeability is believed to originate from the change in the consensus repeat sequence.

Table 3: CD analysis of cNspl and NLPs ^a 1NLP 0.00 0.02 0.19 0.12 0.18 0.47 0.98 0.141

2NLP 0.01 0.02 0.18 0.12 0.18 0.48 0.99 0.128 ^aBy the circular dichroism secondary structure (CDSSTR) method (Kang JW et al, Biomed Opt Express 2: 2484 (201 1)), it was found that approximately 50% structures of all proteins were disordered and others were β-strand and β-turns. Since the single amino acid change Asp in 1NLP to Ser in 2NLP occurred in the middle of the peptide between FG repeats that are known to bind importin β, the change from an anionic to neutral residue (change from a formal charge of -16 to 0 for the entire midblock of 16 repeats) suggests that electrostatic effects may affect molecular transport. It is interesting to note that the hydrogel made from P-cNspl-P, where the midblock has a formal charge of + 6, shows higher cargo-carrier accumulation on the gel interface than the P-2NLP-P hydrogel (max. fluorescence intensity: 7.6 ± 0.7 and 5.0 ± 0.9 for P-cNspl-P and P-2NLP-P, respectively). However, the depth-integrated accumulation in an hour is the same for both materials (Figure 4F), indicating that the P-2NLP-P gel has better permeability to the cargo-carrier than P-cNspl-P gel.

Changes in the charge of the protein based on a single substitution per repeat unit between 1NLP and 2NLP affect biomolecular transport through the designed hydrogels, despite the use of high ionic strength buffers that screen charge as under physiological conditions. Recent studies of related biological hydrogels such as mucus and cartilage have similarly observed that electrostatic effects influence molecular transport at the

physiologically relevant ionic strength conditions. Selective binding can be added to synthetic systems by conjugating FG peptide onto polymer gels. The results presented here on NLP hydrogels indicate that not only FG sequences but also residues far from the FG repeat can play an important role in the performance of nuclear pore-mimetic synthetic hydrogels.

Example 6: Further Characterization of P-2NLP-P

After identifying P-2NLP-P as the top performing construct, additional experiments were performed to explore its performance. The addition of 10% hexanediol to disrupt FG interactions eliminated selective accumulation of cargo complexes in P-2NLP-P gels

(Figure 4G), showing that FG repeat interactions are involved in enhanced selective transport in the biosynthetic hydrogels. This indicates that the engineered hydrogels have a filtering mechanism similar to the natural nucleoporin system.

Using blends of model proteins with and without the IBB domain established the ability of the biosynthetic NLP gels to actively transport cargo proteins compared to the inert proteins. A model cargo protein incapable of binding importin β, MBP-mCherry (67 kg/mol), has smaller size than IBB-MBP-EGFP (75 kg/mol) but showed retarded transport through the biosynthetic NLP hydrogels even in the presence of importin β (Figure 4H-I, 3 hours assay in 10 w/v% and 20 w/v% of P-2NLP-P gel results in Figure 5).

Figure 4H-I shows selective permeability test performed on P-1NLP-P and P- 2NLP-P biosynthetic hydrogels (20 w/v%) with the addition of 5 μΜ MBP-mCherry, a model inert molecule, into 5 μΜ IBB-MBP-EGFP/ importin β cargo complex mixtures. Over an hour, the cargo-carrier complexes accumulated 3.0 and 5.3 times more than MBP- mCherry (without the IBB domain) in P-1NLP-P and P-2NLP-P hydrogels, respectively.

When hydrated in buffer at 20% and 10%, P-2NLP-P formed optically-clear gels that did not phase separate upon centrifugation, suggesting they formed macroscopically homogeneous networks under these conditions. Figure 5 shows that over three hours, the cargo-carrier complexes accumulated in the gel 10.2 and 5.2 times more than MBP- mCherry (no IBB domain), indicating that the 10 w/v% gel still showed enhanced transport of the selected biomolecules and the total accumulation depended on the number of FG sequences in the hydrogels (i.e., there are twice as many FG sequences in the 20 w/v% gel compared to the 10 w/v% gel).

The results illustrate that the designed hydrogels can mimic both the selectivity and enhanced transport of natural nucleoporin hydrogels. Example 7: NTR-Mediated Selective Uptake of a Target Molecule

This example illustrates a generalizable method for capturing selected molecules into a hydrogel of the invention. As shown in Figure 18, a peptide tag which can associate with a target molecule of interest can be genetically fused to a nuclear transport receptor (NTR). In a solution, the peptide tag fused to NTR will recognize, i.e., associate with, its target molecule and thereby spontaneously form NTR-peptide tag-target molecule complexes. Due to the interaction between NTR and the hydrogel, the NTR-peptide tag- target molecule complex is captured by and carried into the hydrogel. As a model system, incomplete green fluorescent protein (GFP) was used as a target molecule and GFP tag as a binding peptide tag. It is known that complete GFP can be constructed by mixing incomplete GFP and the GFP tag, with green fluorescence as evidence of the assembly. Cabantous S et al, Nat Methods 3: 845-854 (2006); Kent KP et al, J Am Chem Soc 130: 9664-9665 (2008). Another benefit using GFP as a model is that green fluorescence can be used to visualize selective transport of the NTR-GFP tag- incomplete GFP (i.e., NTR-GFP) complex into the hydrogel under a fluorescence microscope.

GFP tag was genetically fused to the C-terminus of the NTR nuclear transport factor 2 (NTF2). Since NTF2 is homodimer, each NTF2 has two GFP tags. For protein purification, 6xHistidine tag was also fused between NTF2 and GFP tag (NTF2-His tag- GFP tag). After protein synthesis, more than 50 mg of chimeric NTF2-GFP tag was obtained.

Capillary transport assay validated the method and confirmed that the hydrogel system can capture selected molecules into the hydrogel. When NTF2 with GFP tag was mixed with incomplete GFP, the solution turned from no color to light green after overnight incubation. When the solution was added to one end of a capillary where 20 wt% hydrogel filled the other end (Figure 19), the green fluorescence was accumulated onto the hydrogel over time (Figures 19A-19C). In the same experimental conditions, fluorescein-labeled dextran (40 kg/mol, hydrodynamic radius: 4.5 nm) did not get into the hydrogel (Figures 19D-19F).

One or two incomplete GFP can associate with NTF2-GFP tag homodimer which has two GFP tags. The molar mass of GFP is approximately 30 kg/mol (2.42 nm

hydrodynamic radius). The molar mass of NTF2-GFP tag homodimer is 36 kg/mol. Since NTF2-GFP complex (66 kg/mol with one GFP, or 96 kg/mol with two GFP) is greater than the 40 kg/mol dextran, the results clearly show that the synthetic system can mimic the natural selective filtering nuclear pore function which is carrying target molecule into the gel although the size of the NTF2-GFP complex is greater than uncomplexed inert reference molecules.

The method just described opens new avenues for numerous applications for a number of applications, including for example drug delivery, food toxicology, and defense. Researchers have developed peptide library for various target molecules using phage display, and solid phase peptide synthesis techniques. By simply fusing those peptides to NTR carriers, the carriers will capture the targets and bring them into the hydrogels. More and diverse peptides will be available for specific target molecules with time; thereby, the method will be further generalized. As a specific example, Staphylococcal enterotoxin B (SEB) toxoid can be captured into the hydrogel using anti-SEB tag fused to NTF2 or importin β; the hydrogel can then be removed or destroyed for environmental

decontamination.

INCORPORATION BY REFERENCE

All patents and published patent applications mentioned in the description above are incorporated by reference herein in their entirety.

EQUIVALENTS

Having now fully described the present invention in some detail by way of illustration and example for purposes of clarity of understanding, it will be obvious to one of ordinary skill in the art that the same can be performed by modifying or changing the invention within a wide and equivalent range of conditions, formulations and other parameters without affecting the scope of the invention or any specific embodiment thereof, and that such modifications or changes are intended to be encompassed within the scope of the appended claims.

Previous Patent: CONSTRUCTS AND METHODS FOR BIOSYNTHESIS OF GALANTHAMINE

Next Patent: FINE-GRAINED IMAGE SIMILARITY