ANCHORED TRANSFERRIN FUSION PROTEIN LIBRARIES - BIOREXIS PHARMACEUTICAL CORP

Title:

ANCHORED TRANSFERRIN FUSION PROTEIN LIBRARIES

Document Type and Number:

WIPO Patent Application WO/2006/138700

Kind Code:

Abstract:

Fusion proteins comprising a transferrin moiety, a stalk moiety, and cell wall linking member and peptide libraries thereof are disclosed. The present invention includes a method of screening peptide libraries displayed in fusion proteins expressed by host cells. The fusion proteins of the present invention include transferrin fusion proteins capable of expression in yeast.

See also references of EP 1893635A4

Attorney, Agent or Firm:

TUSCAN, Michael et al. (875 15th Street Nw, Suite 80, Washington DC, US)

Download PDF:

View/Download PDF PDF Help

Claims:

We claim:

1. A fusion protein comprising:

(a) a transferrin (Tf) moiety;

(b) a stalk moiety; and

2. The fusion protein of claim 1, wherein the transferrin moiety is fused directly to the stalk moiety.

3. The fusion protein of claim 1, wherein the fusion protein further comprises an anchor moiety.

4. The fusion protein of claim 1 , wherein the transferrin moiety is a transferrin protein, a modified transferrin protein or a fragment thereof.

5. The fusion protein of claim 4, wherein the transferrin protein is a human transferrin protein.

6. The fusion protein of claim 4, wherein the transferrin moiety comprises the SEQ ID NO.: 3.

7 The fusion protein of claim 1, wherein the Tf moiety comprises the N domain of a Tf protein.

8. The fusion protein of claim 1, wherein the Tf moiety consists of the N domain of a Tf protein.

9. The fusion protein of claim 1, wherein the Tf moiety comprises a portion of the N domain of a Tf protein.

10. The fusion protein of claim 1, wherein the Tf moiety exhibits reduced glycosylation.

11. The fusion protein of claim 1, wherein the Tf moiety is modified to exhibit reduced affinity to iron.

12. The fusion protein of claim 1, wherein the Tf moiety is modified to have reduced affinity for bicarbonate.

13. The fusion protein of claim 1, wherein the Tf moiety does not bind to bicarbonate.

14. The fusion protein of claim 1, wherein the Tf moiety is modified at one or more sites from the group consisting of a glycosylation site, iron binding site, hinge site, bicarbonate site, and receptor binding site.

15. The fusion protein of claim 1, wherein the Tf moiety comprises at least one mutation that prevents glycosylation.

16. The fusion protein of claim 15, wherein the mutation is in an N-linked glycosylation site comprising the sequence N-X-S/T.

17. The fusion protein of claim 16, wherein the sequence N-X-S/T begins at an amino acid corresponding to N413 or N611 of SEQ ID NO.: 3.

18. The fusion protein of claim 17, wherein N, X. S or T has been changed to a proline.

19. The fusion protein of claim 1, wherein the transferrin moiety has been modified to exhibit no glycosylation.

20. The fusion protein of claim 1, wherein the Tf moiety is fused to a ligand or a plurality of ligands.

21. The fusion protein of claim 20, wherein the ligand is one or more of the group consisting of a single chain antibody, an antibody, antibody fragment, an antibody variable region, a random peptide, or an antibody complimentarity-determining region (CDR).

22. The fusion protein of claim 20, wherein the ligand is fused to the N-terminal end of the Tf moiety.

23. The fusion protein of claim 20, wherein the ligand is fused to the C-terminal end of the Tf moiety.

24. The fusion protein of claim 20, wherein the ligand is inserted within the transferrin moiety.

25. The fusion protein of claim 20, wherein the ligand is inserted in a surface exposed loop of the transferrin moiety.

26. The fusion protein of claim 20, wherein the ligand consists of 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, 12 or more, 13 or more, 14 or more, 15 or more, or 20 or more amino acids.

27. The fusion protein of claim 20, wherein the transferrin moiety and ligand comprise a transbody.

28. The fusion protein of claim 20, wherein the ligand is capable of binding to one or more of the group consisting of peptide, antigen, receptor, antibody, toxin, metabolite, and nucleic acid.

29. The fusion protein of claim 2, wherein the transferrin moiety is fused to the N-terminal end of the stalk moiety.

30. The fusion protein of claim 2, wherein the stalk moiety is fused to C-terminal end of the Tf moiety.

31. The fusion protein of claim 1 , wherein the stalk moiety is a heavily glycosylated peptide.

32. The fusion protein of claim 1, wherein the stalk moiety comprises a mucin domain.

33. The fusion protein of claim 32, wherein the mucin domain contains one or more proteins of the group consisting of MUCl, MUC2, MUC3, MUC4, MUC5AC, MUC5B, MUC6, MUC7, MUC8, and MUC9 and variants and fragments thereof.

34. The fusion protein of claim 1, wherein the stalk moiety comprises a human MUCl protein or fragment thereof.

35. The fusion protein of claim 34, wherein the stalk moiety is encoded by the nucleic acid of SEQ ID NO: 5.

36. The fusion protein of claim 1 , wherein the stalk moiety comprises a human MUC3 protein or fragment thereof.

37. The fusion protein of claim 1, wherein the stalk moiety comprises a yeast AGAl protein or fragment thereof.

38. The fusion protein of claim 1, wherein the stalk moiety functions to reduce steric hindrance between the transferrin moiety and a host cell or substrate.

39. The fusion protein of claim 3, wherein the stalk moiety is fused to the N-terminal end of the anchor moiety.

40. The fusion protein of claim 3, wherein the anchor moiety is fused to C-terminal end of the stalk moiety.

41. The fusion protein of claim 3, wherein the anchor moiety is a glycosyl-phosphatidyl- inositol (GPI) or a derivative or fragment thereof.

42. The fusion protein of claim 41, wherein the GPI comprises a yeast GPI signal sequence or fragment thereof.

43. The fusion protein of claim 42. wherein the GPI signal sequence comprises SEQ ID NO: 15.

44. The fusion protein of claim 41, wherein the GPI comprises a mammalian GPI signal sequence or fragment thereof.

45. The fusion protein of claim 3, wherein the anchor moiety comprises a modified GPI signal sequence.

46. The fusion protein of claim 45, wherein the modified GPI signal sequence contains a cysteine residue at position 1 of SEQ ID NO.: 26.

47. The fusion protein of claim 45, wherein the modified GPI signal sequence is an amino acid sequence selected from the group consisting of the first two amino acids of SEQ ID NO. 34 and SEQ ID NOs.: 34, 36, and 38.

48. The fusion protein of claim 1 , wherein the cell wall linking member is covalently bound to the cell wall.

49. The fusion protein of claim 1, wherein the cell wall linking member is non-covalently bound to the cell wall.

50. The fusion protein of claim 1, wherein the cell wall linking member is the stalk moiety.

51. The fusion protein of claim 1, wherein the cell wall linking member is an anchor moiety.

52. The fusion protein of claim 51, wherein the anchor moiety is a transmembrane domain.

53. The fusion protein of claim 1, wherein the cell wall linking member comprises one or more free cysteine residues capable of forming a disulfide bond with one or more proteins in the cell wall.

54. The fusion protein of claim 1 , wherein the cell wall linking member comprises one or more glycans of the stalk moiety capable of cross-linking with beta-glucans of the cell wall.

55. A nucleic acid molecule encoding a fusion protein of any one of claims 1 -54.

56. A host cell comprising a nucleic acid molecule of claim 55.

57. A host cell that expresses a fusion protein of any one of claims 1-54.

58. The host cell of claim 56, wherein the cell is a yeast cell.

59. The host cell of claim 58, wherein the yeast cell is Saccharomyces or Pichia.

60. A method of screening for the binding activity of a ligand, comprising exposing a library of host cells of claim 57 to an agent and detecting binding of at least one host cell to said agent.

61. The method of claim 60, wherein the library of host cells is a collection of yeast cells.

62. The method of claim 60, wherein the agent is an antigen or receptor.

63. A fusion protein comprising:

(a) an albumin moiety;

(b) a stalk moiety; and

64. The fusion protein of claim 63, wherein the albumin moiety is an albumin protein, a modified albumin protein or a fragment thereof.

65. The fusion protein of claim 64, wherein the albumin protein is a human albumin protein.

66. The fusion protein of claim 63 further comprising an anchor moiety.

67. The fusion protein of claim 63, further comprising a transmembrane domain.

Description:

ANCHORED TRANSFERRIN FUSION PROTEIN LIBRARIES INVENTORS: Baiyang Wang and Andrew Turner

RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application 60/691,229, filed June 17, 2005. This application is related to U.S. Patent Application 10/515,429, filed November 23, 2004; U.S. Provisional Application 60/485,404, filed July 9, 2003; U.S. Patent Application 10/384,060 filed March 10, 2003; and U.S. Provisional Application 60/406,977, filed August 30, 2002, all of which are incorporated by reference in their entirety.

FIELD OF THE INVENTION

[0002] The present invention relates to fusion proteins, fusion protein libraries, and the use of fusion proteins to screen for binding activity of a ligand.

BACKGROUND OF THE INVENTION [0003] Cell Surface Display Systems

[0004] Combinatorial library screening and selection methods have become common research tools (Phizicky et al. (1995) Microbiological Reviews 59: 94-123). One of the most widespread techniques is phage display, whereby a protein is expressed as a polypeptide fusion to a bacteriophage coat protein and subsequently screened by binding to an immobilized or soluble biotinylated ligand. Presentation of random peptides is often accomplished by constructing chimeric proteins expressed on the outer surface of filamentous bacteriophages such as Ml 3, fd and fl. Phage display has been successfully applied to antibodies, DNA binding proteins, protease inhibitors, and enzymes. See Hoogenboom et al. (1997) Trends in Biotechnol. 15: 62-70; Ladner (1995) Trends in Biotechnol. 13: 426-430; Lowman et al (1991) Biochemistry 30: 10832-10838; Markland et al (1996) Biochemistry 35: 8045-8057; and Matthews et al (1993) Nucleic Acids Res. 21 : 1727-1734.

[0005] In addition to phage display, several bacterial cell surface display methods have been developed. See Georgiou et al (1997) Nat. Biotechnol. 15: 29-34. One approach taken in bacterial cell surface display methods has been to use a fusion protein comprising a pilin

protein (TraA) or a portion thereof and a heterologous polypeptide displaying the library peptide on the outer surface of a bacterial host cell capable of forming pilus. See U.S. patent 5,516,637 which is herein incorporated by reference in its entirety.

[0006] The FLITRX™ random peptide library (Invitrogen™ Life Technologies) uses the bacterial flagellar protein, FIiC, and thioredoxin, TrxA, to display a random peptide library of dodecamers on the surface of E.coli in a conformational^ constrained manner. See Lu et al. (1995) BioTechnology 13: 366. These systems have been applied to antibody epitope mapping, the development and construction of live bacterial vaccine delivery systems, and the generation of whole-cell bio-adsorbants for environmental clean-up purposes and diagnostics. Peptide sequences that bind to tumor specific targets on tumor derived epithelial cells have also been identified using the FLITRX™ system. See Brown et al. (2000) Annals of Surgical Oncology, 7(10): 743.

[0007] Yeast cell surface display systems have been developed for library screening and have been successful at overcoming some of the limitations of phage and bacterial display systems. Yeast surface display systems, such as the pYDl Yeast Display Vector Kit (Invitrogen™ Life Technologies), use the a-agglutinin receptor of S. cerevisiae to display foreign proteins on the cell surface. The a-agglutinin receptor consists of two subunits encoded by the AGAl and AGA2 genes. The Agal protein (Agalp, 725 amino acids) is secreted from the cell and becomes covalently attached to β-glucan in the extracellular matrix of the yeast cell wall. The Aga2 protein (Aga2p, 69 amino acids) binds to Agalp through two disulfide bonds and after secretion remains attached to the cell through its contact with Agalp. The N-terminal portion of Aga2p is required for attachment to Agalp, while proteins and peptides can be fused to the C-terminus for presentation on the yeast cell surface. Agglutinin is a native yeast protein which normally functions as a specific adhesion contact to fuse yeast cells during mating. As such, it has evolved for protein-protein binding without excessive steric hindrance from cell wall components. Boder et al. in "Yeast Surface Display for Directed Evolution of Protein Expression, Affinity, and Stability", Applications of Chimeric Genes and Hybrid Proteins, (Jeremy Thorner et al), Academic Press, 2000, Vol. 328, pages 430- 439; US 6,699,658; and US 6,423,538, which are herein incorporated by reference in their entireties.

[0008] One of the drawbacks of this system, however, is that, since the Aga2p-fusion protein and Agalp are required to form a disulfide bond in order for the Aga2p protein to be tethered to the cell wall, the efficiency of display is relatively low, with only 40% to 60% of yeast

cells effectively displaying the protein on the surface. See Feldhause et al. (2003) Nat. Biotechnol. 21(2): 163-70. A need exists for a yeast display system that that presents most, if not all, proteins of a library on a cell surface.

[0009] Another drawback of the Agalp and Aga2p yeast display system is that it requires that the ligand to be screened be attached to the C-terminus of Aga2p. As a result, the system cannot be used to select peptides in which a free N-terminus is require for binding and/or is required for activity. Accordingly, a need exists for a flexible display system that does not require the binding of the N-terminus of the ligand to a yeast cell protein.

[0010] Transferrin Fusion Protein

[0011] Serum transferrin (Tf) is a monomelic glycoprotein with a molecular weight of 80,000 daltons that binds iron in the circulation and transports it to various tissues via the transferrin receptor (TfR) (Aisen et al. (1980) Ann. Rev. Biochem. 49: 357-393; MacGillivray et al. (1981) J. Biol. Chem. 258: 3543-3553; and U.S. Patent 5,026,651). Tf is one of the most common serum molecules, comprising up to about 5-10% of total serum proteins. Carbohydrate deficient transferrin occurs in elevated levels in the blood of alcoholic individuals and exhibits a longer half life (approximately 14-17 days) than that of glycosylated transferrin (approximately 7-10 days). See van Eijk et al. (1983) Clin. Chim. Acta 132:167-171; Stibler (1991) Clin. Chem. 37:2029-2037; Arndt (2001) Clin. Chem. 47(l):13-27; and Stibler et al. in "Carbohydrate-deficient consumption", Advances in the Biosciences, (Ed Nordmann et al.), Pergamon, 1988, Vol. 71, pages 353-357). The structure of Tf has been well characterized and the mechanisms of receptor binding, iron binding and release and carbonate ion binding have been elucidated. See U.S. Patents 5,026,651, 5,986,067 and MacGillivray et al. (1983) J. Biol. Chem. 258(6):3543-3546, all of which are herein incorporated by reference in their entirety.

[0012] Mucin is a heavily glycosylated protein which has been used to elevate a ligand domain of a fusion protein at a substantial distance from a microarray. It has been hypothesized that elevating a ligand a significant distance from a substrate increases binding of the ligand to a receptor displayed in receptor-expressing cells. See WO 01/46698 which is herein incorporated by reference in its entirety.

[0013] The inventors of the present invention have previously developed transferrin fusion protein libraries. See U.S. Patent Application 10/515,429 which is herein incorporated by

reference in its entirety. The present invention provides a transferrin fusion protein that contains a stalk-like moiety, such as mucin, designed to reduce steric hindrance and increase ligand binding. The fusion protein can be expressed and displayed on the surface of a host cell, such as yeast, such that the expressed transferrin fusion protein can be used as a peptide screening platform. Further, the transferrin and ligand portion of the fusion protein can be cleaved and used as a therapeutic. This may not be possible to accomplish with existing yeast display technology since the removal of the N-terminal fused Aga2 protein would likely affect the conformation of a small ligand linked to transferrin.

SUMMARY OF THE INVENTION

[0014] As described in more detail below, the present invention includes a fusion protein with a transferrin (Tf) moiety, a stalk moiety, and a cell wall linking group. The Tf moiety contains a transferrin protein or a portion thereof and is displayed on the yeast cell surface. For example, the transferrin moiety can be a portion of the N domain, i.e. lobe, of the transferrin protein. The Tf moiety can be a modified Tf protein such that the Tf portion of the fusion protein exhibits reduced glycosylation compared to wild-type Tf. In one embodiment of the invention, the transferrin portion of the fusion protein exhibits no glycosylation. In another embodiment of the present invention, the transferrin moiety of the fusion protein is modified so that it exhibits reduced affinity to iron, bicarbonate, and/or reduced affinity to a transferrin receptor compared to wild-type transferrin. The transferrin moiety may be modified so that it is unable to bind to a transferrin receptor, to iron, or to bicarbonate. Accordingly, the present invention includes modified transferrin moieties in which the transferrin moiety is modified at one or more sites from the group consisting of a glycosylation site, iron binding site, hinge site, bicarbonate site, and receptor binding site.

[0015] The ligand of the claimed invention can be complexed or fused with the transferrin moiety in various ways. Further, a transferrin moiety may have more than one ligand associated with it. The ligand moiety may be fused to the N-terminus, to the C-terminus of the transferrin moiety, or may be located within the transferrin moiety. In one embodiment of the invention, the ligand is inserted at one or more amino acid positions of the N-lobe (N ₁ or N ₂ ) selected from the group consisting of amino acid positions Asp33, Asn55, Asn75, Asp90, Gly257, Lys280, His289, Ser298, SerlO5, GIuHl, Aspl66, Glnl84, Aspl97, Lys217, Thr231 and Cys241.

[0016] In another embodiment of the invention, the ligand is located on an exposed loop of the transferrin moiety. The ligand moiety such as a random peptide can be expressed by a host cell in a vector coding for the transferrin fusion protein such that it can be in-frame with the transferrin moiety. A random peptide ligand moiety expressed with a transferrin moiety can be created by many methods known in the art including, but not limited to, error prone PCR and DNA shuffling. A ligand moiety can also be added to a transferrin fusion protein after the latter has already been translated.

[0017] The ligand can take many forms, including, but not limited to, a single chain antibody, antibody, antibody fragment, antibody variable region, random peptide, or antibody complimentarity-determining region (CDR). Ligands may contain a variable or random region and an unvariable region. The ligand can be a ligand of interest or one ligand in a library of ligands. The ligand may be capable of binding to a number of receptors or agents such as a peptide, antigen, receptor, antibody, toxin, metabolite, and nucleic acid.

[0018] The stalk moiety can be oriented such that its N-terminus is fused to the transferrin moiety and its C-terminus located in the cell, for instance, in the cell wall. In one embodiment, the C-terminus of the stalk moiety is fused to an anchor moiety. The stalk moiety of the present invention spans the cell wall of a yeast cell and is generally a moderately to heavily glycosylated peptide. By spanning the cell wall, the stalk moiety may act as a cell wall linking member to tether the fusion protein through the cell wall. In one embodiment of the invention, the stalk moiety spans the cell wall and is partially displayed on the cell surface. The composition of the stalk moiety may give it a rod-like conformation which reduces steric hindrance that would otherwise exist between the fusion protein, notably the ligand, and the host cell.

[0019] The stalk moiety may contain or consist of a mucin, mucin variant or fragment thereof. The mucin domain may include, for instance, MUCl, MUC2, MUC3, MUC4, MUC5AC, MUC5B, MUC6, MUC7, MUC8 and MUC9 and variants thereof. In one embodiment, the stalk moiety contains a human MUCl domain such as the peptide corresponding to the nucleic acid sequence of SEQ ID NO: 5 or a fragment thereof. In another embodiment, the stalk moiety comprises two or more repeats of a mucin, for instance, two or more repeats of MUCl or MUC3. In a further embodiment, the stalk moiety comprises two or more mucin proteins or variants or fragments thereof from the group consisting of MUCl, MUC2, MUC3, MUC4, MUC5AC, MUC5B, MUC6, MUC7, MUC8, and MUC9.

[0020] The stalk moiety may also contain or consist of other proteins that are moderately to heavily glycosylated, including native yeast wall proteins. For instance, in one embodiment of the invention, the stalk moiety contains or consists of Agal, a variant of Aga 1, or a fragment thereof.

[0021] The fusion proteins of the present invention include a cell wall linking member which acts to immobilize or tether the fusion protein to a host cell. The cell wall linking member can covalently or non-covalently bind the fusion protein of the invention to the yeast cell wall. In one embodiment of the invention, the stalk moiety of the fusion protein is the cell wall linking member. For instance, O-glycans from the stalk moiety can crosslink to beta glucans of the cell wall. Other cell wall linking members, include, but are not limited to, peptides containing free cysteine residues. For instance, a stalk moiety or anchor moiety containing one or more unpaired cysteine residues can form a disulfide bond(s) with one or more unpaired cysteine residues of proteins in the cell wall.

[0022] The fusion protein of the invention can optionally contain an anchor moiety which also acts to immobilize or tether the transferrin fusion protein to the host cell. The anchor moiety can be a cell wall linking member or can tether the fusion protein to a yeast cell membrane.

[0023] One anchor domain capable of tethering the fusion protein of the present invention to a yeast cell membrane, among others, is a glycosyl-phosphatidyl-inositol (GPI) peptide anchor that is added through post-translational protein modification to the ω-site in the GPI signal peptide sequence, such as the signal peptide sequence provided in SEQ ID NO.: 15. In one embodiment of the invention, an anchor such as the one provided by a modified GPI signal sequence transiently tethers the fusion protein to a host cell membrane or cell wall before being cleaved. Once cleaved, the fusion protein remains tethered to the cell via the cell wall linking member as a result of glycans from the stalk moiety being crosslinked into the beta glucans of the cell wall.

[0024] In another embodiment of the invention, the anchor is a transmembrane domain. The transmembrane domain (TMD) can be the region of a single pass type I or type II membrane protein or any one of the several transmembrane regions of a multispan membrane protein.

[0025] The present invention also includes the nucleic acid molecule that encodes the claimed fusion protein. The nucleic acid can be inserted in a vector and used to transform a host cell such as yeast. Once transformed with the nucleic acid of the present invention, the

host cell can express the fusion protein. Induction of expression of the fusion protein can be controlled by methods known in the art, for instance, by use of an inducible promoter. The present invention includes a library of fusion proteins expressed in a collection of host cells, for instance, a collection of yeast cells expressing the fusion protein of the invention displaying randomized peptides.

[0026] In another embodiment of the present invention, the fusion protein is used to screen for the binding activity of a ligand or agent. A library of host cells capable of expressing the claimed fusion protein can be exposed to an agent, including but not limited to, an antigen or receptor, and then screened for binding activity. Cell surface display libraries can be screened using methods known in the art, including, but not limited to, FACS and magnetic beads.

BRIEF DESCRIPTION OF THE DRAWINGS

[0027] FIGURE 1 shows a random peptide or CDR library displayed on a transferrin fusion protein and the binding of the ligand with a target.

[0028] FIGURE 2 provides the yeast YIROl 9C GPI anchor peptide sequence and highlights the amino acids responsible for cell membrane attachment.

[0029] FIGURE 3 provides the vector map for ρREX0549. [0030] FIGURE 4 provides the vector map for ρREX0995. [0031] FIGURE 5 provides the vector map for ρREX0667. [0032] FIGURE 6 provides the vector map for pREX1012. [0033] FIGURE 7 provides the vector map for pREX0759.

[0034] FIGURE 8 shows the presence of Flag-tagged yeast after two rounds of MACS separation.

[0035] FIGURE 9 provides the vector map for ρREX0855. [0036] FIGURE 10 provides the vector map for ρREX1087. [0037] FIGURE 11 provides the vector map for pREXl 106. [0038] FIGURE 12 shows FACS analysis with MUCl and AGAl.

DETAILED DESCRIPTION [0039] General Description

[0040] The inventors of the present invention have developed a multifunctional fusion protein that can be used, for instance, as part of a cell surface display system to screen libraries, e.g., random peptide or CDR libraries. The fusion protein includes a transferrin moiety complexed with or fused to one or more ligands. The invention envisions a fusion protein containing a protein other than transferrin so long as the other protein is soluble and is capable of conferring increased serum half-life to the fused one or more ligands when cleaved from the remainder of the fusion protein. For instance, albumin or a variant or fragment thereof can be used in the place of transferrin.

[0041] The transferrin moiety of the fusion protein is fused to a stalk moiety, which is moderately to heavily glycosylated. The fusion protein contains a cell wall linking member which is capable of covalently or non-covalently binding the fusion protein to the cell wall of a yeast cell. In one embodiment of the invention, the fusion protein also contains an anchor moiety such as a transmembrane domain.

[0042] The fusion protein offers advantages over the prior art when used as a yeast display system including providing an increased percentage of clones with cell surface displayed peptides compared to the Agalp and Aga2p yeast display system. The fusion protein of the invention also offers the flexibility of screening ligands that require an available N-terminus for binding.

[0043] The present invention also includes therapeutic compositions comprising the fusion proteins or portions thereof, and methods of treating, preventing, or ameliorating diseases or disorders by administering the fusion proteins or portions thereof to a subject in need of such a therapeutic. A fusion protein of the invention includes at least a fragment or variant of a putative therapeutic protein as a ligand moiety. In one embodiment of the invention, the transferrin and ligand, i.e., therapeutic, portion of the fusion protein can be cleaved from the stalk moiety, i.e., yeast cell bound portion of the fusion protein and used to prepare a biopharmaceutical or vaccine.

[0044] Definitions

[0045] As used herein, the term "biological activity" refers to a function or set of activities performed by a therapeutic molecule, ligand moiety, protein or peptide in a biological context, i.e., in an organism or an in vitro facsimile thereof. Biological activities may include, but are not limited to, the functions of the therapeutic molecule portion of the claimed fusion proteins, such as, but not limited to, the induction of extracellular matrix secretion from responsive cell lines, the induction of hormone secretion, the induction of chemotaxis, the induction of mito genesis, the induction of differentiation, or the inhibition of cell division of responsive cells. A fusion protein or peptide of the invention is considered to be biologically active if it exhibits one or more biological activities of its therapeutic protein's native counterpart.

[0046] As used herein, an "'amino acid corresponding to" or an "equivalent amino acid" in a transferrin sequence is identified by alignment to maximize the identity or similarity between a first transferrin sequence and at least a second transferrin sequence. The number used to identify an equivalent amino acid in a second transferrin sequence is based on the number used to identify the corresponding amino acid in the first transferrin sequence. In certain cases, these phrases may be used to describe the amino acid residues in human transferrin compared to certain residues in rabbit serum transferrin.

[0047] As used herein, the terms "Tf moiety", "fragment of a Tf protein" or "Tf protein," or "portion of a Tf protein" refer to an amino acid sequence comprising at least about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% of a naturally occurring Tf protein or mutant thereof.

[0048] As used herein, the term "gene" refers to any segment of DNA associated with a biological function. Thus, genes include, but are not limited to, coding sequences and/or the regulatory sequences required for their expression. Genes can also include nonexpressed DNA segments that, for example, form recognition sequences for other proteins. Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters.

[0049] As used herein, a "heterologous polynucleotide" or a "heterologous nucleic acid" or a "heterologous gene" or a "heterologous sequence" or an "exogenous DNA segment" refers to a polynucleotide, nucleic acid or DNA segment that originates from a source foreign to the particular host cell, or, if from the same source, is modified from its original form. A heterologous gene in a host cell includes a gene that is endogenous to the particular host cell, but has been modified. Thus, the terms refer to a DNA segment which is foreign or heterologous to the cell, or homologous to the cell but in a position within the host cell nucleic acid in which the element is not ordinarily found. As an example, a signal sequence native to a yeast cell but attached to a human Tf sequence is heterologous.

[0050] As used herein, an "isolated" nucleic acid sequence refers to a nucleic acid sequence which is essentially free of other nucleic acid sequences, e.g., at least about 20% pure, preferably at least about 40% pure, more preferably about 60% pure, even more preferably about 80% pure, most preferably about 90% pure, and even most preferably about 95% pure, as determined by agarose gel electrophoresis. For example, an isolated nucleic acid sequence can be obtained by standard cloning procedures used in genetic engineering to relocate the nucleic acid sequence from its natural location to a different site where it will be reproduced. The cloning procedures may involve excision and isolation of a desired nucleic acid fragment comprising the nucleic acid sequence encoding the polypeptide, insertion of the fragment into a vector molecule, and incorporation of the recombinant vector into a host cell where multiple copies or clones of the nucleic acid sequence will be replicated. The nucleic acid sequence may be of genomic, cDNA, RNA, semisynthetic, synthetic origin, or any combinations thereof.

[0051] As used herein, two or more DNA coding sequences are said to be "joined" or "fused" when, as a result of in- frame fusions between the DNA coding sequences, the DNA coding sequences are translated into a fusion polypeptide. The term "fusion" in reference to fusion protein comprises a ligand moiety, stalk moiety, and anchor moiety. A Tf fusion protein is a fusion of a transferrin moiety to a stalk moiety and contains a cell wall binding member.

[0052] "Modified transferrin" as used herein refers to a transferrin molecule that exhibits at least one modification of its amino acid sequence, compared to wild-type transferrin.

[0053] "Modified transferrin fusion protein" as used herein refers to a protein formed by the fusion of at least one molecule of modified transferrin (or a fragment or variant thereof) complexed or fused to a ligand, which is fused to a stalk moiety.

[0054] As used herein, the terms "nucleic acid" or "polynucleotide" refer to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double- stranded form. Unless specifically limited, the terms encompass nucleic acids containing analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g. degenerate codon substitutions) and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al. (1991) Nucleic Acid Res. 19:5081; Ohtsuka et al. (1985) J. Biol. Chem. 260:2605-2608; Cassol et al (1992); Rossolini et al. (1994) MoI. Cell. Probes 8:91-98). The term nucleic acid is used interchangeably with gene, cDNA, and mRNA encoded by a gene.

[0055] As used herein, a DNA segment is referred to as "operably linked" when it is placed into a functional relationship with another DNA segment. For example, DNA for a signal sequence is operably linked to DNA encoding a fusion protein of the invention if it is expressed as a preprotein that participates in the secretion of the fusion protein; a promoter or enhancer is operably linked to a coding sequence if it stimulates the transcription of the sequence. Generally, DNA sequences that are operably linked are contiguous, and in the case of a signal sequence or fusion protein both contiguous and in reading phase. However, enhancers need not be contiguous with the coding sequences whose transcription they control. Linking, in this context, is accomplished by ligation at convenient restriction sites or at adapters or linkers inserted in lieu thereof.

[0056] As used herein, the term "promoter" refers to a region of DNA involved in binding RNA polymerase to initiate transcription.

[0057] As used herein, the term "recombinant" refers to a cell, tissue or organism that has undergone transformation with recombinant DNA.

[0058] As used herein, a targeting entity, protein, polypeptide or peptide refers to a molecule that binds specifically to a particular cell type, e.g., normal cell, such as a lymphocyte, or abnormal cell, such as a cancer cell, and therefore may be used to target a Tf fusion protein or compound (drag, or cytotoxic agent) to that cell type specifically.

[0059] As used herein, "therapeutic protein" refers to proteins, polypeptides, antibodies, peptides or fragments or variants thereof, having one or more therapeutic and/or biological activities. Therapeutic proteins encompassed by the invention include but are not limited to proteins, polypeptides, peptides, antibodies, and biologies. The terms peptides, proteins, and polypeptides are used interchangeably herein. Additionally, the term "therapeutic protein" may refer to the endogenous or naturally occurring correlate of a therapeutic protein. By a polypeptide displaying a "therapeutic activity" or a protein that is "therapeutically active" is meant a polypeptide that possesses one or more known biological and/or therapeutic activities associated with a therapeutic protein such as one or more of the therapeutic proteins described herein or otherwise known in the art. As a non-limiting example, a "therapeutic protein" is a protein that is useful to treat, prevent or ameliorate a disease, condition or disorder. Such a disease, condition or disorder may be in humans or in a non-human animal, e.g. , veterinary use.

[0060] As used herein, the term "transformation".refers to the transfer of nucleic acid, i.e., a nucleotide polymer, into a cell. As used herein, the term "genetic transformation" refers to the transfer and incorporation of DNA, especially recombinant DNA, into a cell.

[0061] As used herein, the term "transformant" refers to a cell, tissue or organism that has undergone transformation.

[0062] As used herein, the term "transgene" refers to a nucleic acid that is inserted into an organism, host cell or vector in a manner that ensures its function.

[0063] As used herein, the term "transgenic" refers to cells, cell cultures, organisms, bacteria, fungi, animals, plants, and progeny of any of the preceding, which have received a foreign or modified gene and in particular a gene encoding a modified Tf fusion protein by one of the various methods of transformation, wherein the foreign or modified gene is from the same or different species than the species of the organism receiving the foreign or modified gene.

[0064] "Variants or variant" refers to a polynucleotide or nucleic acid differing from a reference nucleic acid or polypeptide, but retaining essential properties thereof. Generally, variants are overall closely similar, and, in many regions, identical to the reference nucleic

acid or polypeptide. As used herein, "variant" refers to a therapeutic protein portion of a transferrin fusion protein of the invention, differing in sequence from a native therapeutic protein but retaining at least one functional and/or therapeutic property thereof as described elsewhere herein or otherwise known in the art.

[0065] As used herein, the term "vector" refers broadly to any plasmid, phagemid or virus encoding an exogenous nucleic acid. The term is also be construed to include non-plasmid, non-phagemid and non- viral compounds which facilitate the transfer of nucleic acid into virions or cells, such as, for example, polylysine compounds and the like. The vector may be a viral vector that is suitable as a delivery vehicle for delivery of the nucleic acid, or mutant thereof, to a cell, or the vector may be a non- viral vector which is suitable for the same purpose. Examples of viral and non- viral vectors for delivery of DNA to cells and tissues are well known in the art and are described, for example, in Ma et al. (1997, Proc. Natl. Acad. ScL U.S.A. 94:12744-12746). Examples of viral vectors include, but are not limited to, a recombinant vaccinia virus, a recombinant adenovirus, a recombinant retrovirus, a recombinant adeno-associated virus, a recombinant avian pox virus, and the like (Cranage et al, 1986, EMBO J. 5:3057-3063; International Patent Application No. WO94/17810, published August 18, 1994; International Patent Application No. WO94/23744, published October 27, 1994). Examples of non- viral vectors include, but are not limited to, liposomes, polyamine derivatives of DNA, and the like.

[0066] As used herein, the term "wild type" refers to a polynucleotide or polypeptide sequence that is naturally occurring.

[0067] As used herein, "scaffold protein", "scaffold polypeptide", or "scaffold" refers to a protein to which amino acid sequences such as random peptides, can be fused. The peptides are exogenous to the scaffold.

[0068] As used herein, "random peptide sequence" refers to an amino acid sequence composed of two or more amino acid monomers and constructed by a stochastic or random process. A random peptide can include framework or scaffolding motifs, which may comprise invariant sequences. A random peptide sequence may contain a portion of non- variant, i.e., non-random, amino acids.

[0069] As used herein "random peptide library" refers to a set of polynucleotide sequences that encodes a set of random peptides, and to the set of random peptides encoded by those polynucleotide sequences, as well as the fusion proteins containing those random peptides.

[0070] As used herein, the term "pseudorandom" refers to a set of sequences that have limited variability, so that for example, the degree of residue variability at one position is different than the degree of residue variability at another position, but any pseudorandom position is allowed some degree of residue variation, however circumscribed.

[0071] As used herein, the term "defined sequence framework" refers to a set of defined sequences that are selected on a nonrandom basis, generally on the basis of experimental data or structural data, for example, a defined sequence framework may comprise a set of amino acid sequences that are predicted to form a β-sheet structure or may comprise a leucine zipper heptad repeat motif, a zinc- finger domain, among other variations. A "defined sequence kernal" is a set of sequences which encompass a limited scope of variability. Whereas a completely random 10-mer sequence of the 20 conventional amino acids can be any of (20) ¹⁰ sequences, and a pseudorandom 10-mer sequence of the 20 conventional amino acids can be any of (2O) ¹⁰ sequences but will exhibit a bias for certain residues at certain positions and/or overall, a defined sequence kernal is a subset of sequences which is less that the maximum number of potential sequences if each residue position was allowed to be any of the allowable 20 conventional amino acids (and/or allowable unconventional amino/imino acids). A defined sequence kernal generally comprises variant and invariant residue positions and/or comprises variant residue positions which can comprise a residue selected from a defined subset of amino acid residues, and the like, either segmentally or over the entire length of the individual selected library member sequence. Defined sequence kernals can refer to either amino acid sequences or polynucleotide sequences.

[0072] As used herein, "linker" or "spacer" refers to a molecule or group of molecules that connects two molecules, such as a DNA binding protein and a random peptide, and serves to place the two molecules in a desirable configuration, e.g., so that the random peptide can bind to a receptor with minimal steric hindrance from the DNA binding protein.

[0073] As used herein, the term "variable segment" refers to a portion of a nascent peptide which comprises a random, pseudorandom, or defined kernal sequence. A variable segment can comprise both variant and invariant residue positions, and the degree of residue variation at a variant residue position may be limited; both options are selected at the discretion of the practitioner. Typically, variable segments are about 3 to 20 amino acid residues in length, e.g., 8 to 10 amino acids in length, although variable segments may be longer and may comprise antibody portions or receptor proteins, such as an antibody fragment, a nucleic acid binding protein, a receptor protein and the like.

[0074] As used herein, the term "epitope" refers to that portion of an antigen or other macromolecule capable of forming a binding interaction that interacts with the variable region binding pocket of an antibody. Typically, such binding interaction is manifested as an intermolecular contact with one or more amino acid residues of a CDR.

[0075J As used herein, the term "receptor," "target," or "agent" refers to a molecule that has an affinity for a given ligand. Receptors can be naturally occurring or synthetic molecules. Receptors can be employed in an unaltered state or as aggregates with other species. Receptors can be attached, covalently or noncovalently, to a binding member, i.e., ligand, either directly or via a specific binding substance. Examples of receptors include, but are not limited to, antibodies, including monoclonal antibodies and antisera reactive with specific antigenic determinants (such as on viruses, cells, or other materials), cell membrane receptors, antigens, epitope containing molecules, complex carbohydrates and glycoproteins, enzymes and hormone receptors.

[0076] As used herein, the term "ligand" or "ligand moiety" refers to a molecule, such as a random peptide or variable segment sequence, that is recognized by a particular receptor or agent. As one of skill in the art will recognize, a molecule (or macromolecular complex) can be both a receptor and a ligand.

[0077] As used herein, "fused", "complexed" or "operably linked" is meant that the random peptide and the scaffold protein are linked together, in such a manner as to minimize the disruption to the stability of the scaffold structure.

[0078] As used herein, the term "single-chain antibody" refers to a polypeptide comprising a V _H domain and a V _L domain in polypeptide linkage, generally linked via a spacer peptide {e.g., [Gly-Gly-Gly-Gly-Ser] _x SEQ ID NO.: 17) and which may comprise additional amino acid sequences at the amino- and/or carboxy-termini. For example, a single-chain antibody may comprise a tether segment for linking to the encoding polynucleotide. As an example, a scFv is a single-chain antibody. Single-chain antibodies are generally proteins consisting of one or more polypeptide segments of at least 10 contiguous amino acids substantially encoded by genes of the immunoglobulin superfamily {e.g., see The Immunoglobulin Gene Superfamily, A. F. Williams and A. N. Barclay, in Immunoglobulin Genes, T. Honjo, F. W. Alt, and T. H. Rabbitts, eds., (1989) Academic Press: San Diego, Calif., pp. 361-387, which is incorporated herein by reference), most frequently encoded by a rodent, non-human primate, avian, porcine, bovine, ovine, goat, or human heavy chain or light chain gene

sequence. A functional single-chain antibody generally contains a sufficient portion of an immunoglobulin superfamily gene product so as to retain the property of binding to a specific target molecule, typically a receptor or antigen (epitope).

[0079] As used herein, the term "complementarity-determining region" and "CDR" refer to the art-recognized term as exemplified by the Kabat and Chothia CDR definitions also generally known as hypervariable regions or hypervariable loops. See Chothia and Lesk (1987) J. MoI. Biol. 196: 901; Chothia et al. (1989) Nature 342: 877; E. A. Kabat et al, Sequences of Proteins of Immunological Interest (National Institutes of Health, Bethesda, Md.) (1987); and Tramontano et al. (1990) J. MoI. Biol. 215: 175. Variable region domains typically comprise the amino-terminal approximately 105-115 amino acids of a naturally- occurring immunoglobulin chain, e.g., amino acids 1-110, although variable domains somewhat shorter or longer are also suitable for forming single-chain antibodies.

[0080] An immunoglobulin light or heavy chain variable region consists of a "framework" region interrupted by three hypervariable regions, also called CDRs. The extent of the framework region and CDRs have been precisely defined. See, "Sequences of Proteins of Immunological Interest," E. Kabat et al., 4th Ed., U.S. Department of Health and Human Services, Bethesda, Md. (1987). The sequences of the framework regions of different light or heavy chains are relatively conserved within a species. As used herein, a "human framework region" is a framework region that is substantially identical (about 85% or more, usually 90- 95% or more) to the framework region of a naturally occurring human immunoglobulin. The framework region of an antibody, that is the combined framework regions of the constituent light and heavy chains, serves to position and align the CDRs. The CDRs are primarily responsible for binding to an epitope of an antigen.

[0081] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are described.

[0082] Transferrin and Transferrin Modifications

[0083] The fusion proteins of the present invention include a transferrin (Tf) protein or portion thereof which is able to present a ligand such as a random peptide or CDR to a

receptor or agent. The Tf moiety is fused to the N-terminus of the stalk moiety. The Tf protein or portion thereof of the fusion protein may be referred to as a Tf "portion", "region" or "moiety" of the fusion protein. As used herein, a transferrin fusion protein is a transferrin protein or moiety fused to stalk moiety, and contains a cell wall linking member. The transferrin fusion protein of the invention optionally contains an anchor moiety.

[0084] Any transferrin may be used to make modified Tf fusion proteins of the invention. As an example, a wild-type human Tf (Tf) is a 679 amino acid protein, of approximately 75kDa (not accounting for glycosylation), with two main lobes or domains, N (about 330 amino acids) and C (about 340 amino acids), which appear to originate from a gene duplication. See GenBank accession numbers NM001063, XM002793, M12530, XM039845, XM 039847 and S95936 (www.ncbi.nlm.nih.gov), all of which are herein incorporated by reference in their entirety, as well as SEQ ID NOS: 1, 2 and 3. The two domains have diverged over time but retain a large degree of identity/similarity.

[0085] Each of the N and C domains is further divided into two subdomains, Nl and N2, Cl and C2. The function of Tf is to transport iron to the cells of the body. This process is mediated by the Tf receptor (TfR), which is expressed on all cells, particularly actively growing cells. TfR recognizes the iron bound form of Tf (two molecules of which are bound per receptor), endocytosis then occurs whereby the TfR/Tf complex is transported to the endosome, at which point the localized drop in pH results in release of bound iron and the recycling of the TfR/Tf complex to the cell surface and release of Tf (known as apoTf in its un-iron bound form). Receptor binding is mainly through the C domain of Tf. The two glycosylation sites in the C domain do not appear to be involved in receptor binding as unglycosylated iron bound Tf does bind the receptor.

[0086] Each Tf molecule can carry two iron ions (Fe ³⁺ ). These are complexed in the space between the Nl and N2, Cl and C2 sub domains resulting in a conformational change in the molecule.

[0087] In human transferrin, the iron binding sites comprise at least amino acids Asp 63 (Asp 82 of SEQ ID NO: 2 which includes the native Tf signal sequence), Asp 392 (Asp 411 of SEQ ID NO: 2), Tyr 95 (Tyr 114 of SEQ ID NO: 2), Tyr 426 (Tyr 445 of SEQ ID NO: 2), Tyr 188 (Tyr 207 of SEQ ID NO: 2), Tyr 514 or 517 (Tyr 533 or Tyr 536 SEQ ID NO: 2), His 249 (His 268 of SEQ ID NO: 2), and His 585 (His 604 of SEQ ID NO: 2) of SEQ ID NO: 3. The hinge regions comprise at least N domain amino acid residues 94-96, 245- 247 and/or

316-318 as well as C domain amino acid residues 425-427, 581-582 and/or 652-658 of SEQ ID NO: 3. The carbonate binding sites comprise at least amino acids Thr 120 (Tlir 139 of SEQ ID NO: 2), Thr 452 (Thr 471 of SEQ ID NO: 2), Arg 124 (Arg 143 of SEQ ID NO: 2), Arg 456 (Arg 475 of SEQ ID NO: 2), Ala 126 (Ala 145 of SEQ ID NO: 2), Ala 458 (Ala 477 of SEQ ID NO: 2), GIy 127 (GIy 146 of SEQ ID NO: 2), and GIy 459 (GIy 478 of SEQ ID NO: 2) of SEQ ID NO: 3.

[0088] In one embodiment of the invention, the fusion proteins include a modified human transferrin, although any animal Tf molecule may be used to produce the fusion proteins of the invention, including human Tf variants, cow, pig, sheep, dog, rabbit, rat, mouse, hamster, echnida, platypus, chicken, frog, hornworm, monkey, ape, as well as other bovine, canine and avian species. All of these Tf sequences are readily available in GenBank and other public databases. The human Tf nucleotide sequence is available (see SEQ ID NOS: 1, 2 and 3 and the accession numbers described above and available at www.ncbi.nlm.nih.gov) and can be used to make genetic fusions between Tf or a domain of Tf and the therapeutic molecule of choice. Fusions may also be made from related molecules such as lacto transferrin (lactoferrin) GenBank Ace: NM 002343) or melanotransferrin.

[0089] Lactoferrin (Lf), a natural defense iron-binding protein, has been found to possess antibacterial, antimycotic, antiviral, antineoplastic and anti-inflammatory activity. The protein is present in exocrine secretions that are commonly exposed to normal flora: milk, tears, nasal exudate, saliva, bronchial mucus, gastrointestinal fluids, cervico-vaginal mucus and seminal fluid. Additionally, Lf is a major constituent of the secondary specific granules of circulating polymorphonuclear neutrophils (PMNs). The apoprotein is released on degranulation of the PMNs in septic areas. A principal function of Lf is that of scavenging free iron in fluids and inflamed areas so as to suppress free radical-mediated damage and decrease the availability of the metal to invading microbial and neoplastic cells. In a study that examined the turnover rate of ¹²⁵ I Lf in adults, it was shown that Lf is rapidly taken up by the liver and spleen, and the radioactivity persisted for several weeks in the liver and spleen (Bennett etal. (1979), Clin. ScI (Lond.) 57: 453-460).

[0090] In one embodiment, the transferrin portion of the fusion protein of the invention includes a transferrin splice variant. In one example, a transferrin splice variant can be a splice variant of human transferrin. In one specific embodiment, the human transferrin splice variant can be that of Genbank Accession AAA61140.

[0091] In another embodiment, the transferrin portion of the fusion protein of the invention includes a lactoferrin splice variant. In one example, a human serum lactoferrin splice variant can be a novel splice variant of a neutrophil lactoferrin. In one specific embodiment, the neutrophil lactoferrin splice variant can be that of Genbank Accession AAA59479. In another specific embodiment, the neutrophil lactoferrin splice variant can comprise the following amino acid sequence EDCIALKGE AD A (SEQ ID NO: 4), which includes the novel region of splice- variance.

[0092] Fusion may also be made with melanotransferrin (GenBank Ace. NM_013900, murine melanotransferrin). Melanotransferrin is a glycosylated protein found at high levels in malignant melanoma cells and was originally named human melanoma antigen p97 (Brown et al, 1982, Nature, 296: 171-173). It possesses high sequence homology with human serum transferrin, human lactoferrin, and chicken transferrin (Brown et al., 1982, Nature, 296: 171-173; Rose et al, Proc. Natl. Acad. ScL, 1986, 83: 1261-1265). However, unlike these proteins, no cellular receptor has been identified for melanotransferrin. Melanotransferrin reversibly binds iron and exists in two forms, one of which is bound to cell membranes by a glycosyl phosphatidylinositol anchor while the other form is both soluble and actively secreted (Baker et al, 1992, FEBS Lett, 298: 215-218; Alemany et al, 1993, J. Cell Sd., 104: 1155-1162; Food et al, 1994, J. Biol. Chem. 274: 7011-7017).

[0093] Modified Tf fusions may be made with any Tf protein, fragment, domain, or engineered domain. For instance, fusion proteins may be produced using the full-length Tf sequence, with or without the native Tf signal sequence. Trans-bodies may also be made using a single Tf domain, such as an individual N or C domain. Trans-bodies may also be made with a double Tf domain, such as a double N domain or a double C domain. In some embodiment, fusions of a therapeutic protein to a single C domain maybe produced, wherein the C domain is altered to reduce, inhibit or prevent glycosylation, iron binding and/or Tf receptor binding. In other embodiments, the use of a single N domain is advantageous as the Tf glycosylation sites reside in the C domain and the N domain, on its own, does not bind iron or the Tf receptor. In one embodiment the Tf fusion protein has a single N domain which is expressed at a high level.

[0094] As used herein, a C terminal domain or lobe modified to function as an N-like domain is modified to exhibit glycosylation patterns or iron binding properties substantially like that of a native or wild-type N domain or lobe. In one embodiment, the C domain or lobe is modified so that it is not glycosylated and does not bind iron by substitution of the relevant C

domain regions or amino acids to those present in the corresponding regions or sites of a native or wild-type N domain.

[0095] As used herein, a Tf moiety comprising "two N domains or lobes" includes a Tf molecule that is modified to replace the native C domain or lobe with a second native or wild-type N domain or lobe or a modified N domain or lobe or contains a C domain that has been modified to function substantially like a wild-type or modified N domain. See U.S. provisional application 60/406,977, which is herein incorporated by reference in its entirety.

[0096] Analysis of the two domains by overlay of the 3 -dimensional structure of the two domains (Swiss PDB Viewer 3.7b2, Iterative Magic Fit) and by direct amino acid alignment (ClustalW multiple alignment) reveals that the two domains have diverged over time. Amino acid alignment shows 42% identity and 59% similarity between the two domains. However, approximately 80% of the N domain matches the C domain for structural equivalence. The C domain also has several extra disulfide bonds compared to the N domain.

[0097] Alignment of molecular models for the N and C domain reveals the following structural equivalents:

The disulfide bonds for the two domains align as follows:

Bold aligned disulfide bonds Italics bridging peptide

[0098] In one embodiment, the transferrin portion of the fusion protein includes at least two N terminal lobes of transferrin. In further embodiments, the transferrin portion of the fusion protein includes at least two N terminal lobes of transferrin derived from human serum transferrin.

[0099] In another embodiment, the transferrin portion of the fusion protein includes, comprises, or consists of at least two N terminal lobes of transferrin having a mutation in at least one amino acid residue selected from the group consisting of Asp63, Gly65, Tyr95, Tyrl88, and His249 of SEQ ID NO: 3.

[00100] In another embodiment, the transferrin portion of the modified fusion protein includes a recombinant human serum transferrin N-terminal lobe mutant having a mutation at Lys206 or His207 of SEQ ID NO: 3.

[00101] In another embodiment, the transferrin portion of the fusion protein includes, comprises, consists essentially of, or consists of at least two C terminal lobes of transferrin. In further embodiments, the transferrin portion of the fusion protein includes at least two C terminal lobes of transferrin derived from human serum transferrin.

[00102] In a further embodiment, the C terminal lobe mutant further includes a mutation of at least one of Asn413 and Asnόl 1 of SEQ ID NO: 3 which does not allow glycosylation.

[00103] In another embodiment, the transferrin portion includes at least two C terminal lobes of transferrin having a mutation in at least one amino acid residue selected from the group consisting of Asρ392, Tyr426, Tyr514, Tyr517 and His585 of SEQ ID NO: 3, wherein the mutant retains the ability to bind metal ions. In an alternate embodiment, the transferrin portion includes at least two C terminal lobes of transferrin having a mutation in at least one amino acid residue selected from the group consisting of Tyr426, Tyr514, Tyr517 and His585 of SEQ ID NO: 3, wherein the mutant has a reduced ability to bind metal ions. In another embodiment, the transferrin portion includes at least two C terminal lobes of transferrin having a mutation in at least one amino acid residue selected from the group consisting of Asp392, Tyr426, Tyr517 and His585 of SEQ ID NO:3, wherein the mutant does not retain the ability to bind metal ions and functions substantially like an N domain.

[00104] In some embodiments, the Tf or Tf portion will be of sufficient length to increase the in vivo circulatory half-life, serum stability, in vitro solution stability or bioavailability of the ligand, i.e., therapeutic, when the Tf or Tf portion and ligand of the fusion protein are cleaved from the remainder of the fusion protein compared to the in vivo circulatory half-life, serum stability (half-life), in vitro stability or bioavailability of the ligand in an unfused state, i.e., not fused to Tf. Such an increase in stability, in vivo circulatory half-life or bioavailability may be about a 30%, 50%, 70%, 80%, 90% or more increase over the unfused ligand moiety region. In some cases, the ligand moiety comprising modified transferrin exhibit a serum half-life of about 1 or more days, 1-2 or more days, 3-5 or more days, 5-10 or more days, 10-15 or more days, 10-20 or more days, about 12-18 days or about 14-17 days compared to the ligand in an unfused state.

[00105] When the C domain of Tf is part of the fusion protein, the two N-linked glycosylation sites, amino acid residues corresponding to N413 and N611 of SEQ ID NO: 3 may be mutated for expression in a yeast system to prevent glycosylation or hypermannosylationn and extend the serum half-life of the fusion protein (to produce asialo-, or in some instances, monosialo-Tf or disialo-Tf). In addition to Tf amino acids corresponding to N413 and N611, mutations to the residues within or adjacent to the N-X- S/T glycosylation site prevent or substantially reduce glycosylation. See U.S. Patent 5,986,067 of Funk et al. It has also been reported that the N domain of Tf expressed in Pichia pastoris becomes O-lmked glycosylated with a single hexose at S32 which also may be mutated or modified to prevent such glycosylation. Moreover, O-linked glycosylation may be reduced or eliminated in a yeast host cell with mutations in one ore more of the PMT genes.

[00106] Accordingly, in one embodiment of the invention, the fusion protein includes a modified transferrin molecule wherein the transferrin exhibits reduced glycosylation, including but not limited to asialo- monosialo- and disialo- forms of Tf. In another embodiment, the transferrin portion of the fusion protein includes a recombinant transferrin mutant that is mutated to prevent glycosylation. In another embodiment, the transferrin portion of the fusion protein includes a recombinant transferrin mutant that is fully glycosylated. In a further embodiment, the transferrin portion of the fusion protein includes a recombinant human serum transferrin mutant that is mutated to prevent glycosylation, wherein at least one of Asn413 and Asnόl l of SEQ ID NO:3 are mutated to an amino acid which does not allow glycosylation. In another embodiment, the transferrin portion of the

fusion protein includes a recombinant human serum transferrin mutant that is mutated to prevent or substantially reduce glycosylation, wherein mutations may to the residues within the N-X-S/T glycosylation site. Moreover, glycosylation may be reduced or prevented by mutating the serine or threonine residue. Further, changing the X to proline is known to inhibit glycosylation.

[00107] As discussed below in more detail, modified Tf fusion proteins, comprising a modified Tf, of the invention may also be engineered to not bind iron and/or not bind the Tf receptor. In other embodiments of the invention, iron binding is retained, and the iron binding ability of Tf may be used to deliver a therapeutic protein or peptide(s) to the inside of a cell and/or across the blood brain barrier (BBB). The N domain alone will not bind to TfR when loaded with iron, and the iron bound C domain will bind TfR but not with the same affinity as the whole molecule.

[00108] In another embodiment, the transferrin portion of the transferrin fusion protein, includes a recombinant transferrin mutant having a mutation wherein the mutant does not retain the ability to bind metal ions. In an alternate embodiment, the transferrin portion of the transferrin fusion protein includes a recombinant transferrin mutant having a mutation wherein the mutant has a weaker binding affinity for metal ions than wild-type serum transferrin. In an alternate embodiment, the transferrin portion of the transferrin fusion protein includes a recombinant transferrin mutant having a mutation wherein the mutant has a stronger binding affinity for metal ions than wild-type serum transferrin.

[00109] In another embodiment, the transferrin portion includes a recombinant transferrin mutant having a mutation wherein the mutant does not retain the ability to bind to the transferrin receptor. In an alternate embodiment, the transferrin portion includes a recombinant transferrin mutant having a mutation wherein the mutant has a weaker binding affinity for the transferrin receptor than wild-type serum transferrin. In an alternate embodiment, the transferrin portion includes a recombinant transferrin mutant having a mutation wherein the mutant has a stronger binding affinity for the transferrin receptor than wild-type serum transferrin.

[00110] In another embodiment, the transferrin portion includes a recombinant transferrin mutant having a mutation wherein the mutant does not retain the ability to bind to carbonate ions. In an alternate embodiment, the transferrin portion includes a recombinant transferrin mutant having a mutation wherein the mutant has a weaker binding affinity for carbonate ions

than wild-type serum transferrin. In an alternate embodiment, the transferrin portion includes a recombinant transferrin mutant having a mutation wherein the mutant has a stronger binding affinity for carbonate ions than wild-type serum transferrin.

[00111] In another embodiment, the transferrin portion includes a recombinant human serum transferrin mutant having a mutation in at least one amino acid residue selected from the group consisting of Asρ63, Gly65, Tyr95, Tyrl88, His249, Asp392, Tyr426, Tyr514, Tyr517 and His585 of SEQ ID NO: 3, wherein the mutant retains the ability to bind metal ions. In an alternate embodiment, a recombinant human serum transferrin mutant having a mutation in at least one amino acid residue selected from the group consisting of Asp63, Gly65, Tyr95, Tyrl88, His249, Asp392, Tyr426, Tyr514, Tyr517 and His585 of SEQ ID NO: 3, wherein the mutant has a reduced ability to bind metal ions. In another embodiment, a recombinant human serum transferrin mutant having a mutation in at least one amino acid residue selected from the group consisting of Asρ63, Gly65, Tyr95, Tyrl88, His249, Asρ392, Tyr426, Tyr517 and His585 of SEQ ID NO: 3, wherein the mutant does not retain the ability to bind metal ions.

[00112] In another embodiment, the transferrin portion includes a recombinant human serum transferrin mutant having a mutation at Lys206 or His207 of SEQ ID NO: 3, wherein the mutant has a stronger binding avidity for metal ions than wild-type human serum transferrin (see U.S. Patent 5,986,067, which is herein incorporated by reference in its entirety). In an alternate embodiment, the transferrin portion includes a recombinant human serum transferrin mutant having a mutation at Lys206 or His207 of SEQ ID NO: 3, wherein the mutant has a weaker binding avidity for metal ions than wild-type human serum transferrin. In a further embodiment, the transferrin portion includes a recombinant human serum transferrin mutant having a mutation at Lys206 or His207 of SEQ ID NO:3, wherein the mutant does not bind metal ions.

[00113] Any available technique may be used to produce the fusion protein of the invention, including but not limited to molecular techniques commonly available, for instance, those disclosed in Sambrook et al. Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, 1989. When carrying out nucleotide substitutions using techniques for accomplishing site-specific mutagenesis that are well known in the art, the encoded amino acid changes are preferably of a minor nature, that is, conservative amino acid substitutions, although other, non-conservative, substitutions are contemplated as well, particularly when producing a modified transferrin portion, e.g., a modified fusion protein exhibiting reduced

glycosylation, reduced iron binding and the like. Specifically contemplated are amino acid substitutions, small deletions or insertions, typically of one to about 30 amino acids; insertions between transferrin domains; small amino- or carboxyl-terminal extensions, such as an amino-terminal methionine residue, or small linker peptides of less than 50, 40, 30, 20 or 10 residues between transferrin domains or linking a transferrin protein and therapeutic protein or peptide, ligand, or an antibody variable region or stalk region; or a small extension that facilitates purification, such as a poly-histidine tract, an antigenic epitope or a binding domain.

[00114] Examples of conservative amino acid substitutions are substitutions made within the same group such as within the group of basic amino acids (such as arginine, lysine, histidine), acidic amino acids (such as glutamic acid and aspartic acid), polar amino acids (such as glutamine and asparagine), hydrophobic amino acids (such as leucine, isoleucine, valine), aromatic amino acids (such as phenylalanine, tryptophan, tyrosine) and small amino acids (such as glycine, alanine, serine, threonine, methionine).

[00115] Non-conservative substitutions encompass substitutions of amino acids in one group by amino acids in another group. For example, a non-conservative substitution would include the substitution of a polar amino acid for a hydrophobic amino acid. For a general description of nucleotide substitution, see, e.g., Ford et al. (1991), Prot. Exp. Pur. 2: 95-107. Non-conservative substitutions, deletions and insertions are particularly useful to produce Tf fusion proteins, preferably trans-bodies, of the invention that exhibit no or reduced binding of iron and/or no or reduced binding of the fusion protein to the Tf receptor.

[00116] In the polypeptide and proteins of the invention, the following system is followed for designating amino acids in accordance with the following conventional list:

[00117] TABLE OF AMINO ACIDS

[00118] Iron binding and/or receptor binding may be reduced or disrupted by mutation, including deletion, substitution or insertion into, amino acid residues corresponding to one or more of Tf N domain residues Asp63, Tyr95, Tyrl88, His249 and/or C domain residues Asp 392, Tyr 426, Tyr 514 and/or His 585 of SEQ ID NO: 3. Iron binding may also be affected by mutation to amino acids Lys206, His207 or Arg632 of SEQ ID NO: 3. Carbonate binding may be reduced or disrupted by mutation, including deletion, substitution or insertion into, amino acid residues corresponding to one or more of Tf N domain residues Thrl20, Argl24, Alal26, GIy 127 and/or C domain residues Thr 452, Arg 456, Ala 458 and/or GIy 459 of SEQ ID NO: 3. A reduction or disruption of carbonate binding may adversely affect iron and/or receptor binding.

[00119] Binding to the Tf receptor may be reduced or disrupted by mutation, including deletion, substitution or insertion into, amino acid residues corresponding to one or more of TfN domain residues described above for iron binding.

[00120] As discussed above, glycosylation may be reduced or prevented by mutation, including deletion, substitution or insertion into, amino acid residues corresponding to one or more of Tf C domain residues within the N-X-S/T sites corresponding to C domain residues N413 and/or N611. See U.S. Patent No. 5,986,067. For instance, the N413 and/or N611 may be mutated to GIu residues as may be the adjacent amino acids.

[00121] In instances where the Tf fusion proteins of the invention are not modified to prevent glycosylation, iron binding, carbonate binding and/or receptor binding, glycosylation, iron and/or carbonate ions may be stripped from or cleaved off of the fusion protein. For

instance, available deglycosylases may be used to cleave glycosylation residues from the fusion protein, in particular the sugar residues attached to the Tf portion, yeast deficient in glycosylation enzymes may be used to prevent glycosylation and/or recombinant cells may be grown in the presence of an agent that prevents glycosylation, e.g., tunicamycin.

[00122] The carbohydrates on the fusion protein may also be reduced or completely removed enzymatically by treating the fusion protein with deglycosylases. Deglycosylases are well known in the art. Examples of deglycosylases include, but are not limited to, galactosidase, PNGase A, PNGase F, glucosidase, mannosidase, fucosidase, and Endo H deglycosylase.

[00123] Additional mutations may be made to Tf to alter the three dimensional structure of Tf ₅ such as modifications to the hinge region to prevent the conformational change needed for iron binding and Tf receptor recognition. For instance, mutations may be made in or around N domain amino acid residues 94-96, 245-247 and/or 316-318 as well as C domain amino acid residues 425-427, 581-582 and/or 652-658. In addition, mutations may be made in or around the flanking regions of these sites to alter Tf structure and function.

[00124] In one aspect of the invention, the fusion protein can function as a carrier protein to extend the half life or bioavailability of the ligand as well as, in some instances, delivering the ligand inside cells, and retains the ability to cross the blood brain barrier. In an alternate embodiment, the fusion protein includes a modified transferrin molecule wherein the transferrin does not retain the ability to cross the blood brain barrier.

[00125] In another embodiment, the fusion protein includes a modified transferrin molecule - wherein the transferrin molecule retains the ability to bind to the transferrin receptor and transport the antibody variable region inside cells. In an alternate embodiment, the fusion protein includes a modified transferrin molecule wherein the transferrin molecule does not retain the ability to bind to the transferrin receptor and transport the antibody variable region inside cells.

[00126] In further embodiments, the fusion protein includes a modified transferrin molecule wherein the transferrin molecule retains the ability to bind to the transferrin receptor and transport the antibody variable region inside cells, but does not retain the ability to cross the blood brain barrier. In an alternate embodiment, the fusion protein includes a modified transferrin molecule wherein the transferrin molecule retains the ability to cross the blood brain barrier, but does not retain the ability to bind to the transferrin receptor and transport the antibody variable region inside cells.

[00127] Transferrin Fusion Proteins

[00128] The fusion proteins of the invention may contain one or more copies of the ligand, antibody variable region or random peptide attached to the N-terminus and/or the C-terminus of the Tf protein. In one embodiment, the ligand moiety is attached to the N-terminus of the Tf protein. In some embodiments, the ligand, variable region or peptide is attached to both the N- and C-terminus of the Tf protein and the fusion protein may contain one or more equivalents of these regions on either or both ends of Tf.

[00129] In other embodiments, the one or more ligands are inserted into the transferrin peptide, for instance at known domains of the Tf protein such as into one or more of the loops of Tf. See AIi et al. (1999) J Biol. Chem. 274(34):24066-24073.

[00130] In one embodiment of the invention, the ligand is inserted in the N lobe of transferrin. For instance, the invention also includes one or more insertions can be made at or around other positions in the N ₁ and N ₂ domains of the N-lobe as shown in the table below.

Ni N ₂

^" Asp33 ^" SerlO5 ^" ^{^~
Asn55 Glul41}

Asn75 Asp 166

Asρ90 Glnl84

Gly257 Asp 197

Lys280 ^" Lys217

^" His289 Thr231

Ser298 Cys241

[00131] Generally, the transferrin fusion protein of the invention may have one modified transferrin-derived region and one antibody variable region. Multiple regions of each protein, however, may be used to make a transferrin fusion protein of the invention. Similarly, more than one antibody variable region may be used to make a transferrin fusion protein of the invention, thereby producing a multi-functional modified Tf fusion protein.

[00132] In one embodiment, the fusion protein of the invention contains an antibody variable region or portion thereof fused to a transferrin molecule or portion thereof. In another embodiment, the fusion protein of the inventions contains an antibody variable region fused to the N terminus of a transferrin molecule. In an alternate embodiment, the fusion protein of the invention contains an antibody variable region fused to the C terminus of a transferrin molecule. In a further embodiment, the fusion protein of the invention contains a transferrin molecule fused to the N terminus of an antibody variable region. In an alternate embodiment, the fusion protein of the invention contains a transferrin molecule fused to the C terminus of an antibody variable region.

[00133] The present invention also provides a fusion protein containing an antibody variable region or portion thereof fused to a modified transferrin molecule or portion thereof.

[00134] In other embodiments, the fusion protein of the inventions contains an antibody variable region fused to both the N-terminus and the C~terminus of modified transferrin. In another embodiment, the antibody variable regions fused at the N- and C- termini bind the same antigens. Also, the antibody variable regions that bind the same antigen may be derived from different antibodies, and thus, bind different epitopes on the same target. In an alternate embodiment, the antibody variable regions fused at the N- and C- termini bind different antigens. In another alternate embodiment, the antibody variable regions fused to the N- and C- termini bind different antigens which may be useful for activating two different cells for the treatment or prevention of disease, disorder, or condition. In another embodiment, the antibody variable regions fused at the N- and C- termini bind different antigens which may be useful for bridging two different antigens for the treatment or prevention of diseases or disorders which are known in the art to commonly occur in patients simultaneously.

[00135] Additionally, transferrin fusion protein of the invention may also be produced by inserting the antibody variable region of interest, e.g., a single chain antibody that binds a therapeutic protein or a fragment or variant thereof, into an internal region of the modified transferrin. Internal regions of modified transferrin include, but are not limited to, the loop regions, the iron binding sites, the hinge regions, the bicarbonate binding sites or the receptor binding domain.

[00136] Within the protein sequence of the modified transferrin molecule a number of loops or turns exist, which are stabilized by disulfide bonds. These loops are useful for the insertion, or internal fusion, of therapeutically active peptides, preferably antibody variable

regions, particularly those requiring a secondary structure to be functional, or therapeutic proteins, preferably antibody variable region, to generate a modified transferrin molecule with specific biological activity.

[00137] When ligands such as antibody variable regions, preferably CDRs, are inserted into or replace at least one loop of a Tf molecule, insertions may be made within any of the surface exposed loop regions, in addition to other areas of Tf. For instance, insertions may be made within the loops comprising Tf amino acids 32-33, 74-75, 256-257, 279-280 and 288- 289. See AIi et al, supra. As previously described, insertions may also be made within other regions of Tf such as the sites for iron and bicarbonate binding, hinge regions, and the receptor binding domain as described in more detail below. The loops in the Tf protein sequence that are amenable to modification/replacement for the insertion of proteins or peptides may also be used for the development of a screenable library of random peptide inserts. Any procedures may be used to produce nucleic acid inserts for the generation of peptide libraries, including available phage and bacterial display systems, prior to cloning into a Tf domain and/or fusion to the ends of Tf.

[00138] The N-terminus of Tf is free and points away from the body of the fusion protein. Fusions of a ligand or ligands on the N-teπninus of transferrin is one embodiment of the invention. Such fusions may include a linker region, such as but not limited to a poly-glycine stretch or a PEAPTD linker (SEQ ID NO.: 18) to separate the ligand from Tf.

[00139] The C-terminus of Tf appears may be buried or partially buried and secured by a disulfide bond 6 amino acids from the C-terminus. In human Tf, the C-terminal amino acid is a proline which, depending on the way that it is orientated, will either point a fusion protein away or into the body of the molecule. A linker or spacer moiety at the C-terminus may be used in some embodiments of the invention. There is also a proline near the N-terminus. In one aspect of the invention, the proline at the N- and/or the C- termini may be modified or substituted with another amino acid. In another aspect of the invention, the C-terminal disulfide bond may be eliminated to untether the C-terminus.

[00140] Stalk Moiety

[00141] The stalk moiety of the invention is fused at its N-terminus to a transferrin moiety or ligand and may optionally be fused with an anchor moiety at its C-terminus. When expressed in a yeast cell, the C-terminus of the stalk moiety is located within the cell, for instance,

within the cell wall. In one embodiment of the invention, the stalk moiety acts as a cell wall linking member to covalently or non-covalently bind the fusion protein to the cell wall of a yeast cell.

[00142] The stalk moiety of the present invention has a rod-like or brash-like conformation. This type of conformation is typical of a moderately to heavily glycosylated peptide. The stalk moiety of the invention contains N-glycans or O-glycans. See U.S. 6,114,147 which is herein incorporated by reference in its entirety. The presence of O-glycans is preferred over N-glycans because O-glycans allow the stalk moiety to take on more of an extended, rod-like conformation as compared to N-glycans. The stalk moiety may also contain moderate to heavy glycosylation of serine and threonine glycosylation sites.

[00143] The stalk moiety of the fusion protein of the invention contains a moderate to high percentage of serine or threonine residues. For instance, the invention includes a stalk moiety with at least about 5% or more serine and/or threonine residues, at least about 10% or more serine and/or threonine residues, at least about 20% or more or more serine and/or threonine residues, at least about 30% or more or more serine and/or threonine residues, at least about 40% or more or more serine and/or threonine residues, at least about 50% or more or more serine and/or threonine residues, at least about 60% or more or more serine and/or threonine residues, at least about 70% or more or more serine and/or threonine residues, at least about 80% or more or more serine and/or threonine residues, or at least about 90% or more or more serine and/or threonine residues. In one embodiment of the invention, the stalk moiety contains about 20-30% serine and/or threonine residues, about 20-40% serine and/or threonine residues, about 30-40% serine and/or threonine residues, about 20-50% serine and/or threonine residues, about 30-50% serine and/or threonine residues, about 20-60 serine and/or threonine residues or about 30-60% serine and/or threonine residues.

[00144] The stalk moiety may contain at least about 5% or more N- or O-glycans by weight, at least about 10% or more N- or O-glycans by weight, at least about 20% or more N- or O- glycans by weight, at least about 30% or more N- or O- glycans by weight, at least about 40% or more N- or O- glycans by weight, at least about 50% or more N- or O-glycans by weight, at least about 60% or more N- or O-glycans by weight, at least about 70% or more N- or O- glycans by weight, at least about 80% or more N- or O-glycans by weight, or at least about 90% or more N- or O-glycans by weight. In one embodiment of the invention, the stalk moiety contains about 20-30% O-glycans by weight, about 20-40% O-glycans by weight, about 30-40% O-glycans by weight, about 20-50% O-glycans by weight, about 30-50% O-

glycansby weight, about 20-60% O-glycans by weight or about 30-60% O- glycans. In another embodiment, the presence of glycans, in particular O-glycans, allows the stalk moiety to crosslink with beta glucans present in proteins of the cell wall. As such, the stalk moiety of the invention is capable of functioning as a cell wall linking member.

[00145] The stalk moiety can comprise a mucin protein or portion of a mucin protein, i.e. a member of the MUC-type proteins. MUC-type mucins are a family of structurally related molecules that are heavily glycosylated and are expressed in epithelia of the respiratory, gastrointestinal, and reproductive tracts, e.g., MUCl (GenBank Accession No. AF125525), MUC2 (GenBank Accession No L21998), MUC3 (GenBank Accession No AFl 13616), MUC4 (GenBank Accession No AJ000281), MUC5AC (GenBank Accession No U83139), MUC5B (GenBank Accession No AJ001402), MUC6 (GenBank Accession No U97698), MUC7 (GenBank Accession No L13283), MUC8 (GenBank Accession No U14383), MUC9 (GenBank Accession No AW271430). In one embodiment of the invention, the stalk moiety contains hMUCl or a portion of the hMUCl protein, for instance, SEQ ID NO.: 71 encoded by the nucleic acid of SEQ ID NO.: 70 as well as the polypeptide encoded by the nucleic acid of SEQ ID NO: 5. In another embodiment of the invention, the stalk moiety contains hMUC3 or a portion of the hMUC3 protein. For instance, the invention includes the hMUC3 stalk of SEQ ID NO.: 69 which is encoded by the nucleic acid of SEQ ID NO.: 68. The fusion protein of the invention also includes stalks comprising variants such as analogs and derivatives of mucin proteins and portions thereof.

[00146] The stalk moiety of the present invention can also be derived from glycosylated proteins other than mucin, including, but not limited to, AGAl (for instance, SEQ ID NO.: 73 , encoded by the nucleic acid sequence of SEQ ID NO. : 72), MAdCAM- 1 , GIyCAM- 1 , CD34; consensus repeats from E-selectin, P-selectin, or L-selectin; or viral glycoprotein spikes (such as influenza, herpes simplex, human immunodeficiency, or tobacco mosaic virus) and variants and fragments thereof. See WO 01/46698, Girard et al. (1995) Immunity 2:113-123, and Van Kinken et al. (1998) Anal. Biochem. 265:103-116, all of which are herein incorporated by reference in their entireties. The invention includes repeats of two or more glycosylated proteins or fragments thereof as well as combinations of two or more types of glycosylated proteins.

[00147] In another embodiment of the invention, the stalk is engineered to contain one or more free cysteine residues. The one or more free cysteine residues are capable of forming disulfide bonds with free cysteine residues of proteins in the cell wall of a yeast cell. The

formation of one or more disulfide bonds within the cell wall represents another method that can be used to engineer a stalk moiety capable of functioning as a cell wall binding member.

[00148] The stalk moiety of the present invention must be of sufficient length to span the entire cell wall of a yeast cell. Preferrably, the N-terminus of the stalk moiety is situated on the outside of the cell wall, most preferably, extended in a rod-like configuration away from the yeast cell to reduce steric hindrance between the transferrin moiety and ligand and the host yeast cell. The stalk moiety should be at least about 25 amino acids, at least about 50 amino acids, at least about 75 amino acids, at least about 100 amino acids, at least about 125 amino acids, at least about 150 amino acids, at least about 175 amino acids, at least about 200 amino acids, at least about 225 amino acids, at least about 250 amino acids, at least about 275 amino acids, at least about 300 amino acids, at least about 325 amino acids, at least about 350 amino acids, at least about 375 amino acids, at least about 400 amino acids, at least about 425 amino acids, at least about 450 amino acids, at least about 475 amino acids in length, at least about 500 amino acids in length, at least about 525 amino acids in length, at least about 550 amino acids in length, at least about 575 amino acids in length, at least about 600 amino acids in length, at least about 625 amino acids in length, or at least about 650 amino acids in length. In one embodiment, the stalk moiety is about 500 amino acids in length. In another embodiment, the stalk moiety is about 300 to 600 amino acids in length.

[00149] Anchor Moiety

[00150] The optional anchor moiety of the fusion protein of the present invention is a portion of the fusion protein that physically tethers the fusion protein to a host cell surface or substrate surface. For instance, an anchor moiety can tether or immobilize the fusion protein to a yeast cell membrane or a yeast cell wall. When the anchor tethers the fusion protein to a yeast cell wall it is a cell wall linking member.

[00151] The anchor moiety can transiently tether a fusion protein to a yeast cell wall or cell membrane. In one embodiment of the invention, the anchor moiety transiently tethers a fusion protein to a yeast cell wall or cell membrane which provides an opportunity for the stalk moiety to become covalently or non-covalently bound to the cell wall. For instance, the transient tethering of an anchor in a yeast cell may allow O-glycans from a stalk moiety to crosslink with beta glucans of the cell wall.

[00152] In one embodiment of the present invention, the anchor moiety sticks into cell membranes or walls of microorganisms, preferably lower eukaryotes, e.g., yeasts and molds. The moiety may have a long C terminus which anchors it in the cell membrane or cell wall with amino acids such as proline (Kok (1990) FEMS Microbiology Reviews 87: 15-42).

[00153] An anchor moiety can be anchored to a cell by use of a glycosyl phosphatidylinositol (GPI) anchor. See Conzelmann et al. EMBO 9: 653-661 and Lipke and Ovalle (1998) J. Bacterid. 180: 3735-3740. A GPI signal sequence peptide, such as the GPI signal peptides disclosed herein, signals for attachment of GPI to the C terminus of the fusion protein. The GPI signal itself has three domains: the region containing the GPI attachment site (the ω site) plus the first and second amino acids downstream of the ω site, a spacer of 5 to 10 amino acids, and a hydrophobic stretch of 10 to 15 amino acids. A protein containing the GPI signal is cleaved at the ω site, and the resulting carboxy terminus of the protein is covalently bound to the GPI moiety. This reaction occurs in the endoplasmic reticulum. Being associated with membranes by means of the GPI moiety, GPI-attached proteins are then transported to the cell surface and remain on the plasma membrane as GPI-anchored proteins if the proteins contain basic residues (R and/or K) in the short ω-minus region. GPI- associated proteins with V, I, or L at the ω -4/-5 site and Y or N at the ω -2 site are incorporated in the cell membrane. See Hamada et al. (1999) J. Bacteriol. 181: 3886-3889; Nuoffer et al. (1993) J. Biol. Chem. 268: 10558-10563; De Nobel et al. (1994) Trends Cell Biol. 4: 42-45.; Hamada et al. (1998) MoI. Gen. Genet. 258: 53-59; and Van Der Vaart et al. (1998) Biotechnol. Genet. Eng. Rev. 15: 387-411.

[00154] In one embodiment of the invention, yeast GPI YIR019C is used to provide the anchor moiety of the transferrin fusion protein. Figure 2 provides a diagram of the GPI YIROl 9C. The ω site in the amino acid sequence (SEQ ID NO: 15) is glycine and is illustrated as having a space on either side of it. The spaces are indicative of spacer regions on either side of the ω site. The I and Y amino acids in bold-faced print are the ω -5/-4 and - 2 sites, respectively.

[00155] Several Saccharomyces anchor moieties are known in the art and can be used to construct the fusion proteins of the present invention. Other examples of yeast GPI signal proteins include, but are not limited to, YDR534C, YNL327W, YOR214C, YDRl 34C, YPL130, YOR009W, YER150W, YDR077W, YOR383C, YJR151C, YJR004, YJL078C, YLRl 1OC, and YNL300W. Further, GPI signal proteins can be used from other organisms

such as the GPI of EPAl of Canidida glabrata, Hwplp of Candida albicans, or VSG of Trypanosoma brucei.

[00156] In one embodiment of the invention, the anchor moiety is a mammalian moiety or derivative or fragment thereof. In another embodiment of the invention, a GPI signal peptide is a mammalian GPI signal protein. For instance, the present invention includes derivatives of human MDP GPI signal protein such as those disclosed in Table 1 (see Example 5).

[00157] The invention also includes a fusion protein comprising an anchor moiety with one or more unbound cysteine residues. The cysteine residues can act to tether the fusion protein to the cell by forming disulfide bonds with cysteine residues of proteins in the cell wall.

[00158] The invention includes fusion proteins comprising a transmembrane domains (TMD) as an anchor moiety. In one embodiment of the invention, the TMD is a region of a single pass type I or type II membrane protein. For instance, the invention includes, but is not limited to, residues 70-98 of FUSl.

[00159] In another embodiment of the invention, the TMD comprises one or more of the several transmembrane regions of a multispan membrane protein. In one embodiment of the invention, the TMD is a hydrophobic region of a multispan membrane protein comprising about 10 to 60 amino acids, about 15 to 60 amino acids, about 20 to 60 amino acids, about 30 to 60 amino acids or about 25 to 50 amino acids. For instance, the invention includes, but is not limited to, one or more TMDs from STE6 of Saccharomyces from the group consisting of residues 25-30, 73-100, 171-198, 249-277, 714-742, 761-789, 838-858, 864-884, 940-967 and 979-1000 {Saccharomyces Genome Database annotation).

[00160] In another embodiment, the anchor moiety is used to tether the transferrin fusion protein to a solid substrate such as a microarray. The anchor moiety is preferably a short epitope tag (i.e. a sequence recognized by an antibody, typically a monoclonal antibody) such as polyhistidine, SEAP, or Ml and M2 flag. See Bush et al. (1991) J. Biol. Chem. 266: 13811-13814, Berger et al (1988) Gene 66: 1-10, U.S. Patent 5,011,912, U.S. Patent 4,851,341, U.S. Patent 4,703,004, and U.S. Patent 4,782,137, all of which are incorporated by reference in their entirety. In one embodiment, the stalk domain is tethered to a substrate by an anti-stalk sequence antibody such as an anti-mucin antibody.

[00161] Albumin

[00162] The invention also includes a fusion protein which employs a protein or protein fragment other than transferrin to "present" a ligand to a target. Suitable proteins are ones which are soluble and at least about 50 amino acids in length or longer. In one embodiment of the invention, the protein or protein fragment contains a secondary structure similar to that of transferrin.

[00163] It is preferable that the protein or fragment thereof be capable of increasing the half- life of the ligand when cleaved from the stalk portion of the fusion protein and used as a therapeutic. For instance, the present invention envisions the use of a fusion protein containing an albumin moiety, a stalk moiety and a cell wall linking member. The albumin moiety is capable of conferring increased serum half-life to the ligand, i.e., therapeutic, when the albumin and ligand portion of the fusion protein is cleaved from the remainder of the fusion protein and administered to a patient in need of the ligand as a therapeutic.

[00164] A fusion protein containing an albumin moiety may contain an albumin protein, an albumin variant or a fragment thereof. In one embodiment, the albumin protein comprises the amino acid sequence of SEQ ID NO.: 61 which is encoded by the nucleic acid sequence of SEQ ID NO.: 66. The invention includes modifications of albumin that are known in the art.

[00165] Nucleic Acids

[00166] Nucleic acid molecules are also provided by the present invention. These encode a modified Tf fusion protein comprising a transferrin protein or a portion of a transferrin protein covalently linked or joined to a ligand moiety. The fusion protein may further comprise a linker region, for instance a linker less than about 50, 40, 30, 20, or 10 amino acid residues. The linker can be covalently linked to and between the transferrin protein or portion thereof and the ligand portion. Nucleic acid molecules of the invention may be purified or not.

[00167] Host cells and vectors for replicating the nucleic acid molecules and for expressing the encoded fusion proteins are also provided. Any vectors or host cells may be used, whether prokaryotic or eukaryotic, but eukaryotic expression systems, in particular yeast expression systems, may be preferred. Many vectors and host cells are known in the art for

such purposes. It is well within the skill of the art to select an appropriate set for the desired application.

[00168] DNA sequences encoding transferrin, portions of transferrin and therapeutic proteins of interest may be cloned from a variety of genomic or cDNA libraries known in the art. The techniques for isolating such DNA sequences using probe-based methods are conventional techniques and are well known to those skilled in the art. Probes for isolating such DNA sequences may be based on published DNA or protein sequences (see, for example, Baldwin, G.S. (1993) Comparison of Transferrin Sequences from Different Species. Comp. Biochem. Physiol. 106B/1 :203-218 and all references cited therein, which are hereby incorporated by reference in their entirety). Alternatively, the polymerase chain reaction (PCR) method disclosed by Mullis et al. (U.S. Pat. No. 4,683,195) and Mullis (U.S. Pat. No. 4,683,202), incorporated herein by reference may be used. The choice of library and selection of probes for the isolation of such DNA sequences is within the level of ordinary skill in the art.

[00169] As known in the art, "similarity" between two polynucleotides or polypeptides is determined by comparing the nucleotide or amino acid sequence and its conserved nucleotide or amino acid substitutes of one polynucleotide or polypeptide to the sequence of a second polynucleotide or polypeptide. Also known in the art is "identity" which means the degree of sequence relatedness between two polypeptide or two polynucleotide sequences as determined by the identity of the match between two strings of such sequences. Both identity and similarity can be readily calculated (Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991).

[00170] While there exist a number of methods to measure identity and similarity between two polynucleotide or polypeptide sequences, the terms "identity" and "similarity" are well known to skilled artisans (Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991; and Carillo, H., and Lipman, D., SIAM J. Applied Math., 48: 1073 (1988). Methods commonly employed to determine identity or similarity between two sequences include, but are not limited to those disclosed in Guide to Huge Computers,

Martin J. Bishop, ed., Academic Press, San Diego, 1994, and Carillo, H., and Lipman, D., SIAM J. Applied Math. 48:1073 (1988).

[00171] Preferred methods to determine identity are designed to give the largest match between the two sequences tested. Methods to determine identity and similarity are codified in computer programs. Preferred computer program methods to determine identity and similarity between two sequences include, but are not limited to, GCG program package (Devereux, et al., Nucl. Acid Res. 12(1):387 (1984)), BLASTP, BLASTN, FASTA (Atschul, et al, J. MoI. Biol. 215:403 (1990)). The degree of similarity or identity referred to above is determined as the degree of identity between the two sequences, often indicating a derivation of the first sequence from the second. The degree of identity between two nucleic acid sequences may be determined by means of computer programs known in the art such as GAP provided in the GCG program package (Needleman and Wunsch J. MoI. Biol. 48:443-453 (1970)). For purposes of determining the degree of identity between two nucleic acid sequences for the present invention, GAP is used with the following settings: GAP creation penalty of 5.0 and GAP extension penalty of 0.3.

[00172] Codon Optimization

[00173] The degeneracy of the genetic code permits variations of the nucleotide sequence of a transferrin protein and/or therapeutic protein of interest, while still producing a polypeptide having the identical amino acid sequence as the polypeptide encoded by the native DNA sequence. The procedure, known as "codon optimization" (described in U.S. Patent 5,547,871 which is incorporated herein by reference in its entirety) provides one with a means of designing such an altered DNA sequence. The design of codon optimized genes should take into account a variety of factors, including the frequency of codon usage in an organism, nearest neighbor frequencies, RNA stability, the potential for secondary structure formation, the route of synthesis and the intended future DNA manipulations of that gene. In particular, available methods may be used to alter the codons encoding a given fusion protein with those most readily recognized by yeast when yeast expression systems are used.

[00174] The degeneracy of the genetic code permits the same amino acid sequence to be encoded and translated in many different ways. For example, leucine, serine and arginine are each encoded by six different codons, while valine, proline, threonine, alanine and glycine are each encoded by four different codons. However, the frequency of use of such

synonymous codons varies from genome to genome among eukaryotes and prokaryotes. For example, synonymous codon-choice patterns among mammals are very similar, while evolutionarily distant organisms such as yeast (S. cerevisiae), bacteria (such as E. colϊ) and insects (such as D. melanogaster) reveal a clearly different pattern of genomic codon use frequencies (Grantham, R., et al, Nucl. Acid Res., 8, 49-62 (1980); Grantham, R., et al, Nucl. Acid Res., 9, 43-74 (1981); Maroyama, T., et al, Nucl. Acid Res., 14, 151-197 (1986); Aota, S., et al, Nucl. Acid Res., 16, 315-402 (1988); Wada, K., et al., Nucl. Acid Res., 19 Supp., 1981-1985 (1991); Rutland, C. G., FEBS Lett, 285, 165-169 (1991)). These differences in codon-choice patterns appear to contribute to the overall expression levels of individual genes by modulating peptide elongation rates. (Kurland, C. G., FEBS Lett., 285, 165-169 (1991); Pedersen, S., EMBO J., 3, 2895-2898 (1984); Sorensen, M. A., J. MoL Biol., 207, 365-377 (1989); Randall, L. L., et al, Eur. J. Biochem., 107, 375-379 (1980); Curran, J. F., and Yarus, M., J. MoI. Biol., 209, 65-77 (1989); Varenne, S., et al, J. MoI. Biol, 180, 549-576 (1984), Varenne, S., et al, J. MoI, Biol., 180, 549-576 (1984); Garel, J.-P., J. Theor. Biol, 43, 211-225 (1974); Ikemura, T., J. MoI. Biol., 146, 1-21 (1981); Ikemura, T, J. MoI. Biol., 151, 389-409 (1981)).

[00175] Codon usage frequencies for a synthetic gene should reflect the codon usages of nuclear genes derived from the exact (or as closely related as possible) genome of the cell/organism that is intended to be used for recombinant protein expression, particularly that of yeast species. As discussed above, in one embodiment the human Tf sequence is codon optimized, before or after modification as herein described for yeast expression as may be the therapeutic protein nucleotide sequence(s).

[00176] Vectors

[00177] Expression units for use in the present invention will generally comprise the following elements, operably linked in a 5' to 3' orientation: a transcriptional promoter, a secretory signal sequence, a DNA sequence encoding a modified Tf fusion protein comprising transferrin protein or a portion of a transferrin protein joined to a DNA sequence encoding a therapeutic protein or peptide of interest and a transcriptional terminator. As discussed above, any arrangement of the therapeutic protein or peptide fused to or within the Tf portion may be used in the vectors of the invention. The selection of suitable promoters,

signal sequences and terminators will be determined by the selected host cell and will be evident to one skilled in the art and are discussed more specifically below.

[00178] Suitable yeast vectors for use in the present invention are described in U.S. Patent 6,291,212 and include YRp7 (Struhl et al, Proc. Natl. Acad. Sci. USA 76: 1035-1039, 1978), YEpl3 (Broach et al, Gene 8: 121-133, 1979), pJDB249 and pJDB219 (Beggs, Nature 275:104-108, 1978), pPPC0005, pSeCHSA, pScNHSA, pC4 and derivatives thereof. Useful yeast plasmid vectors also include pRS403-406, pRS413-416 and the Pichia vectors available from Stratagene Cloning Systems, La Jolla, CA 92037, USA. Plasmids ρRS403, pRS404, pRS405 and pRS406 are Yeast Integrating plasmids (Yips) and incorporate the yeast selectable markers HIS3, TRPl, LEU2 and URA3. Plasmids pRS413~41.6 are Yeast Centromere plasmids (YCps).

[00179] Such vectors will generally include a selectable marker, which may be one of any number of genes that exhibit a dominant phenotype for which a phenotypic assay exists to enable transformants to be selected. Preferred selectable markers are those that complement host cell auxotrophy, provide antibiotic resistance or enable a cell to utilize specific carbon sources, and include LEU2 (Broach et al ibid.), URA3 (Botstein et al, Gene 8: 17, 1979), HIS3 (Struhl et al., ibid.) ox POTl (Kawasaki and Bell, EP 171,142). Other suitable selectable markers include the CA T gene, which confers chloramphenicol resistance on yeast cells. Preferred promoters for use in yeast include promoters from yeast glycolytic genes (Hitzeman et al., J Biol. Chem. 225: 12073-12080, 1980; Alber and Kawasaki, J. MoI. Appl. Genet. 1 : 419-434, 1982; Kawasaki, U.S. Pat. No. 4,599,311) or alcohol dehydrogenase genes (Young et al, in Genetic Engineering of Microorganisms for Chemicals, Hollaender et al, (eds.), p. 355, Plenum, N.Y., 1982; Ammerer, Meth. Enzymol. 101: 192-201, 1983). In this regard, promoters that can be used are the TPIl promoter (Kawasaki, U.S. Pat. No. 4,599,311) and theADH2-4 ^c (see U.S. Patent 6,291,212 promoter (Russell et al, Nature 304: 652-654, 1983). The expression units may also include a transcriptional terminator. One transcriptional terminator is the TPIl terminator (Alber and Kawasaki, ibid.).

[00180] In addition to yeast, modified fusion proteins of the present invention can be expressed in filamentous fungi, for example, strains of the fungi Aspergillus. Examples of useful promoters include those derived from Aspergillus nidulans glycolytic genes, such as the adh3 promoter (McKnight et al, EMBO J. 4: 2093-2099, 1985) and the tpiA promoter. An example of a suitable terminator is the adh3 terminator (McKnight et al, ibid.). The

expression units utilizing such components may be cloned into vectors that are capable of insertion into the chromosomal DNA of Aspergillus, for example.

[00181] Mammalian expression vectors for use in carrying out the present invention will include a promoter capable of directing the transcription of the modified Tf fusion protein. Preferred promoters include viral promoters and cellular promoters. Preferred viral promoters include the major late promoter from adenovirus 2 (Kaufman and Sharp, MoI. Cell. Biol. 2: 1304-13199, 1982) and the SV40 promoter (Subramani et al, MoI. Cell. Biol. 1 : 854-864, 1981). Preferred cellular promoters include the mouse metallothionein 1 promoter (Palmiter et al, Science 222: 809-814, 1983) and a mouse V6 (see U.S. Patent 6,291,212) promoter (Grant et al, Nuc. Acids Res. 15: 5496, 1987). One such promoter is a mouse V _H (see U.S. Patent 6,291,212) promoter (Loh et al, ibid.). Such expression vectors may also contain a set of RNA splice sites located downstream from the promoter and upstream from the DNA sequence encoding the transferrin fusion protein. Preferred RNA splice sites may be obtained from adenovirus and/or immunoglobulin genes.

[00182] Also contained in the expression vectors is a polyadenylation signal located downstream of the coding sequence of interest. Polyadenylation signals include the early or late polyadenylation signals from SV40 (Kaufman and Sharp, ibid.), the polyadenylation signal from the adenovirus 5 ElB region and the human growth hormone gene terminator (DeNoto et al, Nucl. Acid Res. 9: 3719-3730, 1981). One such polyadenylation signal is the V _H (see U.S. Patent 6,291,212) gene terminator (Loh et al, ibid.). The expression vectors may include a noncoding viral leader sequence, such as the adenovirus 2 tripartite leader, located between the promoter and the RNA splice sites. Preferred vectors may also include enhancer sequences, such as the SV40 enhancer and the mouse: (see U.S. Patent 6,291,212) enhancer (Gillies, Cell 33: 717-728, 1983). Expression vectors may also include sequences encoding the adenovirus VA RNAs.

[00183] Transformation

[00184] Techniques for transforming fungi are well known in the literature, and have been described, for instance, by Beggs (ibid.), Hinnen et al. (Proc. Natl. Acad. Sci. USA 75: 1929- 1933, 1978), Yelton et al, (Proc. Natl. Acad. Sci. USA 81: 1740-1747, 1984), and Russell (Nature 301 : 167-169, 1983). The genotype of the host cell will generally contain a genetic defect that is complemented by the selectable marker present on the expression vector.

Choice of a particular host and selectable marker is well within the level of ordinary skill in the art.

[00185] Cloned DNA sequences comprising modified Tf fusion proteins of the invention may be introduced into cultured mammalian cells by, for example, calcium phosphate- mediated transfection (Wigler et al, Cell 14: 725, 1978; Corsaro and Pearson, Somatic Cell Genetics 7: 603, 1981; Graham and Van der Eb, Virology 52: 456, 1973.) Other techniques for introducing cloned DNA sequences into mammalian cells, such as electroporation (Neumann et al, EMBO J. 1 : 841-845, 1982), or lipofection may also be used. In order to identify cells that have integrated the cloned DNA, a selectable marker is generally introduced into the cells along with the gene or cDNA of interest. Preferred selectable markers for use in cultured mammalian cells include genes that confer resistance to drugs, such as neomycin, hygromycin, and methotrexate. The selectable marker may be an amplifiable selectable marker. One amplifiable selectable marker is the DHFR gene. One amplifiable marker is the DHFR ^1' (see U.S. Patent 6,291,212) cDNA (Simonsen and Levinson, Proc. Natl. Acad. Sci. USA 80: 2495-2499, 1983). Selectable markers are reviewed by Thilly (Mammalian Cell Technology, Butterworth Publishers, Stoneham, Mass.) and the choice of selectable markers is well within the level of ordinary skill in the art.

[00186] Host Cells

[00187] The present invention also includes a cell, preferably a yeast cell transformed to express a modified transferrin fusion protein of the invention, hi addition to the transformed host cells themselves, the present invention also includes a culture of those cells, preferably a monoclonal (clonally homogeneous) culture, or a culture derived from a monoclonal culture, in a nutrient medium. If the polypeptide is secreted, the medium will contain the polypeptide, with the cells, or without the cells if they have been filtered or centrifuged away.

[00188] Host cells for use in practicing the present invention include eukaryotic cells, and in some cases prokaryotic cells, capable of being transformed or transfected with exogenous DNA and grown in culture, such as cultured mammalian, insect, fungal, plant and bacterial cells.

[00189] Fungal cells, including species of yeast {e.g., Saccharomyces spp., Schizosaccharomyces spp., Pichia spp.) may be used as host cells within the present invention. Exemplary genera of yeast contemplated to be useful in the practice, of the

present invention as hosts for expressing the, transferrin fusion protein of the inventions are Pichia (including species formerly classified as Hansenulά), Saccharomyces, Kluyveromyces, Aspergillus, Candida, Torulopsis, Torulaspora, Schizosaccharomyces, Citeromyces, Pachysolen, Zygosaccharomyces, Debaromyces, Trichoderma, Cephalosporium, Humicola, Mucor, Neurospora, Yarrowia, Metschunikowia, Rhodosporidium, Leucosporidium, Botryoascus, Sporidiobolus, Endomycopsis, and the like. Examples of Saccharomyces spp. are S. cerevisiae, S. italicus and S. rouxii. Examples of Kluyveromyces spp. are K. lactis and K. marxianus. A suitable species is T. delbruecldi. Examples of Pichia (Hansenula) spp. are P. angusta (formerly H. polymorpha), P. anomala (formerly H. anomald) and P. pastoris.

[00190] Particularly useful host cells to produce the Tf fusion proteins of the invention are the methanoltrophic Pzc/zzα^αstorø (Steinlein et al. (1995) Protein Express. Purif. 6:619- 624). Pichia pastoris has been developed to be an outstanding host for the production of foreign proteins since its alcohol oxidase promoter was isolated and cloned; its transformation was first reported in 1985. P. pastoris can utilize methanol as a carbon source in the absence of glucose. The P. pastoris expression system can use the methanol-induced alcohol oxidase [AOXT) promoter, which controls the gene that codes for the expression of alcohol oxidase, the enzyme which catalyzes the first step in the metabolism of methanol. This promoter has been characterized and incorporated into a series of P. pastoris expression vectors. Since the proteins produced in P. pastoris are typically folded correctly and secreted into the medium, the fermentation of genetically engineered P. pastoris provides an excellent alternative to E. coli expression systems. A number of proteins have been produced using this system, including tetanus toxin fragment, Bordatella pertussis pertactin, human serum albumin and lysozyme.

[00191] The transformation of F. oxysporum may, for instance, be carried out as described by Malardier et al (1989) Gene 78:147-156.

[00192] Strains of the yeast Saccharomyces cerevisiae are another preferred host. In a one embodiment, a yeast cell, or more specifically, a Saccharomyces cerevisiae host cell that contains a genetic deficiency in a gene required for asparagine-linked glycosylation of glycoproteins is used. S. cerevisiae host cells having such defects may be prepared using standard techniques of mutation and selection, although many available yeast strains have been modified to prevent or reduce glycosylation or hypermannosylation. Ballou et al. (J. Biol. Chem. 255: 5986-5991, 1980) have described the isolation of mannoprotein biosynthesis mutants that are defective in genes which affect asparagine-linked glycosylation.

Gentzsch and Tanner (Glycobiology 7:481-486, 1997) have described a family of at least six genes (PMT1-6) encoding enzymes responsible for the first step in O-glycosylation of proteins in yeast. Mutants defective in one or more of these genes show reduced O-linked glycosylation and/or altered specificity of O-glycosylation.

[00193] To optimize production of the heterologous proteins, it may be preferred that the host strain carries a mutation, such as the S. cerevisiae pep4 mutation (Jones, Genetics 85: 23-33, 1977), which results in reduced proteolytic activity. Host strains containing mutations in other protease encoding regions are particularly useful to produce large quantities of the Tf fusion proteins of the invention.

[00194] Host cells containing DNA constructs of the present invention are grown in an appropriate growth medium. As used herein, the term "appropriate growth medium" means a medium containing nutrients required for the growth of cells. Nutrients required for cell growth may include a carbon source, a nitrogen source, essential amino acids, vitamins, minerals and growth factors. The growth medium will generally select for cells containing the DNA construct by, for example, drug selection or deficiency in an essential nutrient which are complemented by the selectable marker on the DNA construct or co-transfected with the DNA construct. Yeast cells, for example, are preferably grown in a chemically defined medium, comprising a carbon source, e.g. sucrose, a non-amino acid nitrogen source, inorganic salts, vitamins and essential amino acid supplements. The pH of the medium is preferably maintained at a pH greater than 2 and less than 8, preferably at pH 5.5 to 6.5. Methods for maintaining a stable pH include buffering and constant pH control, preferably through the addition of sodium hydroxide. Preferred buffering agents include succinic acid and Bis-Tris (Sigma Chemical Co., St. Louis, Mo.). Yeast cells having a defect in a gene required for asparagine-linked glycosylation are preferably grown in a medium containing an osmotic stabilizer. One such osmotic stabilizer is sorbitol supplemented into the medium at a concentration between 0.1 M and 1.5 M., preferably at 0.5 M or 1.0 M.

[00195] Cultured mammalian cells are generally grown in commercially available serum- containing or serum-free media. Selection of a medium appropriate for the particular cell line used is within the level of ordinary skill in the art. Transfected mammalian cells are allowed to grow for a period of time, typically 1-2 days, to begin expressing the DNA sequence(s) of interest. Drug selection is then applied to select for growth of cells that are expressing the selectable marker in a stable fashion. For cells that have been transfected with an amplifiable

selectable marker the drug concentration may be increased in a stepwise manner to select for increased copy number of the cloned sequences, thereby increasing expression levels.

[00196] B aculo virus/insect cell expression systems may also be used to produce the modified Tf fusion proteins of the invention. The BacPAK™ Baculovirus Expression System (BD Biosciences (Clontech)) expresses recombinant proteins at high levels in insect host cells. The target gene is inserted into a transfer vector, which is cotransfected into insect host cells with the linearized BacPAKό viral DNA. The BacPAKβ DNA is missing an essential portion of the baculovirus genome. When the DNA recombines with the vector, the essential element is restored and the target gene is transferred to the baculovirus genome. Following recombination, a few viral plaques are picked and purified, and the recombinant phenotype is verified. The newly isolated recombinant virus can then be amplified and used to infect insect cell cultures to produce large amounts of the desired protein.

[00197] Secretory Signal Sequences

[00198] The terms "secretory signal sequence" or "signal sequence" or "secretion leader sequence" are used interchangeably and are described, for example in U.S. Pat. 6,291,212 and U.S. Pat 5,547,871, both of which are herein incorporated by reference in their entirety. Secretory signal sequences or signal sequences or secretion leader sequences encode secretory peptides. A secretory peptide is an amino acid sequence that acts to direct the secretion of a mature polypeptide or protein from a cell. Secretory peptides are generally characterized by a core of hydrophobic amino acids and are typically (but not exclusively) found at the amino termini of newly synthesized proteins. Very often the secretory peptide is cleaved from the mature protein during secretion. Secretory peptides may contain processing sites that allow cleavage of the signal peptide from the mature protein as it passes through the secretory pathway. Processing sites may be encoded within the signal peptide or may be added to the signal peptide by, for example, in vitro mutagenesis.

[00199] Secretory peptides may be used to direct the secretion of modified Tf fusion proteins of the invention. One such secretary peptide that may be used in combination with other secretory peptides is the third domain of the yeast Barrier protein. Secretory signal sequences or signal sequences or secretion leader sequences are required for a complex series of post- translational processing steps which result in secretion of a protein. If an intact signal sequence is present, the protein being expressed enters the lumen of the rough endoplasmic

reticulum and is then transported through the Golgi apparatus to secretory vesicles and is finally transported out of the cell. Generally, the signal sequence immediately follows the initiation codon and encodes a signal peptide at the amino-terminal end of the protein to be secreted. In most cases, the signal sequence is cleaved off by a specific protease, called a signal peptidase. Preferred signal sequences improve the processing and export efficiency of recombinant protein expression using viral, mammalian or yeast expression vectors. In some cases, the native Tf signal sequence may be used to express and secrete fusion proteins of the invention.

[00200] Linkers

[00201] The Tf moiety and the ligand of the modified transferrin fusion proteins of the invention can be fused directly or using a linker peptide of various lengths to provide greater physical separation and allow more spatial mobility between the fused proteins and thus maximize the accessibility of the antibody variable region, for instance, for binding to its cognate receptor. The linker peptide may consist of amino acids that are flexible or more rigid. In one embodiment, the invention includes a substantially non-helical linker such as (PEAPTD) _n (SEQ ID NO.: 18). In another embodiment, the fusion protein of the invention contains a linker with a poly-glycine stretch. The linker can be less than about 50, 40, 30, 20, or 10 amino acid residues. The linker can be covalently linked to and between the transferrin protein or portion thereof and the antibody variable region.

[00202] Linkers may also be used to join antibody variable regions within a ligand or ligands. Suitable linkers for joining the antibody variable regions are those that allow the antibody variable regions to fold into a three dimensional structure that maintains the binding specificity of a whole antibody.

[00203] Screening Methods

[00204] The number of possible target molecules for which ligands may be identified by screening fusion protein libraries of the present invention is virtually unlimited. For example, the target molecule, i.e. receptor or agent, may be an antibody (or a binding portion thereof) or antigen. The antigen to which the antibody binds may be known and perhaps even sequenced, in which case the invention may be used to map epitopes of the antigen. If the antigen is unknown, such as with certain autoimmune diseases, for example, sera, fluids,

tissue, or cells from patients with the disease can be used in the present screening method to identify peptides, and consequently the antigen, that elicits the autoimmune response. Once a peptide has been identified, that peptide can serve as, or provide the basis for, the development of a vaccine, a therapeutic agent, a diagnostic reagent, etc. See WO 01/46698 for a list of target molecules on which the ligands may be screened, which is herein incorporated by reference in its entirety for all purposes.

[00205] Screening may be performed by using one of the methods well known to the practitioner in the art, such as by biopanning, FACS or MACS. In one embodiment of the invention, screening is performed for receptor activation. The target can be either purified and in solution or surface bound or cell associated. The target may be labeled, for instance, with biotin or by other methods known in the art.

[00206] Polypeptides and peptides having the desired property can be isolated and identified by sequencing of the corresponding nucleic acid sequence or by amino acid sequencing or mass spectrometry. Subsequent optimization may be performed by repeating the replacement of sub-sequences by different sequences, preferably by random sequences, and the screening step one or more times.

[00207] Once a peptide library is constructed, host cells are transformed with the library vectors. The successful transformants are typically selected by growth in a selective medium or under selective conditions, e.g., an appropriate growth medium or others depending on the vector used. This selection may be done on solid or in liquid growth medium. For growth of bacterial cells on solid medium, the cells are grown at a high density (about. 10 to 10 transformants per m ² ) on a large surface of, for example, L-agar containing the selective antibiotic to form essentially a confluent lawn. For growth in liquid culture, cells may be grown in L-broth (with antibiotic selection) through about 10 or more doublings. Growth in liquid culture may be more convenient because of the size of the libraries, while growth on solid media likely provides less chance of bias during the amplification process.

[00208] If a transferrin fusion protein peptide library is to be screened by yeast cell surface display, yeast cells will be transformed with the expression vector coding for the transferrin fusion protein. A full range of mutagenesis methods is consistent with yeast surface display library construction such as error-prone polymerase chain reaction and DNA shuffling. See Boder et al. (2000) Methods of Enzymology 328: 430-444. Alternatively, the tranferrin

moiety of the expressed fusion proteins can serve as a scaffold for random peptide sequences or CDRs.

[00209] Several approaches are known in the art for identifying desirable peptides once a yeast cell transferrin fusion protein peptide library has been created. For example, peptides can be distinguished by equilibrated binding with low concentrations of fluorescently labeled target, i.e. receptor or agent, in cases of fairly low affinity concentrations (K _d > nM, or no affinity if the library is being screened to isolate a novel binding specificity). For applications designed to evolve tight-binding proteins, excessively large volumes of dilute target solutions may be necessary to maintain molar ligand excess, complicating handling of samples. In such cases, improvements in binding affinity may be approcimated by changes in dissociation kinetics. Kinetic competition for a stoichiometrically limiting target can be used to identify improved clones within the population (Hawkins et al. (1992) J. MoI. Biol. 226: 889); however, this approach eliminates the quantitative predictability of the screening approach and is not recommended in general. See Boder et al. (2000) Methods of Enzymology 328: 430-444.

[00210] Targets can be biotinylated or fluorescently labeled, or alternatively, a ligand of interest, i.e. a peptide displayed on transferrin, can be labeled. Preferably, the targets are labeled. Labeled targets, e.g. biotinylated targets, can be incubated with a transferrin fusion protein peptide library. The library may have at least about 10 ⁴ members {i.e. displayed peptides), at least about 10 ⁵ members, at least about 10 ⁶ members, at least about 10 ⁷ members, at least about 10 ⁸ members, at least about 10 ⁹ members, at least about 10 ¹⁰ members, at least about 10 ⁿ members, at least about 10 ¹² members, at least about 10 ¹³ members, at least about 10 ¹⁴ members, at least about 10 ¹⁵ members, or at least about 10 ¹⁶ members.

[00211] After incubation, cells can be labeled with a second label such as secondary antibodies, a steptavidin labeled molecules, or other method known in the art. The secondary antibody can be an anti-biotin antibody. Streptavidin labeled molecules, include, but are not limited to, streptavidin-phycoerythrin or streptavidin microbeads.

[00212] Flow cytometry can be used to analyze cell populations as known in the art. When this is done, only the displaying fraction of the population is analyzed. See Boder et al. (2000) Methods of Enzymology 328: 430-444 and Kondo et al. (2004) Appl. Microbiol. Biotechnol. 64: 28-40, both of which are herein incorporated by reference in their entirety.

[00213] Alternatively, if a second label consisting of labeled beads is used, i.e. anti-biotin or streptavidin labeled beads, the mixture of ligands and target molecules can be sorted using a magnetic sorting protocol as described in Yeung et al. (2002) Biotechnol. Prog. 18: 212-220, which is herein incorporated by reference in its entirety. A MACS® MicroBeads kit can be used with this screening protocol (Miltenyi Biotec GmbH). Magnetic sorting can be used in conjunction with FACS.

[00214] In one embodiment of the present invention, it is desirable to characterize a single ligand of interest expressed in a yeast cell. The expressed protein may be screened in a variety of ways. If the protein has a function it may be directly assayed. For example, single chain antibodies expressed on the yeast surface are fully functional and may be screened based on binding to an antigen. If the protein does not have a detectable function that can be easily assayed, expression of the ligand may be monitored using an antibody. Because a yeast cell is much larger than phage, one can use flow cytometry to monitor the phenotype of the protein on a single yeast cell.

[00215] In another embodiment of the present invention, binding of the ligand moiety with a receptor or agent is performed by a means known in the art, other than cell surface display, such as by ELISA, competition binding assays when the target's native binding partner is known, sandwich assays, radioreceptor assays using a radioactive ligand whose binding is blocked by the peptide library, etc. In these methods, host cells transformed with the Tf fusion protein peptide library are lysed. The Tf fusion protein peptides are anchored to the assay substrate via an appropriate anchor moiety such as, but not limited to, an anti-MUCl antibody. The screening process involves reacting the Tf peptide library with the target of interest to establish a baseline binding level against which the binding activities of subsequent peptide libraries are compared. The nature of the assay is not critical so long as it is sufficiently sensitive to detect small quantities of peptide binding to or competing for binding to the target. The assay conditions may be varied to take into account optimal binding conditions for different binding substances of interest or other biological activities. Thus, the pH, temperature, salt concentration, volume and duration of binding, etc. may all be varied to achieve binding of peptide to target under conditions which resemble those of the environment of interest.

[00216] Once it is determined that the Tf peptide library possesses a peptide or peptides which bind to the target of interest, the methods of the invention can be used to identify the sequence of the peptide(s) in the mixture. Cells displaying peptides that bind the target can

be isolated from the general population of the library by MACS or FACS screening. The screening process is repeated 2 to 3 times on the initial isolates to deplete any nonspecific binders. A final round of screening by FACS sorting to isolate based on binding affinity is then performed. Plasmid DNA is recovered from isolated cells, and the DNA for the region of the insert is sequenced to determine the protein sequence. Common motifs between the isolates can then be determined.

[00217] Therapeutic Ligand Molecules

[00218] The ligands of the invention can be putative or known therapeutic molecules. As used herein, a therapeutic molecule is typically a protein or peptide capable of exerting a beneficial biological effect in vitro or in vivo and includes proteins or peptides that exert a beneficial effect in relation to normal homeostasis, physiology or a disease state. Therapeutic molecules do not include fusion partners commonly used as markers or protein purification aids, such as galactosidases (see for example, U.S. Patent 5, 986, 067 and Aldred et al. (1984) Biochem. Biophys. Res. Commun. 122: 960-965). For instance, a beneficial effect as related to a disease state includes any effect that is advantageous to the treated subject, including disease prevention, disease stabilization, the lessening or alleviation of disease symptoms or a modulation, alleviation or cure of the underlying defect to produce an effect beneficial to the treated subject.

[00219] A therapeutic ligand may be fused directly to a transferrin moiety or indirectly via a linker moiety as previously described. In one embodiment, it may be desirable to cleave the fusion protein to separate the transferrin and ligand portion of the fusion protein from the remainder of the fusion protein. In another embodiment, it may be desirable to cleave the ligand from the remainder of the fusion protein.

[00220] The ligand moiety of the fusion protein of the invention may contain at least a fragment or variant of a therapeutic protein, and/or at least a fragment or variant of an antibody. In a further embodiment, the fusion proteins can contain peptide fragments or peptide variants of proteins or antibodies wherein the variant or fragment retains at least one biological or therapeutic activity. The fusion proteins can contain therapeutic proteins that can be peptide fragments or peptide variants at least about 3, at least about 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 30, at least 35, or at least about 40, at least

about 50, at least about 55, at least about 60 or at least about 70 or more amino acids in length fused to the N and/or C termini, inserted within, or inserted into a loop of a modified transferrin.

[00221] In another embodiment, the ligand moiety of the fusion protein of the present invention contains a therapeutic protein portion that can be fragments of a therapeutic protein that include the full length protein as well as polypeptides having one or more residues deleted from the amino terminus of the amino acid sequence.

[00222] In another embodiment, the ligand moiety of the fusion protein of the present invention contains a therapeutic protein portion that can be fragments of a therapeutic protein that include the full length protein as well as polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence.

[00223] In another embodiment, the ligand moiety of the fusion proteins of the present invention contain a therapeutic protein portion that can have one or more amino acids deleted from both the amino and the carboxy termini.

[00224] In another embodiment, the fusion protein contains a therapeutic protein portion, i.e. ligand moiety, that is at least about 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identical to a reference therapeutic protein set forth herein, or fragments thereof. In further embodiments, the transferrin fusion molecules contain a therapeutic protein portion that is at least about 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identical to reference polypeptides having the amino acid sequence of N- and C-terminal deletions as described above.

[00225] In another embodiment, the fusion protein contains the therapeutic protein portion that is at least about 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or 100%, identical to, for example, the native or wild-type amino acid sequence of a therapeutic protein. Fragments, of these polypeptides are also provided.

[00226] The therapeutic proteins corresponding to a therapeutic protein portion of a modified transferrin fusion protein of the invention, such as cell surface and secretory proteins, can be modified by the attachment of one or more oligosaccharide groups. The modification referred to as glycosylation, can significantly affect the physical properties of proteins and can be important in protein stability, secretion, and localization. Glycosylation occurs at specific locations along the polypeptide backbone. There are usually two major types of glycosylation: glycosylation characterized by O-linked oligosaccharides, which are attached to serine or threonine residues; and glycosylation characterized by N-linked oligosaccharides,

which are attached to asparagine residues in an Asn-X-Ser/Thr sequence, where X can be an amino acid except proline. Variables such as protein structure and cell type influence the number and nature of the carbohydrate units within the chains at different glycosylation sites. Glycosylation isomers are also common at the same site within a given cell type. For example, several types of human interferon are glycosylated.

[00227] Therapeutic proteins corresponding to a therapeutic protein portion of a fusion protein of the invention, as well as analogs and variants thereof, may be modified so that glycosylation at one or more sites is altered as a result of manipulation(s) of their nucleic acid sequence by the host cell in which they are expressed, or due to other conditions of their expression. For example, glycosylation isomers may be produced by abolishing or introducing glycosylation sites, e.g., by substitution or deletion of amino acid residues, such as substitution of glutamine for asparagine, or unglycosylated recombinant proteins may be produced by expressing the proteins in host cells that will not glycosylate them, e.g. in glycosylation-deficient yeast. These approaches are known in the art.

[00228] Therapeutic proteins and their nucleic acid sequences are well known in the art and available in public databases such as Chemical Abstracts Services Databases {e.g., the CAS Registry), GenBank, and GenSeq. The Accession Numbers and sequences referred to below are herein incorporated by reference in their entirety.

[00229] The present invention is further directed to fusion proteins comprising fragments of the therapeutic proteins herein described. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification or loss of one or more biological functions of the therapeutic protein portion, other therapeutic activities and/or functional activities {e.g., biological activities, ability to multimerize, ability to bind a ligand) may still be retained. For example, the ability of polypeptides with N-terminal deletions to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained with less than the majority of the residues of the complete polypeptide removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can be assayed by routine methods described herein and otherwise known in the art. It is not unlikely that a mutant with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.

[00230] Also as mentioned above, even if deletion of one or more amino acids from the N- terminus or C-terminus of a therapeutic protein results in modification or loss of one or more biological functions of the protein, other functional activities, e.g., biological activities, ability to multimerize, ability to bind a ligand, and/or therapeutic activities may still be retained. For example the ability of polypeptides with C-terminal deletions to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking the N-terminal and/or, C-terminal residues of a reference polypeptide retains therapeutic activity can readily be determined by routine methods described herein and/or otherwise known in the art.

[00231] Peptide fragments of the therapeutic proteins can be fragments comprising, or alternatively, consisting of, an amino acid sequence that displays a therapeutic activity and/or functional activity, e.g., biological activity, of the polypeptide sequence of the therapeutic protein of which the amino acid sequence is a fragment.

[00232] Other polypeptide fragments are biologically active fragments. Biologically active fragments are those exhibiting activity similar, but not necessarily identical, to an activity of a therapeutic protein used in the present invention. The biological activity .of the fragments may include an improved desired activity, or a decreased undesirable activity.

[00233] Generally, variants of proteins are overall very similar, and, in many regions, identical to the amino acid sequence of the therapeutic protein corresponding to a therapeutic protein portion of a transferrin fusion protein of the invention. Nucleic acids encoding these variants are also encompassed by the invention.

[00234] Further therapeutic polypeptides that may be used in the invention are polypeptides encoded by polynucleotides which hybridize to the complement of a nucleic acid molecule encoding an amino acid sequence of a therapeutic protein under stringent hybridization conditions which are known to those of skill in the art. See, for example, Ausubel, F.M. et al, eds., 1989 Current protocol in Molecular Biology, Green Publishing Associates, Inc., and John Wiley & Sons Inc., New. York. Polynucleotides encoding these polypeptides are also encompassed by the invention.

[00235] By a polypeptide-having an amino acid sequence at least, for example, 95% "identical" to a query amino acid sequence of the present invention, it is intended that the

amino acid sequence of the subject polypeptide is identical to the query sequence except that the subject polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence. In other words, to obtain a polypeptide having an amino acid sequence at least 95% identical to a query amino acid sequence, up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, or substituted with another amino acid. These alterations of the reference sequence may occur at the amino- or carboxy-terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence, or in one or more contiguous groups within the reference sequence.

[00236] As a practical matter, whether any particular polypeptide is at least about 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identical to, for instance, the amino acid sequence of a fusion protein of the invention or a fragment thereof (such, as the therapeutic protein portion of the fusion protein or portion thereof), can be determined conventionally using known computer programs. One method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brufiag et al. (Comp. App. Biosci 245- (1990)).

[00237] The polynucleotide variants of the invention may contain alterations in the coding regions, non-coding regions, or both. Polynucleotide variants containing alterations which produce silent substitutions, additions, or deletions, but do not alter the properties or activities of the encoded polypeptide may be used to produce modified ligand moieties. Nucleotide variants produced by silent substitutions due to the degeneracy of the genetic code can be utilized. Moreover, polypeptide variants in which less than about 50, less than 40, less than 30, less than 20, less than 10, or 5-50, 5-25, 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any combination can also be utilized. Polynucleotide variants can be produced for a variety of reasons, e.g., to optimize codon expression for a particular host (change codons in the human mRNA to those preferred by a host, such as, yeast or E. coli as described above).

[00238] In other embodiments, the therapeutic protein moiety, i.e., ligand moiety, has conservative substitutions compared to the wild-type sequence. By "conservative substitutions" is intended swaps within groups such as replacement of the aliphatic or hydrophobic amino acids Ala, VaI, Leu and He; replacement of the hydroxyl residues Ser and Thr; replacement of the acidic residues Asp and GIu; replacement of the amide residues Asn

and GIn, replacement of the basic residues Lys, Arg, and His; replacement of the aromatic residues Phe, Tyr, and Trp, and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and GIy. Guidance concerning how to make plienotypically silent amino acid substitutions is provided, for example, in Bowie et al, "Deciphering the Message in Protein Sequences: Tolerance to Amino Acid Substitutions," Science 247:1306-1310 (1990). In specific embodiments, the polypeptides of the invention comprise, or alternatively, consist of, fragments or variants of the amino acid sequence of a therapeutic protein described herein and/or serum transferrin, and/ modified transferrin protein of the invention, wherein the fragments or variants have 1-5, 5-10, 5-25, 5-50, 10-50 or 50-150 amino acid residue additions, substitutions, and/or deletions when compared to the reference amino acid sequence. In further embodiments, the amino acid substitutions are conservative. Nucleic acids encoding these polypeptides are also encompassed by the invention.

[00239] The modified fusion proteins of the present invention can be composed of amino- acids joined to each other by peptide bonds or modified peptide bonds and may contain amino acids other than the 20 gene-encoded amino acids. The polypeptides may be modified by either natural processes, such as post-translational processing, or by chemical modification techniques which are well known in the art. Such modifications are well described in basic texts and in more detailed monographs, as well as in a voluminous research literature.

[00240] Modifications can occur anywhere in a polypeptide, including the peptide backbone, the amino acid side-chains and the amino or carboxy termini. It will be appreciated that the same type of modification may be present in the same or varying degrees at several sites in a given polypeptide. Also, a given polypeptide may contain many types of modifications. Polypeptides may be branched, for example, as a result of ubiquitination, and they may be cyclic, with or without branching. Cyclic, branched, and branched cyclic polypeptides may result from postranslation natural processes or may be made by synthetic methods. Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation of cysteine, glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, myristylation, oxidation, pegylation, proteolytic processing, phosphorylation, prenylation, racemization, sulfation, transfer-RNA mediated addition of amino acids to proteins such as arginylation, and

ubiquitination. (See, for instance, PROTEINS - STRUCTURE AND MOLECULAR PROPERTIES, 2nd Ed., T. E. Creighton, W. H. Freeman and Company, New York(1993); POST- TRANS LATION AL COVALENT MODIFICATION OF PROTEINS, B. C. Johnson, Ed., Academic Press, New' York, pgs. 1-12 (1983); Seifter et al. (1990) Meth. Enzymol. 182:626- 646; Rattan et al, Ann. N.Y. Acad. Sci. 663:48-62.

[00241] Therapeutic molecules that may be used as ligand moieties include, but are not limited to, hormones, matrix proteins, immunosuppressants, bronchodilators, cardiovascular agents, enzymes, CNS agents, neurotransmitters, receptor proteins or peptides, growth hormones, growth factors, antiviral peptides, fusogenic inhibitor peptides, cytokines, lymphokines, monokines, interleukins, colony stimulating factors, differentiation factors, angiogenic factors, receptor ligands, cancer-associated proteins, antineoplastics, viral peptides, antibiotic peptides, blood proteins, antagonist proteins, transcription factors, anti- angiogenic factors, antagonist proteins or peptides, receptor antagonists, antibodies, single chain antibodies and cell adhesion molecules. Different therapeutic molecules may be combined into a single fusion protein to produce a bi or multi-functional therapeutic molecule. Different molecules may also be used in combination to produce a fusion protein with a therapeutic entity and a targeting entity. Therapeutic molecules can be fused directly to the stalk moiety of the present invention or, alternatively, fused to or inserted into a presenter moiety, such as a Tf moiety or albumin moiety.

[00242] Cytokines are soluble proteins released by cells of the immune system, which act nonenzymatically through specific receptors to regulate immune responses. Cytokines resemble hormones in that they act at low concentrations bound with high affinity to a specific receptor. The term "cytokine" is used herein to describe naturally occurring or recombinant proteins, analogs thereof, and fragments thereof which elicit a specific biological response in a cell which has a receptor for that cytokine. Cytokines preferably include interleukins such as interleukin-2 (IL-2) (GenBank Ace. No. S77834), IL-3 (GenBank Ace. No. M14743), IL-4 (GenBank Ace. No. M23442), IL-5 (GenBank Ace. No. J03478), IL-6 (GenBank Ace. No. M14584), IL-7 (GenBank Ace. No. NM_000880), IL-IO (GenBank Ace. No. NM_000572), IL-12 (GenBank Ace. No.AF180562 and GenBank Ace. No. AFl 80563), IL-13 (GenBank Ace. No. Ul 0307), IL-14 (GenBank Ace. No. XM_170924), IL-15 (GenBank Ace. No. X91233), IL-16 (GenBank Ace. No. NM_004513), IL-17 (GenBank Ace. No. NM_002190) and IL-18 (GenBank Ace. No. NM_001562), hematopoietic factors such as granulocyte-macrophage colony stimulating factor (GM-CSF)

(GenBank Ace. No. X03021), granulocyte colony stimulating factor (G-CSF) (GeiiBank Ace. No. X03656), platelet activating factor (GenBank Ace. No. NM__000437) and erythropoeitin (GenBank Ace. No. X02158), tumor necrosis factors (TNF) such as TNFα (GenBank Ace. No. X02910), lymphokines such as lymphotoxin- α (GenBank Ace. No. X02911), lymphotoxin-β (GenBank Ace. No. Ll 1016), leukoregulin, macrophage migration inhibitory factor (GenBank Ace. No. M25639), and neuroleukin (GenBank Ace. No. K03515), regulators of metabolic processes such as leptin (GenBank Ace. No. U43415), interferons such as interferon α (IFNα) (GenBank Ace. No. M54886), IFNβ (GenBank Ace. No. V00534), IFNγ (GenBank Ace. No. J00219), IFNo (GenBank Ace. No. NM_002177), thrombospondin 1 (THBSl) (GenBank Ace. No. NM_003246), THBS2 (GenBank Ace. No. L12350), THBS3 (GenBank Ace. No. L38969), THBS4 (GenBank Ace. No. NM_003248), and chemokines. Preferably, the modified transferrin-cytokine fusion protein of the present invention displays cytokine biological activity.

[00243] The term "hormone" is used herein to describe any one of a number of biologically active substances that are produced by certain cells or tissues and that cause specific biological changes or activities to occur in another cell or tissue located elsewhere in the body. Hormones preferably include proinsulin (GenBank Ace. No. V00565), insulin (GenBank Ace. No. NM_000207), growth hormone 1 (GenBank Ace. No. V00520), growth hormone 2 (GenBank Ace. No. F006060), growth hormone release factor (GenBank Ace. No. NM_021081), insulin-like growth factor I (GenBank Ace. No. M27544), insulin-like growth factor II (GenBank Ace. No. NM_000612), insulin-like growth factor binding protein 1 (IGFBP-I) (GenBank Ace. No. M59316), IGFBP-2 (GenBank Ace. No. X16302), IGFBP-3 (GenBank Ace. No. NM_000598), IGFBP-4 (GenBank Ace. No. Yl 2508), IGFBP-5 (GenBank Ace. No. M65062), IGFBP-6 (GenBank Ace. No. NM_002178), IGFBP-7 (GenBank Ace. No. NM_001553), chorionic gonadotropin β chain (GenBank Ace. No. NM_033142), chorionic gonadotropin α chain (GenBank Ace. No. NMJD00735), luteinizing hormone β (GenBank Ace. No. X00264), follicle-stimulating hormone β (GenBank Ace. No. NM_000510), thyroid-stimulating hormone β (GenBank Ace. No. NM_000549), prolactin (GenBank Ace. No. NM_000948), proopiomelanocortin (GenBank Ace. No. VOl 510), corticotropin (ACTH), β-lipotropin, α-melanocyte stimulating hormone (α-MSH), γ- lipotropin, β -MSH, β -endorphin, and corticotropin-like intermediate lobe peptide (CLIP).

[00244] The term "hormone" also includes Glucagon-Like Peptide- 1 (GLP-I) which is a gastrointestinal hormone that regulates insulin secretion belonging to the so-called

enteroinsular axis as well as exendin (e.g., exendin-4 and variants thereof) which is a GLP-I receptor agonist.

[00245] The term "growth factor" is used herein to describe any protein or peptide that binds to a receptor to stimulate cell proliferation. Growth factors preferably include platelet- derived growth factor-α (PDGF-α) (GenBank Ace. No. X03795), PDGF-β (GenBank Ace. No. X02811), steroid hormones, epidermal growth factor (EGF) (GenBank Ace. No. NM_001963), fibroblast growth factors such as fibroblast growth factor 1 (FGFl) (GenBank Ace. No. NM_000800), FGF2 (GenBank Ace. No. NM_002006), FGF3 (GenBank Ace. No. NM_005247), FGF4 (GenBank Ace. No. NM_002007), FGF5 (GenBank Ace. No. M37825), FGF6 (GenBank Ace. No. X57075), FGF7 (GenBank Ace. No. NM_002009), FGF8 (GenBank Ace. No. AH006649), FGF9 (GenBank Ace. No. NM_002010), FGFlO (GenBank Ace. No. AB002097), FGFl 1 (GenBank Ace. No. NM_004112), FGF12 (GenBank Ace. No. NM_021032), FGF13 (GenBank Ace. No. NM_004114), FGF14 (GenBank Ace. No. NM_004115), FGF16 (GenBank Ace. No. AB009391), FGF17 (GenBank Ace. No. NM_003867), FGF18 (GenBank Ace. No. AF075292), FGF19 (GenBank Ace. No. NM_005117), FGF20 (GenBank Ace. No. NM_019851), FGF21 (GenBank Ace. No. NM_019113), FGF22 (GenBank Ace. No. NM_020637), and FGF23 (GenBank Ace. No. NM_020638), angiogenin (GenBank Ace. No. Ml 1567), brain-derived neurotrophic factor (GenBank Ace. No. M61176), ciliary neurotrophic growth factor (GenBank Ace. No. X60542), transforming growth factor-α (TGF-α) (GenBank Ace. No. X70340), TGF-β (GenBank Ace. No. X02812), nerve growth factor-α (NGF-α) (GenBank Ace. No. NM_010915), NGF-β (GenBank Ace. No. X52599), tissue inhibitor of metalloproteinase 1 (TIMPl) (GenBank Ace. No. NM_003254), TIMP2 (GenBank Ace. No. NM_003255), TIMP3 (GenBank Ace. No. U02571), TIMP4 (GenBank Ace. No. U76456) and macrophage stimulating 1 (GenBank Ace. No. Ll 1924).

[00246] The term "matrix protein" is used herein to describe proteins or peptides that are normally found in the extracellular matrix. These proteins may be functionally important for strength, filtration, or adhesion. Matrix proteins preferably include collagens such as collagen I (GenBank Ace. No. Z74615), collagen II (GenBank Ace. No. Xl 6711), collagen III (GenBank Ace. No. X14420), collagen IV (GenBank Ace. No. NM_001845), collagen V (GenBank Ace. No. NM_000393), collagen VI (GenBank Ace. No. NMJ358175), collagen VII (GenBank Ace. No. L02870), collagen VIII (GenBank Ace. No. NM_001850), collagen IX (GenBank Ace. No. X54412), collagen X (GenBank Ace. No. X60382), collagen XI

(GenBank Ace. No. J04177), and collagen XII (GenBank Ace. No. U73778), laminin proteins such as LAMA2 (GenBank Ace. No. NM_000426), LAMA3 (GenBank Ace. No. L34155), LAMA4 (GenBank Ace. No. NM_002290), LAMBl (GenBank Ace. No. NM_002291), LAMB3 (GenBank Ace. No. L25541), LAMCl (GenBank Ace. No. NM_002293), nidogen (GenBank Ace. No. NM_002508), α-tectorin (GenBank Ace. No. NM_005422), β-tectorin (GenBank Ace. No. NM_058222), and fibronectin (GenBank Ace. No. X02761).

[00247] The term "blood proteins" are traditionally defined as those sourced from plasma, many now commonly produced by recombinant means, and include, but are not limited to native serum proteins, derivatives, fragments and mutants or variants thereof, blood clotting factors, derivatives, mutants, variants and fragments (including factors VII, VIII, IX, X), protease inhibitors (antithrombin 3, alpha- 1 antitrypsin), urokinase-type plasminogen activator, immunoglobulins, von Willebrand factor and von Willebrand mutants, fibronectin, fibrinogen, thrombin and hemoglobin.

[00248] The term "enzyme" is used herein to describe any protein or proteinaceous substance which catalyzes a specific reaction without itself being permanently altered or destroyed. Enzymes preferably include coagulation factors such as F2 (GenBank Ace. No. XMJ 70688), F7 (GenBank Ace. No. XM_0275Q8), F8 (GenBank Ace. No. XM_013124), F9 (GenBank Ace. No. NM_000133), FlO (GenBank Ace. No. AF503510) and others, matrix metalloproteinases such as matrix metalloproteinase I (GenBank Ace. No. MMPl) (GenBank Ace. No. NM_002421), MMP2 (GenBank Ace. No. NM_004530), MMP3 (GenBank Ace. No. NM_002422), MMP7 (GenBank Ace. No. NM_002423), MMP8 (GenBank Ace. No. NM_002424), MMP9 (GenBank Ace. No. NM_004994), MMPlO (GenBank Ace. No. NM_002425), MMP 12 (GenBank Ace. No. NMJ302426), MMP 13 (GenBank Ace. No. X75308), MMP20 (GenBank Ace. No. NM_004771), adenosine deaminase (GenBank Ace. No. NM_000022), mitogen activated protein kinases such as MAPK3 (GenBank Ace. No. XM_055766), MAP2K2 (GenBank Ace. No. NMJB0662), MAP2K1 (GenBank Ace. No. NM_002755), MAP2K4 (GenBank Ace. No. NM_003010), MAP2K7 (AF013588), and MAPKl 2 (NM_002969), kinases such as JNKKl (GenBank Ace. No. U 17743), JNKK2 (GenBank Ace. No. AF014401), JAKl (M64174), JAK2 (NM_004972), and JAK3 (NM_000215), and phosphatases such as PPMlA (GenBank Ace. No. NM_021003) and PPMlD (GenBank Ace. No. NM_003620).

[00249] The term "transcription factors" is used herein to describe any protein or peptide involved in the transcription of protein-coding genes. Transcription factors may include SpI, Sρ2 (GenBank Ace. No. NM_003110), Sρ3 (GenBank Ace. No. AY070137), Sρ4 (GenBank Ace. No. NM_003112) NFYB (GenBank Ace. No. NM_006166), Haρ2 (GenBanl< Ace. No. M59079), GATA-I (GenBank Ace. No. NM_002049), GATA-2 (GenBank Ace. No. NM_002050), GATA-3 (GenBank Ace. No. X55122), GATA-4 (GenBanl< Ace. No. L34357), GATA-5, GATA-6 (GenBank Ace. No. NM_005257), FOG2 (NM_012082), Eryfl (GenBank Ace. No. X17254), TRPSl (GenBank Ace. No. NM_014112), NF-E2 (GenBank Ace. No. NM_006163), NF-E3, NF-E4, TFCP2 (GenBank Ace. No. NM_005653), Oct-1 (GenBank Ace. No. Xl 3403), homeobox proteins such as HOXB2 (GenBank Ace. No. NM_002145), HOX2H (GenBank Ace. No. Xl 6665), hairless homolog (GenBank Ace. No. NM_005144), mothers against decapentaplegic proteins such as MADHl (GenBank Ace. No. NM_005900), MADH2 (GenBank Ace. No. NM_005901), MADH3 (GenBank Ace. No. NM_005902), MADH4 (GenBank Ace. No. NM_005359), MADH5 (GenBank Ace. No. AF009678), MADH6 (GenBank Ace. No. NM_005585), MADH7 (GenBank Ace. No. NM_005904), MADH9 (GenBank Ace. No. NM_005905), and signal transducer and activator of transcription proteins such as STATl (GenBank Ace. No. XM_010893), STAT2 (GenBank Ace. No. NM_005419), STAT3 (GenBank Ace. No. AJ012463), STAT4 (GenBank Ace. No. NM_003151), STAT5 (GenBank Ace. No. L41142), and STAT6 (GenBank Ace. No. NM_003153).

[00250] In yet another embodiment of the invention, the therapeutic molecule is a non- human or non-mammalian protein. For example, HIV gpl20, HIV Tat, surface proteins of other viruses such as hepatitis, herpes, influenza, adenovirus and RSV, other HIV components, parasitic surface proteins such as malarial antigens, and bacterial surface proteins may be used. These non-human proteins may be used, for example, as antigens, or because they have useful activities. For example, the therapeutic molecule may be streptokinase, staphylokinase, asparaginase, urokinase, or other proteins with useful enzymatic activities.

[00251] In an alternative embodiment of the invention, the therapeutic molecule is a ligand- binding protein with biological activity. Such ligand-binding proteins may, for example, (1) block receptor-ligand interactions at the cell surface; or (2) neutralize the biological activity of a molecule in the fluid phase of the blood, thereby preventing it from reaching its cellular target. In some embodiments, the modified transferrin fusion proteins include a modified

transferrin molecule fused to a ligand-binding domain of a receptor selected from the group consisting of, but not limited to, a low density lipoprotein (LDL) receptor, an acetylated LDL receptor, a tumor necrosis factor α receptor, a transforming growth factor β receptor, a cytokine receptor, an immunoglobulin Fc receptor, a hormone receptor, a glucose receptor, a glycolipid receptor, and a glycosaminoglycan receptor. In other embodiments, ligand- binding proteins include CD2 (M14362), CD3G (NM_000073), CD3D (NM_000732), CD3E (NMJ3OO733), CD3Z (J04132), CD28 (NM_006139), CD4 (GenBank Ace. No. NM_000616), CDlA (GenBank Ace. No. M28825), CDlB (GenBank Ace. No. NM_001764), CDlC (GenBank Ace. No. NM_001765), CDlD (GenBank Ace. No. NM_001766), CD80 (GenBank Ace. No. NM_005191), GNB3 (GenBank Ace. No. AF501884), CTLA-4 (GenBank Ace. No. NM_005214), intercellular adhesion molecules such as ICAM-I (NM_000201), ICAM-2 (NM_000873), and ICAM-3 (NM_002162), tumor necrosis factor receptors such as TNFRSFlA (GenBank Ace. No. X55313), TNFRlSFB (GenBank Ace. No. NM_001066), TNFRSF9 (GenBank Ace. No. NM_001561), TNFRSFlOB (GenBank Ace. No. NM_003842), TNFRSFl IB (GenBank Ace. No. NM_002546), and TNFRSFl 3B (GenBank Ace. No. NM_006573), and interleukin receptors such as IL2RA (GenBank Ace. No. NM_000417), IL2RG (GenBank Ace. No. NM_000206), IL4R (GenBank Ace. No. AF421855), IL7R (GenBank Ace. No. NM_002185), IL9R (GenBank Ace. No. XM_015989), and ILl 3R (GenBank Ace. No. X95302). Preferably, the ligand-binding protein fusion of the present invention displays the biological activity of the ligand-binding protein.

[00252] The term "cancer-associated proteins" is used herein to describe proteins or polypeptides whose expression is associated with cancer or the maintenance of controlled cell growth, such as proteins encoded by rumor suppressor genes or oncogenes. Cancer- associated proteins may include pl6 (GenBank Ace. No. AH005371), p53 (GenBank Ace. No. NM_000546), p63 (GenBank Ace. No. NM_003722), p73 (GenBank Ace. No. NM_005427), BRCAl (GenBank Ace. No. U14680), BRCA2 (GenBank Ace. No. NM_000059), CTBP interacting protein (GenBank Ace. No. U72066), DMBTl (GenBank Ace. No. NM_004406), HRAS (GenBank Ace. No. NM_005343), NCYM (GenBank Ace. No. NM_006316), FGR (GenBank Ace. No. NM_005248), myb (GenBank Ace. No. AFl 04863), rafl (GenBank Ace. No. NM_002880), erbB2 (GenBank Ace. No. NM_004448), VAV (GenBank Ace. No. Xl 6316), c-fos (V GenBank Ace. No. 01512), c-fes (GenBank Ace. No. X52192), c-jun (GenBank Ace. No. NM_002228), MASl (GenBank

Acc. No. Ml 3150), pim-1 (GenBank Ace. No. Ml 6750), TIFl (GenBank Ace. No. NM_003852), c-fins (GenBank Acc. No. X03663), EGFR (GenBank Acc. No. NM_005228), erbA (GenBank Acc. No. X04707), c-src tyrosine kinase (GenBank Acc. No. XM_044659), c-abl (GenBank Acc. No. M14752), N-ras (GenBank Acc. No. X02751), K-ras (GenBank Acc. No. M54968), jun-B (GenBank Acc. No. M29039), c-myc (GenBank Acc. No. AH001511), RBl (GenBank Acc. No. M28419), DCC (GenBank Acc. No. X76132), APC (GenBank Acc. No. NM_000038), NFl (GenBank Acc. No. M89914), NF2 (GenBank Acc. No. Yl 8000), and bcl-2 (GenBank Acc. No. Ml 3994).

[00253] "Fusogenic inhibitor peptides" is used herein to describe peptides that show antiviral activity, anti-membrane fusion capability, and/or an ability to modulate intracellular processes, for instance, those involving coiled-coil peptide structures. Antiviral activity includes, but is not limited to, the inhibition of HIV-I, HIV-2, RSV, SIV, EBV, measles, virus, influenza virus, or CMV transmission to uninfected cells. Additionally, the antifusogenic capability, antiviral activity or intracellular modulatory activity of the peptides merely requires the presence of the peptides and specifically does not require the stimulation of a host immune response directed against such peptides. Antifusogenic refers to a peptide's ability to inhibit or reduce the level of membrane fusion events between two or more moieties relative to the level of membrane fusion which occurs between said moieties in the absence of the peptide. The moieties may be, for example, cell membranes or viral structures, such as viral envelopes or pili. The term "antiviral peptide", as used herein, refers to the peptide's ability to inhibit viral infection of cells or some viral activity required for productive viral infection and/or viral pathogenesis, via, for example, cell-cell fusion or free virus infection. Such infection may involve membrane fusion, as occurs in the case of enveloped viruses, or some other fusion event involving a viral structure and a cellular structure. Fusogenic inhibitor peptides and antiviral peptides often have amino acid sequences that are derived from greater than one viral protein (e.g., an HIV-I, HIV-2, RSV, and SIV-derived polypeptide).

[00254] Examples of fusogenic inhibitor peptides and antiviral peptides can be found in WO 94/2820, WO 96/19495, WO 96/40191, WO 01/64013 and US patents 6,333,395, 6,258,782, 6,228,983, 6,133,418, 6,093,794, 6,068,973, 6,060,065, 6,054,265, 6,020,459, 6,017,536, 6,013,263, 5,464,933, 5,346,989, 5,603,933, 5,656,480, 5,759,517, 6,245,737; 6,326,004, and 6,348,568; all of which are herein incorporated by reference.

[00255] Examples of other types of peptides, include fragments of therapeutic proteins as described herein, in particular, fragments of human proteins that retain at least one activity of the parent molecule. Peptides that may be used to produce ligand moieties of the invention also include mimetic peptides and peptides that exhibit a biological activity of a therapeutic protein but differ in sequence or three-dimensional structure from a full-length therapeutic protein. As a non-limited example, peptides include erythropoeitin mimetic peptides disclosed by Johnson et al (2000) Nephrol. Dial. Transplant 15(9): 1274-7, Kuai et al. (2000) J. Pept. Res. 56(2):59-62, Barbone et al (1999) Nephrol. Dial. Transplant. 14 Supp 2:80-4, Middleton et al. (1999) J. Biol. Chem. 274(20):14163-9, Johnson et al. (1998) Biochemistry 37(11):3699-710, Johnson et al. (1997) Chem. Biol. 12:939-50, Wrighton et al. (1997) Nat. Biotechnol. 15(12):1261-5, Livnah et al. (1996) Science 273:464-71, and Wrighton et al, (1996) Science 273:458-64.

[00256] Therapeutic molecules also include allergenic proteins and digested fragments thereof. These include pollen allergens from ragweed, rye, June grass, orchard grass, sweet vernal grass, red top grass, timothy grass, yellow dock, wheat, corn, sagebrush, blue grass, California annual grass, pigweed, Bermuda grass, Russian thistle, mountain cedar, oak, box elder, sycamore, maple, elm, etc., dust mites, bee venom, food allergens, animal dander, and other insect venoms.

[00257] Other therapeutic molecules include microbial vaccines which include viral, bacterial and protozoal vaccines and their various components such as surface antigens. These include vaccines which contain glycoproteins, proteins or peptides derived from these proteins. Such vaccines are prepared from Staphylococcus aureus, Streptococcus pyogenes, Streptococcus pneumoniae, Neisseria meningitidis, Neisseria gonorrhoeae, Salmonella spp., Shigella spp., Escherichia coli, Klebsiella spp., Proteus spp., Vibrio cholerae, Campylobacter pylori, Pseudomonas aeruginosa, Haemophilus influenzae, Bordetella pertussis, Mycobacterium tuberculosis, Legionella pneumophila, Treponema pallidum, chlamydia, tetanus toxoid, diphtheria toxoid, influenza viruses, adenoviruses, paramyxoviruses (mumps, measles), rubella viruses, polio viruses, hepatitis viruses, herpes viruses, rabies virus, HIV-I, HIV-2, RSV and papilloma viruses.

[00258] Preferred fusion molecules may contain anti-HIV viral peptides, anti-RSV peptides, human growth hormone, α and/or β interferons, erythropoietin (EPO), EPO like peptides, granulocyte-colony stimulating factor (GCSF), granulocyte-macrophage colony-stimulating factor (GMCSF), insulin, insulin-like growth factor (IGF), thrombopoeitin, peptides

corresponding to the CDR of an antibody, Islet Neogenesis Associated Protein (INGAP), calcitonin, angiostatin, endostatin, interleukin-2, growth hormone releasing factor, human parathyroid hormone, anti-tumor necrosis factor (TNF) peptides, interleukin-1 (IL-I) receptor and/or single chain antibodies.

[00259] Fusion proteins of the invention may also be prepared to include peptides or polypeptides derived from peptide libraries to screen for molecules with new or novel functions. Such peptide libraries may include those commercially or publicly available, e.g. , American Peptide Co. Inc., Cell Sciences Inc., Invitrogen Corporation, Phoenix Pharmaceuticals Inc., United States Biological, as well as those produced by available technologies, e.g., bacteriophage and bacterial display libraries made using standard procedures.

[00260] In yet other embodiments of the invention, fusion proteins may be prepared by using therapeutic protein moieties known in the art and exemplified by the peptides and proteins currently approved by the Food and Drug Administration

(www.fda.gov/cber/efoi/approve.htm) as well as PCT Patent Publication Nos. WO 01/79258, WO 01/77137, WO 01/79442, WO 01/79443, WO 01/79444 and WO 01/79480, all of which are herein incorporated by reference in their entirety.

[00261] Table 1 from PCT International Publication No. WO 03/020746, which is herein incorporated by reference, provides a non-exhaustive list of therapeutic proteins that correspond to a therapeutic protein portion, i.e. ligand moiety, of a fusion protein of the invention. The "Therapeutic Protein X" column discloses therapeutic protein molecules followed by parentheses containing scientific and brand names that comprise or alternatively consist of that therapeutic protein molecule or a fragment or variant thereof. "Therapeutic protein X" as used herein may refer either to an individual therapeutic protein molecule (as defined by the amino acid sequence obtainable from the CAS and Genbank accession numbers), or to the entire group of therapeutic proteins associated with a given therapeutic protein molecule disclosed in this column. The 'Exemplary Identifier' column provides Chemical Abstracts Services (CAS) Registry Numbers (published by the American Chemical Society) and/or Genbank Accession Numbers {e.g., Locus ID, NP - XXXXX (Reference Sequence Protein), and XP-XXXXX (Model Protein) identifiers available through the National Center for Biotechnology Information (NCBI) webpage (www.ncbi.nlm.nih.gov) that correspond to entries in the CAS Registry or Genbank database which contain an amino acid sequence of the protein molecule or of a fragment or variant of the therapeutic protein

molecule. In addition GenSeq Accession numbers and/or journal publication citations are given to identify the exemplary amino acid sequence for some polypeptides.

[00262] The summary pages associated with each of these CAS and Genbank and GenSeq Accession Numbers as well as the cited journal publications are available (e.g., PubMed ID number (PMID)) and are herein incorporated by reference in their entirety. The PCT/Patent Reference column provides U. S. Patent numbers, or PCT International Publication Numbers corresponding to patents and/or published patent- applications that describe the therapeutic protein molecule all of which are herein incorporated by reference in their entirety. The Biological Activity column describes biological activities associated with the therapeutic protein molecule. The Exemplary Activity Assay column provides references that describe assays which may be used to test the therapeutic and/or biological activity of a therapeutic protein or a transferrin fusion protein of the invention comprising a therapeutic protein X portion. These references are also herein incorporated by reference in their entirety. "The Preferred Indication Y" column describes disease, disorders, and/or conditions that may be treated, prevented, diagnosed, or ameliorated by therapeutic protein X or a transferrin fusion protein of the invention comprising a therapeutic protein X portion. The present invention includes the therapeutic proteins provided in WO 03/020746 which is herein incorporated by reference in its entirety.

[00263] Examples

[00264] Example 1 - Preparation of GPI Anchor, hMUCl, and mTF Expression Cassette

[00265] The pREX0549 vector containing a mTf expression cassette (SEQ ID NO: 16) was digested with Sail and Hindlϊϊ. Figure 3 provides a vector map for pREX0549. Primers P0922 and P0923 (SEQ ID NO: 7 and SEQ ID NO: 8) were annealed together and ligated into pREX0549 at the SaWHindill digestion site. The linker formed by P0922 and P0923 contained Spel, Hindlll, and Xbal restriction sites and was designed to accept a nucleic acid molecule coding for a GPI anchor and MUCl stalk. The resulting vector, pREX0628, contained the mTf expression cassette with the P0922/P0923 linker.

[00266] pREX0628 was digested with Hindlll and Xbal. Primers P0924 and P0925 (SEQ ID NO: 9 and SEQ ID NO: 10) were annealed to form the GPI anchor YIR019c. YIR019c was ligated into the digested pREX0628 to create vector pREX0634.

[00267] hMUCl cDNA was RT-PCR amplified from a human breast tumor total RNA library (Clontech) using primers P0958 and P0959 (SEQ ID NO: 11 and SEQ ID NO: 12). The resulting cDNA was amplified with primers P1019 and P1020 (SEQ ID NO: 13 and SEQ ID NO: 14) to create Spel and Hindlll restriction sites. The resulting hMUCl with Spel and Hindlll sites is provided in SEQ ID NO: 6.

[00268] pREX0634 was digested with Spel and Hindlll and the hMUCl with Spel and Hindlll sites was ligated into the vector. The resulting vector, pREX0663, was used as the display expression cassette (mTf-MUCl-GPI).

[00269] pREX0663 was used to create high and low copy number yeast expression vectors. To create a high copy number yeast expression vector, the 4.1 kb display expression cassette was removed from pREX0663 by digesting the vector with Notl. The expression cassette was then ligated into a Notl digested and dephosphorylated pSAC35 vector, resulting in vector pREX0667 (Yeast Display Vector I).

[00270] A low copy number yeast expression vector was created by digesting pREX0663 with Notl and ligating the expression vector into a Notl digested and dephosphorylated pREX0699, resulting in pREX0721 yeast display (Yeast Display Vector II).

[00271] The yeast expression vectors described above can be used to transform yeast cells and bacterial cells as known in the art. The vector can be expressed in yeast as is known in the art. Further, a collection of expressed transferrin fusion proteins capable of displaying a library of ligand moieties such as random peptides or CDRs can be created and used to screen for binding agents as known in the art.

[00272] Example 2 - 15-mer Random Library Construction

[00273] For selection of transferrin variants with novel binding characteristics, a random 15- mer library was constructed in the 289-290 amino acid position of transferrin through a PCR knitting procedure known in the art (see Martin and Smith (2006) Biochem J. 396(2): 287- 95). A 15-mer library was designed even though only about 7 amino acids are usually needed to form a binding epitope, and a library of ~10 ⁹ only covers a small fraction of the designed library (3.3 x 10 ¹⁹ ). However, with a library size of 10 ⁹ , a 15-mer library covers 6.4 times more 7-mers than a 7-mer library of the same size.

[00274] After obtaining DNA fragment containing BamHVBspEl sequence of transferrin using Pl 174/P1227, two PCR reactions (each with a single primer - Pl 172 and Pl 173) were performed to obtain single strand DNAs. The ssDNAs were isolated and annealed to form a knitting 15-mer library. This operation ensured that the library maintained the original complexity of the synthetic oligonucleotide. The double strand knitting 15-mer library was further amplified using Pl 174/Pl 227 to obtain sufficient quantity of DNA. The PCR product was purified, digested with BamαllBspEl and cloned into proper plasmid vectors, e.g., pREX0995 (Figure 4) or pREX0667.

Pl 172 (SEQ ID NO.: 19)

289-290 15mer random peptide lib insertion knitting forward back fragment

C CAA CTA TTC AGC TCT CCT 567 567 567 567 567 567 567 567 567 567 567 567 567

567 567 CAT GGG AAG GAC CTG CTG TTT AAG

In order to introduce randomness in each position in the DNA sequence, a mixture of nucleotides (A, G, T and C) was incorporated into the position at a predetermined ratio according to LaBean and Kauffman (1993) Protein Sci. 2: 1249-54. The mixture indicated below minimizes stop codon frequency and match amino acid composition to natural proteins.

5 13% T, 32% G, 20% C, 35% A

6 24% T, 24% G, 22% C, 30% A

7 37% T, 26% G, 37% C

P1173 (SEQ ID NO.: 20)

289-290 15mer random peptide lib knitting back primer for front fragment

AGGAGAGCTGAATAGTTGG

Pl 174 (SEQ ID NO.: 21)

289-290 15mer random peptide lib knitting forward primer for front fragment

CTGGATGCAGGTTTGGTGTATG

P1227 (SEQ ID NO.: 22)

289-290 15mer random peptide lib knitting back primer for back fragment

TCATGATCTTGGCGATGCAGTC

[00275] Example 3 - Selection of Yeast Cells Displaying Flag

[00276] A yeast display system was established whereby the N-lobe of transferrin was displayed on the surface of yeast by fusion to a stalk region, huMUCl, and a GPI signal sequence. To demonstrate the utility of this system in binder selection, a Flag-tag sequence, DYKDDDDK (SEQ ID NO.: 23), or a random 15-mer peptide library was inserted at amino acid position 289 of the transferrin N-lobe. Yeast displaying the Flag-tagged transferrin N- lobe, pREX1012 (Figure 6), were then spiked into a pool of yeast displaying the transferrin N-lobe with random 15-mer peptides. From this mixed population only yeast displaying the Flag-tagged transferrin N-lobe were recovered by selection with an anti-Flag antibody.

[00277] To insert the Flag tag sequence into amino acid position 289 of transferrin, oligos incorporating the Flag tag sequence were synthesized and PCR knitted into pREX0667 vector to generate pREX0759 (Figure 7). The BamBl/BspEl fragment of ρREX0759 containing the Flag tag sequence was then used to replace the same restriction fragment of pREX0995 (Figure 4). The resulting plasmid, pREX1012, expresses Flag-tagged transferrin N-lobe - MUCl-GPI fusion protein.

[00278] The 15-mer library was also cloned between the BamHVBspEl sites of pREX0995. The preparation of the 15-mer library is described below. The ligation sample was transformed into E.coli DH5α, and the transformation mixture was all plated onto 2 LB/Amp (50μg/mL) agar plates. All colonies were collected and plasmid DNA was extracted using a Qiagen plasmid prep protocol using several miniprep columns.

[00279] Plasmid DNA for both pREX1012 and the 15-mer library were transformed into the Saccharomyces cerevisiae strain DSl 101 cir°. A single colony of pREX1012 was inoculated into Buffered Minimal Medium with Sucrose (BMM/S) and cultured overnight. All colonies of the 15-mer library were collected and inoculated BMM/S. The cell counts of the two overnight cultures were determined by heamocytometer and the following cell mixtures prepared:

(A) 10 ³ pREX1012 yeast cells mixed with 10 ⁹ 15-mer library yeast cells (10:10 ⁷ )

(B) 10 ³ pREX1012 yeast cells mixed with 10 ⁸ 15-mer library yeast cells (100:10 ⁷ )

[00280] The cell mixtures were incubated in ImI cell block solution (lxPBS/0.05% Tween- 20, 1%BSA) on ice for 30 minutes. After centrifugation (30 seconds at 13000 rpm), the cell pellets were suspended in 1 ml wash solution (IxPBS, 0.5% BSA, 2mM EDTA) with biotinylated anti-Flag antibody (Sigma Aldiϊch, 1 :25 dilution) and incubate on ice for 30 minutes. The cells were washed twice with 1 ml wash solution and suspended in 800μl (A) or 160μl (B) of block solution. To the cell suspensions 200μl (A) or 40μl (B) of streptavidin MACS microbeads (Miltenyi Biotec) were added and incubated on ice for 30 minutes. Labeled cells were separated from unlabeled cells using a MS column according to the manufacturer's instructions (Miltenyi Biotec). The labeled cells were collected and plated onto BMM/S agarose plates and incubated at 3O ⁰ C until small colonies appeared.

[00281] A second round of selection was performed by collecting all the colonies from each plate and growing them overnight at 30 ⁰ C in 5 ml BMM/S. From these cultures cells equivalent to 1.5 OD ₆₀₀ were subjected to a further round of MACS separation as described above. The cells from this second round of screening were cultured overnight at 3O ⁰ C in 5 ml BMM/S. Yeast cell cultures before and after each selection were analyzed by FACS using anti-Flag monoclonal antibody (Sigma Aldrich) and APC labeled-Goat anti-mouse detection antibody in a Bioanalyzer from Agilent Technolgy. The FACS analysis was performed according to the manufacturer's instructions. The presence of Flag-tagged yeast became apparent after two rounds of MACS separation (Figure 8)_

[00282] Example 4 - Agal Stalk Display

[00283] The DNA sequence for the core region of the yeast gene AGAl (residues XXX- XXX) was obtained through PCR of S288c yeast genomic DNA using the following primers:

(a) CAGATCTAGAACAACCGCTATCAGCTCATTATCC (SEQ ID NO.: 24)

(b) CAGAAAGCTTAGTAGTGGAAACTTCTGTAGTG (SEQ ID NO.: 25)

[00284] A PCR product of -1.5 kb was isolated and digested with XbaVHindlll. The fragment was ligated in to SpellHinάlll digested pREX0855 (Figure 9) and transformed into E.coli DH5α. All resulting colonies were collected and plasmid DNA was isolated from the

cells. The expression cassette was receovered by Notl digestion and ligated into pSAC35 to give the yeast expression vector.

[00285] Transformation into yeast and FACS with anti-Flag antibody as previously described. Yeast colonies showed high level of N-lobe display, approximately 10-fold higher than the comparable MUCl stalk based construct (data not shown).

[00286] A single yeast colony was isolated and plasmid DNA extracted from this yeast cells. The Notl expression cassette was recovered after Notl digestion of the extracted plasmid DNA and ligated in to Notl digested pREX0855 to give the plasmid pREX1087 (Figure 10), a pUC-based vector containing Flag-N-lobe-Agal-GPI expression cassette. A region of the DNA sequence corresponding to Agal was sequenced to confirm its identity. This expression cassette was also transferred back in to pSAC35 to give pREXl 106 (Figure 11).

[00287] Example 5 - Selection of Mammalian GPI Variants that Function in Yeast Cells

[00288] Mammalian GPI signals play roles that their yeast counterparts do not play, such as intracellular trafficking, transmission of transmembrane signals and clathrin-independent endocytosis (Biochem J. 1993, 294: 305-324). Yeast cells have not only cell membrane, but cell walls that are absent from mammalian cells, and many of the yeast GPI have unique sequences that target proteins to yeast cell wall (J Bacteriol, 1999, 181 :3886-3889). The GPI of human placental alkaline phosphatase has been shown to not function at all in yeast cells (MoI Microbiol 1999, 34:247-256). As a means to obtain novel sequences that can attach expressed recombinant proteins in to a yeast cell wall, a yeast display vector based on pREX0885 (Figure 9) but using the huMDP GPI sequence (DQLGGSCRTHYGYS S G ASSLHRHW GLLLASLAPLVLCLSLL). This sequence was modified to incorporate four completely random codons (X) as well as several (underlined) rational modifications, XQXGGSXXTIGGYS G AASSLQRTIGLLLASLAPLVLASLL (SEQ ID NO.: 26), wherein X is any amino acid.

[00289] A yeast library expressing the following fusion protein, Flag tag-N-lobe-MUCl stalk-GPI in which the GPI sequence was modified as described above was transformed into the Saccharomyces cerevisiae strain DSl 101 cir°. Any yeast cells with the fusion protein attached to the cell wall were isolated through MACS using a biotinylated-anti-Flag antibody.

[00290] Two oligos P2035 & P2036 (see below) were annealed and extended using Taq polymerase. The resulting DNA fragment was purified, digested with Hindlϊϊ/Xbάl and ligated in to HindIIVXbal digested pREX0855 (see below and Figure 9).

Primers

P2035 (SEQ ID NO.: 27)

CTACAAGCTTNNKCAANNKGGTGGTTCTNNKNNICACTATTGGTGGTTATTCTGGT

GCTGCTTCTTCCTTGCAGAGAACTATTG

P2036 (SEQ ID NO.: 28)

GATGTCTAGATTATTATAACAAAGAAGCTAAAACCAATGGAGCTAAAGAAGCCA

ATAACAAACCAATAGTTCTCTGCAAGGAAG

HindiII _+ aagcttnnkc aannkggtgg ttctnnknnk actattggtg gttattctgg tgctgcttct ttcgaannmg ttnnmccacc aagannmnnm tgataaccac caataagacc acgacgaaga k l x q x g g s x x t i g g y s g__ a a s

»., _. P2035. . ^" >

P2036 «

tccttgcaga gaactattgg tttgttattg gcttctttag ctccattggt tttagcttct aggaacgtct cttgataacc aaacaataac cgaagaaatc gaggtaacca aaatcgaaga s 1_ _q r_ t i g i l l a s l a p 1 ^v 1 _ϋ _. ^s

> P2035 »

< P2036 <

Xbal

-+ ttgttataat aatctaga (SEQ ID NO.: 29) aacaatatta ttagatct

1 1 - - s r (SEQ ID NO. : 30) < P2036 «

[00291] The ligation mixture was transformed into E.coli DH5α to obtain approx. 5x10 ⁵ colonies. AU colonies were collected and plasmid DNA isolated. The plasmid DNA was digested with Notl to recover the expression cassette and cloned into SAC35 to create the yeast expression library. This library was transformed into DSl 101 cir° cells by electroporation. An overnight culture of the aforementioned library was subjected to MACS

using a biotinylated-anti-Flag antibody. The isolated cells were immediately purified again through MACS with the same antibody. The resulting cells were plated onto BMMS plates and 24 colonies were characterized by FACS and DNA sequencing analysis. (See Flag spike description.)

[00292] Of the 22 clones that gave readable sequence only 7 had full length GPI anchors (Table 1) with varying levels of display and the best of which were better than the pREX1003 vector expressing the same fusion protein with a yeast GPI anchor.

[00293] Table 1

[00294] Unexpectedly, 50% of the clones were truncated by a stop codon in one of the randomized codons effectively deleting the GPI anchor signal. Of these 12 clones, two were determined to have display levels significantly better than pREX1003 and were found to contain a cysteine residue just prior to the stop codon (CQIGGS* (SEQ ID NO.: 34) and CQ* where * = stop codon) (Figure 12). In all likelihood these construct were crosslinked in to the cell wall via disulphide bonding to a free cysteine residue in a cell wall protein.

[00295] Although the present invention has been described in detail with reference to examples above, it is understood that various modifications can be made without departing from the spirit of the invention. Accordingly, the invention is limited only by the following claims. All cited patents, patent applications and publications referred to in this application are herein incorporated by reference in their entirety.

Previous Patent: CLAMP

Next Patent: MICROWAVE PLASMA COOKING

JP2961322	[Title of Invention] Peptide
JP6324905	Anti-cancer fusion protein
JPH04217996	SYNTHETIC PEPTIDE CONTAINING SEQUENCE FROM FACTOR VIIA, AND USE THEREOF