COLL, Jose, L. (Biological Sciences-OE-304, 11200 Sw 8th Street Miami, Florida, 33199, US)
ALMAGRO, Juan, C. (14170 Sw 93rd Lane, Miami, Florida, 33186, US)
HERRERA, Rene, J. (Biological Sciences-OE-304, 11200 Sw 8th Street Miami, Florida, 33199, US)
COLL, Jose, L. (Biological Sciences-OE-304, 11200 Sw 8th Street Miami, Florida, 33199, US)
ALMAGRO, Juan, C. (14170 Sw 93rd Lane, Miami, Florida, 33186, US)
| WHAT IS CLAIMED IS :
1. A chimeric Spidofibroin (SpF) protein comprising one or more repetitive domains, or units, from a spider dragline silk proteins, said SpF protein further comprising one or more insect amino acid sequences involved in transport and/or processing of expressed silk protein in the insect.
2. A polynucleotides encoding the SpF protein of claim 1.
3. The polynucleotide of claim 2 further comprising transcriptional regulatory elements derived from a gene sequence endogenous to an insect in which the SpF protein is to be expressed.
4. An expression construct comprising the polynucleotide of claim 2 or 3.
5. The expression construct of claim 4 further comprising a polynucleotide sequence encoding one or more inhibitory polynucleotides selected from the group consisting of an antisense polynucleotide, a ribozyme, a RNA polynucleotide which can form a triple helix, and an RNAi, said inhibitory polynucleotide specifically targeting an RNA molecule encoding a silk protein endogenous to an insect in which an SpF protein is expressed.
6. A method of producing a transgenic insect capable of expressing an SpF polypeptide and producing a silk product comprising a significant proportion of the SpF protein comprising the step of transforming insect larvae with the expression construct of claim 5. |
TRANSGENIC PRODUCTION OF SPIDER SILK
The work described herein was funded by the United States Air Force (Grant Award No. FA9550-04-1-0034. Accordingly, the United States government may have certain rights to the subject matter disclosed and claimed.
FIELD OF THE INVENTION
In general, the invention relates to materials and methods for transgenic production of spider silk in an insect.
BACKGROUND
Spiders produce different silks (Gosline et al., (1999) J Exp Biol. 23:3295; Craig and Riekel (2002) Comp Biochem Physiol & Biochem MoI Biol. 133:493). Of particular interest is the silk that forms the dragline. This fiber has an unusual combination of strength and extensibility superior to currently known high- performance materials. Although the exceptional mechanical properties of dragline silk have been known for decades, mass production of the fiber from natural sources has not been feasible, in part because spiders cannot be domesticated due to their territorial and predatory nature.
Since the work of Xu and Lewis (1990) Proc Natl Acad Sci USA. 87:7120, who characterized the first cDNA of a spider dragline silk (Nephila clavipes), cDNAs from other N. clavipes fibroins as well as fibroins from other spiders as A. diadematus (Hinman and Lewis (1992) J Biol Chem. 267:19320; Guerette et al., (1996) Science. 272:112-5; Craig and Riekel, 2002, supra) have been characterized. Overall, the translation products of these cDNAs and their direct amino sequencing have a similar pattern (Xu and Lewis, 1990, supra; Gosline ct al., 1999, supra; Craig and Riekel, 2002, supra).
For example, A. diadematus fibroin-3 (ADF-3) (GenBank Accession No. U47855) is composed of 15 repetitive domains, each 31 to 44 amino acids long (Figure 1). Each domain contains two different amino acid sequence patterns. The first is a segment of 26 to 36 amino acids, rich in glycine (G), tyrosine (Y), proline (P) and glutamine (Q), that forms the consensus pattern 'GGYGPGS(GQQGP) 3 '(SEQ
ID NO: 1 , where n is 3 to 6. This segment is the so-called amorphous domain. The second is a short poly-alanine (poly- A) block of about 8 amino acids, with the consensus pattern 'ASAAAAAA' (SEQ ID NO: 2, the so-called crystalline domain (Gosline et al., 1999, supra). Alternating crystalline and amorphous domains make up a high level unit and the repetitive nature of these domains defines the three- dimensional structure of the fiber and the mechanical characteristics of the silk (Gosline et al., 1999, supra; Craig and Riekel, 2002, supra). The organization and structure of the dragline silk, however, is not known.
At the C-terminal region of ADF-3, there are approximately 100 amino acids (Figure 1) that do not follow the crystalline and amorphous repetitive amino acid patterns. This region contains valine (V), leucine (L), threonine (T) and other amino acids that may form secondary structures similar to globular proteins (Beckwitt et al., (1998) Insect Biochem MoI Biol. 28: 121). Thus, this domain may have a function other than giving structure to the silk fiber (Beckwitt et al., (1998) supra). The C-terminal domain of ADF-3 is also under different evolutionary constraints than its repetitive motifs (Beckwitt et al., 1998, supra). In the C-terminal domain of ADF-3, there are a few single-base silent substitutions (Beckwitt and Arcidiacono (1994) J Biol Chem. 269:6661; Beckwitt et al., 1998, supra). In contrast, the repetitive domain has indels and amino acids replacements. The evolutionary conservation of C-terminal domain with respect to the repetitive domains is consistent with the suggestion that the former plays a different function than the latter. This conservation also suggests an important function, possibly related to the transport and/or assembly of ADF-3 into the silk fiber. How or if this domain is inserted into the silk fiber is an open question. While recombinant expression of spider silk genes in microorganisms has been used as alternatives to synthesizing spider silk (Wong Po Foo and Kaplan (2002) Adv Drug Deliv Rev. 54:1131), no material has been obtained from these sources that has able to compete with natural fiber (Kubik (2002) Angew Chem Int Ed Engl. 41:2721). Lazaris et al., (2002) Science. 295:472, expressed soluble recombinant ADF-3 in mammalian cells and spun fibers from a concentrated aqueous solution of ADF-3. The fiber diameter was 10 to 40 micrometers and exhibited toughness and modulus values comparable to those of native dragline silks (Lazaris et al., 2002, supra). Nonetheless, the process of spinning fibers from concentrated protein
solutions is very inefficient and thus, industrial production of spider silk fiber based on this procedure is far from being feasible.
The silk moth, Bombyx mori, presumably originated by domestication from the wild ancestral species Bombyx mandarin approximately five millennia ago and has since been a species of high economic value. It is estimated that more than 3000 strains of B. mori exist, including different geographical, inbred, ancestral (old) strains as well as mutant strains that carry numerous genotypic/phenotypic differences, some directly related to the quality and yield of the silk (Nagaraju (2000) Current Science 78:746). Fib-H (Genbank Accession No. AF226688) is by far the main component
(75%) of B. mori's silk by weight. Fib-H is responsible for the chemico-physical and mechanic properties of the moth silk (Craig and Riekel, 2002, supra). Fib-H is a 391 KDa protein, linked by a disulfide bridge to another protein of 25 KDa, called Fibroin Light chain (Fib-L) (Zhou et al., (2000) Nucleic Acids Res. 28:24 13; Inoue et al., (2000) J Biol Chem. 275: 40517). Fib-H and Fib-L are associated with a third protein called fibrohexamerin (Fhx) of about 30 kDa (Inoue et al., 2000, supra), which together forms a high molecular complex called the elementary unit of fibroin. Efficient secretion of the vast amount of Fib-H from posterior silk gland cells into the lumen and the maintenance of solubility of Fib-H during lumen transport have been proposed to be facilitated by the formation of the elementary unit. (Inoue et al., 2000, supra) The structural evolution of silk stored within the lumen of the silk gland of the silk moth (Silk I) to the highly crystalline spun fiber (Silk II) is not well understood.
Fib-H (the partial sequence of which is set out in Figure 2) is a repetitive protein like ADF-3, however the repetitive domains of ADF- 3 and Fib-H differ at the amino acid sequence level and in length. Fib-H has 12 crystalline domains of about 413 amino acids on average, instead of the 8 polyalanine domains of ADF-3. In addition, the crystalline domains of Fib-H are mostly composed of glycine-alanine or glycine-tyrosine repeats. Poly-(glycine-alanine) repeats form compact β-sheet structures when compared to poly-alanine repeats. Moreover, the crystalline domains of Fib-H are separated by amorphous domains of 42 to 43 amino acids, which differ in amino acid composition with respect to the equivalent domains of ADF-3. All of these dissimilarities imply a different three-dimensional structure and hence distinct
mechanical characteristics of the A. diadematus and B. mori silks (Gosline et al., 1999, supra; Craig and Riekel, 2002, supra).
Fib-H has two non-repetitive domains (Zhou et al., 2000, supra). The first 151 amino acids of the N-terminal region of Fib-H cannot be aligned with its repetitive domain. This region is known as the header and constitutes an independent globular unit (Zhou et al., (2001) Proteins.44:119). The header of Fib-H has equivalent domains in the moths Galleήa mallonella and Antheraea pernyi, thus suggesting a role in the transport and processing of the Fib-H (Zhou et al., 2001, supra). The last 50 amino acids of the C-terminal domain is also non-repetitive, and it is through Cys 2241 (using amino acid numbering as set out in GenBank Accession No. AF 226688) in the C-terminal domain that disulfide linkage between Fib-H and Fib-L occurs. Fib-H and Fib-L association is indispensable for the formation of the elementary unit of fibroin and is thus key for transport and assemble of the Fib-H (Takei et al., (1987) J Cell Biol. 105:175; Mori et al., (1995) J MoI Biol. 251:217; Inoue et al., 2000, supra).
B. mori has been successfully used as host to produce diverse heterologous proteins. For example, stable germline transgenic silk moths that produce green fluorescent protein (Tamura et al., (2000) Nat Biotechnol. 18:81) and human mini- collagen (Tomita et al., (2003) Nat Biotechnol.21 :52) have been reported. Such research efforts have generated a wealth of information in terms of vectors, transformation techniques and screening procedures for transgenic silk moths. To date, however, no one has been able to produce spider silk using a transgenic moth.
Thus there exists a need in the art to provide efficient production methods for spider dragline silk. Development of such methods, and materials, including expression constructs, for use therein provides economical means to produce commercially valuable fiber which has numerous applications.
SUMMARY OF THE INVENTION
In one aspect, chimeric Spidofibroin (SpF) proteins comprising one or more repetitive domains, or units, from a spider dragline silk protein are provided, wherein the SpF protein includes amino acid sequences from a insect which are involved in transport and/or processing of expressed silk protein in the insect. In another aspect,
polynucleotides encoding an SpF protein are provided wherein the polynucleotides include one or more transcriptional regulatory elements derived from a gene sequence endogenous to an insect in which the SpF protein is to be expressed.
In another aspect, expression vectors comprising a polynucleotide encoding an SpF protein and transcriptional regulatory polynucleotide sequences are provided. In one embodiment, expression vectors further comprise inhibitory polynucleotide sequences which encode, for example, one or more antisense polynucleotides, one or . more ribozymes, one or more RNA polynucleotides which can form a triple helix, and/or one or more RNAi molecules, the target of which is an RNA molecule encoding a silk protein endogenous to the insect in which the SpF is to be expressed. Inclusion of a polynucleotide encoding one or more of these inhibitory molecules provides a silk product with an increased proportion of SpF protein relative to an endogenous silk protein in the insect in which SpF is expressed.
Also provided are methods for producing a transgenic insect capable of expressing an SpF polypeptide and producing a silk product comprising a significant proportion of SpF protein.
DETAILED DESCRIPTION OF THE INVENTION
The references cited herein throughout, including GenBank files, to the extent that they provide exemplary procedural, sequences, or other details supplementary to those set forth herein, are all specifically incorporated herein by reference.
In one aspect, chimeric proteins designated Spidofibroins (Spfs) are provided comprising one or more repetitive domains from a spider dragline protein sequence in combination with insect amino acid sequences which are involved in transport and/or processing as well as disulfide linking between individual SpFs.
The strategy of combining domains in proteins to create new functions has been well documented as occurring in nature (Bashton and Chothia, (2002) J MoI Biol 315:927). Proteins are often assemblages of functionally and evolutionarily independent domains (Henikoff et al., (1997) Science 278:609). This modular architecture has conferred great flexibility for creating new specificities, altered recognition properties, and modified functions starting from a limited set of structurally different domains. In fact, it has been proposed that the tremendous
diversity of spider silks is a consequence of an evolutionary process of modification and rearrangement of a few amino acid motifs.
Examples of man-made chimeric proteins have also been described. For instance, humanization of murine antibodies is a common practice (Tan et al., (2002) J. Immunol. 169:1119; Rojas et al., (2002) J. Biotechnol. 94:287). These chimeric proteins are a combination of murine and humans antibodies sequences. Chimeric antibodies have the specificity-determining regions of murine antibodies, thus retaining its antigen specificity, while the remaining domains are of human origin, thus resulting in proteins that are well tolerated when used in human therapy, hi addition, specific chimeric antibodies can be obtained combining a synthetic human light chain with a diverse set of murine heavy chains (Rojas et al., 2002, supra).
Spf proteins provided herein comprise one or more spider dragline protein repetitive domains (or units) in combination with insect amino acid sequences that enhance silk production, provide silk fiber stability and/or facilitate protein transport. In view of the fact that it is known that spider dragline proteins from a number of species, as well as spider dragline proteins from the same species, are similar in the basic pattern of repetitive domains, the worker of skill in the art will understand that repetitive domains from any of a number of spider dragline proteins can be utilized to produce SpF proteins in the materials and methods exemplified herein using ADF-3 amino acid sequences. Likewise, in view of the fact that the amino acid sequences for non-repetitive domains from various silk-producing insects are known in the art, e.g., the numerous strains of silkworm, the worker of ordinary skill will understand that SpF proteins comprising insect N- and C-terminal domains from any silk producing insect or silk moth species or strain are also contemplated. In one aspect, therefore, the insect N- and C-terminal regions used to produce the desired SpF protein will be based on selection of the particular insect in which the SpF protein is to be expressed. In yet another aspect, one or both of the insect N- and C-terminal domains may be derived from a species or strain distinct from the insect in which the SpF is to be expressed. By way of example, an SpF is provided which comprises one or more repetitive domains of A. diadematus fibroin-3 (ADF-3) with the N- and C- terminal domains of the B. mori Fibroin-H (Fib-H). In one aspect, the SpF comprises a single
repetitive unit from the ADF-3 amino acid sequence, wherein the repetitive unit is selected from the group consisting of:
GSGQQGPGQQGPGQQGPGQQGPYGPGASAAAAAA (SEQ ID NO: 3);
GGYGPGSGQQGPSQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ ED NO: 4);
GGYGPGSGQQGPGGQGPYGPGSSAAAAAA (SEQ ID NO: 5);
GGNGPGSGQQGAGQQGPGQQGPGASAAAAAA (SEQ ID NO: 6);
GGYGPGSGQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ EO NO: 7);
GGYGPGSGQGPGQQGPGGQGPYGPGASAAAAAA (SEQ EO NO: 8);
GGYGPGSGQQGPGQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ EO NO: 9);
GGYGPGYGQQGPGQQGPGGQGPYGPGASAASAAS (SEQ EO NO: 10);
GGYGPGSGQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ EO NO: 11);
GGYGPGSGQQGPGQQGPGQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ EO NO: 12);
GGYGPGSGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQ GPGGQGAYGPGASAAAGAA (SEQ EO NO: 13);
GGYGPGSGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPYGP GASAAAAAA
(SEQ ID NO: 14), and
GGYGPGSGQQGPGQQGPGQQGPGGQGPYGPGAASAA (SEQ ID NO: 15).
Alternatively, the SpF protein comprises a single copy of a consensus repetitive unit sequence deduced from the alignment set out in Figure 1 , wherein the consensus sequence is
GGYGPGS(GQQGP) n ASAAAAAA, wherein "n" is 3 to 6 (SEQ ID NO: 16).
In another aspect, the SpF protein comprises multiple repetitive units from the ADF-3 amino acid sequence selected from the group consisting of
GSGQQGPGQQGPGQQGPGQQGPYGPGASAAAAAA (SEQ ID NO: 3);
GGYGPGSGQQGPSQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ ID NO: 4); GGYGPGSGQQGPGGQGPYGPGSSAAAAAA (SEQ ID NO: 5);
GGNGPGSGQQGAGQQGPGQQGPGASAAAAAA (SEQ ID NO: 6);
GGYGPGSGQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ ID NO: 7);
GGYGPGSGQGPGQQGPGGQGPYGPGASAAAAAA (SEQ ID NO: 8);
GGYGPGSGQQGPGQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ ID NO: 9); GGYGPGYGQQGPGQQGPGGQGPYGPGASAASAAS (SEQ ID NO: 10);
GGYGPGSGQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ ID NO: 11);
GGYGPGSGQQGPGQQGPGQQGPGQQGPGGQGPYGPGASAAAAAA (SEQ ID NO: 12);
GGYGPGSGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQ GPGGQGAYGPGASAAAGAA (SEQ ID NO: 13);
GGYGPGSGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPGQQGPYGP GASAAAAAA (SEQ ID NO: 14),
GGYGPGSGQQGPGQQGPGQQGPGGQGPYGPGAASAA (SEQ ID NO: 15), and
GGYGPGS(GQQGP) n ASAAAAAA. wherein "n" is 3 to 6 (SEQ ID NO: 16).
SpF proteins comprising multiple ADF-3 repetitive domains include those wherein each repetitive unit has the same sequence, wherein each repetitive domain has a different, and combinations thereof. In one aspect, the number of repetitive domains is 2 to 100, 2 to 90, 2 to 80, 2 to 70, 2 to 60, 2 to 50, 2 to 40, 2 to 30, 2 to 25, 2 to 24, 2 to 23, 2 to 22, 2 to 21, 2 to 20, 2 to 19, 2 to 18, 2 to 17, 2 to 16, 2 to 15, 2 to 14, 2 to 13, 2 to 12, 2 to 11, 2 to 10, 2 to 9, 2 to 8, 2 to 7, 2 to 6, 2 to 5, 2 to 4, 2 to 3, or 2.
SpF proteins are provided which further comprise an N-terminal amino acid sequence derived from B. mori comprising the sequence
MRVKTFVILCCALQYVAYTNAMNDFDEDYFGSDVTVQSSNTTDEIIRDASGA VIEEQITTKKMQRE^IKNHGILGKNfEKMIKTFVITTDSDGNESIVEEDVLMKTL SDGTVAQSYVAADAGAYSQSGPYVSNSGYSTHQGYTSDFSTSAAV (SEQ ID NO. 17)
and a C-terminal amino acid sequence derived from B. mori comprising the sequence
YGQGAGSAASSVSS ASSRSYDYSRRNVRKNCGIP RRQLVVKFRALPCVNC (SEQ ID NO. 18).
With respect to the amino acid sequences derived from B. mori, the worker of ordinary skill in the art will understand that various mutations are contemplated which, when introduced into the sequences set out in SEQ ID NOs: 17 and 18, modulate transport, processing, and/or disulfide linking to the extent that SpF production is improved. For example, modifications in the amino acid sequence derived from the B. mori C-terminal sequence can be effected which eliminate the potential for disulfide bond formation (i.e., a point mutation in the SpF-encoding polynucleotide to delete or alter the codon for Cys 2241 ). Alternatively, modifications to the sequence in the C-terminal region can be effected which increase the number of potential disulfide bonds, including, e.g., one or more point mutations in the SpF- encoding polynucleotide which insert one or more codons for cysteine, substitute one or more naturally-occurring codons for an equal number of cysteine codons, and/or combinations of changes of these types. The invention further provides polynucleotides encoding any of the SpF proteins contemplated by the invention. In one aspect, SpF-encoding polynucleotides
are derived from naturally-occurring sequences. In another aspect the polynucleotides are synthesized using methods well known in the art. In yet another aspect, the SpF- encoding polynucleotide are a combination of naturally-occurring sequences and synthesized sequences. Obtaining a polynucleotide sequence from a naturally-occurring sequence is routine in the art. For example, the polymerase chain reaction (PCR) or reverse transcriptase PCR is employed wherein specific primers are synthesized and a target sequence amplified with cycles of annealing, elongation and denaturation. PCR is facilitated when, as here, the nucleotide sequence of a target polynucleotide is known. As another alternative, SpF-encoding sequences are constructed using overlapping oligonucleotides in a recursive "one-pot" polymerase chain reaction as described by Caimiro, et al., (1995) Biochemistry 34:6640.
An isolated polynucleotide sequence permits the worker of ordinary skill to thereafter incorporate changes in the nucleotide sequence. For example, it is understood in the art that different species often utilize codons for amino acids encoded by more than one codon in different proportions. Those codons which are more prevalent in polynucleotides of a give species are generally known as "preferred codons" for a particular amino acid for that species. According, regardless of the method for obtaining an SpF-encoding polynucleotide, modifications to the sequence are contemplated wherein, for example, existing codons in the sequence are modified to reflect a set of preferred codons for the insect species or strain in which the SpF protein is to be expressed. Methods whereby changes are introduced in a polynucleotide sequence are well known in the art, including, for example, site- directed mutagenesis. In another aspect, a polynucleotide encoding an SpF protein is operatively linked to one or more insect regulatory polynucleotide elements. Such regulatory elementary are in general identical or homologous to regulatory elements found in the insect in which the SpF protein is to be expressed. Ih one aspect, a regulatory element is derived from a region 5' to a gene in the host insect, a region 3' to a gene in the host insect, or a region within the genomic coding region (i.e., an internal regulatory element).
For example and without limitation, SpF-encoding polynucleotides are contemplated which include all or part of the 5' and/or 3' regulatory sequences from Fib-H as set out in GenBank Accession No. AF226688 (the sequence of which is incorporated by reference herein). Additionally, SpF-encoding polynucleotides including all or part of an internal regulatory sequence from the single intron of Fib- H, e.g., base pairs (bp) 62480 to 63451 in GenBank Accession No. AF226688 are contemplated. By way of example, polynucleotides including the Fib-H intron include, e.g., those containing the full intron sequence, those including the first 450 bp, and those including 100 bp of the intron. The complete sequence of the Fib-H gene (fib-H) has recently been folly characterized (Zhou et al., 2000, supra). The gene is 16,790 bp in length and consists of one intron and two exons. The promoter is located at the immediate 5' flanking sequence and comprises 150 bp (Hui et al., (1990) J. MoI Biol. 213:395-8). This region contains a cluster of homeodomain binding sites that interact with silkgland factors and induce the tissue specific expression (Hui et al. 1990, supra). The first exon of fib-H is 67 bp long and includes a 25 bp untranslated domain and a 42 bp coding region of 14 amino acids. The second exon is 15,750 bp long and encodes 98% of the amino acid sequence of Fib-H. The 3' flanking region has the proper termination signals (Mori et al., 1995, supra). The single intron is about 970 by long. The single intron of theyz6-i-Tis also typical of spider dragline fibroins (Craig and Riekel, 2002, surpa). Jn fib-H, the intron contains a truncated sequence of the repetitive element BmI (Adams et al., (1986) J MoI Biol. 187:465) and multiple octamer-like AT-rich elements. These elements are located in the 5' half of the intron and bind several fibroin-modulator-binding proteins (FMBPs). Such interactions may control the expression of the fih-H (Tdkiya. et al., (1997) Biochem J. 321 :645).
The polynucleotide so constructed are then subcloned into an expression vector which can be used to create a transgenic insect capable of expressing the encoded chimeric protein. Subcloning techniques are well known and routine in the art. In one embodiment, a bacculovirus expression system is utilized for subcloning and the resulting construct is used to transfect silk moths. The piggyBac (pBac) system has been demonstrated to yield stable transgenic silk moths expressing heterologous proteins such as human collagen (Tomita et al., 2003, surpa). Since the
SpF has all the needed elements for expression, transport and assembly of the protein into the of B. mori silk fiber, SpF will compete with endogenous moth Fib-H for its assemblage into the silk fiber.
In view of the fact that the insect host possesses the ability to express endogenous silk proteins, it is desirable to reduce endogenous silk expression so that silk produced by the insect will have a higher proportion of SpF protein and therefore maintain characteristics more closely related to naturally-occurring spider silk. In order to increase the proportion of SpF protein in the silk, the SpF-encoding construct used to transform the insect host also includes one or more polynucleotide sequences which encode an inhibitor of endogenous insect silk protein expression. As used herein, "specifically modulate(s)" or "specifically hybridize(s)" is intended to mean an inhibitor of the invention recognizes only a polynucleotide encoding the desired target.
In one aspect, the SpF-encoding construct also encodes one or more anti-sense polynucleotides which specifically hybridize to polynucleotides encoding an endogenous insect silk protein. By way of example and without limitation, antisense technology is discussed with reference to Fib-H. Polynucleotides encoding full length and fragment anti-sense are contemplated as part of the SpF construct. The worker of ordinary skill will appreciate that fragment anti-sense molecules include those which specifically hybridize to Fib-H RNA as well as those which hybridize to RNA encoding variants of the Fib-H family of proteins. In one aspect, the expressed antisense binds to the Fib-H target nucleotide sequence in the cell and prevents translation of the target sequence. In one embodiment, the invention provides methods using antisense oligonucleotides which negatively regulate Fib-H expression via hybridization to messenger RNA (mRNA) encoding Fib-H. Antisense oligonucleotides at least 5 to about 50 nucleotides in length, including all lengths (measured in number of nucleotides) in between, which specifically hybridize to mRNA encoding Fib-H and inhibit Fib-H protein expression, are contemplated. It is understood in the art that, while antisense oligonucleotides that are perfectly complementary to a region in the target polynucleotide possess the highest degree of specific inhibition, antisense oligonucleotides that are not perfectly complementary, i.e., those which include a limited number of mismatches with respect to a region in the target polynucleotide, also retain high degrees of hybridization specificity and
therefore also can inhibit expression of the target mRNA. Accordingly, the invention contemplates SpF constructs encoding antisense oligonucleotides that are perfectly . complementary to a target region in a polynucleotide encoding Fib-H, as well as antisense oligonucleotides that are not perfectly complementary (i.e., include mismatches) to a target region in the target polynucleotide to the extent that the mismatches do not preclude specific hybridization to the target region in the target polynucleotide. Preparation and use of antisense compounds is described, for example, in U.S. Patent No. 6,277,981, the entire disclosure of which is incorporated herein by reference (see also, Gibson (Ed.), (1997) Antisense and Ribozyme Methodology).
In another aspect, SpF-encoding constructs also encode one or more ribozymes which target polynucleotides encoding endogenous insect silk proteins and thereby specifically modulate expression of the endogenous protein. For a review of ribozyme technology, see Gibson and Shillitoe, (1997) MoI. Biotech. 7:125-137. Ribozymes are utilized to inhibit translation of, for example, Fib-H mRNA in a sequence specific manner through (i) the hybridization of a complementary RNA to a target mRNA and (ii) cleavage of the hybridized mRNA through nuclease activity inherent to the complementary strand. Polynucleotides encoding ribozymes are identified by empirical methods or are specifically designed based on accessible sites on the target mRNA (Bramlage, et al., (1998) Trends in Biotech 16:434-438). Delivery of ribozymes is accomplished as part of the SpF-encoding construct.
Ribozymes can specifically modulate expression of an endogenous insect silk protein when designed to be complementary to regions unique to a polynucleotide encoding the target protein. Similarly, ribozymes can be designed to modulate expression of all or some of a family of proteins (e.g., a specific Fib-H-encoding polynucleotide and variants thereof). Ribozymes of this type are designed to recognize polynucleotide sequences conserved in all or some of the polynucleotides which encode the family of proteins.
Ribozyme inhibitors that include a nucleotide region which specifically hybridizes to a target polynucleotide and an enzymatic moiety that digests the target polynucleotide are also contemplated. Specificity of ribozyme inhibition is related to the length of the antisense region and the degree of complementarity of the antisense region to the target region in the target polynucleotide. Ribozyme inhibitors are
contemplated comprising antisense regions from 5 to about 50 nucleotides in length, including all nucleotide lengths in between, that are perfectly complementary, as well as antisense regions that include mismatches to the extent that the mismatches do not preclude specific hybridization to the target region in, for example, a Fib-H-encoding polynucleotide. Because ribozymes are enzymatic; a single molecule is able to direct digestion of multiple target molecules thereby offering the advantage of being effective at lower concentrations than non-enzymatic antisense oligonucleotides. Preparation and use of ribozyme technology is described in U.S. Patent Nos. 6,696,250, 6,410,224, 5,225,347, the entire disclosures of which are incorporated herein by reference.
In yet another aspect, the SpF-encoding construct also includes polynucleotide sequences encoding one or more oligonucleotides which are capable of triple helix formation which specifically modulates expression of a target protein. For a review, see Lavrovsky, et al., (1997) Biochem. MoI. Med. 62:11-22. Triplet helix formation is accomplished using sequence specific oligonucleotides which hybridize to double stranded DNA in the major groove as defined in the Watson-Crick model. Hybridization of a sequence specific oligonucleotide can thereafter modulate activity of DNA-binding proteins, including, for example, transcription factors and polymerases. Target sequences for triple helix hybridization include promoter and enhancer regions that regulate transcriptional regulation of, for example insect Fib-H.
In yet another aspect, interference RNA (RNAi) techniques are employed to sequester through specific hybridization polynucleotides encoding endogenous insect silk proteins and render the sequences untranslatable. Technique is described in detail in the examples below. Additional aspects and details of the invention will be apparent from the following examples, which are intended to be illustrative rather than limiting.
EXAMPLE 1 DESIGN OF SpF To determine the role of the intron in the expression and regulation of the SpF, three variants of the SpF are created. The first variant, called SpF-I, has the complete intron of the native fihíH. The second variant, called SpF-2, is equivalent to the
cDNA and includes no fib-H intron sequences. The third variant called SpF-3 has only the 5* half of ύiefib-H intron plus the branch site sequence (at the end of the intron).
These three SpF variants are used to determine how different repetitive motifs, i.e., different length, amino acid composition, or consensus motifs of N. clavipes fibroin, correlate with the mechanical characteristics of the silk fiber and relevancy of modifications in the N- and C-terminal domains to silk production.
In one aspect, the repetitive motifs of ADF-3 are fully synthesized using 15 cycles of digestion-ligation-sequencing to yield 15 identical repeats of the ADF-3 consensus repetitive motif. By using synthetic repetitive motifs of ADF-3, the natural variability in length and sequence of the repetitive motifs is avoided. Another advantage of synthesizing the repetitive motifs is that the preferred codons of B. mori (Zhou et al., 2000, supra) are incorporated into the ADF-3 -encoding polynucleotide.
In another aspect, one or more ADF-3 repetitive domains are amplified using PCR. Primers were designed complementary to the polynucleotide sequences encoding each end of the repeated domain.
All Fib-H-encoding sequences are PCR amplified from genomic DNA using specific primers. The primers are designed based on ihe fib-H sequence in GenBank Accession No. AF226688 (Zhou et at., 2000, supra), corresponding to bases in this sequence at 62091 to 62110 and 62479 to 62460 for the first pair, 63451 to 63474 and 63861 to 63842 for the second pair, and 79048 to 79069 and 79588 to 79567 for the third pair. Genomic DNA is extracted from a B. mori strain by standard protocols (Sambrook and Russell (2001) Molecular cloning: a Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY) and the same strain is used in the transgenesis experiments (described below). Characterization o£ fib-H has been carried out with DNA extracted from the strain p50 (Zhou et al., 2000, supra).
The 5' flanking region containing the βb-H promoter (nucleotides 62091 to 62479 ihe fib-H) were PCR-amplified together with the first fragment of the N- terminal domain. This fragment encodes the first 14 amino acids of Fib-H. The second fragment of the N-terminal domain is 137 amino acids long (411 bp) and is located from position 63451 to 63861. The C- terminal domain is 50 amino acids long (150 bp) and is encoded in Hie fib-H gene at 79048 to 79598. This fragment was amplified together with the 3' flanking sequence of fib-H. The N. clavipes 765
nucleotide sequence encoding the C-terminal region was obtained by RT-PCR using a 5' primer corresponding to nucleotides 67 to 87 as set out in the sequence of Xu and Lewis, 1990, and a 3' primer corresponding to nucleotides 2131 to 2154 in the same sequence. The PCR products were purified using agarose gel and cloned into the pCR-II ® TOPO vector. White colonies from blue/white screening were sequenced as described below.
Thefib-Hintron is 970 bp long, located at position 62480 to 63450 of fib-H (Zhou et al., 2000, surpa).
PCR products were agarose gel purified by using Wizard PCR Purification Kit (Promega). The purified DNA was ligated into the pCR-II ® TOPO linearized vector provided by the TOPO TA Cloning ® Kit (Invitrogen). One Shot ® TOPlO chemically competent cells are transformed according to Invitrogen's specifications, and plated on LB plates containing 100 μg/ml of ampicillin. Blue/white screening is employed for the identification of recombinant colonies. Individual white colonies are inoculated in 5 ml of LB (Sambrook and Russell, 2001, supra) containing ampicillin at a 100 μg/ml concentration and grown overnight at 37°C. The overnight cultures are centrifuged and the plasmids are purified from the pellet using Wizard 373 DNA Purification Kit (Promega). The isolated plasmids are linearized with Hind III. The pGEM ® -3Zf(+) vector (Promega) is digested under the same conditions and used as a standard at serial concentrations for the mass estimation of the enzyme-restricted recombinant plasmids in IX TAE, 1% agarose gel electrophoresis.
Several plasmids (typically 10) having an insert of the proper size are sequenced using the ABI Prism Big Dye Terminator 1.1 Cycle Sequencing Kit. The 10 μl reactions contained 1.0 μl of terminator ready reaction premix from the kit, 1.5 μl of 5X buffer, 1.0 μl of Ml 3 forward or reverse primer (1.78 pmol/μl, Promega), and approximately 100 ng of purified plasmid DNA. The reactions are run in a 9600 Perkin-Elmer thermocycler for 30 cycles, as per the ABI Prism protocol specifications. Sequences are generated using the ABI Prism 3100 Genetic Analyzer from Applied Biosystems. The PCR π TOPO plasmids containing the correct SpF gene fragment sequences as described above were used as template DNA for PCR amplification and gene construction. In efforts to ligate these fragments into a continuous sequence,
two types of PCR were performed. The first reaction amplified a fragment using linker primers that add flanking sequences to the resulting amplicons. The second PCR used two sequential amplicons to yield a continuous fragment which was subsequently agarose gel purified. In this manner the four fragments were assembled to yield a continuous recombinant gene.
SpF-I is obtained by cloning ύieβb-H intron as a Pst I/SnaB I fragment. The size of the SpF-I gene is 3,128 bp. SpF-2 (no fib-H intron) is obtained as described for SpF-I, except that the 3' end of the first portion of the N-terminal domain and the 5' end of the second fragment of the N-terminal domain are ligated as blunt ends instead of using the restriction sites Pst I and SnaB I. To this end, primers that exactly match the nucleotide sequence encoding of the 3' end of the first fragment of the N- terminal domain and the 5 1 end of the second fragment of the N-terminal are designed. SρF-2 is 2,158 bp long. SpF-3 is obtained by PCR-amplification of the 5' half of the Hfib intron (the first 450 bp of the intron) and blunt end ligated to the portion of the intron that contains the branch site sequence. SpF-3 is 2,708 bp long.
The ligation products are electroporated in XLl-blue electrocompetent cells (Stratagene) by standard protocols (Sambrook and Russell, 2001, supra). The electro- transformed cells are plated on LB plates containing 100 μg/ml of ampicillin and 20 μg/ml of tetracycline. A number of white colonies are screened by PCR to check the presence of inserts of the expected size. Positive clones are grown in 5 ml of LB containing 100 μg/ml of ampicillin and 20 μg/ml of tetracycline. The plasmids are purified and sequencing as described herein.
EXAMPLE 2 SUBCLONING THE SPF VARIANTS IN pBAC VECTORS AND OBTAINING
TRANSGENIC SILK MOTHS.
The pBac system is used in the transgenesis experiments. This system is based on the P-element paradigm that separates the two functional components of the transposoή, i.e., the inverted terminal repeats and the transposase gene, into two separate plasmids that are co-injected into early syncytial embryos (Spradling and Rubin (1982) Science. 21 8:341). The transposase gene is cloned into a plasmid without the terminal repeats and supplies the transposase enzymatic activity. A gene
expression cassette flanked by the transposon inverted terminal repeats is cloned into a second plasmid and provides the mobile sequence of DNA. Co-injection of vector and helper into early syncytial embryos permits the mobilization of the vector sequence from the plasmid into the host genome under the action of the transposase from the helper plasmid. The vector insertion in the host genome is stable over time because the helper plasmid is not propagated through subsequent cell divisions.
SpF-I, SpF-2 and SpF-3 are PCR-amplified from the pGEM plasmids containing the constructions. These variants are subcloned in the pBac vector using the multiple cloning site (MCS). This derivative of pBac (pB[P9DsRedMCS]) includes the gene for the red fluorescent protein (DsRed). Red fluorescence is observed using DsRed filters, thus allowing a rapid screening of transgenic moths in eggs as well as in the fat body and cardiac cells of early larvae.
B. mori is used as the host in the transgenesis experiments. For example, strain p50 is polyvoltine and produces 5-6 generations per year. This strain has been selected as the model for the recent stated B. mori genome project, and was used to characterize the fib-H gene (Zhou et al., 2000, supra). B. mori strains are obtained from any of a number of commercial sources and/or international or national repositories.
Transgenesis is conducted as follows. In brief, the pBac vectors bearing the SpF variants are mixed with the helper plasmid pHA3PIG (Tamura et al., 2000, surpά) and injected into syncytial embryos that are 1.5 to 3 hr old. Moths are mated within the same family and the resultant Gl broods are screened for DsRed fluorescence. Typical efficiency of transgenesis for Gl broods obtained in this manner is 18 to 28 % (Tomita et al., 2003, supra). Integration of SpF variants into the B. mori genome is evaluated by Southern
Blot Hybridization analysis following standard protocols (Sambrook and Russell, 2001, supra). Genomic DNA extracted from DsRed-Gl positive moths is digested with Hind III. The digestion products are run in a 0.6% agarose gel and transferred onto a positive charged nylon membrane (Amershain Biosciences) under vacuum. The membrane is then hybridized at 65°C with a [α- 32 P]UTP-labeled specific probe, which is designed based on the nucleotide sequence of the synthetic repetitive motif
of SpF. The [α- 32 P]UTP-labeled hybridization products is visualized by autoradiography.
EXAMPLE 3 MEASURE THE EXPRESSION OF SPF AND ASCERTAIN THE
MECHANICAL CHARACTERISTICS OF THE SILK FIBERS.
ELISA is used to measure the expression of the SpF variants in cocoons. To this end, a polyclonal antibody against SpF is raised in rabbits by using as immunogen the synthetic peptide 'GSGQQGPGQQGPG' conjugated to Bovine Albumin Serum (SSA). The peptide sequence is part of the amorphous domain of the, ADF-3. Peptide synthesis, conjugation, immunization, bleeding and serum preparations is carried out by Alpha Diagnostic International, Inc..
Proteins are extracted from the cocoons and used to coat ELISA plates as described in Ionue et al. (2000), supra. The rabbit anti-SpF polyclonal antibodies is added to the peptide-coated wells and the free antibodies washed away with PBST. The SpFrantibody complexes are reveled by adding goat anti-rabbit horseradish peroxidase conjugated (Stratagene). The color reaction is developed by adding H 2 O 2 and ortho-phenylenediamine (OPD) in citrate buffer (0.1 M; pH 5). The reaction is stopped with 6N HCl and the plates read at 492 run. The mechanical properties of the obtained fibers are tested using an Instron model 55Rl 122 Universal Testing Apparatus. The apparatus allows conduction of tensile tests to failure and load cycle tests over a broad range of extension rates. Ih addition, tests are conducted under conditions of temperature and humidity control.
EXAMPLE 4
CONDUCT EXPERIMENTS TO INCREASE THE PROPORTION OF SPF
WITH RESPECT TO FIB-H
The SpF constructions have all of the regulatory elements needed to compete with expression of Fib-H. Thus, fibers spun by transgenic moths are a hybrid of SpF and Fib-H. In order to increase the proportion of SpF in the silk, the amount of endogenous B. mori Fib-H is reduced. Interference RNA (RNAi) is employed to
target the GX-I nucleotide sequence of the repetitive motifs of Fib-H and sequester endogenous fib-H xήRNA in order to render it untranslatable. The repetitive nature of this domain provides a greater possibility for the RNAi to complex.
Vector derivatives are designed to produce in vivo dsRNA as an extended hairpin-loop RNA. This technique was first demonstrated in C. elegans and subsequently successfully used in D. melanogaster. It was shown to be as effective in silencing genes as direct injection of dsRNA but had the advantage that, by being expressed in vivo, it could be maintained undiluted throughout development allowing study of even late-acting gene function (Tavamarakis et al., (2000) Nat Genet. 24:180; Kennerdell and Carthew (2000) Nat Biotechnol. 18:896). Following the strategy of Kennerdell and Carthew (2000), the hairpin RNA(s) is expressed from a transgene exhibiting dyad symmetry cloned into the densovirus-derived vector. The construct is made by generating two PCR products of the same exon-rich genomic or cDNA region with primers that generate two unique restriction sites at opposing ends. Using the restriction site located at the inversion point, both PCR products are digested with a single enzyme and ligated to form the inverted repeat (IR). Subsequently, the restriction sites at each end of the IR dimmer are digested and used for insertion into the plasmid of choice between the desired promoter and the 3 1 UTR of SV40. The resulting RNAi expression cassette is inserted within a JcDNV vector or apiggyBac vector derivative.
The foregoing describes and exemplifies the invention but is not intended to limit the invention defined by the claims which follow. AU of the compositions and/or methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the materials and methods of this invention have been described in terms of specific embodiments, it will be apparent to those of skill in the art that variations may be applied to the materials and/or methods and in the steps or in the sequence of steps of the methods described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those of ordinary skill in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
Next Patent: MODIFIED CYANOVIRIN-N POLYPEPTIDE
