Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ENZYMATIC INVERSE POLYMERASE CHAIN REACTION LIBRARY MUTAGENESIS
Document Type and Number:
WIPO Patent Application WO/1993/012257
Kind Code:
A1
Abstract:
This invention discloses a method for generating a recombinant library by introducing one or more changes within a predetermined region of double-stranded nucleic acid, comprising providing a first primer population and a second primer population, each of the populations having a variable base composition at known positions along the primers, the primers incorporating a class IIS restriction enzyme recognition sequence, being capable of directing change in the nucleic acid sequence and being substantially complementary to the double-stranded nucleic acid to permit hybridization thereto. The method additionally comprises hybridizing the first and second primer populations to opposite strands of the double-stranded nucleic acid to form a first pair of primer-templates oriented in opposite directions, performing enzymatic inverse polymerase chain reaction to generate at least one linear copy of the double stranded nucleic acid incorporating the change directed by the primers, cutting the double-stranded nucleic acid copy with a class IIS restriction enzyme to form a restricted linear nucleic acid molecule containing the change, joining termini of the restricted linear nucleic acid molecule to produce double-stranded circular nucleic acid and introducing the nucleic acid into compatible host cells. A method is additionally provided for generating a recombinant library using wobble-base mutagenesis.

Inventors:
STEMMER,Willem,P.,C.
Application Number:
PCT/US1992/010647
Publication Date:
June 24, 1993
Filing Date:
December 10, 1992
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HYBRITECH INCORPORATED.
International Classes:
C12N15/10; C12Q1/68; (IPC1-7): C12N15/10; C12Q1/68
Foreign References:
EP0414134A1
US4959312A
Other References:
NUCLEIC ACIDS RESEARCH. vol. 17, no. 16, 1989, ARLINGTON, VIRGINIA US pages 6545 - 6551 A. HEMSLEY ET AL. See abstract and " discussion "
NUCLEIC ACIDS RESEARCH. vol. 18, no. 6, 25 March 1990, ARLINGTON, VIRGINIA US page 1656 TOMIC M. ET AL.
NATURE. vol. 344, 19 April 1990, LONDON GB pages 793 - 794 JONES D.H. ET AL.
GENE. vol. 84, 1989, AMSTERDAM NL pages 143 - 151 J.D. HERMES ET AL.
BIOTECHNIQUES vol. 13, no. 2, August 1992, EATON PUBL. CO.,NATICK,MA,USA pages 216 - 220 W.P.C. STEMMER ET AL.
Download PDF:
Claims:
We Claim:
1. A method for generating a recombinant mutagenesi library by introducing one or more changes within a predetermined region of double stranded nucleic acid, comprising: (a) providing a first primer population and a second primer population, each said population having variable base composition at known positions along sa primers, said primers incorporating a class IIS restriction enzyme recognition sequence, being capable of directing change in said nucleic acid sequence and being substantially complementary to said double stranded nucleic acid to allow hybridization thereto; (b) hybridizing said first and second primer populations to opposite strands of said double strande nucleic acid to form a first pair of primertemplates oriented in opposite directions; (c) performing enzymatic inverse polymerase chai reaction to generate at least one linear copy of said double stranded nucleic acid incorporating said change directed by said primers; (d) cutting the double stranded nucleic acid cop of step (c) with a class IIS restriction enzyme to for a restricted linear nucleic acid molecule containing said change; and (e) introducing nucleic acid generated from step (c) or (d) into compatible host cells.
2. The method of Claim 1, additionally comprising th step of joining termini of said restricted linear nucleic acid molecule of step (d) to produce doublestranded circular nucleic acid.
3. The method of Claim 1, wherein said restricted linear nucleic acid molecule produced in step (d) contains only said change in said nucleic acid sequence.
4. The method of Claim 1, wherein at least steps (b) and (c) are repeated one or more times.
5. The method of Claim 1, wherein said double stranded nucleic acid is circular DNA.
6. The method of Claim 1, wherein step (d) further comprises treating said restricted nucleic acid molecule with a polymerase under conditions which create blunt ends.
7. The method of Claim 1, wherein said host cells a bacteria.
8. The method of Claim 1 wherein said double strand nucleic acid encodes polypeptide.
9. The method of Claim 8, additionally comprising th step of expressing said polypeptide encoded by the nucleic acid of step (e) .
10. The method of Claim 1, wherein said cells are eukaryotic.
11. The method of Claim 8 wherein said change is located within a polypeptide encoding region of the double stranded nucleic acid.
12. The method of Claim 8, wherein said change is located within a regulatory region of said doublestranded nucleic acid.
13. The method of Claim 12, wherein said change is located within a promoter region of said doublestranded nucleic acid.
14. The method of Claim 8, wherein said change is located within the enhancer region of said doublestranded nucleic acid.
15. The method of Claim 1, wherein said double stranded nucleic acid comprises a viral vecto „.
16. The method of Claim 15, wherein said compatible host cells comprise a helper virus packaging cell line that directs the packaging of viral particles containing said viral vector.
17. The method of Claim 16, comprising the step of collecting said viral particles.
18. The method of Claim 17, additionally comprising the step of infecting susceptible cells with said viral particles.
19. A recombinant library created by the method of Claim l.
20. A method for improving polypeptide expression fr a doublestranded nucleic acid sequence encoding polypepti comprising: (a) measuring polypeptide expression from said doublestranded nucleic acid in a compatible host cel (b) providing a first primer population and a second primer population, each said population having variable base composition at known positions along sa primers, said primers incorporating a class IIS restriction enzyme recognition sequence, being capabl of directing change in said nucleic acid sequence and being substantially complementary to said double stranded nucleic acid to allow hybridization thereto; (c) hybridizing said first and second primer populations to opposite strands of said double strande nucleic acid to form a first pair of primertemplates orientated in opposite directions; (d) performing enzymatic inverse polymerase chai reaction to generate at least one linear copy of said double stranded nucleic acid incorporating said change directed by said primers; (e) cutting said double stranded nucleic acid copy of step (d) with a class IIS restriction enzyme t form a restricted linear nucleic acid molecule containing said change; (f) introducing said nucleic acid generated from step (d) or (e) into said host cells; (g) measuring polypeptide expression from said modified nucleic acid of step (f) in said cells; and (h) identifying cells with expression levels greater than the expression levels measured in step (a) .
21. The method of Claim 20, additionally comprising the step of joining termini of said restricted linear nucleic acid of step (e) to produce modified doublestrand circular nucleic acid.
22. The method of Claim 20, additionally comprising the step of obtaining modiiied template from said identifi cells.
23. The method of Claim 22, comprising the step of identifying the modified nucleic acid sequence.
24. The method of Claim 22, comprising transferring the modified sequence into another nucleic acid sequence.
25. The method of Claim 21, wherein said primers direct changes in a promoter sequence.
26. The method of Claim 21, wherein said primers direct changes in a polypeptide sequence.
27. The method of Claim 21, wherein said compatible cells are bacteria.
28. The method of Claim 21, wherein said cells are eukaryotes.
29. The method of Claim 21, wherein said primers direct changes in a ribosome binding sequence.
30. A method for generating a recombinant library using wobblebase mutagenesis comprising: (a) providing a first primer population and a second primer population, said primers being substantially complementary to a region of double stranded nucleic acid encoding polypeptide to allow hybridization thereto, said primers having a variable base composition in the third position of at least one nucleotide codon corresponding to said double stranded nucleic acid and a class IIS restriction enzyme recognition sequence; (b) hybridizing said first and second primer populations to opposite strands of said double strande nucleic acid to form a first pair of primertemplates orientated in opposite directions; (c) performing enzymatic inverse polymerase chai reaction to generate at least one linear copy of said double stranded nucleic acid incorporating said change directed by said primers; (d) " cutting said double stranded linear nucleic acid of step (c) with a class IIS restriction enzyme form restricted linear nucleic acid molecule containi said change; and (e) introducing nucleic acid generated from step (c) or (d) into compatible host cells.
31. The method of Claim 30, additionally comprising joining termini of said restricted linear nucleic acid of step (d) to produce doublestranded circular nucleic acid.
32. The method of Claim 30, wherein said variable bas codons do not alter the corresponding amino acid sequence o said polypeptide.
33. The method of Claim 30, wherein said primers direct alterations in the leader sequence of said polypeptide.
34. The method of Claim 30, wherein said host cells are bacteria.
35. The method of Claim 33, wherein said leader sequence is the bacterial OmpA protein leader sequence or a fragment thereof.
36. The method of Claim 33, wherein said leader sequence is linked to polynucleotide encoding light and heavy chain antibody fragments.
37. An optimized OmpA protein leader: 5•ATGAAAAAAACTGCAATTGCGATTGCTGTTGCTCTTGCTGGTTTCGCGACGGTAGCA AGGCC 3', or an expression promoting fragment thereof.
38. An optimized OmpA protein leader sequence: 5'ATGAAAAAAACCGCGATCGCCATTGCTGTGGCGCTTGCCGGCTTTGCTACGGTGGCG AGG 3 'or an expression promoting fragment thereof.
Description:
ENZYMATIC INVERSE POLYMERASE CHAIN REACTION LIBRARY

MUTAGENESIS

BACKGROUND OF THE INVENTION

Recombinant DNA techniques have revolutionized molecular biology and genetics by permitting the isolation and characterization of specific DNA fragments. Of major impact has been the exponential amplification of small amounts of DNA by a technique known as the polymerase chain reaction (PCR) . The sensitivity, speed and versatility of PCR makes this technique amenable to a wide variety of applications such as medical diagnostics, human genetics, forensic science and other disciplines of the biological sciences.

PCR is based on the enzymatic amplification of a DNA sequence that is flanked by two oligonucleotide primers which hybridize to opposite strands of the target sequence. The primers are oriented in opposite directions with their 3 ' ends pointing towards each other. Repeated cycles of heat denaturation of the template, annealing of the primers to their complementary sequences and extension of the annealed primers with a DNA polymerase result in the amplification of the segment defined by the 5* ends of the PCR primers. Since the extension product of each primer can serve as a template for the other primer, each cycle results in the exponential accumulation, of the specific target fragment, up to several million fold in a few hours. The method can be used with a complex template such as genomic DNA and can amplify a single- copy gene contained therein. It is also capable of amplifying a single molecule of target DNA in a complex mixture of RNAs

or DNAs and can, under some conditions, produce fragments to ten kb long. The PCR technology is the subject matter o United States Patent Nos. 4,683,195, 4,800,159, 4,754,065, an 4,683,202 all of which are incorporated herein by reference In addition to the use of PCR for amplifying targe sequences, this method has also been used to generate site specific mutations in known sequences. Mutations are create by introducing mismatches into the oligonucleotide primer used in the PCR amplification. The oligonucleotides, wit their mutant sequences, are then incorporated at both ends o the linear PCR product. In addition to their mutate sequences, the primers often contain restriction enzym recognition sequences which are used for subcloning th mutated linear DNAs into vectors in place of the wild typ sequences. Although this procedure is relatively simple t perform, its applications are limited because appropriat restriction sequences are not always conveniently located fo substituting the mutant sequence with the wild-type sequence. Restriction sequences can be incorporated into the wild-typ sequences for subcloning. However, such extraneous sequence can cause detrimental effects to the function of the ' gene o resulting gene product. Moreover, PCR products typicall contain heterogeneous termini resulting from the addition o extra nucleotides and/or incomplete extension of the pri er templates. Such termini are extremely difficult to ligate and therefore result in a low subcloning efficiency.

Several modifications of the PCR-based site-directed mutagenesis strategies have been developed to circumvent such limitations, but they too have undesirable features. The most prominent undesirable feature exhibited by these alternative methods is a low frequency of correct mutations. For example, inverse PCR (IPCR) is a method which amplifies a circular plasmid rather than a linear molecule, Hemsley et al. , Nuc. Acid. Res. 17:6545-6551 (1989), which is incorporated herein by reference. In this technique, two primers which are located back to back on opposing DNA strands of a plasmid drive the PCR reaction. The resultant PCR product, a linear

DNA molecule identical in length to the starting plasmid contains any mutations which were designed into the primers The product is then enzymatically prepared for ligation b blunting and phosphorylating the termini. Enzymatic treatmen of the termini is a necessary step for ligation due t heterogeneous termini associated with PCR products. Thes treatments are likely to be incomplete and cause unwante mutations as well as result in a low ligation an transformation efficiency due to the additional require steps.

Recombinant circle PCR (RCPCR) , Jones and Howard BioTechniques 8:178-183 (1990), and recombination PCR (RPCR) , Jones and Howard, BioTechniques 10:62-65 (1991), on the othe hand, are two methods similar to IPCR which do not require an enzymatic treatment. In RCPCR, two separate PCR reactions, requiring a total of four primers, are needed to generate th mutated product. The separate amplification reactions ar primed at different locations on the same template to generat products that when combined, denatured and cross-annealed, form double-stranded DNA with complementary single-stran ends. The complementary ends anneal to form DNA circle suitable for transformation into E. coli.

RPCR is a technique that uses PCR primers having a twelv base exact match at their 5' ends, resulting in a PCR produc with homologous double-stranded termini. Transformation o the linear product into recombination-positive (recA-positive) cells produces a circular plasmid through in viv recombination. Although this method reduces the number o steps and primers used compared to RCPCR, the transformatio and recombination of linear molecules is an inefficien

process resulting in a correspondingly low mutation frequency. A modification of site-directed mutagenesis, rando mutagenesis, permits the incorporation of random mutations into a polynucleotide. Mutant libraries are normall constructed by the mutagenesis of a small, defined area of a plasmid containing the gene or control region of interest. Methods for generating mutant libraries typically use

synthetic oligonucleotides with random or biased mixtures bases in one or more positions along the oligonucleotide. variety of methods have been used to introduce these mutagen oligonucleotides into the expression vector. Typically, t oligonucleotides are hybridized to a substantiall complementary strand of DNA and a polymerase is used to exten the length of the oligonucleotide into a polynucleotide whos length is dependant both on the length of the template and o the conditions of enzymatic extension. This procedure permit the construction of large libraries of mutants havin mutations in one or more regions of the polynucleotide o protein sequence as compared with the template. From thes libraries, the transfectants or transformants can be screene for the desired characteristic. However, both rando mutagenesis employing PCR, and random mutagenesis, in general are restricted in design by the choice of restrictio endonucleases traditionally employed for these procedures Often random mutagenesis has a relatively low efficiency suc that a significant number of individual mutations are los during primer extension and introduction of the polynucleotid into the host. Further, mistakes or unintended mutations ar often incorporated into the sequences resulting in a additional decrease in the efficiency. Selected mutations ma therefore be under or overrepresented in the library. Thus, a need exists for a PCR-based mutagenesis metho which allows the rapid and efficient alteration of nucleotid sequences to create libraries that are sufficiently diverse. The present invention satisfies this need and provides relate advantages as well.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 is a schematic diagram outlining the steps of

EIPCR. Figure 2 shows the design of EIPCR primers. Line A shows a region of the PCR template (SEQ ID NO: 1) and two mutations to be made by EIPCR (indicated by small arrows) . Line B shows

how the primers (SEQ ID NO: 2; SEQ ID NO: 3) relate to th mutated product (line C) (SEQ ID NO: 4) . This is not a actual reaction intermediate, but is a cartoon to draw whe designing the primers. The primers are indicated in grey The Bsa I recognition sequence SEQ ID NO: 5) is underlined Four or more bases are added 5' to the enzyme recognitio sequence of each primer to ensure efficient substrat recognition by the enzyme. Line C shows the sequence of th mutated product. The grey boxes show the parts of the prime that have been incorporated into the final product. Th overhangs of the two DNA ends are indicated, but th recognition sequences have been cut off and are not part o the final product.

Figure 3 is a list of class IIS restriction enzymes an the nucleotide sequence of their recognition sequences (SEQ I NOS: 5 through 20) .

Figure 4 is a schematic diagram showing the use of EIPC technology for generating single chain antibodies. Line shows the template region (SEQ ID NO: 21) to be mutagenized t create a linker between heavy and light chain encodin sequences. Line B shows the EIPCR primer design (SEQ ID NO 22; SEQ ID NO: 23) and line C shows the nucleotide (SEQ ID NO 24) and amino acid (SEQ ID NO: 25) sequence of an identified active single chain antibody sequence. Figure 5 is a schematic of the 1.8 kb expression vecto pMCHAFvl for CHA255 Fv fragment expression. The expressio cassette is located between Hind III and Eco Rl restrictio endonuclease sequences in pUC19.

Figure 6 is a schematic of EIPCR primer design. Line shows the area of the wildtype leader sequence that wa replaced by a library of leader sequences. Line B shows th design of the mutagenic primers relative to the template (SE ID NO: 26 and SEQ ID NO: 27) . Line C shows the sequence o the identified, positive single chain Fv linker conferrin increased protein expression that was obtained from the. rando library (SEQ ID NO: 28) .

Figure 7 is a schematic illustrating EIPCR promote

library mutagenesis. Figure 7A is the template sequence. T underlined regions in Figure 7B indicate the regions variability in the library.

SUMMARY OF THE INVENTION The invention is directed to a method for generating recombinant mutagenesis library by introducing one or mor changes within a predetermined region of double strande nucleic acid, comprising providing a first primer populatio and a second primer population, each population having variable base composition at known positions along th primers, the primers incorporating a class IIS restrictio enzyme recognition sequence, being capable of directing chang in the nucleic acid sequence and being substantiall complementary to the double-stranded nucleic acid to allo hybridization thereto. The method also comprises hybridizin the irst and second primer populations to opposite strands o the double-stranded nucleic acid to form a first pair o primer-templates oriented in opposite directions, performin enzymatic inverse polymerase chain reaction to generate a least one linear copy of the double stranded nucleic aci incorporating the change directed by the primer, cutting th double stranded nucleic acid copy with a class IIS restrictio enzyme to form a restricted linear nucleic acid molecul containing the change and introducing nucleic generate therefrom into compatible host cells.

In a preferred embodiment, the method additionall comprises the step of joining termini of the restricted linea nucleic acid molecule to produce double stranded circula nucleic acid. The method preferably produces restricte linear nucleic acid molecules containing only the directed change in the nucleic acid sequence. Preferably the double stranded nucleic acid is circular DNA. The method can be performed on either eukaryotic or prokaryotic cells.

In a preferred embodiment of the invention, the double stranded nucleic acid encodes polypeptide. The change in the nucleic acid can be introduced into the amino acid coding region of the polypeptide or into a regulatory region of the

polypeptide. Thus changes may be introduced into promoter a enhancer regions of the double stranded nucleic acid. T polypeptide encoded by the double stranded nucleic acid preferably expressed from the host cells. In another preferred embodiment of the invention, t double stranded nucleic acid comprises a viral vector a compatible host cells comprise a helper virus packaging ce line that directs the packaging of viral particles containi the viral vector. The viral particles are preferab collected and the method additionally comprises the step infecting susceptible cells with the viral particles.

In yet another preferred embodiment of the invention, method is provided for improving polypeptide expression fr a double-stranded nucleic acid sequence encoding polypepti comprising: measuring polypeptide expression from the doub stranded nucleic acid in a compatible host cell, providing first primer population and a second primer population, ea of the populations having a variable base composition at kno positions along the primers, the primers incorporating a cla IIS restriction enzyme recognition sequence, being capable directing change in the nucleic acid sequence and bei substantially complementary to t he double stranded nucle acid to allow hybridization thereto. The method additional comprises hybridizing the first and second primer populati to opposite strands of the double stranded nucleic acid form a first pair of primer-templates orientated in opposi directions, performing enzymatic inverse polymerase cha reaction to generate at least one linear copy of the doub stranded nucleic acid incorporating the change directed by t primers, cutting the double stranded nucleic acid copy with class IIS restriction enzyme to form a restricted line nucleic acid molecule containing the change, introducing t nucleic acid from the cutting step or the PCR step into ho cells and measuring polypeptide expression from the modifi nucleic acid in the cells, and identifying cells wi expression levels greater than the expression levels measur in cells containing unmodified double stranded nucleic aci

The method preferably additionally comprises the step joining termini of the restricted linear nucleic acid molecu to produce modified double stranded circular nucleic acid a the method also preferably comprises the step of obtaini modified template from selected cells. Preferably t modified nucleic acid sequence is identified and transferr into another nucleic acid sequence. The primers can dire changes in a regulatory sequence, including promoters, or t primers can direct changes in a polypeptide sequence. In preferred embodiment the primers direct changes in a riboso binding sequence.

In yet another preferred embodiment of this invention, method is provided for generating a recombinant library usin wobble-base mutagenesis comprising: providing a first prime population and a second primer population, said primers bein substantially complementary to a region of double strande nucleic acid encoding polypeptide to allow hybridizatio thereto, the primers having a variable base composition in th third position of a least one nucleotide codon correspondin to the double stranded nucleic acid and a class II restriction enzyme recognition sequence. The ' metho additionally comprises hybridizing the first and second prime populations to opposite strands of the double stranded nuclei acid to form a first pair of primer-templates orientated i opposite directions, performing enzymatic inverse polymeras chain reaction'to generate at least one linear copy of th double stranded nucleic acid incorporating the change directe by the primers, cutting the double stranded linear nuclei acid with a class IIS restriction enzyme to form restricte linear nucleic acid molecule containing the change an introducing nucleic acid generated therefrom into compatibl host cells. The variable base codons preferably do not alte the corresponding animo acid sequence of the polypeptide.

In a preferred embodiment the primers direct alterations in the leader sequence of the polypeptide. The leade sequence is preferably the bacterial OmpA protein leader sequence of a fragment thereof and the leader sequence is

preferably linked to polynucleotide encoding light and hea chain antibody fragments.

DETAILED DESCRIPTION OF THE INVENTION The invention provides a novel method for rapid a efficient site directed mutagenesis of double-stranded line or circular DNA. The method, termed Enzymatic Inver Polymerase Chain Reaction (EIPCR) , greatly improves t utility of previous PCR techniques enabling rapid screening selection of putative mutant to identify clones containi changes of interest.

In one embodiment, oligonucleotide primers containing t desired sequence changes are used to direct PCR synthesis a double-stranded circular DNA template (Figure 1) . T primers are designed so that they additionally contain a cla IIS restriction enzyme recognition sequence and a sequen complementary to the template for primer hybridization. T primers are hybridized to opposite strands of the circul template and direct the amplification of each strand to fo linear molecules containing the desired mutations. The en of the linear molecules are filled in with Klenow polymera or T4 DNA polymerase and restricted with the appropriate cla IIS restriction enzyme to produce compatible overhangs f circularization and ligation.

EIPCR uses class IIS restriction enzyme recogniti sequences in the mutated or non-mutated PCR primers. Thi type of recognition sequence is used because the cleavage sit is separated from the recognition sequence and therefore doe not introduce extraneous sequences into the final product Restriction of the PCR products with a class IIS enzy removes the recognition sequence and produces homogeneou termini for subsequent ligation. Class IIS recognitio sequences therefore circumvent problems associated wit ligating heterogeneous PCR termini since such termini will b cleaved off using a class IIS recognition enzyme. If th primers are designed with complementary cleavage sites, th resulting termini will have complementary overhangs which ca be used for circularization of the linear molecules. Suc

complementary overhangs increase the efficiency intramolecular ligation compared to blunt ends and result a high percentage of correctly mutated clones. Thus, EIP allows efficient mutagenesis and production of homogeneo termini of any DNA template without incorporating extraneo sequences. EIPCR also allows mutagenesis at any locati within a circular template independent of convenie restriction sequences.

As used herein, the term "predetermined change" refers t a specific desired change within a known nucleic aci sequence. Such desired changes are commonly referred to i the art as site directed mutagenesis and include, for example additions, substitutions and deletions of base pairs. specific example of a base pair change is the conversion o the first A/T bp in the sequence AGCA to a G/C bp to yield th sequence GGCA. It is understood that when referring to a bas pair, only one strand of a double-stranded sequence.or on nucleotide of a base pair need be used to designate th referenced base pair change since one skilled in the art wil know the corresponding complementary sequence or nucleotide

As used herein, the term "class IIS restriction enzym recognition sequence" refers to the recognition sequence o class IIS restriction enzymes. Class IIS enzymes cleav double-stranded DNA at precise distances from thei recognition sequence. The recognition sequence is generall about four to six nucleotides in length and directs cleavag of the DNA downstream from the recognition sequence. Th distance between the recognition sequence and the cleavag site as well as the resulting termini generated in th restricted product vary depending on the particular enzym used. For example, the cleavage site can be anywhere from one to many nucleotides downstream from the 3* most nucleotide of the recognition sequence and can result in either blunt cuts or 5* and 3 1 staggered cuts of variable length. Such staggered cuts produce termini having single-stranded overhangs. Therefore, "complementary cleavage sites" as used herein refers to complementary nucleic acid sequences at such

single-stranded overhangs. Class IIS restriction enzy recognition sequences suitable for use in the invention c be, for example, Alw I, Bsa I, Bbs I, Bbu I, Bsm Al, Bsr Bsm I, BspM I, Ear I, Esp 31, Fok I, Hga I, Hph I, Mbo II, P I, SfaN I, and Mnl I. It is understood that the recogniti sequence of any enzyme that utilizes this separation betwe the recognition sequence and the cleavage site is includ within this definition.

As used herein, the term "substantially complementar refers to a nucleotide sequence capable of specificall hybridizing to a complementary sequence under conditions kno to one skilled in the art. For example, specifi hybridization of short complementary sequences will occu rapidly under stringent conditions if there are no mismatche between the two sequences. If mismatches exist, specifi hybridization can still occur if a lower stringency is used Specificity of hybridization is also dependent on sequenc length. For example, a longer sequence can have a greate number of mismatches with its complement than a shorte sequence without losing hybridization specificity. Suc parameters are well known and one skilled in the art wil know, or can determine, what sequences are substantiall complementary to allow specific hybridization.

As used herein, the term "a primer capable of directing when used in reference to nucleic acid sequence changes refer to a primer having a mismatched base pair or base pairs withi its sequence compared to the template sequence. Suc mismatches correspond to the mutant sequences to b incorporated into the template and can include, for example additional base pairs, deleted base pairs or substitute bas pairs. It is understood that either one or both primers use for the PCR synthesis can have such mismatches so long a together they incorporate the desired mutations into the wild type sequence. Thus, the invention provides methods of introducing a least one predetermined change in a nucleic acid sequence o a double-stranded DNA. Such methods include: (a) providin

a irst primer and a second primer capable of directing sa predetermined change in said nucleic acid sequence, said ir and second primers comprising a nucleic acid sequen substantially complementary to said double-stranded DNA so to allow hybridization, a class IIS restriction enzy recognition sequence and cleavage sites; (b) hybridizing sa first and second primers to opposite strands of said doubl stranded DNA to form a first pair of primer-templates orient in opposite directions; (c) extending said first pair primer-templates to create double-stranded molecules; ( hybridizing said first and second primers at least once said double-stranded molecules to form a second pair primer-templates; (e) extending said second pair of prime templates to produce double-stranded linear molecul terminating with class IIS restriction enzyme recogniti sequences; and (f) restricting said double-stranded line molecules with a class IIS restriction enzyme to fo restricted linear molecules containing said change in sa nucleic acid sequence. Enzymatic Inverse Polymerase Chain Reaction (EIPCR) is PCR-based method for performing site-directed mutagenesis Mutations are introduced into a DNA by first hybridizin primers which contain the desired mutations to the DNA referred to herein as mutant primers. The resulting primer templates are enzy atically extended with a polymerase t yield an intermediate product. Repriming of the intermediate and polymerase extension will yield the final mutant product Cohesive termini can be subsequently generated fo circularization of the linear products by intramolecula ligation.

The invention is described with particular reference t introducing a predetermined change into a circular templat and recircularizing of the product to generate mutant copie of the starting template. However, one skilled in the art ca use the teachings and methods described herein to similarl generate mutations in linear templates. The primers designe for use on linear templates are similar to those used fo

circular templates. Appropriate modifications of primers f use on linear templates are known to one skilled in the a and will be determined by the intended use of the final muta product. For example, when generating circular product either from a linear or circular starting template, it beneficial to use primers containing complementary cleava sites downstream from the class IIS recognition sequenc Such complementary sites greatly increase the efficiency intramolecular ligation. With linear molecules, on the oth hand, while it is beneficial in some cases for the primers contain class IIS recognition sequences which produce singl stranded overhangs at their cleavage sites, such cleava sites need not be complementary. For example, if the produ is a linear molecule for subcloning into a vector, cleava sites which are not complementary can be used for direction cloning of the product. Additions , a blunt cleavage si can be used to eliminate sequence r> irements for subclonin Thus, depending on the desired product, the cleavage sit within the primers can be complementary or non-complementar EIPCR primers are synthesized having three basic sequen components. These sequences are used for generating mutatio and for enabling efficient formation of circular produc without introducing unwanted sequences or requiring the use o template restriction sequences. The first sequence componen of the primers is the region which directs the predetermine changes. This region contains the desired mutations which ar to be introduced into the template. The length and sequenc of this region will depend on the number and locations o incorporated mutations. For example, if multiple and adjacen mutations are desired, then the primer will not contain an nucleotides within this region identical to the wild-typ sequence. However, if the mutations are not located a adjacent positions, then the nucleotides in between suc mutations will be identical to the wild-type sequence an capable of hybridizing to the appropriate complementar strand. Thus, the region can be from one to many nucleotide in length so long as it contains the desired mismatches wit

the wild-type sequence.

It is only necessary for one of the primers to conta the desired mutations but a larger number of bases can mutagenized and a higher efficiency of correct mutations c be obtained if both primers contain the desired mutations each complementary strand. A strategy for designing EIP primers is outlined in Figure 2. This strategy shows example of a pair of primers which can be used for mutagenes at two nonadjacent locations. One skilled in the art can u this strategy and the teachings described herein to design a use primers that incorporate essentially any desired mutati into a double-stranded DNA. The template containing the wil type sequence is shown in Figure 2A (SEQ ID NO: 1) . Al shown are the desired nucleotide substitutions (arrows) . T actual primers are depicted in Figure 2B as the shad sequence (SEQ ID NO: 2; SEQ ID NO: 3). The region of ea primer containing the desired substitutions is complementa and corresponds to the opposite strand at the same locati within the template (Figure 2C) (SEQ ID NO: 4) . For primer A (SEQ ID NO: 2) and B (SEQ ID NO: 3) in Figure 2B, the mutan region would consist of the sequence GTTCC and its complement respectively.

The second sequence component of EIPCR primers is th region containing the class IIS restriction enzyme recognitio sequence. The location of the recognition sequence is 5' t the mutant region and thus is incorporated at the termini o any extension products. Since recognition sequences ar located at the ends of linear extension products, they ca also contain additional 5 1 sequences to facilitate recognitio and cleavage by a class IIS enzyme. For example, the primer

in Figure " 2B (SEQ ID NO: - 2; SEQ ID NO: 3) contain fou additional nucleotides 5 1 to the Bsa I recognition sequenc

(SEQ ID NO: 5) .

Other sequences included within the recognition sequenc component of EIPCR primers are the nucleotides between th recognition sequence and the cleavage site. The number o nucleotides will correspond to the distance between these tw

sites and therefore will vary for different enzymes. Fo example, the primers of Figure 2 contain a Bsa I recognitio sequence which is cleaved by Bsa I on opposite (SEQ ID NO: 5) strands one and five nucleotides, respectively, 3' to th recognition sequence, leaving a four nucleotide single-stran overhang. Generally, such overhang sequences within th primers are completely complementary to each other but ca include limited mutations. Primers are synthesized wit filler nucleotides placed 5' to the first cleavage site. Th number of filler nucleotides corresponds to the distanc between the particular class IIS recognition sequence used an its cleavage site. The sequence of such spacer nucleotide can, for example, correspond to wild-type or non-wild-typ sequences or to predetermined mutations. For generating jus a few point mutations, it is beneficial to match thes nucleotides to the wild-type sequence to increase th hybridization stability of the adjacent mutant primer region.

Types of restriction enzyme recognition sequences to b used in the invention are those recognized by class IIS enzymes. These enzymes recognize the DNA through a sequence specific interaction and cleave it at a discrete distance downstream from the recognition sequence. The ability t cleave such sequences downstream provides a useful means t remove heterogeneous ends and to produce complementary termini for circularization while at the same time removing the recognition sequence from the final product. Specific examples of class IIS recognition sequences have been listed previously and are also listed in Figure 3 along with their nucleotide sequences and cleavage sites (SEQ ID NOS: 5 through 20) . Although recognition sequences having complementary cleavage sites associated with them are preferred, those which have blunt ended cleavage sites can also be used in the invention.

The third sequence component of EIPCR primers is the region to be hybridized to the template DNA. This region must be sufficient in length and sequence to allow specific hybridization to the template. The hybridized portion of the

primers must also form a stable primer-template which can used as a substrate for polymerase extension. It is typical found 3 ' to the mutant primer region and its sequence determined with respect to the location of the desir mutations. For example, for the primers shown in Figure (SEQ ID NO: 2; SEQ ID NO: 3), the hybridization region twenty nucleotides in length and found 3* to the muta region. However, the hybridization region can also be 5' the mutant region. For this orientation, the mutant regi must form a stable primer-template which can be used as substrate for polymerase extension. Longer or short hybridization sequences can be used in this region so long they are appropriately located with respect to the muta region and also specifically hybridize to the templa molecule. One skilled in the art knows or can readil determine the specificity of such hybridization regions fo use in EIPCR primers.

Thus, the invention also provides a synthetic primer fo introducing at least one predetermined change in a nuclei acid sequence of a double-stranded circular DNA. The prime includes: (a) a class IIS restriction enzyme recognitio sequence; (b) said predetermined change in said nucleic aci sequence; and (c) a nucleic acid sequence substantiall complementary to said double-stranded DNA. The preferre orientation of the above regions (a) through (c) is in a 5 1 t 3 - direction.

The above described primers can be, for example hybridized to a double-stranded circular or linear DN molecule which has first been denatured. Denaturation can b performed, for example, using heat or an alkaline solution Other methods known to one skilled in the art* can also b used.

Hybridization of the primers occurs on opposite strand of the circular template and in a location where the single stranded overhangs of each primer's complementary cleavag site can be joined together by restriction and ligation.

Preferably, such joining should occur so that the wild-typ

sequence is reformed except for the incorporation of t desired mutations. One way to ensure proper sequen reconstruction is to design the primers such that the complementary cleavage sites overlap and are either identic to the template sequence or contain some or all of the desir mutations. Such primers, once hybridized to a double-strand circular DNA, form primer-templates and can be extended wi a polymerase. The first extension reactions of circul templates result in the synthesis of double-stranded circul products which can be concatenated. Depending on the exte of polymerization, t " e concatemers can be either partially completely double-stranded. It is necessary f polymerization to proceed sufficiently far to allow subseque primer hybridization for a second extension reaction. Small circular DNAs result in a greater number of completely doubl stranded products and also require shorter extension tim compared to much larger circles. Small circular DNAs of les than 1.0 kb are known in the art. Such vectors are beneficia to use in the invention since they can accommodate larg inserts (3 to 5 kb) and still be comparable in size to mos standard cloning vectors. The plasmid pVX is a specifi example of a 902 bp vector, Seed, B. , Nuc. Acids Res. 11:2477 2444 (198."*), which is incorporated herein by reference. Suc vectors can be further modified by the addition of, fo example, promoters, terminators and the like to achieve th desired end. Complete extension of a circular DNA of abou 5.0 kb can be achieved using the conditions described herein however, alternative conditions used by those skilled in th art to achieve complete extension of larger circular DNAs ca also be used to practice the invention. For linear templates on .the other hand, the* first extension reaction produces double-stranded linear molecule known in the art as the lon product.

After one extension reaction, the double-strande products, whether they exist as circular or linear molecules have incorporated at one of their ends the EIPCR primer wit its associated class IIS restriction enzyme recognitio

sequence and the desired mutations. These double-strande molecules can be used for a second cycle of hybridization an extension to produce double-stranded linear molecules whic terminate at both ends with EIPCR primers. Further cycle will result in the exponential amplification of templat sequence located between each primer on the circular DNA Thus, the location of the hybridized primers defines th termini of template sequences to be amplified.

Polymerases which can be used for the extension reactio include all of the known DNA polymerases. However, i multiple cycles of hybridization and extension are to b performed, such as required for PCR amplification, the preferably a thermostable polymerase is used. Thermostabl polymerases include, for example, Taq polymerase, Ven polymerase and PFU polymerase. Vent and PFU polymeras advantageously exhibit a higher fidelity than Taq due to their 3* to 5' proofreading capability.

Following synthesis of the linear molecules, the products are restricted with the appropriate class IIS restriction enzyme to remove the class IIS recognition sequence and heterogeneous termini and to create cohesive termini used for circularization. The resulting termini correspond to the single-strand overhangs produced after restriction of each primer's complementary cleavage site. To facilitate proper recognition and cleavage, the linear products can be pre¬ treated with a polymerase, such as Klenow, under conditions which create blunt ends. This procedure will fill in any uncompleted product ends produced during amplification and allows efficient restriction of essentially all of the products. After restriction, the cohesive termini can be joined to recircularize the linear molecule. Covalently closed circles can subsequently be formed in vitro with a ligase. Alternatively, in vivo ligation can be accomplished by introducing the circularized products into a compatible host by transformation or electroporation, for example.

Transformation or electroporation of the circularized products can additionally be used for the propagation and

manipulation of mutant products. Such techniques and the uses are known to one skilled in the art and are describe for example, in Sambrook et al., Molecular Cloning: Laboratory Manual, Cold Spring Harbor, Cold Spring Harbor, (1989), or in Ausubel et al., Current Protocols in Molecul Biology, John Wiley and Sons, New York, NY (1989), both which are incorporated herein by reference. Propagation a manipulation procedures do not have to be performed at the e of all EIPCR reactions. The need will determine whether su procedures are necessary. For example, transformation and D preparation can be eliminated if two consecutive EIP reactions are to be performed where the product of the fir reaction is used as the template for the second reaction. A that is necessary is that the first reaction products a circularized and ligated prior to hybridization with t second reaction primers. Additionally, primers for EIPCR c be used without purification. EIPCR is not as sensitive other methods to the presence of primers of incomplete leng because the non-uniform DNA ends are removed by restriction the class IIS recognition sequence.

The invention further provides methods of producing least two changes located at one or more positions within nucleic acid sequence of a double-stranded circular DNA. T methods include: (a) providing a first population of prime and a second population of primers capable of directing sa changes in said nucleic acid sequence, said first and seco populations of primers comprising a nucleic acid sequen substantially complementary to said double-stranded DNA so to allow hybridization, a class IIS restriction enzy recognition sequence, and cleavage sites; (b) hybridizing sa first and second populations of primers to opposite strands said double-stranded DNA to form a first pair of prime template populations orientated in opposite directions; ( extending said first pair of primer-template populations create a population of double-stranded molecules; ( hybridizing said first and second populations of primers least once to said population of double-stranded molecules

form a second pair of primer-template populations; (e extending said second pair of primer-template populations t produce a population of double-stranded linear molecule terminating with class IIS restriction enzyme recognitio sequences; and (f) restricting said population of double stranded linear molecules with a class IIS restriction enzym to form a population of restricted linear molecules containin said changes within said nucleic acid sequence. Also provide is a population of synthetic primers for producing at leas two changes located at one or more positions within a nuclei acid sequence of a double-stranded circular DNA comprising (a) a class IIS restriction enzyme recognition sequence; (b) said changes within said nucleic acid sequence; and (c) nucleic acid sequence substantially complementary to sai double-stranded circular DNA.

The method for producing at least two changes located a one or more positions is similar to that described above fo site-directed mutagenesis except that the primers can hav more than one nucleotide at a desired position. For example, if it is desirable to produce mutations incorporating from two to four different mutant nucleotides at a particular position, then a population of primers should be synthesized such that all mutant nucleotides are represented within the entire population. Each individual primer within the population will contain only a single mutant nucleotide. The proportion of primers containing identical mutant nucleotides will determine the expected frequency of that mutation being correctly incorporated into the final product. For example, if only two mutant nucleotides are desired and each one is equally represented within the primer population, then 50% of the products should contain one of the mutations and 50% should contain the other mutation. If more than two mutations are desired at a particular position or at more than one position, then primer populations should be synthesized which contain individual primers having each of the desired mutations. Primer populations can also be synthesized which direct single mutations at one position and multiple mutations at another

position by incorporating one or more mutant nucleotides a the appropriate position.

The design and use of such primers is identical to tha previously described for introducing at least on predetermined change into a double-stranded circular DNA. Th only difference is that instead of hybridizing a first prime and a second primer to form a pair of primer-templates hybridization is with a first population of primers and second population of primers to form a pair of primer-templat populations. Each primer-template within the population ca include, for example, one of the desired mutant sequences t be incorporated into the resultant products. Amplification o the primer-template population will produce a population o linear products containing all desired mutations. Th products can be restricted, circularized and screened fo individual mutant clones. Screening can be performed, fo example, by sequencing or by expression of polypeptide Selection can be performed by linking polypeptide expressio with the expression of a suitable marker such as an antibioti resistance gene, luciferase, or the like. Only colonie containing the gene are selected. Following selection positive colonies can then be screened for a particula characteristic. Expression screening or selection offers th advantage of screening or selecting a large number of clone in a relatively short period of time. These assays permit th identification of clones of interest. Examples of screenin and selection assays are well known to those with skill in th art. Each assay is designed and modified for that particula application. Examples of these assays are found in th examples below.

The methods and primers described herein can be used t create essentially any desired change in a nucleic aci sequence. Templates can be linear or circular and result i products containing only the desired changes since class II recognition sequences allow the removal of extraneous an unwanted sequences. Product termini which are homogeneous i nature are also produced using the class IIS recognitio

sequences. Use of circular templates allows the incorporati of mutations at any desired location along the template wi subsequent recircularization of the mutant products. Thu additions, deletions and substitutions of single base pair multiple base pairs, gene segments and whole genes can rapid and efficiently be produced using EIPCR. A specific use EIPCR would be in the mutagenesis of antibodies or antibo domains. Mutagenesis of antibody complementary determini regions (CDR) , for example, can be performed using EIPCR f the rapid generation of antibodies exhibiting altered bindi specificities. Likewise, EIPCR can also be used for produci chimeric and/or humanized antibodies having desir immunogenic properties.

The efficiency of incorporating correct mutations int the product using EIPCR can be, for example, greater tha about 90%, preferably about 95 to 99%, more preferably abou 100%. This efficiency is routinely obtained when using abou 0.5 to 2.0 ng of template in a 25 cycle PCR reaction However, it should be understood that the efficiency directl correlates with the number of amplification cycles an inversely with the amount of template used. For example, th more amplification cycles which are performed, the greater th amount of mutant product present and therefore a large fraction of mutant sequences will be present within the tota sequence population. Conversely, if a large amount o template is used, more amplification cycles are required compared to using a smaller amount of template, to achieve th same fraction of mutant sequences within the total sequenc population. One skilled in the art knows such parameters an can adjust the number of cycles and amount of templat required to achieve the required efficiency.

The following examples are intended to illustrate but no limit the invention.

EXAMPLE I

This example shows the use of EIPCR for site-direct mutagenesis of two bases located on a 2.6 kb pUC-based plasm (designated pl86) .

The design of the primers and their relationship to t template and to the final mutant sequence is shown in Figu

2. The 3 s end of the primer is an exact match of 20 base

The 5' ends of the primers comprise the enzyme recogniti site and the enzyme cut site, which was designed to fo complementary overhangs. Four additional bases were added to the enzyme recognition sequence to facilitate recogniti and digestion of the PCR product by the enzyme. T complementary mutations were designed into each of t primers. Bsa 1 was the enzyme used to make the overhan

(Figure 3) . PCR reactions were performed in 100 μl volumes containi

0.2-1.0 uM of each unpurified primer, 0.5 ng uncut pl template plasmid DNA, lx Vent buffer, 200 uM of each dNTP, 2. units Vent polymerase (New England Biolabs, Beverly, MA) Thermal cycling was performed on a Perkin-Elmer-Cetus P machine (Emeryville, CA) with the following parameters

94°C/3 minutes for 1 cycle; 94°C/1 minute, 50°C/1 minut 72°C/3-4 minutes for 3 cycles; 94°C/1 minute, 55°C/1 minut 72°C/3-4 minutes, with autoextension at 4-6 sec/cycle for 2 cycles; followed by one 10 minute cycle at 72°C. To blunt the ends of the PCR product, the entire reactio mix was supplemented with 8 ul of 10 mM of dNTP mi.-:ture (2. mM each) and 20 units of Klenow fragment (Gibco-BR Gaithersburg, MD) incubated at 37°C for 30 minutes. T reaction was then extracted with an equal volume o phenol/chloroform (1:1), ethanol-precipitated, and the pelle was washed and dried. The blunt end product was the restriction digested with Bsa I (New England Biolabs, Beverly MA) as recommended by the manufacturer. The digested DNA wa extracted with an equal volume of phenol/chloroform, ethanol precipitated, as described above, and ligated with 20 units T

DNA ligase (Gibco-BRL) for one hour at room temperature. Gel

purification of the digested DNA before ligation was n necessary. After ligation, the DNA was transformed in competent DH10B cells recommended by the manufacturer (Gibc BRL) . Approximately 400 colonies were obtained from transformation using 10 ng of DNA into 30 ul of froz competent cells. The transformation efficiency was 4xl cfu/ug of DNA. Seven colonies were randomly picked a plasmid DNA was prepared for restriction digests. differences in restriction pattern were seen. The mutat areas of the plasmids of these seven colonies were sequence Double-stranded dideoxy sequencing was performed on a Dupo Genesis 2000 automated sequencer using the Dupont Genesis 20 sequencing kit. The sequences of all seven plasmids contain the desired mutation.

EXAMPLE II

This example shows the use of EIPCR for constructi large libraries of protein mutants.

The binding site of an antibody, called the Fv fragmen normally consists of a heavy chain and a light chain, eac about 110 amino acids long. Using molecular modelling tools several groups have constructed single chain Fv fragment (scFv) in which the c-terminus of one chain is connected by 10-15 amino acid linker to the n-terminus of the other chai (Huston, Bird, Glockshuber) . The single chain construct wa shown to be much more stable than the two chain Fv. To eliminate the need for molecular modelling, EIPCR wa used to make a large library of different linkers and scree for a scFv clone that is not only active but also expressed a a high level. An antibody was chosen that binds a radioactiv Indium chelate, Reardan et al. , Nature 316:265-267 (1985) which is incorporated herein by reference. A 3.5 kb pUC derived plasmid was constructed in which both Fv chains ar attached to ompA leader peptides and driven by a Lac promote

(Figure 4) . This plasmid was used as the template for EIPC in which the DNA between the c-terminus of the first chain an the n-terminus of the mature second chain was replaced by random mixture of bases, encoding «-, library of random linkers The design of the primers is shown in Figure 4B in the shade region where N represents an equal proportion of all fou nucleotides at the position within the primer population.

Synthesis of the two primer populations used to construc the library was performed on a Milligen/Biosearch 8700 DN synthesizer. The mixed base positions were synthesized usin a 1:1:1:1 mixture of each of the four bases in the reservoir. The oligonucleotides were made trityl-on and wer purified with Nensorb Prep nucleic acid purification column (NEN-Dupont, Boston, MA) as described by the manufacturer. PCR reactions were performed in 100 μl volumes containin

0.5 uM of each unpurified primer, 0.5 ng pUCHAFvl templat plasmid DNA, lx Taq buffer, 200 uM of each dNTP, 1 ul Ta polymerase (Perkin-Elmer-Cetus) . Thermal cycling wa performed on a Perkin-Elmer-Cetus PCR machine with th following parameters: 94°C/3 minutes for 1 cycle; 94°C/ minute, 50°C/1 minute, 72°C/2 minutes for 3 cycles; 94°C/ minute, 55°C/1 minute, 72°C/2 minutes, with autoextension a

4 sec/cycle for 25 cycles; followed by one 10 minute cycle a 72°C. The product of the 100 ul PCR was extracted with an equa volume of phenol/ch xroform (1:1), ethanol-precipitated, an the pellet was resuspended in 20 ul KKL buffer (50 mM Tris-HC pH 7.6, 10 mM MgCl2, 5 Mm DTT; suitable for Klenow, Kinase an Ligase) containing 200 μM dNTPs, 1 mM ATP, 10 units DN Polymerase Klenow fragment and 10 units T4 DNA Kinase an incubated at 37°C for 30 minutes. Then 10 units T4 DNA ligas were added, and the reaction was continued for 2 hours at roo temperature. The enzymes were then inactivated by heating a 65°C for 10 minutes. The polymerized DNA was then digeste with Bbs I (NEB) which cuts off the ends of the PCR fragment inside the oligos. It was found that Bbs I digestion wa

inefficient with only four bp 5' to the recognition sequenc To create a longer 5' extension and improve efficiency, t DNA was ligated before digestion. Alternatively, prime could have been synthesized with longer 5 1 extensions. T digested DNA was then extracted with phenol/chlorofor ethanol precipitated, and resuspended in 20 ul lx NEB ligati buffer, containing 1 mM ATP and 10 units T4 DNA ligase and t reaction was incubated for 2 hours at room temperature.

One microliter amounts of the ligation reaction we electroporated into 20 ul of DH10B Electromax cells (Gibc BRL, Gaithersburg, MD) to produce a library of sc constructs. The Gibco-BRL electroporator and voltage boost was used as recommended by the manufacturer. Cells we plated at 3,000 cfu/plate on plates containing 0.05 mM IPT to induce Fv expression.

For screening, the labelled chelate was prepared incubating 10 ul of 0.075 mM Eotube chelate with 50 uCi buffered 111 Indium Chloride in a metal free tube. Colony lif of the petri plates containing the protein library we prepared using BA83 nitrocellulose filters (Schleicher an Schuell, Keene, NH) . The filters were blocked by incubatio in Blotto (7% non-fat milk in PBS) for 10 minutes, washed wit PBS, followed by incubation in Blotto containing 10 uCi o m Indium Chloride per filter for 1 hour at room temperature The filters were then washed repeatedly with PBS for a tota of 15 minutes, dried and exposed to Kodak X-omat A autoradiography film for several hours.

The quality of the protein library was determined by DN sequencing of the linker of several unscreened clones Sequencing was performed as described in Example I. Th composition of the mixed site residues was 19% G, 31% A, 25

T, 25% C (n=119) .

The size of the library was determined by plating. In typical electroporation, 30,000 cfu's were obtained fro electroporation of l ul of ligation mixture into 20 ul o cells. The ligation contained 0.1 ug of DNA in 20 ul. Th library size was about 3 l0 5 recombinants and th

electroporation efficiency was 6xl0 6 cfu/ug. Approximate 30,000 clones were screened, and about 60 colonies gave range of signals on the primary screen (0.2%) . Those with t strongest signal were colony purified and the DNA sequence the linker was determined. The sequences of one inker fr an identified scFv clone is shown in Figure 4C. LIBRARY MUTAGENESIS

Library mutagenesis using a heterogenous prim population permits incorporation of a large number mutations into a population of host cells to generate recombinant library. The resulting mutations are typical introduced into a polynucleotide suitable for cell deliver " a polynucleotide can additionally be adapted for expressio These polynucleotides may contain anges in either t regulatory region of the polynu ieot ' .-:.*-. or in a translatab region. The directed mutations in the lynucleotide sequen may alter levels of protein expressiw.i, alter a function characteristic of a protein, or confer a particular ce phenotype. The incorporation of a large number of mutatio into a host population is termed library mutagenesis. general, libraries can be prepared and screened for c anαes any measurable cell property. Similarly, the tranr or? ad transfected ' cells containing the altered nuc- „ ac sequences can be screened or selected for & desir polynucleotide sequence independent of polypeptide expressio

There are several different methods for performi library mutagenesis that are available to those of skill the art. A number of these methods use PCR to produce library of mutant constructs. However, none of the existi methods for making mutant libraries are based on inverse PC

Enzymatic Inverse PCR (EIPCR) amplifies the enti plasmid, a portion of the plasmid or linear sequence of polynucleotide. These methods differ from other mutagenes methods in the use of class IIS restriction sequences in t 5' end of both primers. Digestion with class IIS restricti enzymes, such as Bsal (GGTCTCN'NNNN) , which have the

recognition sequence 5' to, and separated from, their cleava site allows the removal of the entire recognition sequen prior to ligation. This preferably leaves the linear P product with compatible overhangs at each end. Intra molecular ligation of the PCR product yields a full-lengt circular plasmid.

An important advantage of EIPCR library mutagenesis i that any plasmid or DNA fragment can be used to create library of mutations. The only limitation is the efficienc of the PCR process. The generation of a complementary stran is limited by the length of the template and by the elongatio rate of the polymerase. It is likely that advances in the PC technology, in particular, enzyme efficiency, will permit lon DNA fragments to be used in this invention. The librar mutagenesis methods disclosed herein are rapid and efficien and permit one of skill in the art to generate severa libraries in a day. For example, once primers are prepared libraries such as those prepared in Example III can b generated in 6 to 10 hours. In EIPCR library mutagenesis, the entire plasmid i amplified using mutagenic primers. The simple design of EIPC results in a high efficiency of ligation of mutant plasmids, thus generating a high level of diversity in the library. Th higher the level of genetic diversity in a recombinan library, the more likely the library will contain a mutant o interest readily identifiable by methods known to one of skil in the art . Another important benefit of EIPCR over othe methods for library mutagenesis is that, as in EIPCR site directed mutagenesis, mutations can be made in any area of th sequence independent of available restriction sequences. Restriction endonuclease recognition sites are not incorporated into the final construct. The usefulness of EIPCR for library mutagenesis, is described in Example III and illustrated in Figure 5. A method for performing library mutagenesis to generate a recombinant library by introducing changes within a predetermined region of linear or, preferably, circular double

stranded DNA is contemplated herein. The method comprises ( providing a first primer population and a second prim population, each having at least one variable base at kno complementary positions along the primers capable of directi a change in the nucleic acid sequence, the first and seco primer populations being substantially complementary to t double-stranded nucleic acid to allow hybridization there and having a class IIS restriction enzyme recognition sequen and cleavage sites, (b) hybridizing the first and seco primer populations to opposite strands of the double strand nucleic acid to form a first pair of primer-templates orient in opposite directions, (c) performing enzymatic PCR as here before described, (d) cutting the double stranded line molecules with a class IIS restriction enzyme to fo restricted linear polynucleotide sequences containing t change in said nucleic acid sequence, thereby removi restriction endonuclease recognition sites, (e) optional joining termini of the restricted linear molecules of step ( to produce a double-stranded circular polynucleotide sequenc and (f) introducing polynucleotide sequence obtained from st

(d) or (e) into compatible host cells.

The term "primer population" is used to describe the po of primers that have identical base compositions except certain predetermined locations along the sequence th contain a variable composition. The primers for EIPCR libra mutagenesis are otherwise designed similar to those prime used for EIPCR site-directed mutagenesis. Primer pairs f EIPCR mutagenesis are designed to hybridize to the top a bottom strands of a double stranded template and to extend opposite directions. The primers are chosen to substantially complementary to that region of the nucleic ac template to be mutagenized. These primers may be overlappi on the template, contiguous, or non-overlapping. T h primer pairs are substantially complementary to the templa to facilitate hybridization during the PCR proces Preferably, the primer contains at least a 15 base region the 3 ' end of the primer that is complementary to t

template. Other regions of complementarity may b interspersed throughout the length of the primer. The prime additionally contains a class IIS restriction endonucleas recognition sequence and a region containing noncomplementar bases that confers the desired variable mutation. Th variable region can be of any length, the only restriction o length being the ability of the primers to hybridize to th template and direct synthesis of a substantially complementar strand of DNA. Further, the variable region or regions may b interspersed between complementary regions along the prime strand. Filler base regions can additionally be added to th primer at the 5' end of the primer, before the class II recognition sequence, and between the class IIS recognitio sequence and the class IIS cleavage site. Any final prime length is contemplated within the scope of the invention. Primer length is limited only by the efficiency of th oligonucleotide synthesizer. Primers may be prepared b methods known to those of skill in the art. Those with skil in the art will be readily able to determine if a given prime adequately hybridizes to a given template and is thus suitabl for amplification using EIPCR.

The extent of primer variability desirable for library mutagenesis is determined during primer synthesis. A mixture of nucleotides, or polynucleotides such as amino acid encoding trimers, are introduced at one or more positions along the primer oligonucleotide. The addition of trinucleotide fragments during synthesis provides direct control over amino acid mixtures. The nucleotide mixture is formulated to contain a predetermined percentage of each of the four bases. These percentages may vary from 0% to less than 100% for any one base and from 0 to 100% for each of the 64 amino acid encoding trimers. The frequency of a given sequence is determined by the desired probability that a particular base or trimer will be present at a particular position along the primer. Thus, for example, if the library is to contain variable mutations at position 6 of the primer oligonucleotide corresponding to a 75% average likelihood that position 6 is

guanosine and a 25% average likelihood that position 6 will b adenosine, then the elongating primer will be exposed to mixture of 3/4 guanosine and 1/4 adenosine at position 6 These mixtures can also be prepared in proportions such tha for a region of 10 bases it is likely that on average only on of the 10 bases in any primer is different from the templat sequence. This provides a primer pool that theoreticall represents every possible permutation in each nucleotid position over a 10 base pair sequence. A review of prime preparation and design in random mutagenesis can be found i Oligonucleotides and Analogues: A Practical Approach (F Eckstein Ed., Oxford University Press, 1991) and Hermes e al., Gene 84:143-151, 1989, which is hereby incorporated b reference. As illustrated in Figure 6 the primer pairs contain complementary region at the class IIS restriction endonucleas cleavage site. In EIPCR library mutagenesis, this overlappin region preferably does not contain a mutation. This ensure that recircularization of the template can occur following PC amplification. In the examples that follow, class II restriction endonuclease Bsal is used to generate a four bas overhang at each end of the nucleotide sequence. Figure provides an exemplary list of other class IIS restrictio endonucleases, contemplated within the scope of thi invention.

Library mutagenesis can be used to alter any regio within a nucleic acid sequence. These mutagenesis procedure are particularly useful for generating a library of mutation within the mature region of a protein sequence, within leader sequence, or within sequences that do not encod protein. Sequences that do not encode protein may influenc or regulate protein expression. These include, but are no limited to non-coding regions on the DNA, for example, enhancer sequences, promoter regions, sites for DNA bindin proteins such as repressors, Z-DNA formation, matri associated regions, telomeres, origins of replication an recombination signals. In addition to those non-codin

regions on the DNA that are transcribed, non-coding regions RNA additionally contemplated include, but are not limited snRNP's, spliceosomes, ribosome binding sites, regions secondary structure, terminators, stability sites and c sites. It is additionally contemplated within the scope this invention that EIPCR library mutagenesis can be used generate recombinant libraries containing altered sequenc corresponding to tRNA or rRNA. Mutations in regulatory regio of a nucleic acid sequence can effect the level of prote expression, while in-frame substitution mutations within t nucleic acid sequence encoding protein can effect prote function. It is therefore contemplated that the procedur described herein will be useful for generating recombinan libraries having mutations in any of these aforementione regions of the nucleic acid.

EIPCR library mutagenesis can be used to alter th functional characteristics of a particular protein. A protei sequence engineered into an expression construct can be use as a nucleic acid template for EIPCR library mutagenesis Like other forms of library mutagenesis, this procedure can b used, for example, to mutagenize a binding region on polypeptide, thereby generating an expression library that ca be screened or selected for altered binding characteristics EIPCR mutagenesis can also be employed to mutate a region o a polypeptide sequence that influences intra-molecula binding. For example, a polypeptide region that links tw protein domains involved in ligand binding can be mutated using the methods disclosed herein, to optimize th interactions between the protein domains. One type of mutagenesis contemplated within the scope o this invention is.wobble base library mutagenesis using EIPCR. Wobble base mutagenesis incorporates mutations within th primer population in positions that correspond to the thir position of a nucleotide codon. Most mutations in the thir position of a codon do not alter the amino acid sequence o the resulting polypeptide. Accurate tRNA-mRNA pairing i required at the first two positions within the codon durin

translation. The third position can tolerate pairing wi more than one tRNA and this degeneracy is termed a "wobble Thus the same amino acid sequence can be derived from sever different nucleotide sequences. Alterations in the nucleotide sequence that do not affe the protein sequence may alter the level of protein synthes or expression within a given host. In particular, alteratio in the nucleic acid sequence of the leader portion of polypeptide can influence levels of protein synthesis from o protein to another or from one host to another. An example two primers designed to confer alterations in the OmpA leade sequence that result in increased levels of antibody F fragment expression from E. coli is found in Figure 6. Onc a leader sequence is optimized for the expression of on particular polypeptide, using EIPCR library mutagenesis within a given host, it is further contemplated that thi leader sequence can then be linked to other gene sequence encoding polypeptide to optimize expression of othe polypeptide. Similarly it is also contemplated within th scope of this invention that other regulatory regions can b optimized using EIPCR library mutagenesis and that thes optimized regions can be engineered into other expressio constructs for maximal expression of other polypeptides i vitro or in vivo. The invention is preferably designed to incorporate on or more random changes within predetermined regions of circular template, such as a vector. Vector choice i determined first by the choice of host cell used to create th desired library. It is well known to those of skill in th art that vectors are commercially available for protei expression in prokaryotic and eukaryotic systems. Expressio vectors are available for bacteria, yeast and mammalia systems. In addition, viral vectors for both eukaryotic an prokaryotic cells are also contemplated within the scope o this invention. Expression vectors are required when th translation products from the mutated nucleic acid sequence are to be assayed. An analysis of random mutations in nuclei

acid may not require the use of an expression vector whe mutations can be screened using polynucleotide probes or t like. Those with skill in the art will be able to choose appropriate commercially available vector, create their o vector, or recreate the exemplary vector described in Examp V below.

It is additionally contemplated within the scope of th invention that EIPCR library mutagenesis could be performed one region of nucleic acid within a construct, and a seco (and/or subsequent) mutagenesis procedure be performed another region of a construct or on a separate nucleic ac construct. Following amplification, these sequences can th be combined to produce a construct with two or more regions o random mutagenesis. A general description of the hybridization of aliquots o the first and second primer pools to the nucleic acid templat as well as a general description of EIPCR are disclosed in th detailed description of site-directedmutagenesis beginning o page 16. The term "inverse" in enzymatic inverse polymeras chain reaction is used to describe the primer pair orientatio during the PCR process such that at the initiation o elongation the 3• end of the primers are directed away fro one another. The mechanics of hybridization and nucleic aci sequence amplification in library mutagenesis are similar to if not identical to, those employed in EIPCR site-directe mutagenesis and will not be repeated here. Thus, the ter "performing EIPCR" as a step in the production of a library o mutations following the hybridizing step of the primers to th template, comprises 1) extending the first pair of primer templates to create double stranded molecules; 2) denaturin the primer templates; 3) hybridizing the first and .secon primers at least once to the double stranded molecules to for a second pair of primer-templates; 4) extending the secon pair of primer-templates following hybridization to produc double-stranded linear molecules terminating with class II restriction enzyme recognition sequences; and 5) repeatin steps 1-3 as needed.

Once mutated linear template has been generated sufficient quantity, the appropriate class IIS restricti enzyme is used to cleave the nucleic acid to create termi compatible for ligation. Ligation of the linear molecules i performed under conditions that favor recircularization of t plasmid. These conditions are well known to those with skil in the art and exemplary conditions are described in Exampl III.

The nucleic acid is next introduced into the desired hos cells. The nucleic acid can be introduced into the host cell by any means known to those of skill in the art. Thes methods include, but are not limited to methods to prepar competent bacterial cells including CaCl 2 treatment, an methods to tranεfect eukaryotic cells including CaP precipitation, liposome mediated transfection, vira infection, or electroporation. The method for introducin nucleic acid into the host cell will, in part, be determine by the host cell type. Descriptions of each th transformation and transfection procedures are found i recombinant methodology handbooks including those of Sambroo et al. or Ausubel et al. (supra. ) Following transfection transformation or infection, the cells are expanded an screened for the desired cell function. There are a variet of screening assays that are available to the investigator Assay design should reflect the desired goal of mutagenesis For example, the assay disclosed in Example III below i designed to detect increased levels of expression of particular antibody fragment in E. coli. Assays can also b designed to detect increases in the binding constants (K a ) o an antibody or receptor to its antigen or ligand. Othe assays can be designed" to detect changes in the level o protein expression or changes in the functional activity of protein. For example, in a eukaryotic system, the increase ability of a protein to promote growth or stimulate particular cellular function can be measured by removing cel supernatants from mutated cells or their progeny, adding thi supernatant to susceptible cells, and assaying for growt

promoting activity. Those with skill in the art will be abl to select an appropriate screening or selection assay for particular library to identify a particular clone of interest

In a second example, EIPCR library mutagenesis can b used to alter the expression of one polypeptide in relation t a second polypeptide. Thus in Example III below, rando mutagenesis is used to increase the level of Fv heavy chai expression, thereby equalizing levels of heavy and light chai Fv fragment expression.

In general once a particular mutation is identified a conferring a desired property to a protein sequence, the cell are selected and expanded. The nucleic acid containing th desired mutation is isolated and sequenced. Identifie sequences from mutations in regulatory regions of a nuclei acid sequence can then be genetically transposed to othe expression systems. Thus, a contemplated method within th scope of this invention is one that identifies an optimize nucleic acid sequence derived from EIPCR library mutagenesi to promote an increase in the level of protein expression a compared with wildtype sequence.

The following examples of random EIPCR librar mutagenesis are provided below. These examples are intende to illustrate but not limit the invention.

EXAMPLE III This example illustrates a preferred embodiment of EIPCR library mutagenesis, wobble base mutagenesis. In wobble base mutagenesis, mutations are introduced into the nucleic acid sequence without altering the amino acid sequence of the target protein. In. this example, the leader or signal sequence of a protein is variably mutated in the third base position of at least one codon to generate - a recombinant library that can be screened for colonies with increased levels of eukaryotic protein expression as compared with non- mutated controls. The expression level of foreign proteins in E. coli is determined by a large number of factors, and

expression level optimization is normally a slow and tediou process. For secreted proteins, like the exemplary antibod Fv fragments used here, optimization of expression i complicated by the difficulties associated with secreting eukaryotic protein in a prokaryotic system. Without th optimized modifications generated by EIPCR librar mutagenesis, described below, secretion and expression o eukaryotic proteins in prokaryotic systems is very low.

In this particular example, expression of Fv fragmen expression of an anti-metal-chelate antibody (CHA255) wa optimized in E. coli. The Fv fragment was expressed in activ form in the periplasm of E. coli. Both the heavy and ligh chains of the Fv fragment, each with its own leader peptide were placed under the control of a Lac promoter on a 1.8 k plasmid. The CHA255 antibody binds a chelated radioactiv metal ( 111 Indium or 90 Y chelate complex) to provide a simpl screening assay to permit detection of functional antibod fragments. For optimization of expression or mutagenesis o other proteins and antibodies, other screening systems may b useful.

Expression Vectors

Any expression vector that can be amplified together wit its insert is contemplated within the scope of this invention However, we have chosen to exemplify a relatively smal plasmid (< 7kb) that is readily amplified by PCR. pMCHAFvl, the 1.8 kb expression vector used for EIPCR mutagenesis and F expression, is shown in Figure 5. The nucleic acid sequenc encoding light chain of the Fv fragment is 5' to the nuclei acid sequence encoding the heavy chain of the Fv fragment. Each chain has its own OmpA signal peptide, and both chain are driven by a single Lac promoter. The OmpA signal sequenc and Lac promoter sequence are ; t vided in references fro Mowa et al. and Reznikoff et •; respectively, which ar hereby incorporated by reference (Mowa et al., J. Biol. Che 255:27-29, 1980, J. Mol. Biol. 143:317-328 (1980) an Reznikoff et al. (1980) "The Lac Promoter". The Operon. Mille et al. Eds. Cold Spring Harbor Press, NY.) The antibody gene

for CHA255 are the same as those used in Example I above. T codons of the light and heavy chain are those obtained fr the original mouse antibody sequence. Similarly, the Om leader sequence is the native sequence obtained from the Om protein nucleic acid sequence as described in Example pMCHAFvl was constructed from pMINI3 (Figure 5) . pMINI3 is 1.0 kb expression vector which contains a ' synthetic L promoter, supF (derived from tRNA-tyr, Huang et al. , supra as the selectable marker, and a rop " ColEl origin, obtain from pUC (Pharmacia, Piscataway, N.J.). The supF vectors a designed to be used with commercially available chemically electro-competent E.coli MC1061/P3 cells (Invitrogen Inc., S Diego, CA) . These cells contain amber mutations in both t ampicillin and tetracycline drug resistance genes, located a P3 incompatibility group plasmid. Thus the P3 plasmid c co-exist with ColEl incompatibility group plasmids such pUC. The P3 plasmid is too large to interfere with p plasmid purification. Transformants are selected on plat with 25 ug/ml ampicillin and 7.5 ug/ml tetracycline.

Oligonucleotide Synthesis for Wobble Mutagenesis

The two oligonucleotides used to construct the librar are shown schematically in Figure 6B. The oligonucleotide are designed to hybridize to opposite DNA strands of th pMCHAFvl template adjacent to the OmpA leader sequence. Th resulting DNA and mRNA derived from this pool of mutate oligonucleotides is a library of sequences, all encoding th same OmpA protein sequence. The X in Figure 6B corresponds t the variable positions within the primer population. Th sequences are provided as SEQ ID NO: 26 and SEQ ID NO: 27 Here the. N corresponds " to the X in Figure 6B. Prime oligonucleotides also contain R and Y base designations. Th R indicates the incorporation of a purine and the Y indicate the incorporation of a pyrimidine. The limitation of purine or pyrimidines in the third position of the codon ensures tha the amino acid sequence is not modified by the incorporatio of random nucleotides. Constant regions within the primer ar

coded by the appropriate base designation. The prime (moving 5' to 3)' contain, as indicated, filler sequence, Bsal class IIS restriction endonuclease recognition site filler sequence, a Bsal cleavage site that forms the cohesi termini for circularization, a region comprising random bas positions in the third position of the nucleotide codon, an a complementary region to anchor the primer to the templat during hybridization. Oligonucleotide synthesis was performe on a Milligen/Biosearch 8700 DNA synthesizer (Milligen Burlington, MA) . The mixed base positions were synthesize using a fresh 1:1:1:1 molar mixture of each of the four base in the U reservoir. The oligonucleotides were made trityl-o and were purified with Nensorb Prep nucleic acid purificatio columns (NEN-Dupont, Boston, MA) as described by th manufacturer.

Amplification conditions and generation of modified template

PCR was performed in a 100 μl volume. Each reactio contained 0.5 μM of each purified primer, 0.5 ng pMCHAFv template plasmid DNA, lx Taq buffer, 200 μM of each dNTP an 1 μl Taq polymerase (Perkin-Elmer-Cetus) The thermo-cyclin parameters were: 94 c C/3 min for 1 cycle; 94 C C/1 min, 50°C/ in, 72°C/2 min for 3 cycles; 94°C/1 min, 55°C/1 min, 72°C/ min, with autoextension at 5 sec/cycle for 10 cycles; 94°C/ min, 55°C/1 min, 72°C/3 min, 1 with autoextension at sec/cycle for 12 cycles; followed by one 10 min cycle at 72°C In a PCR reaction, the primers direct the amplification of linear DNA sequence of equal length to the template plasmi with an additional 11-14 bp extensions at each end of the DN that includes the class IIS restriction sequence.

PCR Product Manipulations

The DNA obtained from 2-4 100 μl PCR reactions wa flushed by addition of dNTPs to 200 μM, 50 units DN Polymerase Klenow fragment and 30 units T4 DNA Kinase an

incubated at 37 β C for 30 minutes. After phenol/chlorofo extraction and precipitation, the DNA was digested with Bs

(New England Biolabs, Beverly MA) . The digested DNA was g purified, ethanol precipitated, and ligated at l concentration and without polyethylene glycol to favo intramolecular interactions, thus favoring circularization o the nucleic acid as opposed to concatamer formation. Th ligation was ethanol precipitated using ammonium acetate washed twice with 80% ethanol, vacuum dried and resuspended i 20 ul 0.1 x TE (Sambrook et al. , supra.) for electroporation After digesting the 12-14 bp overhang with Bsal, the resultin cohesive termini were ligated intramolecularly, and th ligation was electroporated into E. coli for expressio analysis.

Electroporation

One microliter amounts of the ligation reaction wer electroporated into 20 ul of MC1061/P3 cells (Invitrogen, Sa Diego, CA) using the Invitrogen electroporator. Cells wer plated on 23 x 23 cm plates as described above.

Cell Growth Conditions

For routine cell growth that does not require foreig protein expression, the cells were grown in M9CA media (Merri et al. , Proc. Natl. Acad. Sci. (USA) 74:4335-4339, 1979) whic is hereby incorporated by reference.

For colony lift screening assays, the cells were plate on 23x23 cm plates with CS agar (48 g/1 yeast extract, 24 g/1 tryptone, 3 g/1 NaH2P04, 3 g/1 Na2HP04, 15 g/1 agar) with 0.5 ug/ml isopropylthiogalactoside (IPTG) (Boehringer Mannheim, Indianapolis, IN) for induction of protein expression.

For expression level determination, clones were grown in

CS broth with 0.2 mM IPTG in baffled shaker flasks at 250 rpm for 30 hours at 30 C, with a boost of 0.2 volumes of 240 g/1 yeast extract and 120 g/1 tryptone after 18 hours. The Fv expressing constructs were grown at 30°C. CS broth permits

the use of higher levels of IPTG before over-expression of t foreign protein causes bacterial death. Thus, with CS bro most of the Fv protein can be found in the media rather th in the bacterial periplasm.

Size Determination of the Random Library

The molar ratios of fresh bases were reflected accurate in the oligonucleotide pool as determined by the methods Hermes et al. (Proc. Natl. Acad. Sci. (USA) 87:696-700, 199 which is hereby incorporated by reference. The ratio of bas in the mixed sites within the PCR product was verified by D sequencing a representative sampling of individual clone The composition of the mixed site residues in the PCR produ was 19% G, 31% A, 25% T, 25% C (n=119) . The theoretical maximum complexity of the library

8xl0 9 different sequences. The actual size of the library w determined by plating. In a typical electroporation, 5 x colony forming units (cfu) were obtained from electroporati of 1 μl of ligation mixture into 20 μl of cells. The ligati contained 0.5 μg of DNA in 20 μl. The library size is th about 1 x 10 7 and the efficiency was 2 x 10 7 cfu/ug. For th particular example, the screening assay was found to be mo limiting than library size.

Colony Screening Assay

Colony lifts of 23cmx23cm plates with 0.3-1 x 1 colonies were prepared using BA83 nitrocellulose filte (Schleicher and Schuell, Keene, NH) . The filters were block by incubation in 3% non-fat milk in 25 mM Tris-HCI pH7.5 f 10 minutes, washed with 25 mM Tris, followed by incubation 25 mM Tris containing 50 uCi of chelated 11 Indium or 90 Yttri per filter for 1 hour at room temperature. The filters we then washed with 25 mM Tris for a total of 15 minutes, dri and exposed to Kodak X-omat AR autoradiography film f several hours.

Approximately 5 x 10 5 clones were screened, and a wi

range of signals were obtained on the primary scre Bacterial colonies that corresponded to strong filter signa were purified by replating. These were again assayed f activity. Two colonies with very strong signals were colo purified and reassayed. The expression level of these t clones was about ten times that of the wildtype. Assay desi for the expression of other antibody fragments in E. coli outlined by Skerra et al. (Anal. Biochem. 196:151-155, 199 which is hereby incorporated by reference.

Elimination of the effect of unintended mutations

With any mutagenesis procedure there is a risk introducing mutations in areas other than the target. demonstrate that the observed increase in protein expressi was the result of the nucleic acid sequence identified fr the selected clone, a 130 bp fragment containing the mutat area was cloned back into wildtype pMCHAFvl DNA. Th construct expressed more protein than the wildtype sequenc proving that the 10-fold increase in the level of prote expression as compared with wildtype controls is the result the mutated sequence.

DNA sequencing

The sequence of the 130 bp fragment, containing t mutation that conferred increased protein expression wa determined by double stranded dideoxy sequencing on a Dupon Genesis 2000 automated sequencer using the Dupont Genesis 200 sequencing kit. The DNA sequence of the 130 bp fragmen differed from the wildtype sequence only at the targete wobble bases, confirming that the amino acid sequence was no

• altered by the mutagenesis procedure. No mutations outside o the targeted wobble bases were observed. The optimize sequences obtained by this method are provided in Figure 6 and listed as SEQ ID NO: 28 and SEQ ID NO: 29. Thes sequences can then be further defined to more specificall determine the expression promoting regions contained therein Therefore SEQ ID NO: 28 and SEQ ID NO: 29 or fragments thereo

can be used in subsequent expression systems to promote t expression of the same or different protein.

Fv expression level guantitation The expression level of Fv fragments was determined assaying cell free supernatants. Wildtype and purified muta colonies were grown under expression conditions in CS broth described above. Dilutions of antibody containing sampl were incubated with radiolabelled metal-chelate. Aft incubation for one hour, the free, unbound metal chelate w separated from the antibody-bound metal chelate centrifugation through a Millipore ultrafree filter (molecul wight cut-off of 10,000 MW, Millipore, Bedfo; , MA) Sampl of the filtrated and the pre-filtration mixture were count for radioactivity, yielding a "fraction bound". A standa curve of "fraction bound' ersus known amounts of antibody w constructed. The amour,c of Fv in an unknown sample w determined from the standard curve. The results of the ass indicated that the mutants reproducibly expressed 10 tim more active Fv fragment than the original construct.

The protein sequence of the antibody fragments in th example is not altered by wobble base mutagenesis. Therefo any difference in signal strength in the screening assay due to differences in expression levels. However, t expression level may be affected by the mutation in sever ways. The mRNA stability could be improved by the mutatio Similarly, initiation and translation from the ribosome may improved. Further, protein expression is strongly influenc by the sequence of the first few codons following the " A initiation codon (Bucheler et al., Gene 98:271-276, 199 Therefore, wobble base mutagenesis can potentially influen polypeptide expression in a number of ways depending on whe the mutagenic primers bind to nucleic acid and which rand mutations are conferred upon the sequence.

" EXAMPLE IV In another preferred embodiment of this invention, EIP

is used to create a promoter library for gene expression in coli.

In this particular example of the preparation of promoter library, Fv fragment expression of the anti-met chelate antibody (CHA255) is optimized using a population primers with variable sequences in the promoter region (Figu

7).

Expression Vectors

In this example, the plasmid used is pCCHAVll, a 2.4 plasmid containing the Lac promoter followed by an OmpA lead sequence linked to the antibody light chain fragment sequenc and followed by an optimized OmpA sequence linked to th antibody heavy chain fragment. Both antibody chain sequence are driven by a single Lac promoter. This optimized Omp sequence (SEQ ID NO: 26) is derived from Example III. Plasmi pCCHAVll is LacI negative, chlora phenicaol resistance gen positive with a Rop " CoIE 1 origin. In this example second copy of the Lac promoter region is placed in front o the antibody heavy chain fragment sequence. The nucleic aci sequence is provided in Figure 7A (ID SEQ NO: 30) and th inserted promoter sequence is provided in Figure 7B and as I SEQ NO: 33. The inserted region includes the Lac promote library region followed by the wildtype Lac operator followe by the ribosome binding site. The sequence including th ribosome binding site is provided in ID SEQ NO: 34. Oligonucleotide Synthesis

The primers used to create the recombinant promote library are provided as ID SEQ NO: 31 and ID SEQ NO: 32. I SEQ NO: 31 directed mutations to the ribosome binding sit while ID SEQ NO: 32 directed changes to the Lac promote region- In Figure 7B the ribosome binding site; the -10 an the -35 regions of the Lac promoter are underlined and th sequence is provided as ID SEQ NO: 34 and ID SEQ NO: 3 respectively. The bold underlining in Figure 7B correspond to the primer regions in ID SEQ NO: 31 and ID SEQ NO: 32 tha are underlined. The underlined portions are those position along the primer that contain variability. The expecte

frequency of variability at each nucleotide position derived from a mixture 75% of template nucleotide and 8.3% f each of the remaining three nucleotides. For example, in SEQ NO: 31, the first underlined position is a cytosine. T expected bias of the primer population at this position i 75%:C, 8.3%:G, 8.3%:T, 8.3%:A. Libraries were created usi primer populations based on ID SEQ NO: 31 and ID SEQ NO: 3 Other libraries were created using one biased prim population while the other member of the primer pair contain no variability. As an example, a recombinant library w created using ID SEQ NO: 31 to prepare a variable first prim pool, while the second primer corresponded exactly with ID S NO: 32 and therefore contained no variability. The libra generated from these primers contains mutated sequences at t ribosome binding site and a constant Lac promoter sequenc The oligonucleotides comprise a Bsal restriction endonuclea recognition site, a region of variability reflected in t underlined portion of ID SEQ NO: 31 and ID SEQ NO: 32, and region complementary to the template.

PCR Amplification and Product Manipulation

Sequences were amplified using conditions outlined Example III. Following amplification the nucleic acid w cleaved with Bsal and ligated. Nucleic acid w electroporated into E. coli.

Colony Screening Assay and Identification of Positive Clon The screening assay is described in Example III Colonies with increased levels of hapten binding a identified and colony purified. These colonies are expand and analyzed for the. presence of unintended mutations Optimized promoter sequences are identified by sequencing t expression plasmids from positive colonies.

EXAMPLE V

In yet another preferred embodiment of this invention EIPCR is employed to create a eukaryotic mutagenesis library

Similar to EIPCR in E. coli. any region of a eukaryotic vect can be modified. Eukaryotic expression vectors may modified in regulatory regions or within translated regions a particular gene. In this example, a retroviral expressi vector pLN is used to generate a library of mutations with the ribosome binding site of the Neomycin resistance gene. T ribosome binding site, also known as a Kozak sequence (Koza M. , Nuc. Acids. Res. 12(2) :857-72, 1984 which is here incorporated by reference) is a highly conserved region eukaryotic cells comprising the consensus sequen CCACCATG(G) .

Expression Vector

The retroviral expression plasmid pLN was obtained fro A.D. Miller and is described in a publication by Miller et al

(BioTechniques 7(9) :980-990, 1989 which is hereby incorporate by reference) . The vector contains two Moloney Murin

Leukemia Virus (MoMuLV) long terminal repeats (LTR) . Betwee the LTR regions is the Neomycin resistance gene (Neo r ) . Th Neo r ribosome binding site is targeted for library mutagenesi to confer increased resistance to G418 in the eukaryotic cel line NIH 3T3 (ATCC) . The plasmid has a final size of 6 kb.

Oligonucleotide synthesis Oligonucleotides are prepared that are similar in desig to those described for Example I above. The primers ar designed to flank the Neo r ribosome binding site and ar substantially complementary to both strands of DNA. A shor (4-10 bp.) variable region is designed to overlap the ribosom binding site. Thus, the oligonucleotides contain a class II recognition site, the -variable region, and a twenty bas complementary region that anchors the oligonucleotides to th pLN plasmid.

Amplification Conditions

Reaction tubes are prepared for PCR in a final 100 ul. reaction volume. Reaction conditions are optimized fro

initial reaction conditions as outlined in Example II Following PCR, the DNA is purified, cleaved with the desir class IIS restriction endonuclease, recircularized a ligated.

Isolation of Packaged Vector

Ligated product from the PCR reaction is electroporat into the helper virus packaging cell line PE501 obtained fr A.D. Miller and described by Miller et al. , supra. Mutated p is transiently packaged into retroviral particles using PE50 Cell supernatant containing viral particles is harvested fr the packaging cell line and titered on virus susceptible NI 3T3 cells (ATCC) .

Selection and identification of Mutated Sequences

Colonies expressing mutations are selected with elevat levels of G418, preferably between 0.75 -2.5 mg/ml Thes colonies are expanded, lysed, and if desired, the DNA i purified. The optimized promoter region is retrieved from th selected cells by PCR. This new Kozak sequence can then b reintroduced into pLN to verify that the new sequence confer elevated G418 resistance. The region is sequenced to identif the selected nucleic acid sequence. The results from this wor permits the identification of sequences conferring increase G418 resistance and facilitates the identification of Koza sequence requirements and the isolation of improved sequence that can be transferred to other constructs to improve th expression of other protein sequences.

It is additionally contemplated that this technolog could be applied to any gene in combination with a selectabl marker such as Neo r . Therefore any gene or portion of a gen can be mutated and initially selected by its resistance t Neomycin. Subsequent selection will be required t distinguish the optimized mutation. Neomycin resistance i just one of a variety of selection systems useful for EIPC library mutagenesis applications. For example, as a selectio procedure, transfected cells can be screened by a Fluorescen

Activated Cell Sorter (FACS) and positive colonies expand from these cells for further analysis.

Thus, EIPCR library mutagenesis is a reliable a efficient method for obtaining optimized nucleic ac sequences. EIPCR reactions have an efficiency of 95% better in reactions designed to measure the efficiency mutagenesis. EIPCR library mutagenesis is general applicable for de novo design or redesign of protein nucleic acid sequences. Although the invention has been described with referenc to the above examples, it should be understood that variou modifications can be made by those skilled in the art withou departing from the invention. Accordingly, the invention i limited only by the following claims.

SEQUENCE LISTING

(1) GENERAL INFORMATION:

(i) APPLICANT: STEMMER, WILLEM

(ii) TITLE OF INVENTION: ENZYMATIC INVERSE POLYMERASE CHAIN REACTION

(iii) NUMBER OF SEQUENCES: 32

(iv) CORRESPONDENCE ADDRESS:

(A) ADDRESSEE: KNOBBΞ, MARTENS, OLSON & BEAR

(B) STREET: 620 NEWPORT CENTER DRIVE, SIXTEENTH FLOOR

(C) CITY: NEWPORT BEACH

(D) STATE: CALIFORNIA

(E) COUNTRY: UNITED STATES

(F) ZIP: 92660

(v) COMPUTER READABLE FORM:

(A) MEDIUM TYPE: Floppy disk

(B) COMPUTER: IBM PC compatible

(C) OPERATING SYSTEM: PC-DOS/MS-DOS

(D) SOFTWARE: Patentin Release #1.0, Version #1.25

(vi) CURRENT APPLICATION DATA:

(A) APPLICATION NUMBER:

(B) FILING DATE:

(C) CLASSIFICATION:

(viii) ATTORNEY/AGENT INFORMATION:

(A) NAME: ISRAELSEN, NED A.

(B) REGISTRATION NUMBER: 29,655

(C) REFERENCE/DOCKET NUMBER: HYBRIT.001CP1

(ix) TELECOMMUNICATION INFORMATION:

(A) TELEPHONE: 619-235-8550

(B) TELEFAX: 619-235-0189

(2) INFORMATION FOR SEQ ID NO:l:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 54 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: circular

(ii) MOLECULE TYPE: cDNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: AAATCTGGAG CCGGTGAGCG TGGGTCTCGC GGTATCATTG CAGCACTGGG GCCA 54

(2) INFORMATION FOR SEQ ID NO:2: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 36 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: ATTAGGTCTC GGTTCCCGCG GTATCATTGC AGCACT 36

(2) INFORMATION FOR SEQ ID NO:3:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 35 base pairs (B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: AATTGGTCTC GGAACCACGC TCACCGGCTC CAGAT 35

(2) INFORMATION FOR SEQ ID NO:4:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 54 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double (D) TOPOLOGY: circular

(ii) MOLECULE TYPE: cDNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: :

AAATCTGGAG CCGGTGAGCG TGGTTCCCGC GGTATCATTG CAGCACTGGG GCCA 54

(2) INFORMATION FOR SEQ ID NO:5:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 11 base pairs

(B) TYPE: nucleic acid (C) STRANDEDNESS: single

(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: GGTCTCNNNN N 11

(2) INFORMATION FOR SEQ ID NO:6: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 12 base pairs

(B) TYPE: nuc.-sic acid

(C) STRANDEDNL3S: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: GAAGACNNNN NN 12

(2) INFORMATION FOR SEQ ID NO:7: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 10 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: CTCTTCNNNN 10

(2) INFORMATION FOR SEQ ID NO:8:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 7 base pairs (B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8; GAATGCN

(2) INFORMATION FOR SEQ ID NO:9:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 14 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single (D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:

ACCTGCNNNN NNNN 14

(2) INFORMATION FOR SEQ ID NO:10:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: GGATCNNNNN 10

(2) INFORMATION FOR SEQ ID NO:11:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 17 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single (D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:

GCAGCNNNNN NNNNNNN 17

(2) INFORMATION FOR SEQ ID NO:12:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:

GTCTCNNNNN 10

(2) INFORMATION FOR SEQ ID NO:13:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 6 base pairs

(B) TYPE: nucleic acid (C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:

ACTGGN 6

(2) INFORMATION FOR SEQ ID NO:14:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single (D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:

GGATGNNNNN NNNNNNNN 18

(2) INFORMATION FOR SEQ ID NO:15:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 15 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:

GACGCNNNNN NNNNN 15

(2) INFORMATION FOR SEQ ID NO:16:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 13 base pairs

(B) TYPE: nucleic acid (C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: GGTGANNNNN NNN 13

(2) INFORMATION FOR SEQ ID NO:17: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 13 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: GAAGANNNNN NNN 13

(2) INFORMATION FOR SEQ ID NO:18:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:

GAGTCNNNNN 10

(2) INFORMATION FOR SEQ ID NO:19:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 14 base pairs

(B) TYPE: nucleic acid (C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: GCATCNNNNN NNNN 14

(2) INFORMATION FOR SEQ ID NO:20: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 11 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: CCTCNNNNNN N 11

(2) INFORMATION FOR SEQ ID NO:21:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 90 base pairs (B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: circular

(ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: AGGAACCAAA CTGACTGTCC TAGGATAGAA GGAGATATAT CATGAAAAAG ACAGCTGGCG.60 CAGGCCGAGG TGACCCTGGT GGAGTCTGGG 90

(2) INFORMATION FOR SEQ ID NO:22:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 58 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:

ATTAGAAGAC TACTCCNNNN NNNNNNNNNN NNNNNNNGAG GTGACCCTGG TGGAGTCT 58

(2) INFORMATION FOR SEQ ID NO:23:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 58 base pairs

(B) TYPE: nucleic acid (C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: . SEQ ID NO:23: AATTGAAGAC ATGGAGNNNN NNNNNNNNNN NNNNNNTCCT AGGACAGTCA GTTTGGT 58

(2) INFORMATION FOR SEQ ID NO:24: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 94 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: circular

(ii) MOLECULE TYPE: cDNA

(ix) FEATURE: (A) NAME/KEY: CDS

(B) LOCATION: 2..94

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:

A GGA ACC AAA CTG ACT GTC CTA GGA CGG AAA TCG GGG CGG TCT ACC 46 Gly Thr Lys Leu Thr Val Leu Gly Arg Lys Ser Gly Arg Ser Thr

1 5 10 15 TCC CCT CTC CCA ATA AAA TTA GGG GAG GTG ACC CTG GTG GAG TCT GGG 94 Ser Pro Leu Pro lie Lys Leu Gly Glu Val Thr Leu Val Glu Ser Gly 20 25 30

(2) INFORMATION FOR SEQ ID NO:25:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: Gly Thr Lys Leu Thr Val Leu Gly Arg Lys Ser Gly Arg Ser Thr Ser 1 5 10 15

Pro Leu Pro lie Lys Leu Gly Glu Val Thr Leu Val Glu Ser Gly 20 25 30

(2) INFORMATION FOR SEQ ID NO:26:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 72 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:

TCTATAGGTC TCTTTGCNGT NGCNCTNGCN GGNTTYGCNA CNGTNGCNCA RGCNGAGGTG 60 ACCCTGGTGG AG 72

(2) INFORMATION FOR SEQ ID NO:27:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 56 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single (D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:

TATTAAGGTC TCAGCAATNG CRATNGCNGT YTTYTTCATG ATATATCTCC TTCTAT 56

(2) INFORMATION FOR SEQ ID NO:28: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 63 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: circular

(ii) MOLECULE TYPE: cDNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:

ATG AAA AAA ACC GCG ATC GCC ATT GCT GTG GCG CTT GCC 39

MET LYS LYS THR ALA ILE ALA ILE ALA VAL ALA LEU ALA 1 5 10

GGC TTT GCT ACG GTG GCG CAG GCA 63

GLY PHE ALA THR VAL ALA GLN ALA 15 20

(2) INFORMATION FOR SEQ ID NO:29:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 63 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: circular (ii) MOLECULE TYPE: cDNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: ATG AAA AAA ACT GCA ATT GCG ATT GCT GTT GCT CTT GCT 39

MET LYS LYS THR ALA ILE ALA ILE ALA VAL ALA LEU ALA 1 5 10 GGT TTC GCG ACG GTA GCA CAG GCC 63

GLY PHE ALA THR VAL ALA GLN ALA 15 20

(2) INFORMATION FOR SEQ ID NO:30:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 13 base pairs (B) TYPE: nucleic acid

(G) STRANDEDNESS: single (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:

AAGGAGATAT ATC

13

(2) INFORMATION FOR SEQ ID NO:31:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 85 base pairs (B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:

AACTATTGGT CTCAGTGGAA TTGTGAGCGG ATAACAATTT CACACAGGAA ACAGCTATGA 60 AAAAAACCGC GATCGCCATT GCTGT 85

(2) INFORMATION FOR SEQ ID NO:32:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 109 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single (D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:

ATCATTAGGT CTCACCACAC AACATACGAG CCGGAAGCAT AAAGTGTAAA GCCTGGGGTG 60

AAAAAAAAAG GCTCCAAAAG GAGCCTTTCT ATCCTAGGAC AGTCAGTTT 109

(2) INFORMATION FOR SEQ ID NO:33:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 41 base pairs (B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: CACCCCAGGC TTTACACTTT ATGCTTCCGG CTCGTATGTT G 41

(2) INFORMATION FOR SEQ ID NO:34:

(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 12 base pairs (B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:

CAGGAAACAG CT 12