Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR INTRODUCING DIVERSITY INTO AND ASSEMBLY OF POLYNUCLEOTIDE SEQUENCES
Document Type and Number:
WIPO Patent Application WO/2010/063711
Kind Code:
A1
Abstract:
The invention provides a method for introducing a unique DNA sequence or a mixture of different DNA sequences within regions of selectable size and location of a target DNA vector by generating and using compatible cohesive ends. Specifically the invention provides a methodology for introducing a population of variable DNA sequences within regions of selectable size and location (e.g. segments containing complementarity determining regions in the genes coding for Immunoglobulins) by generating and using compatible cohesive ends.

Inventors:
MONACI PAOLO (IT)
BENDTSEN CLAUS (IT)
LAHM ARMIN (IT)
Application Number:
PCT/EP2009/066135
Publication Date:
June 10, 2010
Filing Date:
December 01, 2009
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ANGELETTI P IST RICHERCHE BIO (IT)
MONACI PAOLO (IT)
BENDTSEN CLAUS (IT)
LAHM ARMIN (IT)
International Classes:
C12N15/10; C07K16/00
Domestic Patent References:
WO2008095927A12008-08-14
Foreign References:
DE10337407A12005-03-10
Other References:
BERGER S L ET AL: "PHOENIX MUTAGENESIS: ONE-STEP REASSEMBLY OF MULTIPLY CLEAVED PLASMIDS WITH MIXTURES OF MUTANT AND WILD-TYPE FRAGMENTS", ANALYTICAL BIOCHEMISTRY, ACADEMIC PRESS INC, NEW YORK LNKD- DOI:10.1006/ABIO.1993.1540, vol. 214, 1 January 1993 (1993-01-01), pages 571 - 579, XP002043107, ISSN: 0003-2697
MARSHALL JACQUELINE J T ET AL: "Restriction endonucleases that bridge and excise two recognition sites from DNA.", JOURNAL OF MOLECULAR BIOLOGY 23 MAR 2007 LNKD- PUBMED:17266985, vol. 367, no. 2, 23 March 2007 (2007-03-23), pages 419 - 431, XP002580023, ISSN: 0022-2836
Attorney, Agent or Firm:
BUCHAN, Gavin (Hoddesdon, Hertfordshire EN11 9BU, GB)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method of introducing diversity at a desired point into one or more target polynucleotide sequences and assembling a library of double-stranded DNA sequences, wherein said method comprises the steps of: a) providing one or more double-stranded polynucleotide sequences each of which is engineered to contain at least one recognition motif for a type II restriction enzyme that is not native to that polynucleotide sequence in a region common to all the polynucleotide sequences targeted for diversification, b) digesting the polynucleotide sequences of step (a) with a type II restriction enzyme that recognizes the recognition motif and, if necessary, one or more further type II restriction enzymes in the same region of the polynucleotides such that a portion of the polynucleotide sequences can be excised to give polynucleotide sequences with overhanging single stranded nucleotide ends of two to seven nucleotides, c) preparing a collection of dsDNA variant fragments having at their 5' and 3' ends overhanging single stranded nucleotide ends that are complementary to the ones generated in step (b), and d) ligating the dsDNA variant fragments of step (c) into the restriction enzyme digested polynucleotide sequences of step (b) to give a diverse library of ds DNA sequences.

2. A method according to claim 1 wherein at least one of the recognition motifs is a recognition motif for a type HB restriction enzyme that is not native to that polynucleotide sequence.

3. A method according to claim 1 wherein at least one of the recognition motifs is a recognition motif for a type IIS restriction enzyme that is not native to that polynucleotide sequence.

4. A method according to any one of claims 1 to 3 wherein the ds polynucleotide encodes a polypeptide.

5. A method according to claim 4 wherein the polypeptide is an antibody heavy or light chain variable domain.

6. The method according to claim 5 wherein the diversity introduced corresponds to a complementarity determining region (CDR) of the polynucleotide sequences encoding an antibody light or heavy chain variable domain.

7. The method according to claim 5 wherein the complementarity determining region (CDR) is a CDR3 of the polynucleotide sequences encoding an antibody light or heavy chain variable domain.

8. The method of any previous claim wherein the dsDNA variant fragments to be inserted in the target polynucleotide sequences are generated by digestion of ds DNA fragments comprising within their 5' and 3' segment type II restriction enzymes recognition sites.

9. The method of any previous claim wherein the dsDNA variant fragments to be inserted in the target polynucleotide sequences are generated by digestion of ds DNA fragments comprising within their variable regions one or two HB restriction enzymes recognition sites that generate, on digestion with the appropriate type HB restriction enzymes, overhanging single stranded nucleotide ends which are complementary to the overhanging ends generated by digestion of the target polynucleotide sequences.

10. The method of claims 1 to 7 wherein the dsDNA variant fragments to be inserted in the target polynucleotide sequences are generated by starting from a collection of single-stranded oligonucleotides containing the desired variability; a reverse primer is annealed to the 3' end of each of the collection of the single-stranded oligonucleotides followed by double-strand elongation with an appropriate DNA polymerase to give the ds DNA variant fragments.

11. The method of claims 1 to 7 wherein the dsDNA variant fragments to be inserted in the target polynucleotide sequences are generated by annealing two complementary collections of oligonucleotides where sequences within each collection of oligonucleotides contains a 5' segment remaining single-stranded after annealing and the 5' segment being complementary to the overhanging ends generated in the target polynucleotide sequences on digestion.

12. The method of claims 1 to 7 wherein the collection of dsDNA variants is prepared using single-stranded oligonucleotides generated by Dimer-based codon synthesis, a reverse primer being annealed to the 3' end of the single-stranded oligonucleotides followed by double-strand elongation using an appropriate DNA polymerase.

13. The method of claims 1 to 7 wherein the collection of dsDNA variants is prepared starting with single-stranded oligonucleotides; a reverse primer is annealed to the 3' end of the single- stranded oligonucleotides followed by double-strand elongation using an appropriate DNA polymerase.

14. The method of claims 1 to 7 wherein the dsDNA variant fragments with two overhanging ends following digestion or annealing are inserted into a library of double-stranded polynucleotide sequences with appropriate complementary overhanging ends generated by PCR amplification of a library of double-stranded polynucleotide sequences with appropriate PCR primers.

15. The method of any previous claim wherein the cleavage by the restriction enzymes results in overhangs of up to 5 bases on the sense strand and/or up to 6 bases on the reverse strand.

16. The method of any previous claim wherein the cleavage by the restriction enzymes results in overhangs of up to 4 bases on the sense strand and/or up to 4 bases on the reverse strand.

Description:
METHOD FOR INTRODUCING DIVERSITY INTO AND ASSEMBLY OF POLYNUCLEOTIDE SEQUENCES

BACKGROUND OF THE INVENTION Generating diversity

Methods to create diversity in specific regions of a nucleic acid sequence encoding a target protein can be grouped in two classes, depending on whether the process is performed in vivo or in vitro. A first class includes bacterial or other microorganism strains with an inherently increased frequency of mutation. Though their use is straightforward, the rate of mutagenesis is rather low. In addition, mutations are introduced in the bacterial chromosome as well as in the entire plasmid DNA containing the target sequence. As a consequence, target sequence mutants affecting viability of the bacteria are eliminated during the process.

A second class, by far the most popular, comprises enzymatic or synthetic processes to generate sequence diversity. Several PCR-based protocols have been reported. Error-prone polymerase chain reaction (PCR) uses low- fidelity polymerization conditions to introduce point mutations randomly within a nucleic acid sequence. This method can be used to mutagenize one or more nucleic acid sequence fragments, whose sequence can be unknown. However, the mutagenesis rate is too slow to allow the introduction of focused changes that are required for continuous sequence evolution. On the other hand, increasing the number of PCR cycles results in the accumulation of harmful or neutral mutations. In particular, for therapeutic proteins an undesired side effect of this approach can be the generation of point mutations that can increase the immunogenicity of the protein. Several variant protocols have been reported where variation in buffer composition alters the mutagenesis rate. Similarly, incorporating nucleotide derivatives can affect the specificity and the efficiency of the process. DNA shuffling, also known as "sexual PCR", is an alternative method to generate diversity. It is based on in vitro homologous recombination of pools of selected mutant genes by random fragmentation and PCR reassembly. Iterative homologous recombination makes DNA shuffling a powerful protocol, for example for selecting protein mutants with increased enzymatic activity. An alternative to enzyme-mediated random mutagenesis acting on the DNA sequence is in vitro synthesis of oligonucleotides which, in principle, allows the amino acid composition and relative frequency at each position in the target sequence to be defined. A number of protocols have been reported where only a subset of desired codon triplets are being used, eliminating the redundancy and occurrence of stop codons (Virnekas et al. 1994; Kayushin et al. 1996; Neuner et al., 1998). Whilst allowing better control of the diversity introduced, the respective protocols can become relatively complex if the target regions to be modified are not contiguous. Introducing diversity at specific sites

Several different strategies have been adopted to assemble and clone the diversity generated through the above methods. The simplest approach is direct cloning of a mutagenized sequence into the nucleic acid sequence coding for the target protein exploiting the use of unique restriction sites compatible with the amino acid sequence of the target protein. This implies that the required restriction site(s) are already present or are engineered into the DNA sequence encoding for the target protein without altering its amino acid sequence. When this requirement is fulfilled, this approach is straightforward. In most cases two different restriction enzymes are used which generate incompatible restriction ends. This reduces the frequency of intra-molecular ligation events of the vector in the absence of the mutagenized insert. In addition it also prevents multiple DNA fragments (vector and/or insert) to polymerize and ligate together.

An effective example of this approach is the modular cloning strategy developed and implemented in WO97/08320 to introduce sequence diversity into the complementarity determining regions (CDR) of the HuCAL Gold™ synthetic phage-displayed antibody libraries. Unique restriction sites were engineered in the region adjacent to each of the six CDRs present in the seven light and seven heavy chain variable domain consensus sequences included in the HuCAL Gold™ library. These restriction sites are located in the region flanking the target antibody variable chain and were chosen to preserve the amino acid sequence of the polypeptide. Other methods described in the literature to generate diversity using specific restriction enzymes include: i) ONCL-DYAX WO 01/79481-A2 and Nucleic Acids Research 2005 33(9): e81. A method is described where specificity of cloning is achieved by generating local double- stranded (ds) DNA regions on a single-stranded (ss) DNA. Initially a ss oligonucleotide is synthesized which: i) hybridizes to a specific complementary sequence in the ssDNA template (and where cleavage is desired); and ii) includes a sequence that together with its complement in the ssDNA template forms a restriction endonuclease recognition site. Preferably, the restriction endonuclease recognition site is that of a type IIS restriction endonuclease whose cleavage site is located at a known distance from its recognition site. Upon incubation, the restriction enzyme binds to the dsDNA region and cleaves at the desired location within the dsDNA generated by the ssDNA template and ssDNA oligonucleotide. ii) WO2004/013328 (COSMIX Molecular Biologicals) describes the recombination of heterologous polynucleotides comprising variant groups and/or regions wherein recombination is performed within one group of polynucleotides and/or between defined regions of polynucleotides by using different restriction sites in different groups and/or regions of polynucleotides. The overall result is therefore the generation, for each heavy chain subclass, of new heavy/light chain combinations, where only the heavy chain CDR3 region has been exchanged. The method therefore allows to considerably enlarge the complexity of a pre-existing library of antibody Fab fragments by generating new variants where the heavy chain CDR3 region of a particular heavy/light chain combination has been substituted by heavy chain CDR3 regions present in heavy/light chain combinations of the same heavy chain subclass. iii) WO2001/000816 (Complete Genomics) describes the generation of dsDNA fragments with unique single stranded regions and then mediating the binding between a first and second nucleic acid molecule. In this method, restriction nucleases are used that form non- identical overhangs, e. g. type IIP or IIS restriction endonucleases. This method of mediation has particular applications for effectively identifying and selecting a first nucleic acid molecule fragment and then mediating its binding to a second nucleic acid molecule where this was not previously possible. The intended use is aimed at inserting, without too much explicit control, an insert sequence into a vector without, however, removing the type II restriction site. The method requires the use of adapter sequences and will in general introduce additional segments in addition to the primary insert sequence. This is not useful for generating diversity in a specific region of a polypeptide as the reading frame of the polypeptide will be enlarged and modified and in general also disrupted. Ligation-PCR is an alternative approach. In this method partially overlapping DNA fragments which include the desired sequence diversity to be introduced at one or more locations in the target peptide are assembled and amplified by PCR. Several amplification rounds are required to efficiently combine the different diversity regions. During this process the imperfect pairing between regions containing nearly complementary diversity creates a number of undesired mutations, including deletions and inversions, which frequently generate out-of-frame mutations. These can be effectively removed by fusing a beta-lactamase coding sequence downstream of the diversity cassette

Another PCR-based approach uses an inverse-PCR reaction of the entire plasmid with the 5 '-region of one or both PCR primers contains a diversified region. The ends of these amplification products are then trimmed by restriction enzymes to generate compatible ends, intra-molecularly ligated and amplified by transformation into bacteria. However, the efficiency of this approach decreases with the length of the plasmid to be amplified. Furthermore mutations are expected to occur throughout the plasmid though they will be readily detected only when leading to frame shifts or stop codons. These examples clearly identify the need for a novel strategy enabling a direct cloning of a pool of DNA fragments anywhere into the DNA sequence of a target protein. The same approach should be applicable to any DNA sequence and at the same time preserving the amino acid sequence of the target protein outside the region of interest.

TECHNICAL PROBLEM Often the goal of research is the identification of a protein derivative with altered properties compared to the wild type (wt) sequence (e.g. binding to a target receptor molecule). In many cases these amino acid changes need to be confined to one or more well defined segments of the target protein. For example, in the case of antibodies, changes are directed to the CDR loops. This task is usually accomplished by generating in vitro a large collection of variants of the target protein DNA sequence. In order to preserve the correct reading frame coded by the DNA sequence, the wt sequence segment has to be replaced precisely by the corresponding mutated sequence segment, the "diversity cassette". In addition, the process has to be capable of efficiently generating large numbers of mutant proteins. Cloning by means of restriction sites is by far the most efficient and reliable approach. However, in many cases, it is difficult or even impossible to identify existing or to introduce new unique restriction sites adjacent to the target segment without also altering the protein sequence outside the region of interest.

The technical problem solved by the present invention is the introduction of potentially very high sequence diversity into a defined segment of a DNA sequence which does not contain useful restriction sites. In addition for a protein-coding DNA sequence, cloning should not alter the amino acid sequence of the protein outside the region of interest.

SUMMARY OF THE INVENTION

We describe a solution to the above problem by using the compatible cohesive ends (CoCohEn) approach described below. CoCohEn is a generally applicable method based on the cloning of a single sequence or the cloning of a diversity cassette into a defined region of a target double-stranded (ds) DNA sequence (Figure 1) by means of compatible cohesive ends. The cohesive ends are, at least in the target vector, generated using restriction enzymes that recognize a defined sequence of the dsDNA and then cleave at a defined distance outside their recognition site. Engineering a vector as illustrated in Figure 2 allows the use of certain restriction enzymes (the same or two different enzymes) to generate the desired compatible cohesive ends in the target region of the dsDNA. The CoCohEn process only transiently introduces an additional sequence segment (the acceptor cassette) within the target region. Indeed, the additional segment is subsequently eliminated upon introduction of the diversity segment. The restriction enzymes used may be, but are not limited to, type IIS or type HB restriction enzymes that cleave outside their recognition site. The additional segment containing the restriction sites is introduced at the desired region of the target sequence (or target sequences in cases of libraries of sequences). This is done in order to generate, upon digestion with the corresponding enzyme, compatible cohesive ends which unambiguously anneal to the compatible (complementary) cohesive ends in the region flanking the diversity cassette. Design of the acceptor cassette does require that the restriction sites at the 5'- and 3 '-ends are correctly positioned and orientated and that these sites are not present within the rest of the vector. The sequence composition for the rest of the acceptor cassette can be chosen as one sees fit, for example by introducing a unique un-related restriction site for counter-selection by digestion of the un-digested vector. Unlike as described in WO2001/000816, no adapter is required in CoCohEn and no material is introduced in addition to what is desired, thereby reducing the risk of mutations. Thus, the present invention provides a process for introducing a unique DNA sequence, or a mixture of different DNA sequences, in a modular fashion at specific site(s) in a vector by generating and using compatible cohesive ends (Figure 2 and 3). More specifically, the invention provides a protocol, CoCohEn, for introducing a population of variable DNA sequences at target sites (e.g. complementarity determining regions) in polynucleotides coding, for example, for the immunoglobulin variable domains. The disclosed CoCohEn method can be used to introduce a dsDNA fragment with a unique sequence, or a mixture of dsDNA fragments, with different sequence and/or length, into an expression vector or, more generally, into any dsDNA segment. In particular, the CoCohEn protocol can be used to create antibody libraries that are suitable either for antibody lead discovery or for optimization (maturation or humanization) of existing antibodies.

As shown herein, the CoCohEn protocol can be used to replace specific regions of a polynucleotide sequence, for example an antibody CDR that is targeted for diversification. More specifically, the protocol can be used to diversify any of the three CDR regions present in an antibody heavy or light chain polypeptide by introducing one or more cassettes encoding diversified polynucleotide sequences into specific locations of a recombinant vector that has been engineered appropriately to accept diversity inserts with compatible cohesive ends.

The disclosed method provides an efficient and simple cloning procedure that is generally applicable to the cloning of a sequence or a group of sequences, into a vector at one or more desired target position(s). Generally speaking, the disclosed method uses restriction enzymes which cut at a defined distance from their recognition site, with the distance being restriction enzyme specific and independent of the nucleotide sequence present at the site of restriction and independently of the sequence present between the restriction enzyme recognition site and the site of the cut. The nucleotide sequence of the resulting overhangs will thus depend on the sequence flanking the recognition site and is appropriately chosen in order to vary from one site to another thereby generating site specific cohesive ends with distinct single stranded overhangs. In general, an acceptor cassette containing restriction enzyme (RE) recognition motifs is inserted into the vector in order to place the cohesive ends which are generated upon digestion adjacent to the region where diversity needs to be introduced. For example, if diversity needs to be introduced in a region of a protein comprising residues K to K+3, the acceptor cassette will in general be positioned in the vector in such a way that cohesive ends are generated around residue K-I or K-2 and around K+4 or K+5 (Figure 4). Start and end of the cohesive ends do not need to coincide with a specific nucleotide of the residue's codon triplet since appropriate design of the diversity cassette ensures that the correct peptide sequence and correct reading frame that needs to be maintained is restored after insertion of the diversity cassette. A diversity cassette, containing a collection of dsDNA fragments harboring diversity in a defined region flanked both at their 5'- and 3 '-ends by single-stranded overhangs (cohesive ends) complementary to those present in the acceptor cassette described above is generated and ligated into the digested vector.

Methods to generate these compatible cohesive ends in the diversity cassette include but are not limited to introducing type II RE recognition motifs at both the 5'- and the 3'-end of the diversity cassette. When using such motifs, these are localized and oriented to generate, after digestion of the dsDNA, single-stranded overhangs complementary to those generated by the digestion of the acceptor cassette as previously described. The type II RE recognition motifs present within the 5' and 3' flanking region might be recognized by the same type II RE or represent distinct motifs recognized by two different type II REs.

In summary the invention is a method of introducing diversity at a desired point into one or more target polynucleotide sequences and assembling a library of double-stranded DNA sequences, wherein said method comprises the steps of: a) providing one or more double-stranded polynucleotide sequences each of which is engineered to contain at least one recognition motif for a type II restriction enzyme that is not native to that polynucleotide sequence in a region common to all the polynucleotide sequences targeted for diversification, b) digesting the polynucleotide sequences of step (a) with a type II restriction enzyme that recognizes the recognition motif and, if necessary, one or more further type II restriction enzymes in the same region of the polynucleotides such that a portion of the polynucleotide sequences can be excised to give polynucleotide sequences with overhanging single stranded nucleotide ends of two to seven nucleotides, c) preparing a collection of dsDNA variant fragments having at their 5' and 3' ends overhanging single stranded nucleotide ends that are complementary to the ones generated in step (b), and d) ligating the dsDNA variant fragments of step (c) into the restriction enzyme digested polynucleotide sequences of step (b) to give a diverse library of ds DNA sequences.

DESCRIPTION OF THE DRAWINGS Figure 1 : Application of the CoCohEn method allows inserting diversity in the region of dsDNA between sites X and Y. The method has the advantage that the location of sites X and Y can be chosen as desired.

Figure 2 A): Generation of vector applicable to CoCohEn cloning. A cassette containing type IIS restriction enzyme recognition sites is inserted into the region of the polynucleotide sequence where diversity needs to be inserted. Recognition sites are placed within the cloning cassette in such a way that the digestion with the type IIS restriction enzymes generates single- strand overhangs flanking the region where diversity needs to be introduced. One or two different type IIS restriction enzymes can be used. A suitable ds diversity cassette containing complementary single-strand overhangs is then ligated into the digested vector. Figure 2 B): CoCohEn cloning using a cloning cassette containing a type HB restriction enzyme recognition site. The recognition site is placed in such a way that single-strand overhangs flanking the region where diversity needs to be introduced are generated on digestion. A suitable ds diversity cassette containing complementary single-strand overhangs is then ligated into the digested vector. Figure 3 A): Generation of a ds diversity cassette for CoCohEn cloning. Flanking the diversity segment are two regions containing type IIS RE recognition sites positioned at the right distance and in the right orientation. One or two different type IIS restriction enzymes can be used. On digestion single-strand overhangs complementary to those present in the digested target vector are generated. Figure 3 B): Same as Figure 3 A) but using instead a type HB RE site appropriately placed and oriented in the flanking regions. Suitable cassettes can also be prepared using a combination of type IIS and type HB restriction enzymes. Also, if compatible with the composition of the desired diversity, the restriction sites can also be located within the diversity region.

Figure 3 C): Generation of diversity cassettes using a suitable combination of restriction enzymes sites compatible with the generation of single-strand overhangs complementary to those present in the digested target vector. Also, cassettes might be generated using a combination of approaches A, B and C. D) Generation of diversity inserts by annealing of complementary single- stranded oligonucleotides. Following annealing single-strand overhangs complementary to those present in the digested target vector are present. Figure 4: Example illustrating how diversity is generated in a region covering residues K to K+3 of a peptide sequence, including insertion of two additional residues. In the target polynucleotide (vector) the cohesive ends are positioned around residues K-I, K-2 and K+4, K+5. On insertion of the appropriately designed diversity cassette the correct peptide sequence is restored. The precise positioning of the cohesive ends (acceptor cassette) in the target polynucleotide vector can be chosen as appropriate because the design of the diversity cassette can be adjusted accordingly. Also, co-incidence of the start and end of the cohesive ends with the start or the end of a residue codon triplet is not required.

Figure 5: Variable domain of the light chain of the vector 3.1 5B. Bpil and Kpnl sites are indicated.

Figure 6: Forward (Primer 3.1_Lfor) and reverse (Primer3.1_Lrev) primer used to generate the acceptor cassette. Bpil and Kpnl sites are indicated

Figure 7: dsDNA VLes construct. Bpil and Kpnl sites are indicated as are the Esp3I site

Figure 8: Acceptor cassette at the light chain CDR3 region of plasmid p3.1_Les obtained by ligation of the Bpil and Kpnl digested dsDNA VLes construct into the Bpil and Kpnl digested 3.1 5B vector. Esp3I sites of the acceptor cassette are indicated. Figure 9: Composition of the degenerate oligonucleotides CDR3-L Oligo-S, Oligo-S2 and the reverse primer CDR3-L-S12. The Esp3I sites present in the oligonucleotide sequences are underlined.

Figure 10: PCR product dsDNA Vlib obtained by elongating oligonucleotides CDR3-L- Sl and CDR3-L-S2 with the reverse primer CDR3-L-S12. The Esp3I sites present in the oligonucleotide sequences are underlined.

Figure 11 : Cloning of the dsDNA Vlib lnsert insert into the Esp3I digested p3.1_Les vector. The acceptor cassette in p3.1_Les is replaced by the insert dsDNA Vlib lnsert. Amino acid diversity present at the desired site in the resulting ligation product p3.1_Vlib is indicated.

Figure 12: Observed and expected amino acid residue frequency in p3.1_Vlib.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides: A method of introducing diversity at a desired point into one or more target polynucleotide sequences and assembling a library of double-stranded DNA sequences, wherein said method comprises the steps of: a) providing one or more double-stranded polynucleotide sequences each of which is engineered to contain at least one recognition motif for a type II restriction enzyme that is not native to that polynucleotide sequence in a region common to all the polynucleotide sequences targeted for diversification, b) digesting the polynucleotide sequences of step (a) with a type II restriction enzyme that recognizes the recognition motif and, if necessary, one or more further type II restriction enzymes in the same region of the polynucleotides such that a portion of the polynucleotide sequences can be excised to give polynucleotide sequences with overhanging single stranded nucleotide ends of two to seven nucleotides, c) preparing a collection of dsDNA variant fragments having at their 5' and 3' ends overhanging single stranded nucleotide ends that are complementary to the ones generated in step (b), and d) ligating the dsDNA variant fragments of step (c) into the restriction enzyme digested polynucleotide sequences of step (b) to give a diverse library of ds DNA sequences. Recognition motifs for any type II restriction enzymes might be used provided that, on digestion with the appropriate type II restriction enzymes, overhanging single-stranded ends are generated which are complementary to the cohesive ends generated in the target polynucleotide sequences on digestion of the acceptor cassette. However, the use of at least one type HB or type IIS restriction enzyme is preferred for use in the methods of the present invention. If a type IIS restriction enzyme is used in the methods of the present invention, then preferably there are two recognition motifs for restrictions enzymes in the polynucleotide sequence, at least one of which recognition motifs is not native to that polynucleotide sequence, and preferably there are at least two recognition motifs for type IIS restriction enzyme that are not native to that polynucleotide sequence. However, if a type HB restriction enzyme is used in the methods of the present invention, then the use of only one restriction enzyme is necessary together with a recognition motif for that restrictions enzyme. Suitably the ds polynucleotide encodes a polypeptide, for example a polypeptide which is an antibody variable heavy or light chain.

The dsDNA variant fragments to be inserted in the polynucleotide sequence may be generated by digestion of ds DNA fragments comprising within their 5' and 3' segment type II restriction enzymes recognition sites. Thus, the dsDNA variant fragments to be inserted in the polynucleotide sequence may be generated by digestion of ds DNA fragments comprising within their variable regions one or two HB restriction enzymes recognition sites that generate, on digestion with the appropriate type HB restriction enzymes, cohesive ends which are complementary to the overhanging ends generated by digestion of the polynucleotide sequence. The dsDNA variant fragments to be inserted in the polynucleotide sequence may be generated by starting from a collection of single-stranded oligonucleotides containing the desired variability; a reverse primer being annealed to the 3' end of each of the collection of the single-stranded oligonucleotides followed by double-strand elongation with an appropriate DNA polymerase to give the ds DNA variant fragments. Oligonucleotides of the collection contain a 5' segment complementary to the respective cohesive end generated at the 5' end of the acceptor cassette. The reverse primer contains a 5' segment complementary to the respective cohesive end generated at the 3' end of the acceptor cassette. After annealing the reverse primer to the collection of single-stranded oligonucleotides the 5' segments of the oligonuclotides and the reverse primer representing the complement of the respective cohesive ends of the acceptor cassette are protected by addition of appropriate short single-stranded RNA molecules before initiation of the polymerase reaction. After completion of the polymerase reaction the RNA components of the resulting double-strand are digested by RNAse thus generating dsDNA variant fragments with cohesive ends complementary to the cohesive ends generated in the target polynucleotide sequences on digestion of the acceptor cassette.

Alternatively, the dsDNA variant fragments to be inserted in the polynucleotide sequence may be generated by annealing two complementary collections of oligonucleotides where sequences within each collection of oligonucleotides contains a 5' segment remaining single- stranded after annealing and the 5' segment being complementary to the cohesive ends generated in the target polynucleotide sequences on digestion.

The dsDNA variant fragments with two cohesive ends following digestion or annealing are inserted into a library of double-stranded polynucleotide sequences with appropriate complementary cohesive ends generated by PCR amplification of a library of double-stranded polynucleotide sequences with appropriate PCR primers.

The collection of dsDNA variants may also be prepared using single-stranded oligonucleotides generated by Dimer-based codon synthesis, a reverse primer being annealed to the 3' end of the single-stranded oligonucleotides so prepared followed by double-strand elongation using an appropriate DNA polymerase.

The collection of dsDNA variants may be prepared starting with single-stranded oligonucleotides; a reverse primer is annealed to the 3' end of the single-stranded oligonucleotides followed by double-strand elongation using an appropriate DNA polymerase.

The method described above wherein digestion by the restriction enzymes results in overhangs of 2 to 5 bases on the sense strand and/or 2 to 6 bases on the reverse strand, suitably overhangs of 2 to 4 bases on the sense strand and/or 2 to 4 bases on the reverse strand, for example overhangs of 4 bases.

In a particular embodiment, the polynucleotide sequence encoding an antibody variable chain, and the region targeted for diversification is one of the three complementarity determining regions (i.e., CDRl, CDR2 or CDR3). More specifically, in a particular embodiment the disclosed method is used to introduce diversity into the region of a polynucleotide sequence encoding the heavy or light chain CDR3 region. A skilled artisan will readily appreciate that the disclosed methodology can be used to introduce diversity into any one of the three CDR regions present in a parental antibody variable heavy (VH) or variable light (VL) chain. In many cases, diversity needs to be introduced at various sites in a protein (e.g. the CDR regions of an antibody). Using the CoCohEn approach, we inserted the diversity cassettes by cloning one cassette per round of cloning. The reason for following this strategy stems from the much lower efficiency of the ligation step when three or more different fragments are present during ligation as compared to the case involving only two fragments. Attempts to first ligate two insert segments and then ligate the product into the vector rely on the fidelity of the ligation between the two insert fragments and on the reduced efficiency as the size of the ligated fragment increases. In addition, it is important to remember that the complexity generated during each cloning step must be conserved since the material produced by a cloning step becomes the template into which diversity needs to be introduced during subsequent cloning step. For example, after having inserted a cassette of diversity C in a first step, the population thus generated has to be submitted to the next cloning step with a number of clones N > 1OC to ensure the presence of the majority of the clones generated during the first step. As a consequence, multiple diversity cassettes are usually introduced in order of increasing complexity.

As used herein the term "complementary" refers to specific base recognition via for example base-base complementary. However, complementary as referred to herein includes also any Watson-Crick base-pairing of nucleoside analogs, e.g. deoxyinosine, which are capable of specific hybridization to the C base and other analogs which result in such specific hybridization, e.g. PNA, DNA and their analogs. Complementary of one single stranded region to another is considered to be sufficient when, under the conditions used, specific binding is achieved. Thus in the case of long single stranded regions including some lack of base-base specificity, e.g. presence of a mismatch, may be tolerated (e.g. if one base in a series of 10 bases is not complementary). Single-strand regions containing such small mismatches which do not affect the ultimate binding and ligation of the complementary single stranded region are considered to be complementary for the purposes of this invention. The complementary single stranded regions may retain portions, after binding, which remain single stranded, e.g. when single-strand overhangs of different length are used. In these cases, prior to ligation, missing bases may be filled in e.g. using Klenow fragment or other appropriate techniques as necessary.

TERMS

Ab: Antibody: an immunoglobulin. The term also covers any protein having a binding domain which is homologous to an immunoglobulin. A few examples of antibodies within this definition are the Fab, F(ab 1 )2,scFv, Fv, dAb and Fd fragments.

Antisense strand: The lower strand of ds DNA as usually written. In the antisense strand, 3'-TAC- 5' would correspond to a Met codon in the sense strand. Acceptor cassette: A region of a polynucleotide sequences targeted for diversification that has been engineered in a way to comprise at least two restriction enzyme recognition motifs. The restriction enzyme recognition motifs are placed such that single stranded overhangs are generated upon digestion at the boundaries of the acceptor cassette. The latter is eliminated upon digestion.

CDR: Complementarity determining region.

CoCohEn: Compatible Cohesive Ends.

Complementary: Refers to specific base recognition via for example base-base complementary.

DNA : Deoxyribonucleic Acid

ds: double-stranded

Forward primer: A "forward" primer is complementary to a part of the antisense strand and primes for synthesis of a new sense-strand molecule.

Forward strand: The upper strand of ds DNA as usually written. In the forward strand, 5'-ATG-3' codes for Met.

Library: A collection of ds DNA sequences.

PCR: Polymerase chain reaction.

PNA: Peptide nucleic Acid.

Primer: A primer is an oligonucleotide complementary to a part of the antisense or sense strand and primes for synthesis of a new sense-or anti-sense strand molecule.

Reverse primer: A "reverse" primer is complementary to a part of the sense strand and primes for synthesis of a new antisense-strand molecule.

Reverse strand : The lower strand of ds DNA as usually written. In the reverse strand, 3'-TAC-5' would correspond to a Met codon in the sense strand. RE: Restriction enzyme.

Overhang: A segment of single-stranded DNA generated by digesting double-stranded DNA with restriction enzymes.

Sense strand: The upper strand of ds DNA as usually written. In the sense strand, 5'-ATG-3' codes for Met.

ss : single-stranded.

VH: A variable domain of an immunoglobulin heavy chain.

VL: A variable domain of an immunoglobulin light chain.

wt: wild type.

Type IIS and HB restriction enzymes

Restriction enzymes (REs) are endonucleases which recognize a specific, generally consecutive, sequence of bases in a double-stranded (ds) DNA molecule, known as the recognition motif. REs are traditionally classified into three types on the basis of their subunit composition, cleavage position, sequence-specificity and co factor-requirements:

Type I restriction enzymes bind to the recognition site and then cut at a varying distance from the recognition sequences. Type II restriction enzymes bind at a recognition site and then cleave the DNA molecule at a site within to the recognition site DNA or at a defined distance outside but close to the recognition site DNA.

Type III enzymes cleave outside of their recognition sequences and require two such sequences to be present in opposite orientations within the same DNA molecule in order to accomplish cleavage. They rarely give complete digests.

Cloning methods have used restriction enzymes to generate nucleic acid fragments suitable for insertion into vectors and to prepare vectors to accept a nucleic acid sequence. Typically, restriction enzymes are used to cleave vectors to produce complementary terminal sequences relative to the terminal sequences present in the nucleic acid sequences which are being introduced into the vector. Most commonly, restriction enzymes cut palindromic sequences which are characterized by generating identical overhangs. The subsequent ligation of nucleic acid fragments that have been cut with the same restriction endonucleases as the vector results in the formation of a new recombinant nucleic acid sequence.

However, cloning strategies which rely on the use of one endonuclease that produce identical overhangs are relatively inefficient due to the formation of ligation products containing multiple fragments and to the inability to control the orientation of the segment to be inserted in these cases, thus, additional steps are required to separate and recover the desired ligation product.

The property of type II restriction enzymes to bind a single, specific recognition sequence, and cleave the DNA molecule at a specific site has promoted their extensive use in recombinant DNA technology type. Type II REs can be grouped into various subclasses according to the following properties: recognition of symmetric (EcoRI, GAATTC) or asymmetric (e.g., BbvCI, CCTCAGC) DNA sequences, binding to contiguous sequences in which the two half-sites of the recognized sequence are adjacent (e.g. EcoRI, GAATTC) or binding to discontinuous sequences in which the half-sites are separated (e.g. BgII, GCCNNNNNGGC), cleavage of DNA within or outside their recognition sequence (i.e. HindIII or Fokl), the type of cleavage (blunt, 3' or -5'- protruding), and the distance of the cleavage site from the recognition site. Type IIP REs, which include most of the type II REs, recognize palindromic DNA sequences and cleave both DNA strands within the target at symmetrical positions. Type HA REs recognize asymmetric DNA sequences and cleave within or outside of the recognition sequence. Type IIS REs are those type IIA REs that bind an asymmetric recognition sequences and cleave at least one strand of the DNA duplex outside of the recognition sequence. Type HB REs cleave on both sides of the recognition sequence.

Type IIS endonucleases exhibit no specificity with respect to the sequence that is cut and they can therefore generate single-strand overhangs with any base composition. Depending on the enzyme, cleavage with IIS enzymes results in overhangs of various lengths, e.g. from -5 to +6 bases in length (negative numbers indicate overhangs on the sense strand, positive numbers overhangs on the reverse strand). Preferably for performing the method of the invention, enzymes are chosen which generate 3-6, e.g. 4 base pair overhangs. Suitable restriction enzymes for use in the invention include, but are not limited to enzymes which produce 4 base overhangs at the 3' end of the recognition site: BstXI; 5 base overhang at the 3' end: AIoI, Bael, BpII, Bsp24I; 6 base overhang at the 3' end: Cjel, CjePI, HaelV; 4 base overhang at the 5' end: Acelll, Acc36I, Alw26I, AIwXI, Bbr7I, Bbsl, Bbvl, BbvII, Bvblβll, BK736I, Bpil, BpuAI, Bsal, Bsc91I, BseKI, BseXI, BsmAI, BsmBI, BsmFI, Bso3 II, Bsp423I, BspBS3 II, BspIS4I, BspLUl HII, BspMI, BspST5I, BspTS514I, Bstl2I, Bst71I, BstBS32I, BstGZ53I, BstTS5I, BstOZ616I, BstPZ418I, Eco31I, EcoA41, EcoO44I, Esp3I, Fokl, Phal, SfaNI, Sthl32I, Stsl; and 5 base overhang at the 5' end: Hgal.

Type IIB endonucleases exhibit no specificity with respect to the sequence that is cut and they can therefore generate single-strand overhangs with any base composition. Depending on the enzyme, cleavage with IIB enzymes results in overhangs of various lengths, e.g. from -5 to +6 bases in length (negative numbers indicate overhangs on the sense strand, positive numbers overhangs on the reverse strand). Preferably for performing the method of the invention, enzymes are chosen which generate 3-6, e.g. 4 base pair overhangs. Suitable restriction enzymes for use in the invention include, but are not limited to Fall, Bcgl, BpII, AIfI, Bdal, Bael, BsaXI, Ppil.

EXAMPLE

The following example serves to exemplify the preparation of the introduction of diversity into nucleosides as claimed in the present invention. This should not, however, be construed as forming the only genus that is considered as the invention.

INTRODUCING DIVERSITY AT SPECIFIC SITE BY THE COMPLEMENTARY COHESIVE ENDS PROTOCOL (CoCohEn)

We evaluated the efficiency of the CoCohEn protocol by cloning a variable sequence in the CDR3-L region of an antibody fragment. The p3.1_5B vector selected from the PDLl mini library (obtained from P. Luo, Abmaxis) expresses the 3.1 5B Fab. The VL sequence from the p3.1_5B vector is reported in Figure 5.

A PCR elongation reaction mix was assembled containing 10 μg of primer 3.1_Lfor and 3.1_Lrev (Table I), 2U Phusion DNA polymerase in Ix Phusion buffer (NEB) in a final volume of 50 μL. Reaction was thermal-cycled according to the following profile: 95°C 30", 72°C 1'. The dsDNA VLes construct thus obtained contains two Esp3i sites arranged as reported in Figure 7. The dsDNA VLes was digested with Bpil and Kpnl restriction enzymes, purified using Qiaquick PCR kit and ligated overnight at 16° to the 3.1 5B vector digested with the same enzymes and 5'-ends dephosphorylated with Calf Intestinal Phosphatase (Boehringer). Ligation reaction was transformed in Top 10 electro-competent cells (Invitrogen), plated on 100 μg/mL Amp, 2% glucose, 2xTY (2xTYAG) and grown overnight at 30 0 C. The resulting plasmid is referred to as p3.1_Les (Figure 8). Design of degenerated oligonucleotides.

CDR3-L.S1 and CDR3-L.S2 oligonucleotides, together with primer CDR3-L.S12, were designed and synthesized according to the sequence reported in Figure 9. The synthesis of the oligos CDR3-L.S1 and CDR3-L.S2 was performed incorporating an equimolar ratio of two or more dNTPs at 7 positions which creates 128 variants (2x(2 Λ 6)). CDR3-L.S1 and CDR3-L.S2 oligonucleotides and primer CDR3-L.S12 were synthesized 5'-biotinylated. Processing of degenerated oligonucleotides. The CDR3-L.S12 primer was annealed to its complementary sequence in oligonucleotides

CDR3-L-S1 and CDR3-L-S2 and elongated. A reaction mix was assembled containing 600 pmoles of each oligo, 0.25mM dNTPs, Ix PCR buffer, 2 U Fusion DNA polymerase and incubated at 95°C for 30", 43°C for T, 72°C for 30". The PCR product is referred to as dsDNA_Vlib (Figure 10). Introducing sequence diversity for CDR3-L in 3.1_Les.

The dsDNA Vlib product was digested with Esp3I, loaded onto a Streptavidin-agarose column. The flow-through was analyzed on a 2.5% agarose gel to confirm that digestion was effectively accomplished. Digestion removed the flanking regions and generated dsDNA_Vlib_Insert with 4 bp long 5 '-overhanging ends at both extremes (Figure 11). At the same time the vector 3.1_Les was digested with Esp3i restriction enzyme, creating two 4bp 5'- overhanging ends, compatible with those present in dsDNA Vlib lnsert. Vector and insert were ligated overnight, transformed into electro-competent Top 10 bacterial cells, plated onto 2xTYAG plates and grown overnight at 30 0 C. The cloning strategy adopted is schematically illustrated in Figure 11. Analysis of the clones

Ninety-five clones derived from the ligation product p3.1_Vlib were sequenced with the following results:

• 3 sequences retained the original Esp3i cloning sites

• the remaining clones displayed the expected inserted variable sequence Figure 12 summarizes the data and compares the observed against the expected frequency of residues at each position.

In some cases the observed frequency of a variant sensibly differs from the expected value. The results of similar experiments where dsDNA Vlib was cloned blunt into a different plasmid yielded similar results. These data indicate that the incorporation of the nucleotides in the synthesis process is responsible for these variations.




 
Previous Patent: HIGHLY HYDROPHOBIC COATINGS

Next Patent: MAGNETIC ENCODER