Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ENANTIOMERIC SCREENING PROCESS, AND COMPOSITIONS THEREFOR
Document Type and Number:
WIPO Patent Application WO/1997/035194
Kind Code:
A2
Abstract:
The present invention makes available a powerful directed approach for identifying enantioselective compounds which bind to biological targets. As a general overview, the present invention relates, in one aspect, to a method for identifying compounds which interact with a target molecule, by (i) contacting a screening molecule with a variegated compound library, wherein the screening molecule comprises solid target molecule, or the enantiomer thereof if the target molecule is chiral; (ii) selecting, from the library, compounds which have a desired interaction with the target molecule; and (iii) testing the ability of the enantiomer of a compound selected in step (ii) to interact with the target molecule.

Inventors:
FORSTER ANTHONY C (US)
Application Number:
PCT/US1997/004176
Publication Date:
September 25, 1997
Filing Date:
March 21, 1997
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HARVARD COLLEGE (US)
FORSTER ANTHONY C (US)
International Classes:
C07K1/04; C07K14/00; C12N15/10; C40B30/04; C40B40/02; G01N33/68; C07B61/00; (IPC1-7): G01N33/53; C07K1/00; C07K2/00; G01N33/543; G01N33/68
Domestic Patent References:
WO1996034879A11996-11-07
WO1994016332A11994-07-21
WO1995027072A11995-10-12
WO1992009690A21992-06-11
Download PDF:
Claims:
I claim:
1. A method for identifying compounds which interact with a target molecule, comprising the steps of: (/) contacting a screening molecule with a variegated compound library, wherein said screening molecule comprises solid target molecule, or the enantiomer thereof if said target molecule is chiral; (ii) selecting, from said library, compounds which have a desired interaction with said target molecule; and (iii) testing the ability of the enantiomer of a compound selected in step (ii) to interact with the target molecule.
2. The method of claim 1, wherein said target molecule is chiral, and said screening molecule comprises the enantiomer of said target molecule.
3. The method of claim 2, wherein said target molecule comprises an Lenantiomer polypeptide.
4. The method of claim 2, wherein said target molecule comprises a Denantiomer nucleic acid.
5. The method of claim 2, wherein said target molecule comprises a Denantiomer carbohydrate.
6. The method of claim 2, wherein said target molecule comprises a naturally occurring steroisomer of a nonpolymeric molecule selected from the group consisting of a steroid, an inositol, a lipid soluble vitamin, a teφene, an acetogenims, a chiral neurotransmitter, and a transition state analog.
7. The method of claim 1, wherein said target molecule is achiral.
8. The method of claim 1, wherein said variegated compound library is selected from the group consisting of a peptide library, a nucleic acid library and a carbohydrate library.
9. The method of claim 1, wherein said variegated compound library is a small organic molecule library.
10. The method of claim 1, wherein said variegated compound library is a natural product extract library.
11. A method for identifying ligands for a target protein, comprising the steps of (i) combining a Denantiomer of a target protein (a Dtarget protein), and a variegated compound library; (ii) selecting one or more compounds from the library which have a desired binding interaction with the Dtarget protein; and (iii) testing the ability of the enantiomer of a compound selected in step (ii) to interact with the target protein.
12. The method of claim 11, wherein said variegated compound library is selected from the group consisting of a nucleic acid library and a carbohydrate library.
13. The method of claim 11, wherein said variegated compound library is a small organic molecule library.
14. The method of claim 11, wherein said variegated compound library is a natural product extract library.
15. The method of claim 11, wherein said variegated compound library is a peptide library.
16. The method of claim 15, wherein the peptide library is a phage display library.
17. The method of claim 15, wherein the peptide library is a bacterial cellsurface display library or a spore display library.
18. The method of claim 15, wherein the peptide library is a collection of synthetic peptides.
19. The method of claim 1, wherein the compounds of the library are free in solution.
20. The method of claim 18, wherein the compounds of the library are linked to an insoluble support.
21. The method of claim 15, wherein each compound of the library is associated with an index providing the molecular identity of the compound associated therewith.
22. The method of claim 11, wherein the target protein is a receptor.
23. The method of claim 11 , wherein the target protein is an enzyme. H7 .
24. The method of claim 11 , wherein the target protein is a DNAbinding protein.
25. The method of claim 11, wherein the target protein is a subunit of a protein complex.
26. The method of claim 11, wherein the target protein includes a motif selected from the group consisting of SH2 domains; SH3 domains; ankyrinlike repeats; WD40 motifs; Kunitztype inhibitor domains; growth factorlike domains such as EGFlike domains; Kringle domains; fibronectin fingerlike domains; heparin binding domains;; death domains; TRAF domains; pleckstrin homology (PH) domains; ITAMs; kinase domains; phosphatase domains; phospholipase domains; guanine nucleotide exchange factor (GEF) domains; hydrolase domains; leucine zippers, zinc fingers and helixloophelix motifs..
27. The method of claim 11, wherein the compounds are selected from the library by a differential binding means comprising affinity separation of compounds which specifically bind the Dtarget protein from compounds which do not specifically bind.
28. The method of claim 27, wherein the differential binding means comprises panning the compound library with a Dtarget protein immobilized on an insoluble surface.
29. The method of claim 11, wherein the compound are selected from the library based on an ability to modulate proteinprotein or proteinDNA binding by the target protein.
30. The method of claim 11, wherein the compound are selected from the library based on an ability to modulate an enzymatic activity of the target protein.
31. A method for identifying Denantiomer peptide ligands for a target protein, comprising: (a) transforming suitable host cells with a library of replicable phage vectors encoding a library of phage particles displaying a fusion coat protein, the fusion coat protein comprising a phage coat protein portion and a test peptide portion, the test peptide portion being encoded by a variegated gene library; 1% (b) culturing the transformed host cells such that the phage particles are formed and the fusion coat proteins are expressed; and (c) selecting any of the phage particles having a peptide portion which binds to a Denantiomer of a selected target protein, wherein the Denantiomer corresponding to the peptide portion are ligands for the Lenantiomer of the target protein, and the molecular identity of the peptide portion is provided by the phage vector.
32. The method of claim 31, wherein the target protein is a receptor.
33. The method of claim 31 , wherein the target protein is an enzyme.
34. The method of claim 31, wherein the target protein is a DNAbinding protein.
35. The method of claim 31, wherein the target protein is a subunit of a protein complex.
36. The method of claim 31, wherein the target protein includes a motif selected from the group consisting of SH2 domains; SH3 domains; ankyrinlike repeats; WD40 motifs; Kunitztype inhibitor domains; growth factorlike domains such as EGFlike domains; Kringle domains; fibronectin fingerlike domains; heparin binding domains;; death domains; TRAF domains; pleckstrin homology (PH) domains; ITAMs; kinase domains; phosphatase domains; phospholipase domains; guanine nucleotide exchange factor (GEF) domains; hydrolase domains; leucine zippers, zinc fingers and helixloophelix motifs.
37. The method of claim 31, wherein the phage particles are selected by a differential binding means comprising contacting the phage particles with the Dtarget protein and separating phage particles which specifically bind the Dtarget protein from phage particles which do not specifically bind the Dtarget protein.
38. The method of claim 37, wherein the differential binding means comprises an affinity chromatographic means in which the Dtarget protein is provided as a component of an insoluble matrix.
39. The method of claim 38, wherein the insoluble matrix comprises the Dtarget protein attached to a polymeric support. nq .
40. The method of claim 31, wherein the differential binding means comprises precipitating the phage particles with a multivalent form of the immunorecessive epitope, and subsequently removing nonspecifically bound phage particles from the precipitate.
41. The method of claim 40, wherein the phage particle is selected from a group consisting of M13, fl, fd, Kl, De, Xf, Pfl, Pf3, λ ,T4, T7, P2, P4, φX174, MS2 and £2.
42. The method of claim 40, wherein the phage particle is a filamentous bacteriophage specific for Escherichia coli and the phage coat protein is coat protein HI.
43. The method of claim 42, wherein the filamentous bacteriophage is selected from a group consisting of M13, fd, and fl.
44. The method of claim 31, wherein the transformed host cells are cultured with a helper phage suitable for inducing formation of the phage particles.
45. The method of claim 31, wherein the differential binding means comprises precipitating the phage particles with a multivalent form of the Dtarget protein, and subsequently removing nonspecifically bound phage particles from the precipitate.
46. A method for generating Denantiomer peptide ligands for a target protein, comprising: (a) generating a library of replicable phage vectors encoding a library of phage particles displaying a fusion coat protein, each of the phage vectors comprising a chimeric coat protein gene encoding the fusion coat protein, the chimeric gene including (i) a first peptide gene encoding a test peptide, and (ii) a second gene encoding at least a portion of a phage coat protein, such that the library of phage vectors encodes a plurality of different test peptide sequences; (b) transforming suitable host cells with the library of replicable phage vectors; (c) culturing the transformed host cells such that the phage particles are formed and the fusion coat protein are expressed; (d) selecting any of the phage particles having a test peptide which binds to a Denantiomer of a selected target protein; (e) determining the molecular identity of a test peptide which bind the D target protein by sequencing the corresponding phage vector; and (f) synthesizing a Denantiomer of the test peptide, wherein the Denantiomer of the peptide is a ligand for the Lenantiomer of the target protein.
47. A kit for identifying Denantiomer peptide Ugands for a target protein, comprising i) a Denantiomer of a target protein (a Dtarget protein), and ii) a variegated population of test peptides.
48. The kit of claim 47, wherein the variegated population of test peptides includes at least 103 different peptide sequences.
49. The kit of claim 47, wherein at least one of the Dtarget protein and the variegated population of test peptides includes a detectable label.
50. The kit of claim 47, wherein the Dtarget protein has a molecular weight less than 25,000 daltons.
51. The kit of claim 47, wherein the test peptides are from 225 amino acid residues in length.
52. The kit of claim 47, wherein the variegated population of test peptides comprises a phage display library.
53. The kit of claim 47, wherein the variegated population of test peptides comprises a collection of synthetic peptides.
54. A Denantiomer of peptide identified by the method of claim 1.
Description:
Enantiomeric Screening Process, and Compositions Therefor

Background of the Invention

The recognition and binding of ligands regulates almost all biological processes, such as immune recognition, cell signaling and communication, transcription and translation, intracellular signaling, and catalysis, .i.e., enzyme reactions. There is a long¬ standing interest in the art to identify molecules which act as agonists or which can agonize or antagonize the activity of ligands such as hormones, growth factors, and neurotransmitters; which induce B-cell (antibody-mediated) or T-cell (cell-mediated) immunity; which can catalyze chemical reactions; or which can regulate gene expression at the level of transcription or translation.

The traditional approach to drug discovery relies heavily on a mixture of serendipity and hard work. Screening natural products from animal and plant tissues, or the products of fermentation broths, or the random screening of archived synthetic molecules have been the most productive avenues for the identification of new lead compounds.

However, recent trends in the search for novel pharmacological agents have focused on the preparation of combinatorial libraries as potential sources of new leads for drug discovery. At the heart of this new field of "combinatorial chemistry" is a collection of differing molecules which can be prepared either synthetically or biosynthetically and screened for biological activity in a variety of formats. Through the use of such techniques as, e.g., encoding, spatially addressing and/or deconvolution, combinatorial libraries can be synthesized by batch processes and, importantly, the molecular identity of individual members of the library can be ascertained in a drug screening format. A variety of combinatorial approaches have been described in the art. For instance, the most widely used techniques for screening large peptide libraries typically comprises cloning the peptide-encoding gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates relatively easy isolation of the vector encoding the gene whose product was detected. Such illustrative assays are amenable to high through-put analysis which is necessary to screen large numbers of potential peptide agonists or antagonists.

In one screening assay, the candidate peptides are displayed on the surface of a cell or viral particle, and the ability of particular cells or viral particles to bind a target molecule, such as a receptor protein via this gene product is detected in a "panning assay". For instance, the gene library can be cloned into the gene for a surface membrane protein of a bacterial cell, and the resulting fusion protein detected by panning (Ladner et al., WO 88/06630; Fuchs et al. (1991) Bio/Technology 9: 1370-1371; and Goward et al. (1992) ΗBS 18:136-140).

In an alternate embodiment, the peptide library is expressed as fusion proteins on the surface of a viral particle. For instance, in the filamentous phage system, foreign peptide sequences can be expressed on the surface of infectious phage, thereby conferring two significant benefits. First, since these phage can be applied to affinity matrices at very high concentrations, a large number of phage can be screened at one time. Second, since each infectious phage displays the combinatorial gene product on its surface, if a particular phage is recovered from an affinity matrix in low yield, the phage can be amplified by another round of infection. The group of almost identical E. coli filamentous phages M13, fd., and fl are most often used in phage display libraries, as either of the phage gUI or gNUl coat proteins can be used to generate fusion proteins without disrupting the ultimate packaging of the viral particle (Ladner et al. PCT publication WO 90/02909; Garrard et al., PCT publication WO 92/09690; Marks et al. (1992) J. Biol. Chem. 267:16007-16010; Griffiths et al. (1993) EMBO J 12:725-734; Clackson et al. (1991) Nature 352:624-628; and Barbas et al. (1992) PNAS 89:4457- 4461). The prior art techniques, however, suffer from a number of disadvantages. First, natural product libraries and combinatorial libraries, which generally consist predominantly of chiral compounds, generally lack the enantiomers of most or all of the chiral compounds. For example, with L-peptide libraries, while the 20 naturally occurring amino acids provide a wide range of steric, electronic and functional groups, the chirality of the Cα carbon effectively limits the three dimensional shape space which is accessible by the prior art display technology.

Moreover, as therapeutic agents, molecules with "natural" stereochemistries (e.g. D-DΝA, D-R A, and L-peptides) are often less preferable than their "unnatural" enantiomers (L-DΝA, L-RΝA, D-peptides) because natural enantiomers can be limited in use by poor pharmacokinetic profiles due to in vivo processing. For example, L- peptides can be rapidly degraded by proteases after administration to an animal, thus requiring a higher effective dose. Furthermore, pharmaceutical peptides can elicit strong immunogenic responses in patients, further contributing to their rapid clearance. One approach to preventing the degradation of the therapeutic peptide has been to generate

non-hydrolyzable peptide analogs such as retro-inverso analogs (cf, Sisto et al. U.S. Patent 4,522,752), retro-enantio analogs (c , Goissis et al. (1976) J Med Chem 19:1287-90); trans-olefin derivatives (c , Shue et al. (1987) Tetrahedron Letters 28:3225); and phosphonate derivatives (c , Loots et al., in Peptides: Chemistry and Biology, (Escom Science Publishers, Leiden, 1988, p. 118). However, in most instances the backbone of the peptide is altered in order to render the peptidomimetic resistant to proteolysis. In doing so, the resulting peptidomimetic can suffer from decreased bioactivity through loss of certain binding contacts between the natural peptide backbone and target receptor, as well as changes in the steric space relative to the peptide due to alteration in dihedral angles and the like. Moreover, D-DNA and D-RNA can be rapidly degraded by nucleases after administration to an animal, thus requiring a high effective dose. In contrast, D-peptides and L-DNA and L-RNA have increased resistance to peptidases and nucleases, respectively (Zawadzke and Berg (1992) J Am Chem Soc 114:4002; Holy and Sorm (1969) Collect Czech Chem Commun 34:3383; Tazawa et al. (1970) Biochemistry 9:34-99; Ashley (1992) J Am Chem Soc 114:9732).

Summary of the Invention

The present invention makes available a powerful directed approach for identifying enantioselective compounds which bind to biological targets. As a general overview, the present invention relates, in one aspect, to a method for identifying compounds which interact with a target molecule, by (i) contacting a screening molecule with a variegated compound library, wherein the screening molecule comprises solid target molecule, or the enantiomer thereof if the target molecule is chiral; (ii) selecting, from the library, compounds which have a desired interaction with the target molecule; and (iii) testing the ability of the enantiomer of a compound selected in step (/ ' /) to interact with the target molecule. In preferred embodiments, the target molecule is chiral, and the screening molecule comprises the enantiomer of the target molecule. Exemplary target molecules include an polypeptides, nucleic acids, carbohydrates and other chiral polymers, as well as steroisomers of non-polymeric molecules, such as steroids, inositols, lipid soluble vitamins, terpenes, acetogenims, chiral neurotransmitters, or a transition state analog. In yet other embodiments, the target molecule is achiral.

A variety of formats are also available for the variegated compound library, such as peptide libraries, nucleic acid libraries and carbohydrate libraries. Other variegated compound libraries contemplated by the present invention include small organic

molecule libraries, and a natural product extract libraries, such as isolated from animals, plants, fungus and/or microbes.

In one embodiment, the method is directed to identifying ligands for a target protein. For instance, the method can be carried out according to the steps of (i) combining a D-enantiomer of a target protein (a D-target protein), and a variegated compound library; (ii) selecting one or more compounds from the library which have a desired binding interaction with the D-target protein; and (iii) testing the ability of the enantiomer of a compound selected in step (ii) to interact with the target protein. As above, a variety of variegated compound libraries can be used. In preferred embodiments, the variegated compound library is a peptide library. Such libraries can be provided as peptide phage display libraries, or as bacterial cell-surface display or spore display libraries. In yet other embodiments, the library is a collection of synthetic peptides, and may, for example, be present free in solution or linked to an insoluble support. In preferred embodiments, each peptide of the peptide library is associated with an index providing the molecular identity of the peptide associated therewith.

The protein target can be, to illustrate, a receptor, an enzyme, a DNA-binding protein or a protein complex, or a portion thereof which retains a screenable activity. For instance, the present invention contemplates screening targets which correspond to (e.g. include) such domain structures as: SH2 domains; SH3 domains; ankyrin-like repeats; WD40 motifs; Kunitz-type inhibitor domains; growth factor-like domains such as EGF-like domains; Kringle domains; fibronectin finger-like domains; heparin-binding domains;; death domains; TRAF domains; pleckstrin homology (PH) domains; ITAMs; catalytic domains such as kinase domains; phosphatase domains; phospholipase domains; guanine nucleotide exchange factor (GEF) domains; and hydrolase domains (such as protease domains); or DNA binding domains such as leucine zippers, zinc fingers and helix-loop-helix motifs. In preferred embodiments, the D-target protein is soluble. In other preferred embodiments, the D-target protein has a molecular weight of less than 50kd, more preferably less than 40kd, and even more preferably less than 25kd. Preferred D-target proteins include less than 500 amino acid residues, more preferably less than 250 amino acid residues, and even more preferably less than 200 amino acid residues. In still other embodiments, the D-target protein is provided in a lipid bilayer with at least a portion of the D-target protein accessible to an aqueous phase for interaction with the compound library.

In certain embodiments, the peptides are selected from the peptide library by a differential binding means comprising affinity separation of peptides which specifically

bind the D-target protein from peptides which do not specifically bind. For example, the differential binding means can include panning the peptide library with a D-target protein immobilized on an insoluble surface. In other embodiments the peptides are selected from the display library based on an ability to modulate protein-protein or protein-DNA binding by the target protein, or based on an ability to modulate an enzymatic activity of the target protein.

In a preferred embodiment of the above method, D-enantiomer peptide ligands can be identified for a target protein, by: (a) transforming suitable host cells with a library of replicable phage vectors encoding a library of phage particles displaying a fusion coat protein, the fusion coat protein comprising a phage coat protein portion and a test peptide portion, the test peptide portion being encoded by a variegated gene library; (b) culturing the transformed host cells such that the phage particles are formed and the fusion coat proteins are expressed; and (c) selecting any of the phage particles having a peptide portion which binds to a D-enantiomer of a selected target protein. In the subject assay, the D-enantiomer corresponding to the peptide portion are ligands for the L-enantiomer of the target protein, and the molecular identity of the peptide portion is provided by the phage vector.

The phage particles can be selected by a differential binding means comprising contacting the phage particles with the D-target protein and separating phage particles which specifically bind the D-target protein from phage particles which do not specifically bind the D-target protein. For instance, the differential binding means can include an affinity chromatographic means in which the D-target protein is provided as a component of an insoluble matrix, such as a polymeric support. In other embodiments, the differential binding means comprises precipitating the phage particles with a multivalent form of the D-target protein, and subsequently removing non-specifically bound phage particles from the precipitate.

Preferred phage particles are selected from a group consisting of Ml 3, fl, fd, Ifl, Ike, Xf, Pfl, PD, λ ,T4, T7, P2, P4, φX-174, MS2 and f2. Especially preferred are filamentous bacteriophage specific for Escherichia coli, such as the filamentous bacteriophage is selected from a group consisting of Ml 3, fd, and fl .

Another aspect of the invention relates to kits for identifying D-enantiomer peptide ligands for a target protein, comprising (i) a D-enantiomer of a target protein (a

D-target protein), and (ii) a variegated population of test peptides. In preferred embodiments, the variegated population of test peptides includes at least 10 3 different peptide sequences, though more preferably at least IO 6 , IO 7 , or IO 8 . Generally at least

one of the D-target protein and the variegated population of test peptides includes a detectable label.

In preferred embodiments, the D-target protein has a molecular weight less than 25,000 daltons, more preferably less than 20kd, 15kd and, in some embodiments, lOkd. Preferably, the test peptides are from 2-25 amino acid residues in length, and may be provided as part of a phage display library or as a collection of synthetic peptides.

Yet another aspect of the invention relates to compounds, such D-enantiomers of peptides identified by the subject method, and their uses.

Description of the Drawings

Figure 1 is a schematic overview of a method for identifying enantiomer ligands for a biological target. By the laws of stereochemistry, A has the same affinity for D as B has for C.

Figures 2 A and 2B illustrate degenerate gene sequences for 15-mer and 6-mer peptide libraries, respectively.

Figure 3 shows different 6-mer sequences selected in an assay using D-IL-8.

Figure 4 shows different 15-mer sequences selected in an assay using D-LL-8.

Figure 5 shows two exemplary consensus sequences after panning with D-IL-8.

Figures 6 and 7 show selected peptide sequences obtained by panning, while Figure 8 shows the consensus sequences which are evident from the isolated peptides.

Detailed Description of the Invention

The present invention makes available a powerful directed approach for identifying enantioselective compounds which bind to biological targets. The invention is depicted in a preferred form in Figure 1. The chiral target (A) may be any chiral molecule that could potentially be synthesized or purified in its enantiomeric (mirror image) form (B). Target A may be a known molecule, or it may be a fragment or derivative of a known molecule. It may be large (e.g. DNA, RNA, protein, carbohydrate, lipid or combinations and/or modifications thereof) or it may be small (e.g. metabolites, hormones, transition state analogs, and drugs). Examples of large

molecules that have been prepared as enantiomers are DNA, RNA, and protein (Garbesi et al. (1993) Nucl Acids Res 21:4159; Visser et al. (1986) Reel Trav Chim Pays Bas 105:528; Zawadzke and Berg supra). Examples of small molecules that have been prepared as enantiomers are glucose, norepinephrine and cocaine (Lehringer (1975) Biochemistry (Worth Publishers, NY):252; Iversen et al. (1971) Br J Pharmac 43:845; Lewin et al. (1987) J Heterocyclic Chem 24:19.). The library of chiral molecules (Fig 1) can be a natural product library, a library of known compounds, or a combinatorial library. Libraries that are biologically amplifiable are particularly important because these are the most diverse libraries that can be screened (currently as many as 10 13 molecules can be concurrently screened [Bartel and Szostak (1993) Science 261 : 1411]), and biologically amplifiable libraries cannot be created in enantiomeric form because biological amplification requires "natural" enantiomers.

Once a chiral ligand (C) has been identified from a library, the synthesis of its enantiomeric form (D) is required (Figure 1). This synthesis is straightforward in the case of polymers such as DNA, RNA or peptide ligands, because it simply requires substitution of enantiomeric monomers in the chemical polymerization reactions (Garbesi et al. (1993) Nucl Acids Res 21:4159; Visser et al. (1986) Reel Trav Chim Pays Bas 105:528; Zawadzke and Berg supra). The synthesis of non-polymeric enantiomers may use one or a combination or enantioselective synthesis strategies and purification techniques (Iversen et al. (1971) Br J Pharmac 43:845; Lewin et al. (1987) J Heterocyclic Chem 24:19).

A related strategy to that of Figure 1 is also envisioned for achiral targets. In this case, target structures A and B of Figure 1 would be identical. Achiral target (B) in Figure 1 would be screened with a chiral library to yield a high affinity chiral ligand (C). Either this ligand C or its enantiomer D could be used as ligand for the achiral target B. Ligand D may have preferable pharmacokinetic properties to ligand C (see above). Examples of such achiral targets would be dopamine and serotonin (Adrich Catalog (1994-95) p. 812; p. 1248).

In one aspect of the invention, the subject method is directed to the generation of D-peptide enantiomers. In particular, the subject method permits the selection and amplification of D-peptide ligands, the mirror images (enantiomers) of natural L-peptides by a process which first exploits advantages to library generation with L-amino acids.

The subject method generally comprises the steps of generating a variegated library of test compounds, and selecting from the library those compounds which have a

desired interaction with a target protein. To illustrate, the peptide library can be provided with L-amino acids, while the target protein is generated from D-enantiomers of amino acids so as to provide a mirror image of the naturally occurring target protein (e.g., the L-enantiomer). As described herein, the peptides of the library can be presented by a display package, free in solution, or immobilized on an insoluble support, and screened for binding to the enantiomeric D-protein target. Thus, affinity selection techniques and the like can be utilized to isolate from the population of test peptides those peptides which have a desired binding specificity for the D-target protein enantiomer. In a final step, the D-enantiomer of the L-peptide ligand is synthesized (Figure 1) for use as a specific ligand of the original chiral L-protein target. By the laws of stereochemistry, this D-peptide ligand will have the same affinity for the L-protein target as the L-peptide ligand of the library had for the D-protein target. This D-peptide can then be evaluated for inherent biological activity or other utility.

The subject method, by providing a privileged platform, provides access to three dimensional shape space in a manner which was not readily available by prior art methods. Because both the peptide library and protein target display side chains in a unique, specific directional sense, the present method is a structurally selective approach in addition to scoring for interaction of functional groups.

The subject method overcomes certain of the weaknesses of drug design described above, and provides a powerful selection and amplification method that allows the production of ligands with the same diversity as peptides, but with the greatly improved pharmacokinetic profiles needed for drug activity. For example, D-peptides are very resistant to degradation by proteases and mammahan serum (Wade et al. (1990) PNAS 87:4761) because of the specificity of biological enzymes for natural enantiomers. Furthermore, D-peptides and D-proteins do not elicit strong immune responses (Dintzis et al. (1993) Proteins Struct. Fund. Genet. 16:306). The much lower immunogenicity of D-peptides than L-peptides is probably due to a combination of reasons such as: (i) D- peptides are resistant to degradation by proteases into the smaller fragments necessary for presentation by major histocompatibility complex (MHC) class I and II proteins; (ii) D-peptides may not bind productively to the L-peptide-binding groove of MHC class I and π molecules because the groove is chiral and therefore probably specific for L- peptides (Parham P (1992) Nature 360:300); and (iii) D-peptides do not have the repetitive features found in strongly immunogenic non-protein antigens, such as carbohydrates.

Moreover, in the context of the chemically synthesized peptide libraries, described infra, the synthesis of an L-peptide library rather than a D-peptide library can be economically more feasible.

The present method can be used to identify new leads, as well as refine existing pharmacophoric peptides. The D-enantiomer ligands accessible by the subject method include agonists (e.g., mimetics) of existing ligands, as well as antagonists, and the protein targets include receptors, enzymes, DNA binding proteins, signal transduction proteins, etc.

Definition of Terms For convenience, certain terms employed in the specification, examples, and appended claims are collected here.

Amino acid residues in peptides are abbreviated as follows: Phenylalanine is Phe or F; Leucine is Leu or L; Isoleucine is lie or I; Methionine is Met or M; Valine is Val or V; Serine is Ser or S; Proline is Pro or P; Threonine is Thr or T; Alanine is Ala or A; Tyrosine is Tyr or Y; Histidine is His or H; Glutamine is Gin or Q; Asparagine is Asn or N; Lysine is Lys or K; Aspartic Acid is Asp or D; Glutamic Acid is Glu or E; Cysteine is Cys or C; Tryptophan is Trp or W; Arginine is Arg or R; and Glycine is Gly or G.

The term "effective amount" refers to an amount sufficient to induce a statistically significant result. The term "ligand" refers to a molecule that is recognized by a particular protein, e.g., a receptor. Any agent bound by or reacting with a protein is called a "ligand," so the term encompasses the substrate of an enzyme and the reactants of a catalyzed reaction. The term "ligand" does not imply any particular molecular size or other structural or compositional feature other than that the substance in question is capable of binding or otherwise interacting with a protein. A "ligand" may serve either as the natural ligand to which the protein binds or as a functional analogue that may act as an agonist or antagonist.

The term "substrate" refers to a substrate of an enzyme which is catalytically acted on and chemically converted by the enzyme to product(s). The term "receptor" refers to a molecule that has an affinity for a given ligand.

Receptors may be naturally-occurring or synthetic molecules. The D-amino acid enantiomer of a receptor can be employed in the present invention in its isolated state or as an aggregate with other species or in some modified form.

The term "peptide" refers to an oligomer in which the monomers are amino acids (usually alpha-amino acids) joined together through amide bonds. Peptides are two or more amino acid monomers long, but more often are between 5 to 10 amino acid monomers long and can be even longer, i.e. up to 20 amino acids or more, although peptides longer than 20 amino acids are more likely to be called "polypeptides." The term "protein" is well known in the art and usually refers to a very large polypeptide, or set of associated homologous or heterologous polypeptides, that has some biological function. For purposes of the present invention the terms "peptide," "polypeptide," and "protein" are largely interchangeable as all three types can be used to generate the display library and so are collectively referred to as peptides.

The term "random peptide library" refers to a set of random or semi-random peptides, as well as sets of fusion proteins containing those random peptides (as applicable).

The term "carbohydrate" embraces a wide variety of chemical compounds having compositions similar to the general formula (CH 2 O)-., such as monosaccharides, disaccharides, oligosaccharides. and polysaccharides. Oligosaccharides are chains composed of saccharide units, which are alternatively known as sugars. These saccharide units can be arranged in any order and the linkage between two saccharide units can occur in any of approximately ten different ways. As a result, there are a number of different possible stereoisomeric oligosaccharide chains of the same or similar molecular formula.

The term "reporter group" or "tag" refers to an atom, compound, or biological molecule or complex that can be readily detected when attached to other molecules and exploited in chemical separation processes. A reporter group can be, for example, a fluorescent or radioactive atom or a compound containing one or more such atoms.

The term "solid support" refers to a material having a rigid or semi-rigid surface. Such materials will preferably take the form of small beads, pellets, disks, chips, dishes, multi-well plates, wafers or the like, although other forms may be used. In some embodiments, at least one surface of the substrate will be substantially flat. The term "surface" refers to any generally two-dimensional structure on a solid substrate and may have steps, ridges, kinks, terraces, and the like without ceasing to be a surface.

The term "synthetic" refers to production by in vitro chemical or enzymatic synthesis.

K

The phrases "individually selective manner" and "individually selective binding", with respect to binding of a test peptide with a target protein, refers to the binding of a peptide to a certain protein target which binding is specific for, and dependent on, the molecular identity of the protein target. The language "display package" describes a biological or chemical particle (such as a bead or the like) which has one or more peptides provided on its surface in order that such peptides may interact, if at all, with a target protein. In preferred embodiments, the display package further includes encoded information providing the particle with the identity of the associated peptide. A variegated mixture of display packages encoding at least a portion of the test peptide library is also referred to as an "peptide display library".

The language "replicable genetic display package" describes a biological display package which has genetic information providing the package with the ability to replicate. The package can display a fusion protein including peptide derived from the variegated peptide library. The test peptide portion of the fusion protein is presented by the display package in an context which permits the peptide to bind to a target protein that is contacted with the display package. The display package will generally be derived from a system that allows the sampling of very large variegated peptide libraries, as well as easy isolation of the recombinant genes from purified display packages. The display package can be, for example, derived from vegetative bacterial cells, bacterial spores, and bacterial viruses (especially DNA viruses).

The language "differential binding means", as well as "affinity selection" and "affinity enrichment", refer to the separation of members of the peptide display library based on the differing abilities of test peptides on each of the display packages of the library to bind to the target protein. The differential binding of a target protein by peptides of the display can be used in the affinity separation of peptides which specifically interact with the target protein from peptides which do not. Examples of affinity selection means include affinity chromatography, precipitation, fluorescence activated cell sorting, and plaque lifts. As described below, the affinity chromatography includes panning techniques using, e.g. immobilized target proteins.

The language "fusion protein" and "chimeric protein" are art-recognized terms which are used interchangeably herein, and include contiguous polypeptides comprising a first polypeptide covalently linked via an amide bond to one or more amino acid sequences which define polypeptide domains that are foreign to and not substantially

homologous with any domain of the first polypeptide. One polypeptide from which the fusion protein is constructed comprises a recombinant test peptide derived from a variegated gene library. A second polypeptide portion of the fusion protein is typically derived from an outer surface protein or display anchor protein which directs the "display package" (as hereafter defined) to associate the test peptide with its outer surface. As described below, where the display package is a phage, this anchor protein can be derived from a surface protein native to the genetic package, such as a viral coat protein. Where the fusion protein comprises a viral coat protein and a test peptide it will be referred to as an "peptide fusion coat protein". The fusion protein may further comprise a signal sequence, which is a short length of amino acid sequence at the amino terminal end of the fusion protein, that directs at least a portion of the fusion protein to be secreted from the cytosol of a cell and localized on the extracellular side of the cell membrane.

Gene constructs encoding fusion proteins are likewise referred to a "chimeric genes" or "fusion genes".

The term "vector" refers to a DNA molecule, capable of replication in a host cell, into which a gene can be inserted to construct a recombinant DNA molecule.

The terms "phage vector" and "phagemid" are art-recognized and generally refer to a vector derived by modification of a phage genome, containing an origin of replication for a bacteriophage, and preferably, though optional, and origin for a bacterial plasmid. The use of phage vectors rather than the phage genome itself provides greater flexibility to vary the ratio of chimeric test peptide/coat protein to wild¬ type coat protein.

The language "helper phage" describes a phage which is used to infect cells containing a defective phage genome or phage vector and which functions to complement the defect. The defect can be one which results from removal or inactivation of phage genomic sequence required for production of phage particles.

Examples of helper phage are M13K07, and M13K07 gene HI no. 3.

The term "chiral" refers to molecules which have the property of non- superimposability of the mirror image partner, while the term "achiral" refers to molecules which are superimposable on their mirror image partner.

A "D-enantiomer" or "D-peptide enantiomer" refers to a peptide comprised of D-amino acid residues, as opposed to L-amino acids.

The terms "D-amino acid" and "L-amino acid" each denote an absolute configuration by convention relative to the possible stereoisomers of glyceraldehyde. Thus, all stereoisomers that are stereochemically related to L-glyceraldehyde are designated L-, and those related to D-glyceraldehyde are designated D-, regardless of the direction of the rotation of plane of polarized light by the given isomer. In the case of threonine and isoleucine, there are two stereochemical centers, i.e., the Cα and the Cβ atoms. The D-threonine and D-isoleucine employed herein preferably have stereochemistries at both chiral sites which are opposite (enantiomeric) to the stereochemistry of the L-enantiomers of those amino acids, e.g., they are complete mirror images. Glycine is the only commonly occurring achiral amino acid. Accordingly, when a peptide is designated herein as a D- or L- enantiomer, it is meant that essentially all of the chiral amino acid residues comprising such peptide have the indicated chirality. The presence of achiral amino acid residues such as glycine do not affect the designation of its chirality. All chiral amino acids in protein described in nature, e.g., "naturally occurring" are L-amino acids.

The term "stereoisomers" refers to compounds which have identical chemical constitution, but differ with regard to the arrangement of the atoms or groups in space. In particular, "enantiomers" refer to two stereoisomers of a compound which are non- superimposable mirror images of one another. "Diastereomers", on the other hand, refers to stereoisomers with two or more centers of dissymmetry and whose molecules are not mirror images of one another. With respect to the nomenclature of a chiral center, terms "D" and "L" configuration are as defined by the IUPAC Recommendations. As to the use of the terms, diastereomer, racemate and enantiomer will be used in their normal context to describe the stereochemistry of peptide preparations.

Similarly, the terms "enantiomerically enriched" and "non-racemic", as used interchangeably herein with reference to preparations of a molecule refers to preparations of a chiral compound which substantially lacks one of the enantiomers. For instance, with reference to polypeptides, enantiomerically enriched refers to a preparation in which the D-enantiomer sidechains are enriched, compared to a control mixture of the protein made with naturally occurring amino acid precursors, e.g., L- amino acids. Unless otherwise specified, such terms refer to peptide compositions in which the ratio (D) and (L) enantiomers is greater that 1:1 by weight. For instance, an enantiomerically-enriched preparation of a D-target protein means a preparation having greater than 50% by weight of the peptide derived from D-amino acid enantiomers

relative to the L-enantiomers, more preferably at least 75% by weight, and even more preferably at least 85% by weight. Of course the enrichment can be much greater than 85%, providing a "substantially enantiomerically enriched" or "substantially non- racemic" preparation, which refers to preparations of a peptide which have greater than 90% of the D-enantiomer relative to the L-enantiomer, and even more preferably greater than 95%. The term "substantially free of the L-enantiomer" will be understood to have similar purity ranges.

In preferred embodiments, (D)-enantiomeric targets are substantially free of (L)- protein. "Substantially free" as used herein, means the same ranges as "substantially enantiomerically enriched" above.

/. Screening Target

The enantioselective synthesis of chiral molecules is now state of the art. Through combinations of enantioselective synthesis and purification techniques, many chiral molecules can be synthesized as an enantiomerically enriched preparation. Accordingly, the screening target can be, in one embodiment, any chiral molecule of biological significance. For instance, the target can be a polymer of repeating chiral subunits, such as a polypeptide, a nucleic acid, a carbohydrate, or cell wall components. The target may alternatively be a non-polymeric molecule containing one or more chiral centers, such as a steroid, opioid, macrolid, or other macrocyclic compound.

The enantioselective synthesis of a wide variety of compounds has become increasingly feasible in recent years. Methods have been reported for the enantioselective synthesis of, inter alia, epoxides (see, e.g., Johnson, R.A.; Sharpless, K.B. In Catalytic Asymmetric Synthesis. Ojima, I., Ed.: VCH: New York, 1993; Chapter 4.1. Jacobsen, E.N. Ibid. Chapter 4.2), diols (e.g., by the method of Sharpless, J. Org. Chem. (1992) 5.7:2768), and alcohols (e.g., by reduction of ketones, E.J.Corey et al., J. Am. Chem. Soc. (1987) 109:5551). Other reactions useful for generating optically enriched products include hydrogenation of olefins (e.g., M. Kitamura et al., J. Org. Chem. (1988) 53:708); Diels-Alder reactions (e.g., K. Narasaka et al., J. Am. Chem. Soc. (1989) 1U:5340); aldol reactions and alkylation of enolates (see, e.g., D.A. Evans et al., J. Am. Chem. Soc. (1981) 103:2127; D.A. Evans et al., J. Am. Chem. Soc. (1982) 104:1737); carbonyl additions (e.g., R. Noyori, Angew. Chem. Int. Ed Eng. (1991) 30:49); and ring-opening of meso-epoxides (e.g., Martinez, L.E.; Leighton J.L., Carsten, D.H.; Jacobsen, E.N. J. Am. Chem. Soc. 1995, 117, 5897-5898). The use of

\5

enzymes to produce optically enriched products is also increasing (e.g., M.P. Scheider, ed. "Enzymes as Catalysts in Organic Synthesis", D. Reidel, Dordrecht (1986).

Separation of enantiomers can be accomplished in several ways known in the art. For example, a racemic mixture of two enantiomers can be separated by chromatography, using a chiral stationary phase (see, e.g., "Chiral Liquid Chromatography", W.J. Lough, Ed. Chapman and Hall, New York (1989)). Enantiomers can also be separated by classical resolution techniques. For example, formation of diastereomeric salts and fractional crystallization can be used to separate enantiomers. For the separation of enantiomers of carboxylic acids, the diastereomeric salts can be formed by addition of enantiomerically pure chiral bases such as brucine, quinine, ephedrine, strychnine, and the like. Alternatively, diastereomeric esters can be formed with enantiomerically pure chiral alcohols such as menthol, followed by separation of the diastereomeric esters and hydrolysis to yield the free, enantiomerically enriched carboxylic acid. For separation of the optical isomers of amino compounds, addition of chiral carboxylic or sulfonic acids, such as camphorsulfonic acid, tartaric acid, mandelic acid, or lactic acid can result in formation of the diastereomeric salts.

In still other embodiments, the target molecule can be an achiral molecule. The enantiomers of chiral ligands identified to such achiral targets may have preferable pharmacokinetic properties to the initially identified ligand C. If a chiral ligand already exists for an achiral target, an enantiomeric ligand can be prepared without having to perform the screening step.

A. Polypeptide Targets

In one embodiment, the screening target is a polypeptide derived from D-amino acids. According to the present invention, the D-polypeptide target can range in size from a short peptide, e.g. dipeptide, to a full length protein.

Total chemical synthesis has become an important approach to the construction of native proteins for the study of the structural basis of biochemical activity. Advances in synthetic approaches over the past decade have resulted in improvements in conventional techniques, and the development of new modes of protein construction, each of which allow large synthetic proteins and proteins of extraordinary structure to be routinely assembled (c , Wallace et al. (1995) Curr Opin Biotechnol 6:403-410; Muir et al. (1993) Curr Opin Biotechnol 4:420-427; Sakakibara et al. (1995) Tanpakushitsu Kakusan Koso 40:304-316; and Kent (1988) Annu Rev Biochem 57:957-

989). Because the basic synthetic chemistry which underlies the synthesis of L- polypeptide isomers is the same for the synthesis of D-polypeptides, the technical solutions which have advanced the synthesis of large proteins in the art also provide a vast repertoire of protein molecules which are accessible for use in the subject method. A D-polypeptide target can be prepared by any means available to one skilled in the art. The precise method employed for synthesizing the polypeptide is not considered essential to the subject method, and is therefore not to be considered as limiting, particularly as technology develops new ways to synthesize and assemble polypeptides de novo. While not wishing to be bound to any particular technique, it is noted that, in general, three approaches to total chemical synthesis of proteins have been most prominent in the art: de novo chemical by stepwise assembly from constituent amino acids, convergent coupling of peptide segments, and the template-assembled synthetic protein (TASP) concept. Each of these exemplary techniques can be employed to generate the D-enantiomer of a target protein. The synthetic protocols originally developed for the construction of small peptides can be extended to the assembly of small to medium sized proteins, e.g., upwards of about 200-300 amino acids in length. The stepwise solid phase peptide synthesis (SPPS) method is generally described in the following references: Merrifield (1963) J Am Chem Soc 888:2149; Barany et al. in Peptides. (1980) Gross and Meinenhofer, Eds., Academic Press, New York, 3 :285; Kent (1988) Annu Rev Biochem 57:957-989. By the SPPS method, a polypeptide of a desired length and sequence can be produced through the stepwise addition of D-amino acids (optionally protected) to a growing peptide chain which is covalently bound to an insoluble support. D-amino acids suitable for polypeptide synthesis are commercially available from, for example, Peptide Institute (Osaka, Japan), Peptides International (Louisville, KY), Bachem Bioscience (Philadelphia, PA), and Bachem California (Torrance, CA). Appropriate protective groups usable in such synthesis are described in the above-referenced texts, as well as by McOmie in Protective Groups in Organic Chemistry. Plenum Press, New York, 1973.

SPPS has been used to synthesize a variety of proteins possessing biological activity, including ribonuclease A (Hirschmann et al. (1969) J Am Chem Soc 91:502; Gutte and Merrifield (1971) JBiol Chem 246:1922), the C-terminal domain of the HTV- 1 gag p24 protein (Mascagni et al. (1990) Tetrahedron Lett 31:4637-4640), interleukin- 3 (Clark-Lewis et al. (1986) Science 231:134-139), insulin-like growth factor (Li et al. (1985) PNAS 80:2216-2220), epidermal growth factor (Heath et al. (1986) PNAS 83:6367-6371), interleukin-8 (Clark-Lewis (1991) Biochemistry 50:3128-3134), fflV-1

'7

protease (Nutt et al. (1988) PNAS 85:7129-7133), and monellin (Konmura et al. (1991) J. Biol. Chem. 55:539-545). Currently preferred SPPS techniques are based on improvements to the original t-butyloxycarbonyl (BOC) methodologies of Merrifield (see citations above) or utilize the base labile 9-fluorenylmethoxycarbonyl (Fmoc) chemistry (Carpino et al. (1970) J. Am. Chem. Soc. 92:5740-5749; and Ramage et al. (1987) Tetrahedron Lett. 30:2149-2153). In the context of the present invention, either technique is exemplary of a stepwise synthesis scheme which may be used to generate a target protein from D-amino acid precursors.

An alternative to the stepwise construction of long peptide chains is to individually synthesize segments of the target protein and then chemically couple the segments to form the full length protein. This convergent synthesis procedure has the obvious advantage that the smaller fragments (e.g., <100 amino acids) are easier to synthesize and purify. In one approach, in order to obtain unambiguous reactions, the side chain functionalities of the peptide fragments are protected and the peptide fragments joined using the same chemistry to create a peptide bond between the two fragments. However, more preferred techniques have been developed for the total chemical synthesis of proteins. For instance, regioselective condensation of unprotected polypeptide fragments has been achieved by chemoselective ligation (Schnόlzer et al. (1992) Science 256:221-225; and Dawson et al. (1993) J Am Chem Soc 115:7263-7266; Dawson et al. (1994) Science 266:776). In this approach, highly selective chemistries are used to link together unprotected shorter fragments which have been synthesized from D-amino acids, e.g., by SPPS. This method involves the joining of unprotected peptide segments in an unambiguous manner by the chemoselective reaction of unique, mutually reactive functionalities, one on each segment. In this way a stable bond is formed in a predictable and controlled fashion even in the presence of the range of functional groups found in proteins. Chemical ligation is not limited to formation of a peptide bind at the ligation site. As further described below, a variety of ligation chemistries can be used to implement the strategy. Exemplary chemoselective ligation reactions include thioester-forming ligations (Schnolzer et al., supra) and oxime-forming ligations (Rose (1994) J Am Chem Soc 116:30-34). Other chemistries of potential utility in the formation of a D-amino acid target protein by chemoselective ligation include hydrozone formation (Gaertner et al. (1992) Bioconjugate Chem 3:262-268), metal chelation (Ghadiri et al. (1992) J Am Chem Soc 114:825-831), disulfide formation . (O'Shea et al. (1989) Science 245:646-648; and Futaki et al. (1994) Tetrahedron Lett 35:1267-1270), thioether formation (Muir et al. (1994) Biochemistry 33:7701-7708), and thiazolidine formation (Lui et al. (1994) PNAS 91:6584-6588).

\S

In an exemplary embodiment, a preferred linkage is a thiol ester linkage. This linkage can be accomplished by first attaching a facile leaving group to a first peptide segment and by attaching a carbonylthiol functionality to a second peptide segment. The two segments are covalently joined by nucleophilic substitution involving the sulphur nucleophile attacking the leaving group. For example, as illustrated below, a sulfur nucleophile at the terminus of one peptide segment is used to attack an alkyl bromide at the terminus of the other segment. The high chemical selectivity of this S j ∑ type reaction can, under proper conditions, allow the ligation of the two peptides with all other functionalities unprotected (see, for example, Kent et al. PCT Publication WO93/20098).

In the illustrated example, the two halves of the target protein (e.g., interleukin- 8) are chemically synthesized by SPPS method using D-amino acid precursors, the peptides cleaved, deprotected with hydrogen fluoride, and purified by high performance liquid chromatography (Schnόlzer et al., supra). In the case of the C-terminal fragment, the amino terminus of that fragment is functionalized with bromoacetic anhydride. To synthesize the N-terminal fragment, 4-[α-(Boc-Gly-S)benzyl]phenoxyacetamidomethyl- resin is used as the resin support which releases the carbonylthiol. The ligation reaction is carried out by mixing the amino- and carboxyl-terminal halves under normal ligation reaction conditions optimized for peptide solubility (e.g., 6 M guanidine hydrochloride, 0.1 M sodium phosphate buffer at pH 4.3).

Kl

Metl Ser57-HCH 2 COSH BrCH 2 CO-Phe59 Scr99.QH

IL-8 PR

Metl Ser57-HCH 2 CO-S-CH 2 CO-Phe59 Ser99.QH

Moreover, the use of two or more mutually exclusive ligation chemistries can be used to ligate three or more peptide segments in a specific manner. For example, the thioester- and oxime-forming chemistries can be used in a modular approach to the synthesis of a protein through several peptide fragments. The synthesis of functional analogs by such methods have permitted the synthesis of large proteins, and is particularly amenable to the synthesis of protein which possess numerous discrete domains, as for example many of the proteins in involved in clot formation and dissolution fibrinolysis. In an exemplary embodiment, Canne et al. (1995) J Am Chem Soc 117:2998-3007 have descrioed a modular ligation strategy for the generated covalently linked transcriptional factor dimers. The cMyc-Max heterodimer and Max homodimer described in Canne et al. are easily provided as the D-amino acid enantiomers by synthesizing the individual peptide fragments with D-amino acid and carrying out the convergent coupling according to those authors.

2D

An alternative approach to the preparation of covalent peptide arrays of predetermined secondary and tertiary structure is the "template-assembled synthetic protein" (TASP) concept (Mutter et al. in Peptides -Chemistry and Biology. Proceedings of the 10th American Peptide Symposium. Marshall, G.R., Ed.;Escom:Leiden, The Netherlands, 1988, pp349-353). A template molecule is used to covalently anchor arrays of secondary structure elements. For example, by this process peptide units are synthesized separately and covalently coupled to a multifunctional carrier, such as a core peptide, using chemical coupling reagents. The distinctive feature of the TASP approach is the nonlinear topology used; the molecule is made up of an array of branched polypeptides rather than the folded linear polypeptide of natural proteins (Mutter et al. (1989) Angew Chem (Intl Ed) 28:535-554). Mutter and co-workers (see Mutter (1988) ΗBS 13:260-265; Mutter et al. (1988) Makromol. Chem. Rapid Commun 9:437-443; Mutter et al. (1988) Tetrahedron 44:771-785; Mutter et al. (1988) Helvetica Chimica Acta 71:835-846; and Mutter et al. (1989) Proteins: Structure, Function, and Genetics 5:13-21) have described the concept of the template-associated synthetic protein (TASP) in which component amphiphilic peptides, particularly those preferring α-helical and β-sheet structures, are assembled by covalent bonds to a carrier or template molecule which is said to direct "the peptide chains into protein-like packing arrangements." The resultant molecule has a branched structure in which a number of peptides extend from the template. Oligopeptides, in particular, are employed as template molecules.

In a preferred embodiment, the TASP molecules are synthesized by a chemoselective ligation approach. For example, reaction of a readily prepared synthetic helical peptide-αCOSH (see above) with a synthetic (BrAc) n template molecule can be used to generate multi-helical assemblies (see, for example, Dawson et al. (1993) J Am Chem Soc 115:7263-7266). Another approach, suitable for the preparation of multimeric proteins (e.g. MUC-1, which contains a multimer of a 20-amino-acid sequence, is to first synthesize the monomeric peptide (e.g. the 20-amino-acid sequence) with protecting groups on all non-terminal amino and carboxyl groups, and then polymerize the monomeric peptide with a carbodiimide reagent to allow the formation of native peptide bonds. Oligosaccharide residues could be attached chemically before polymerization, if desired, (see below).

Several criteria should be considered in choosing a polypeptide target suitable for synthesis in the mirror image form: (i) Its amino acid sequence must be known to enable its chemical synthesis.

zf

(ii) The target should be associated with a biological significant process, such as a disease process.

(iii) Extracellular targets are preferred because these are most accessible to hydrophylic D-peptides. However, intracellular targets may also be considered because they are accessible to peptides that (a) directly cross the membrane due to their intrinsic hydrophobicity, (b) cross the membrane via endocytosis or prinocytosis, or (c) are delivered across the membrane by transposing agents such as liposomes. Transmembrane targets are also suitable because potential solubility problems may be overcome when they are synthesized in fragments (see the example of the IL-8 receptor below).

(iv) Small polypeptides or fragments of large polypeptides are preferred as targets because these are easier to chemically synthesize.

(v) It is better that the fragments retain biological activity to guarantee their suitability as targets. However, this is not critical because, for example, peptide fragments of proteins that lack biological activity are still useful for the isolation of monoclonal antibodies that have activities against the full-length protein.

(vi) It is preferable that the target fragment does not contain modifications, such as glycosylations, that are necessary for activity because these are more difficult to synthesize. However, methods for the conjugation of oligosaccharides to peptides are available

(see below), if necessary.

In choosing a polypeptide screening target, factors which can be considered include solubility, peptide chain length, requirement of post-translational modifications, or addition of co-factors, and/or monomeric or oligomeric nature of protein(s) upon which the target is based. In general, it will be desirable that the polypeptide target be soluble. Many of the cytosilic and extracellular proteins which are contemplated as candidates for development of D-polypeptide targets can be chemically synthesized in their full-length form as soluble molecules. However, for ease of chemical synthesis, and in some instances to overcome solubility problems which may otherwise arise in the synthesis ofcertain full-length proteins, the screening polypeptide can be derived from a smaller portion of the protein of interest.

For instance, the target can be generated as the D-enantiomer of a domain, or other portion of the protein, which retains a biological activity against which a compound library is to be screened. For example, domains and/or motifs are well

known that, even when isolated from the full-length protein, retain such activities as ligand binding or catalytic activity. Accordingly, the present invention contemplates screening targets which correspond to (e.g. include) such domain structures as: SH2 domains; SH3 domains; ankyrin-like repeats; WD40 motifs; Kunitz-type inhibitor domains; growth factor-like domains such as EGF-like domains; Kringle domains; fibronectin finger-like domains; heparin-binding domains;; death domains; TRAF domains; pleckstrin homology (PH) domains; ITAMs; catalytic domains such as kinase domains; phosphatase domains; phospholipase domains; guanine nucleotide exchange factor (GEF) domains; and hydrolase domains (such as protease domains); or DNA binding domains such as leucine zippers, zinc fingers and helix-loop-helix motifs.

Where the protein of interest is a transmembrane protein, the screening target can be derived from a soluble extracellular or cytoplasmic domain. For instance, the screening target can be the D-enantiomer of a soluble ligand binding domain of a cell surface receptor. To illustrate, the screening target can correspond to the extracellular domain of a guanylyl cyclase, a cytokine receptor, a tyrosine kinase receptor, or a serine/thermine kinase receptor. In other embodiments, the screening target can correspond to a soluble portion of a G-protein coupled receptor (GCR) which retains ligand binding activity. For example, as described above, certain of the extracellular loops between the transmembrane portions of the GCRs have been shown to retain ligand binding activity even when provide free in solution. In still other embodiments, the screening target can be reconstituted in a lipid bilayer, such as a liposome or other vesicle (see, for example, Kalva Kolanu et al. (1990) Biotechniques 11:248; and The Huang U.S. Patents 4,957,735 and 4,708,933) and the lipid protein combination used as the screening target. The selection of amino acid length is, on the one hand, related to the solubility issue, and on the other hand, related to the issue of chemical synthesis. It may be desirable, despite good soluble characteristics of the full-length protein, to nevertheless restrict the size of the screening target in order to reduce expense and increase yield. At present, preferred screening targets are in the range of 50-500 amino acids in length, preferably less than 400 residues, and even more preferably less than 200. Of course, as set out above, the screening target can be a short peptide sequence, e.g. less than 50 residues, including peptides as small as dipeptides.

Where modification of the screening target is desired to correspond to the post¬ translational modification of the natural protein, standard chemical coupling with non- peptide and peptide functionalities can be provided. For example, the target can be

glycosylated, prenylated, phosphorylated, etc. to resemble actual post-translational modification. However, where the added moiety is itself chiral, the opposite enantiomer of that naturally occurring with the protein can be provided with the screening target.

Merely for purposes of illustration, the following protein targets are described for use in the subject method.

In one embodiment, the target protein is the gastrin-releasing peptide (GRP). The lung cancer mortality rate is almost 90% at two years, with about 25% of the deaths due to small-cell lung cancer (SCLC). An excellent therapeutic target for SCLC is gastrin-releasing peptide, a 27-amino-acid-long bombesin-like growth factor (Marx et al. (1984) PNAS 81:5699; and Cuttitta et al. (1985) Nature 316:823). Human SCLC cell lines secrete GRP which, in turn, can stimulate the growth of these cell lines in vitro (Cuttitta et al., supra). Furthermore, a murine MAb to GRP that blocks the binding of GRP to its receptor on SCLC cells inhibits the growth of SCLC in culture and in nude mice (20). Clinical trials with this MAb have begun (Avis et al. (1991) J. Natl. Cancer Inst. 83:1470).

The subject method can accordingly be derived to develop D-amino acid peptides which, by interfering with the binding of GRP to its cognate receptor, may be used as anti-proliferative agents to inhibit GRP-stimulated cell growth.

Another potential therapeutic target are receptors from the neu receptor family. In women in the U.S.A., breast cancer is the most common cancer and is only second to lung cancer in the number of cancer deaths. A prime breast cancer target is neu/erbB- 2 HER-2, a 185 kD trans-membrane phosphoglycoprotein tyrosine kinase (Shih et al. (1981) Nature 290:261). Amplification or over expression of the neu oncogene occurs in about 30% of breast and ovarian adenocarcinomas, a finding that correlates with a poor response to primary therapy (Slamon et al. (1987) Science 235:177; and Hayes et al. (1993) Annals of Oncology 4:807). Transfection of NTH 3T3 cells with the neu oncogene results in transformation (Shih et al., supra), and introduction of an activated neu oncogene into mice results in the transformation of the entire mammary epithelium (Muller et al. (1988) Cell 54:105) or the stochastic appearance of mammary tumors (Bouchard et al. (1989) Cell 57:931).

Evidence is accumulating that breast cancer may be inhibited by molecules that bind to neu or a ligand of neu, such as members of the heregulin family (Holmes et al.

1992 Nature 256:1205). MAbs and their radiolabelled conjugates that bind to the extracellular domain of neu retard the growth of breast cancer cells in culture and in nude mice without the selection of neu-negative cell clones (DeSantes et al. (1992)

Cancer Res. 52:1916; and Drebin et al. (1986) PNAS 83:9129.). Such MAb conjugates, which are now in clinical trials (Lippman et al. (1993) Science 259:631), may alter the neu signal transduction pathway and affect tumor growth in several different ways. They may (i) over stimulate neu, thereby causing differentiation (Bacus et al. (1992) Cancer Res. 52:2580), (ii) prevent homo- or hetero-dimerization of neu, thereby inactivating neu (Caraway et al. (1994) Cell 78:5), (iii) cause cellular internalization and down regulation of neu (Tagliabue et al. (1991) Int. J. Cancer 47:93), (iv) deliver conjugated cytotoxic radionuclides or toxins to the cell surface or cytoplasm, or (v) prevent binding of a ligand to neu. The neu receptor shares 39-50% identity with the extracellular domain of the epidermal growth factor receptor (EGFR) that contains a 202-amino-acid-fragment sufficient for high affinity binding to epidermal growth factor and transforming growth factor alpha (Kohda et al. (1993) J. Biol. Chem. 268: 1976). Furthermore, an eight- amino-acid fragment of neu (termed neuRL2) from the putative ligand-binding domain inhibits the phosphorylation of neu and the growth of breast cancer cells in a dose- dependent manner, but does not inhibit cells lacking neu. It was concluded that the neuRL2 peptide probably bound to, and inactivated, a neu ligand (Neri et al. First SPORE Investigator's Meeting, Rockville, Maryland. Abstract #2, July 18-20, 1993 (submitted for publication in May, 1995)). Thus, this eight-amino-acid fragment of neu is a good target for anti-neu drug design.

The D-peptides which can be derived by the present invention can be useful inhibiting the biological function of neu by, for example, competitively disrupting the binding of neu with its ligand or other protein, or preventing allosteric activation of an enzymatic activity associated with neu. Alternatively, they may be useful as agonists causing over stimulation or down-regulation of neu.

Yet another potential target is Interleukin-8 (EL-8). IL-8 is a chemoattractant and activator of neutrophils, and has been implicated in a wide range of acute and chronic inflammatory diseases (Murphy (1994) Annu. Rev. Immunol. 12:593-633). Human LL-8 is a 72-amino-acid-long polypeptide produced by monocytes, fibroblasts, keratinocytes and endothelial cells upon induction by factors such as tumor necrosis factor, interleukin- 1, and lipopolysaccharides (Murphy, supra). Certain analogs of IL-8 act as LL-8 antagonists in vitro by inhibiting neutrophil activation (chemotaxis, exocytosis and respiratory burst), suggesting that anti-IL-8 agents may have therapeutic potential for inflammatory diseases (Moser et al. (1993) J. Biol. Chem. 268: 7125- 7128).

The monomeric JL-8 peptide forms dimers in vitro with a Kd of 20μM (Paolini et al. (1994) J Immunology 153: 2704; and Burrows et al. (1994) Biochemistry 33:12741-12745), so it is possible that the monomer and/or the dimer are active in vivo. Mutants that cannot dimerize are active in functional assays in vitro (Rajarathnam et al. (1994) Science 264:90). Interestingly, NMR and X-ray determination of the three- dimensional structure of the IL-8 dimer (Clore et al. (1990) Biochemistry 29: 1689-1696; and Baldwin et al. (1991) PNAS 88:502) revealed that it resembled the peptide-binding groove of MHC class I and π proteins (Bjorkman et al. (1987) Nature 329:506), so it is conceivable that IL-8 dimers may be able to bind to in v/fro-selected peptide sequences in a manner similar to the MHC molecules. Therefore, the IL-8 dimer, in addition to the monomer, is an attractive target.

In the case of the IL-8 pathway, an alternative target from IL-8 is a functional fragment of the JL-8 receptor. Fragments of the human and rabbit IL-8 Type 1 receptor of 39 and 44 amino acids, respectively, are functional in IL-8 binding assays (Gayle et al. (1993) JBiol Chem 268:7283-7289). Thus, members of the largest receptor family, the seven transmembrane receptors, are potential targets, because these fragments could be readily synthesized as D-peptides for use as targets.

Small ligand-binding fragments of other receptors have also been determined using an ingenuous method. For example, small regions of the human thyrotropin receptor capable of hormone binding were determined by synthesizing overlapping peptides of the proposed hormone-binding domain and assaying them for binding activity. (Atassi et al. (1991) PNAS 88:3613-3617). Such an approach could be applied to any receptor.

The cellular proto-oncogene c-myc is involved in cell proliferation and transformation but is also implicated in the induction of programmed cell death (apoptosis). The c-Myc protein is a transcriptional activator with a carboxyl-terminal basic region/helix-loop-helix (HLH)/leucine zipper (LZ) domain. It forms heterodimers with the HLH/LZ protein Max and transactivates gene expression after binding DNA E- box elements. The protein Max is the obligatory partner of c-Myc for many its biological functions analyzed to date. For instance, Myc must heterodimerize with Max to bind DNA and perform its oncogenic activity.

According to the present invention, the subject method can be used to derive D- amino acid peptides which can inhibit formation of complexes between Myc and other proteins such as Max, and/or which can inhibit the binding of a Myc complex to a myc-

A >

responsive element in a gene. The total synthesis of Myc-Max and Max-Max dimers are described by Canne et al. (1995) J Am Chem Soc 117:2998-3007.

Yet another target which can derived for use in the subject method is fibronectin, a glycoprotein involved in cell adhesion, tissue organization and wound healing. The total synthesis of fibronectin modules is described by, for example, Williams et al. (1994) JAm Chem Soc 116:10797-10798.

It has been previously shown that the expression of human immunodeficiency virus type 1 (HTV-l) major gag protein, p24, is persistent on the surface of H1N-1- infected cells (Νishino et al. (1992) Vaccine 10:677-683). The total synthesis of a C- terminal 100 amino acid fragment of p24 is described by Mascagni et al. (1990) Tetrahedron Lett 31 :4637-4640, and that portion of the p24 protein, as a D-enantiomer, can be used to generate a screening target

The total synthesis of TGFα has been described by Woo et al. (1989) Protein Eng 3:29-37, and accordingly provides a possible target molecule. Likewise, the HIN protease has been synthesized by total chemical synthetic means (Kent et al. PCT Publication WO93/20098) and provides a unique target for developing inhibitors of the catalytic activity as well as inhibitors of protein-protein interactions involving the protease.

B. Nucleic Acid Targets

In another embodiment, the screening target can be an L-nucleic acid such as single or double-stranded L-DΝA or single or double-stranded L-RΝA, where the natural D-ribose is replaced by its enantiomer L-ribose. The synthesis of L-DΝA and L- RΝA is described in the art. See, for example, Visser et al (1986) Reel Trav Chim Pays Bas 105:528; Morvan et al (1990) Biochem Biophys Res Comm 172:537-543; Anderson et al. (1984) Nucleosides and Nucleotides 3:499-512; Asseline et al. (1991) Nuc Acid Res 19:4067-4074; Garbesi et al. (1993) JAm Chem Soc 21:4159-4165; Ashley (1992) JAm Chem Soc 114:9731-9736; and Damha et al. (1994) Biochemistry 33:7877-7885,

To illustrate, the pyrimidine L-2'-deoxynucleosides dC and dU can be prepared readily from commercially available, naturally occurring L-arabinose (Anderson supra). Also, L-dU can be readily converted to L-dA and L-dG by transgleosylation procedures such as those detailed by Holy (1972) Collect Czech Chem Commun 37:4072-4087, and Spadari et al. (1992) J Med Chem 35:4214-4220; and L-thymidine can be synthesized from L-dU according to the procedure of Reese et al. (1983) J Chem Soc Chem

21

Commun 877-879. Protected forms of the L-nucleosides, such as the phosphoramidites of the nucleosides, are suitable for direct use on an automated gene synthesizer and can be used to synthesize L-oligonucleotides by standard protocols (cf. Damha et al. supra), L-RNA can be assembled in similar fashion using the silyl phosphoramidite method (OgUvie et al. (1988) PNAS 85:5764-5768; and Damha et al. (1993) in Protocols for Oligonucleotides and Analogs: Synthesis and Properties (Agrawal. S. Ed.) pp. 81-114, The Humana Press, Inc., Totowa, NJ).

In one embodiment, the L-oligonucleotide targets can be used to screen L- peptide or other libraries for compounds which selectively bind a particular nucleic acid sequence or structure (such as a hairpin). In the instance of peptides identified by the subject method, D-amino acid enantiomers of the peptides will accordingly bind the natural D-enantiomer of the nucleic acid. Such compounds can be useful as, for example, specific inhibitors of transcriptional and/or translational events, or as agents which alter transcript stability. The nucleic acid target can be, to illustrate, the L- enantiomer of a transcriptional regulatory sequence; a structural sequence, e.g., which involves histone contact or other chromatin regulatory processes; a methylation site; a viral RNA or DNA a 11 rRNA; a tRNA; or a sequence involved in mRNA stability, ribosome contact and/or critical 2° structure.

In an exemplary embodiment, the nucleic acid target is the L-enantiomer of a single-stranded RNA with the HIN TAR sequence. The TAR sequence, through the formation of different 2° structures in the presence or absence of various factors (e.g.

TAT) is a cis-acting element which regulates transcription of certain HIN genes.

Accordingly, molecules which alter the structure or equilibrium of structures of the TAR sequence may be useful to disrupt the fidelity of the HTV life cycle (Lepidof et al. (1995) J Virol 69:5422-5430; and U.S. Patents 5,474,935 and 5,278,042). The subject method can be used to identify molecules which disrupt the TAR structure, or the ability of cellular or viral proteins to bind the TAR sequence. Thus, for example, D-peptides which may be useful in the treatment of HIN can be identified by the subject method.

Likewise, the TAR sequence can be used to screen nucleic acid libraries in order to identify L-nucleic acids as potential antisense constructs (described infra); such nucleic acids may ultimately have therapeutic value.

In another exemplary embodiment, the nucleic acid target is the L-enantiomer of the double-stranded DΝA estrogen response element (ERE). D-peptides which bind the ERE may prevent DΝA binding of the estrogen receptor, thereby acting as anti- estrogens (REF).

ώo *

C. Carbohydrate Targets

In yet another embodiment, the target can be a carbohydrate. Carbohydrate can be synthesized from L-sugars, many of which are readily available, in order to provide the opposite enantiomer to a carbohydrate of interest. For example, step-wise synthesis of L-carbohydrates can be carried out using traditional organic synthesis (see, for example, Malikes et al. (1990) Chemiker-Zeitung 114:371-375; PCT publication WO 95/03315; and U.S. patents 5,476,924 and 5,470,843), and may be automated (see U.S. patent 5,288,037). Carbohydrate targets can be polymeric, e.g. a polysaccharide, or constitute a portion of a non-polymeric molecule, such as an inositol.

In some circumstances, the carbohydrate target will constitute a glycoconjugate, such as may be produced by condensation with a polyamide such as a polypeptide. See, for example, U.S. patents 5,470,749 and 5,324,663 (peptide conjugates) and 5,309,096 (lipid conjugates). As desired, the molecule(s) to which the carbohydrate is conjugated, if chiral, may be provided in enantiomerically pure form as well. Accordingly, an L- polysaccharide and D-peptide can be coupled to form the opposite enantiomeric target for a given glycoprotein.

In an exemplary embodiment, the carbohydrate target can be an L-polysaccharide enantiomer of a carbohydrate found on a cell surface receptor, or a viral protein, which are implicated in viral infectivity. Ligands identified by the subject method may, accordingly, be useful to prevent viral infectivity.

D. Other Targets

Other exemplary chiral targets include, in addition to the polymers described above, non-peptide hormones and other small molecule signal transducers, metabolic products, and transition state analogs.

To illustrate, the target molecule can be an enantiomer of a prostaglandin or thromboxane. For example, enantioselective synthesis of thromboxanes are described in U.S. Patent 4,256,646. Agents identified by the subject method which bind to prostaglandins or thromboxanes may be useful for modulating platelet aggregation.

In another embodiment, the target can be the opposite enantiomer of a naturally occurring inositol. Inositol phosphates are critical components in many signal transduction pathways. Inositol binding molecules identified by the subject method may

be useful therefore as modulators of various signal processes, particularly in altering a cell's response to extracellular signals transduced by cell surface receptors.

In yet another embodiment, the target molecule is the opposite enantiomer of a bioactive steroid, lipid-soluble vitamin, terpene or acetogenims. Steroids are also critical components of intracellular and extracellular signaling. Ligands which bind, for example, progesterone, aldosterone, cortisol, testosterone, and/or estradiol can have broad therapeutic application including modulation of inflammation and fertility. Other potentially therapeutic compounds may be identified in the present method using, for example, enantiomerically-enriched preparations of any of vitamin K, vitamin E, vitamin vitamin D, Santonin, cedrol, gibberellic acid, citrinin, terramycin and the like as target proteins.

In another embodiment, the target is an enantiomer of a chiral drug. To illustrate, the enantioselective synthesis of cocaine has been previously described (Lewin et al. (1987) J Heterocyclic Chem 24:19). Using an enantiomerically enriched preparation of (+) cocaine, peptides (or other molecules from a library) can be identified which bind to the cocaine target. The opposite enantiomer of the identified ligand, which should bind the bioactive (-) cocaine, can be further tested for its ability to antagonize the delivery of cocaine to its receptor.

In still another embodiment, the target can be a transition state analog of a desired chemical reaction. Molecules identified by the subject method can be tested for the ability to enhance, and even catalyze, a chemical reaction. Returning to the cocaine example merely for illustrative purposes, it is noted that transition state analogs of cocaine have been synthesized from (-) ecgonine, and these analogs have successfully been employed to generate catalytic antibodies (see, for example, Landry et al. (1993) Science 259:1899-1901). According to the present, the enantiomer of the Landry transition state analog, e.g. derived from (+) ecgonine, can be used to screen compound libraries. The enantiomers of such molecules can then be tested for catalytic activity in converting cocaine benzoyl ester to ecgonine methyl ester. Such catalytic agents may be useful in the treatment of cocaine addiction

II. Variegated Compound Libraries

The target, be it chiral or achiral, can be screened against any of a variety of test compounds. The compound library can range from natural extracts to random chemical collections to combinatorial libraries. In preferred embodiments, the molecular identity

of a test compound which interacts with the target can be readily ascertained. For instance, the test compound may be present to such a concentration that it is directly sequenceable, such as by mass spectroscopy, or is immobilized at a known spatial address, or is associated with a label which provides the ability to either amplify the test agent and/or itself provides the information as to the molecular identity of the test agent. Exemplary combinatorial libraries include peptide libraries, nucleic acid libraries, carbohydrate libraries, and libraries of small organic molecules. Other libraries amenable to the subject method, e.g. which contain many chiral compounds, will be apparent to those skilled in the art in light of the present invention.

A. Variegated Peptide Display

The variegated peptide libraries of the subject method can be generated by any of a number of methods, and, though not limited by, preferably exploit recent trends in the preparation of chemical libraries. The library can be prepared, for example, by either synthetic or biosynthetic approaches, and screened for activity against the D-enantiomer target in a variety of assay formats. As used herein, "variegated" refers to the fact that a population of peptides is characterized by having a peptide sequence which differ from one member of the library to the next. For example, in a given peptide library of n amino acids in length, the tajal number of different pppTiτftp sequences in the library is given by the product of ^ < V2 X " »V n -l x Vn *- " ere eac * 1 v n represents the number different amino acid residues occurring at position n of the peptide. In a preferred embodiment of the present invention, the peptide display collectively produces a peptide library including at least 96 to IO 7 different peptides, so that diverse peptides may be simultaneously assayed for the ability to interact with the target protein. Peptide libraries are systems which simultaneously display, in a form which permits interaction with a target protein, a highly diverse and numerous collection of peptides. These peptides may be presented in solution (Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner USSN 5,223,409), spores (Ladner USSN '409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301- 310; and Ladner USSN '409).

In one embodiment, the peptide library is derived to express a combinatorial library of peptides which are not based on any known sequence, nor derived from

cDNA. That is, the sequences of the library are largely random. It will be evident that the peptides of the library may range in size from dipeptides to large proteins.

In another embodiment, the peptide library is derived to express a combinatorial library of peptides which are based at least in part on a known polypeptide sequence or a portion thereof (not a cDNA library). That is, the sequences of the library is semi- random, being derived by combinatorial mutagenesis of a known sequence(s). See, for example, Ladner et al. PCT publication WO 90/02909; Garrard et al., PCT publication WO 92/09690; Marks et al. (1992) J. Biol. Chem. 267:16007-16010; Griffths et al. (1993) EMBOJ 12:725-734; Clackson et al. (1991) Nature 352:624-628; and Barbas et al. (1992) PNAS 89:4457-4461. Accordingly, polypeptid ) which are known ligands for a target protein can be mutagenized by standard techniques to derive a variegated library of polypeptide sequences which can further be screened for agonists and/or antagonists.

In still another embodiment, the combinatorial polypeptides are produced from a cDNA library.

Depending on size, the combinatorial peptides of the library can be generated as is, or can be incorporated into larger fusion proteins. The fusion protein can provide, for example, stability against degradation or denaturation, as well as a secretion signal if secreted. In an exemplary embodiment, the polypeptide library is provided as part of thioredoxin fusion proteins (see, for example, U.S. Patents 5,270,181 and 5,292,646; and PCT publication WO94/ 02502). The combinatorial peptide can be attached on the terminus of the thioredoxin protein, or, for short peptide libraries, inserted into the so- called active loop.

In preferred embodiments, the combinatorial polypeptides are in the range of 3- 100 amino acids in length, more preferably at least 5-50, and even more preferably at least 10, 13, 15, 20 or 25 amino acid residues in length. Preferably, the polypeptides of the library are of uniform length. It will be understood that the length of the combinatorial peptide does not reflect any extraneous sequences which may be present in order to facilitate expression, e.g., such as signal sequences or invariant portions of a fusion protein.

i) Biosynthetic Peptide Libraries

The harnessing of biological systems for the generation of peptide diversity is now a well established technique which can be exploited to generate the peptide libraries of the subject method. The source of diversity is the combinatorial chemical synthesis of

mixtures of oligonucleotides. Oligonucleotide synthesis is a well-characterized chemistry that allows tight control of the composition of the mixtures created. Degenerate DNA sequences produced are subsequently placed into an appropriate genetic context for expression as peptides. There are two principal ways in which to prepare the required degenerate mixture. In one method, the DNAs are synthesized a base at a time. When variation is desired at a base position dictated by the genetic code a suitable mixture of nucleotides is reacted with the nascent DNA, rather than the pure nucleotide reagent of conventional polynucleotide synthesis. The second method provides more exact control over the amino acid variation. First, trinucleotide reagents are prepared, each trinucleotide being a codon of one (and only one) of the amino acids to be featured in the peptide library. When a particular variable residue is to be synthesized, a mixture is made of the appropriate trinucleotides and reacted with the nascent DNA. Once the necessary "degenerate" DNA is complete, it must be joined with the DNA sequences necessary to assure the expression of the peptide, as discussed in more detail below, and the complete DNA construct must be introduced into the cell.

Whatever the method may be for generating diversity at the codon level, chemical synthesis of a degenerate gene sequence can be carried out in an automatic DNA synthesizer, and the synthetic genes can then be ligated into an appropriate gene for expression. The purpose of a degenerate set of genes is to provide, in one mixture, all of the sequences encoding the desired set of potential test peptide sequences. The synthesis of degenerate oligonucleotides is well known in the art (see for example, Narang, SA (1983) Tetrahedron 39:3; Itakura et al. (1981) Recombinant DNA, Proc 3rd Cleveland Sy pos. Macromolecules, ed. AG Walton, Amsterdam: Elsevier pp273- 289; Itakura et al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al. (1984) Science 198:1056; Ike et al. (1983) Nucleic Acid Res. 11:477. Such techniques have been employed in the directed evolution of other proteins (see, for example, Scott et al. (1990) Science 249:386-390; Roberts et al. (1992) PNAS 89:2429-2433; Devlin et al. (1990) Science 249: 404-406; Cwirla et al. (1990) PNAS 87: 6378-6382; as well as U.S. Patents Nos. 5,223,409, 5,198,346, and 5,096,815).

Because the number of different peptides one can create by this combination approach can be huge, and because the expectation is that peptides with the appropriate structural characteristics to serve as ligands for a given target protein will be rare in the total population of the library, the need for methods capable of conveniently screening large numbers of clones is apparent. Several strategies for selecting peptide ligands

A

from the library have been described in the art and are applicable to certain embodiments of the present method.

In one embodiment, a variegated peptide library can be expressed by a population of display packages to form a peptide display library. With respect to the display package on which the variegated peptide library is manifest, it will be appreciated from the discussion provided herein that the display package will often preferably be able to be (i) genetically altered to encode a test peptide, (ii) maintained and amplified in culture, (iii) manipulated to display the peptide in a manner permitting the peptide to interact with a target protein during an affinity separation step, and (iv) affinity separated while retaining the peptide-encoding gene such that the sequence of the peptide can be obtained. In preferred embodiments, the display remains viable after affinity separation.

Ideally, the display package comprises a system that allows the sampling of very large variegated peptide display libraries, rapid sorting after each affinity separation round, and easy isolation of the peptide-encoding gene from purified display packages. The most attractive candidates for this type of screening are prokaryotic organisms and viruses, as they can be amplified quickly, they are relatively easy to manipulate, and large number of clones can be created. Preferred display packages include, for example, vegetative bacterial cells, bacterial spores, and most preferably, bacterial viruses (especially DNA viruses). However, the present invention also contemplates the use of eukaryotic cells, including yeast and their spores, as potential display packages.

In addition to commercially available kits for generating phage display libraries (e.g. the Pharmacia Recombinant Phage Peptide System, catalog no. 27-9400-01; and the Stratagene SurfZAP™ phage display kit, catalog no. 240612), examples of methods and reagents particularly amenable for use in generating the variegated peptide display library of the present method can be found in, for example, the Ladner et al. U.S. Patent No. 5,223,409; the Kang et al. International Publication No. WO 92/18619; the Dower et al. International Publication No. WO 91/17271; the Winter et al. International Publication WO 92/20791; the Markland et al. International Publication No. WO 92/15679; the Breitling et al. International Publication WO 93/01288; the McCafferty et al. International Publication No. WO 92/01047; the Garrard et al. International Publication No. WO 92/09690; the Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) JMol Biol 226:889-896; Clackson

et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982.

When the display is based on a bacterial cell, or a phage which is assembled periplasmically, the display means of the package will comprise at least two components. The first component is a secretion signal which directs the recombinant peptide to be localized on the extracellular side of the cell membrane (of the host cell when the display package is a phage). This secretion signal is characteristically cleaved off by a signal peptidase to yield a processed, "mature" peptide. The second component is a display anchor protein which directs the display package to associate the peptide with its outer surface. As described below, tbis anchor protein can be derived from a surface or coat protein native to the genetic package.

When the display package is a bacterial spore, or a phage whose protein coating is assembled intracellularly, a secretion signal directing the peptide to the inner membrane of the host cell is unnecessary. In these cases, the means for arraying the variegated peptide library comprises a derivative of a spore or phage coat protein amenable for use as a fusion protein.

In the instance wherein the display package is a phage, the cloning site for the test peptide sequences in the phagemid should be placed so that it does not substantially interfere with normal phage function. One such locus is the intergenic region as described by Zinder and Boeke, (1982) Gene 19:1-10. In an illustrative embodiment comprising an Ml 3 phage display library, the test peptide sequence is preferably expressed at an equal or higher-level than the H L .cpIH product (described below) to maintain a sufficiently high V L concentration in the periplasm and provide efficient assembly (association) of V L with V H chains. For instance, a phagemid can be constructed to encode, as separate genes, both a V^/coat fusion protein and a V L chain. Under the appropriate induction, both chains are expressed and allowed to assemble in the periplasmic space of the host cell, the assembled peptide being linked to the phage particle by virtue of the VJJ chain being a portion of a coat protein fusion construct. The number of possible peptides for a given library may, in certain instances, exceed 10 12 . To sample as many combinations as possible depends, in part, on the ability to recover large numbers of transformants. For phage with plasmid-like forms (as filamentous phage), electrotransformation provides an efficiency comparable to that of phage-transfection with in vitro packaging, in addition to a very high capacity for DNA

input. This allows large amounts of vector DNA to be used to obtain very large numbers of transformants. The method described by Dower et al. (1988) Nucleic Acids Res., 16:6127-6145, for example, may be used to transform fd-tet derived recombinants at the rate of about IO 7 transformants/ug of ligated vector into E. coli (such as strain MC1061), and libraries may be constructed in fd-tet Bl of up to about 3 x IO 8 members or more. Increasing DNA input and making modifications to the cloning protocol within the ability of the skilled artisan may produce increases of greater than about 10- fold in the recovery of transformants, providing libraries of up to 10 10 or more recombinants.

As will be apparent to those skilled in the art, in embodiments wherein high affinity peptides are sought, an important criteria for the present selection method can be that it is able to discriminate between peptides of different affinity for a particular target, and preferentially enrich for the peptides of highest affinity. Applying the well known principles of affinity and valence, it is understood that manipulating the display package to be rendered effectively monovalent can allow affinity enrichment to be carried out for generally higher binding affinities (i.e. binding constants in the range of IO 6 to 10 10 M' 1 ) as compared to the broader range of affinities isolable using a multivalent display package. To generate the monovalent display, the natural (i.e. wild-type) form of the surface or coat protein used to anchor the peptide to the display can be added at a high enough level that it almost entirely eliminates inclusion of the peptide fusion protein in the display package. Thus, a vast majority of the display packages can be generated to include no more than one copy of the peptide fusion protein (see, for example, Garrad et al. (1991) Bio/Technology 9:1373-1377). In a preferred embodiment of a monovalent display library, the library of display packages will comprise no more than 5 to 10% polyvalent displays, and more preferably no more than 2% of the display will be polyvalent , and most preferably, no more than 1% polyvalent display packages in the population. The source of the wild-type anchor protein can be, for example, provided by a copy of the wild-type gene present on the same construct as the peptide fusion protein, or provided by a separate construct altogether. a) Phage As Display Packages Bacteriophage are attractive prokaryotic-related organisms for use in the subject method. Bacteriophage are excellent candidates for providing a display system of the variegated peptide library as there is little or no enzymatic activity associated with intact mature phage, and because their genes are inactive outside a bacterial host, rendering the mature phage particles metabolically inert. In general, the phage surface is a relatively simple structure. Phage can be grown easily in large numbers, they are

2b

amenable to the practical handling involved in many potential mass screening programs, and they carry genetic information for their own synthesis within a small, simple package. As the peptide gene is inserted into the phage genome, choosing the appropriate phage to be employed in the subject method will generally depend most on whether (i) the genome of the phage allows introduction of the peptide-encoding gene either by tolerating additional genetic material or by having replaceable genetic material; (ii) the virion is capable of packaging the genome after accepting the insertion or substitution of genetic material; and (iii) the display of the peptide on the phage surface does not disrupt virion structure sufficiently to interfere with phage propagation. One concern presented with the use of phage is that the morphogenetic pathway of the phage determines the environment in which the peptide will have opportunity to fold. Periplasmically assembled phage are preferred as the displayed antibodies where the test peptide contains essential disulfides. However, in certain embodiments in which the display package forms intracellularly (e.g., where λ phage are used), it has been demonstrated that the peptide may assume proper folding after the phage is released from the cell.

Another concern related to the use of phage, but also pertinent to the use of bacterial cells and spores as well, is that multiple infections could generate hybrid displays that carry the gene for one particular peptide yet have at least one or more different test peptides on their surfaces. Therefore, it can be preferable, though optional, to minimize this possibility by infecting cells with phage under conditions resulting in a low multiple-infection. However, there may be circumstances in which high multiple-infection conditions would be desirable, such as to increase homologous recombination events between gene constructs encoding the peptide display in order to further expand the repertoire of the peptide display library.

For a given bacteriophage, the preferred display means is a protein that is present on the phage surface (e.g. a coat protein). Filamentous phage can be described by a helical lattice; isometric phage, by an icosahedral lattice. Each monomer of each major coat protein sits on a lattice point and makes defined interactions with each of its neighbors. Proteins that fit into the lattice by making some, but not all, of the normal lattice contacts are likely to destabilize the virion by aborting formation of the virion as well as by leaving gaps in the virion so that the nucleic acid is not protected. Thus in bacteriophage, unlike the cases of bacteria and spores, it is generally important to retain in the peptide fusion proteins those residues of the coat protein that interact with other proteins in the virion. For example, when using the Ml 3 cpVTH protein, the entire

mature protein will generally be retained with the peptide fragment being added to the N-terminus of cpVHI, while on the other hand it can suffice to retain only the last 100 carboxy terminal residues (or even fewer) of the Ml 3 cpIJI coat protein in the peptide fusion protein. Under the appropriate induction, the peptide library is expressed and allowed to assemble in the bacterial cytoplasm, such as when the λ phage is employed. The induction of the protein(s) may be delayed until some replication of the phage genome, synthesis of some of the phage structural-proteins, and assembly of some phage particles has occurred. The assembled protein chains then interact with the phage particles via the binding of the anchor protein on the outer surface of the phage particle. The cells are lysed and the phage bearing the library-encoded test peptides (that correspond to the specific library sequences carried in the DNA of that phage) are released and isolated from the bacterial debris.

To enrich for and isolate phage which contain cloned library sequences that encode a desired protein, and thus to ultimately isolate the nucleic acid sequences themselves, phage harvested from the bacterial debris are, for example, affinity purified.

As described below, when a peptide which specifically binds a particular target protein is desired, the target protein can be used to retrieve phage displaying the desired peptide.

The phage so obtained may then be amplified by infecting into host cells. Additional rounds of affinity enrichment followed by amplification may be employed until the desired level of enrichment is reached.

The enriched peptide-phage can also be screened with additional detection- techniques such as expression plaque (or colony) lift (see, e.g., Young and Davis, Science (1983) 222:778-782) whereby a labeled target protein is used as a probe. The phage obtained from the screening protocol are infected into cells, propagated, and the phage DNA isolated and sequenced, and/or recloned into a vector intended for gene expression in prokaryotes or eukaryotes to obtain larger amounts of the particular peptide selected.

In yet another embodiment, the peptide is also transported to an extra- cytoplasmic compartment of the host cell, such as the bacterial periplasm, but as a fusion protein with a viral coat protein. In this embodiment the desired protein (or one of its polypeptide chains if it is a multichain peptide) is expressed fused to a viral coat protein which is processed and transported to the cell inner membrane. Other chains, if present, are expressed with a secretion leader and thus are also transported to the periplasm or

other intracellular by extra-cytoplasmic location. The chains present in the extra- cytoplasm then assemble into a complete test peptide. The assembled molecules become incorporated into the phage by virtue of their attachment to the phage coat protein as the phage extrude through the host membrane and the coat proteins assemble around the phage DNA. The phage bearing the test peptide may then be screened by affinity enrichment as described below.

1) Filamentous Phage

Filamentous bacteriophages, which include Ml 3, fl, fd, Ifl, Dee, Xf, Pfl, and Pf3, are a group of related viruses that infect bacteria. They are termed filamentous because they are long, thin particles comprised of an elongated capsule that envelopes the deoxyribonucleic acid (DNA) that forms the bacteriophage genome. The F pili filamentous bacteriophage (Ff phage) infect only gram-negative bacteria by specifically adsorbing to the tip of F pili, and include fd, fl and Ml 3.

Compared to other bacteriophage, filamentous phage in general are attractive for generating the peptide libraries of the subject method, and Ml 3 in particular is especially attractive because: (i) the 3-D structure of the virion is known; (ii) the processing of the coat protein is well understood; (iii) the genome is expandable; (iv) the genome is small; (v) the sequence of the genome is known; (vi) the virion is physically resistant to shear, heat, cold, urea, guanidinium chloride, low pH, and high salt; (vii) the phage is a sequencing vector so that sequencing is especially easy; (vϋi) antibiotic-resistance genes have been cloned into the genome with predictable results (Hines et al. (1980) Gene 11:207-218); (ix) it is easily cultured and stored, with no unusual or expensive media requirements for the infected cells, (x) it has a high burst size, each infected cell yielding 100 to 1000 Ml 3 progeny after infection; and (xi) it is easily harvested and concentrated (Salivar et al. (1964) Virology 24: 359-371). The entire life cycle of the filamentous phage M13, a common cloning and sequencing vector, is well understood. The genetic structure of Ml 3 is well known, including the complete sequence (Schaller et al. in The Single-Stranded DNA Phages eds. Denhardt et al. (NY: CSHL Press, 1978)), the identity and function of the ten genes, and the order of transcription and location of the promoters, as well as the physical structure of the virion (Smith et al. (1985) Science 228:1315-1317; Raschad et al. (1986) Microbiol Dev 50:401-427; Kuhn et al. (1987) Science 238: 1413-1415; Zimmerman et al. (1982) J Biol Chem 257:6529-6536; and Banner et al. (1981) Nature 289:814-816). Because the genome is small (6423 bp), cassette mutagenesis is practical on RF Ml 3 (Current Protocols in Molecular Biology, eds. Ausubel et al. (NY: John Wiley & Sons, 1991)), as is single-stranded

oligonucleotide directed mutagenesis (Fritz et al. in DNA Cloning, ed by Glover (Oxford, UK: IRC Press, 1985)). M13 is a plasmid and transformation system in itself, and an ideal sequencing vector. Ml 3 can be grown on Ree- strains of £ coli. The M13 genome is expandable (Messing et al. in The Single-Stranded DNA Phages, eds Denhardt et al. (NY: CSHL Press, 1978) pages 449-453; and Fritz et al., supra) and M13 does not lyse cells. Extra genes can be inserted into M13 and will be maintained in the viral genome in a stable manner.

The mature capsule or Ff phage is comprised of a coat of five phage-encoded gene products: cpVUI, the major coat protein product of gene VUI that forms the bulk of the capsule; and four minor coat proteins, cpDI and cpJN at one end of the capsule and cpVH and cpLX at the other end of the capsule. The length of the capsule is formed by 2500 to 3000 copies of cp VUI in an ordered helix array that forms the characteristic filament structure. The gene JJI-encoded protein (cpILI) is typically present in 4 to 6 copies at one end of the capsule and serves as the receptor for binding of the phage to its bacterial host in the initial phase of infection. For detailed reviews of Ff phage structure, see Rasched et al., Microbiol. Rev., 50:401-427 (1986); and Model et al., in The Bacteriophages, Volume 2, R. Calendar, Ed., Plenum Press, pp. 375-456 (1988).

The phage particle assembly involves extrusion of the viral genome through the host cell's membrane. Prior to extrusion, the major coat protein cpVJH and the minor coat protein cpEQ are synthesized and transported to the host cell's membrane. Both cpVUI and cpIH are anchored in the host cell membrane prior to their incorporation into the mature particle. In addition, the viral genome is produced and coated with cpV protein. During the extrusion process, cpV-coated genomic DΝA is stripped of the cpV coat and simultaneously recoated with the mature coat proteins. Both cpHI and cpVUI proteins include two domains that provide signals for assembly of the mature phage particle. The first domain is a secretion signal that directs the newly synthesized protein to the host cell membrane. The secretion signal is located at the amino terminus of the polypeptide and targets the polypeptide at least to the cell membrane. The second domain is a membrane anchor domain that provides signals for association with the host cell membrane and for association with the phage particle during assembly. This second signal for both cpVUI and cpJJI comprises at least a hydrophobic region for spanning the membrane.

The 50 amino acid mature gene VUI coat protein (cp UI) is synthesized as a 73 amino acid precoat (lto et al. (1979) PNAS 76:1199-1203). The cpVUI protein has

been extensively studied as a model membrane protein because it can integrate into lipid bilayers such as the cell membrane in an asymmetric orientation with the acidic amino terminus toward the outside and the basic carboxy terminus toward the inside of the membrane. The first 23 amino acids constitute a typical signal-sequence which causes the nascent polypeptide to be inserted into the inner cell membrane. An E. coli signal peptidase (SP-I) recognizes amino acids 18, 21, and 23, and, to a lesser extent, residue 22, and cuts between residues 23 and 24 of the precoat (Kuhn et al. (1985) J. Biol. Chem. 260:15914-15918; and Kuhn et al. (1985) J. Biol. Chem. 260:15907-15913). After removal of the signal sequence, the amino terminus of the mature coat is located on the periplasmic side of the inner membrane; the carboxy terminus is on the cytoplasmic side. About 3000 copies of the mature coat protein associate side-by-side in the inner membrane.

The sequence of gene Vπi is known, and the amino acid sequence can be encoded on a synthetic gene. Mature gene VUI protein makes up the sheath around the circular ssDNA. The gene VUI protein can be a suitable anchor protein because its location and orientation in the virion are known (Banner et al. (1981) Nature 289:814- 816). Preferably, the test peptide is attached to the amino terminus of the mature M13 coat protein to generate the phage display library. As set out above, manipulation of the concentration of both the wild-type cpVHI and test peptide/cpVUI fusion in an infected cell can be utilized to decrease the avidity of the display and thereby enhance the detection of high affinity antibodies directed to the target epitop ).

Another vehicle for displaying the test peptide library is by expressing it as a domain of a chimeric gene containing part or all of gene HI. When monovalent displays are required, expressing the test peptide as a fusion protein with cpHI can be a preferred embodiment, as manipulation of the ratio of wild-type gpUI to chimeric cpHI during formation of the phage particles can be readily controlled. This gene encodes one of the minor coat proteins of Ml 3. In particular, the single-stranded circular phage DNA associates with about five copies of the gene HI protein and is then extruded through the patch of membrane-associated coat protein in such a way that the DNA is encased in a helical sheath of protein (Webster et al. in The Single-Stranded DNA Phages, eds Dressier et al. (NY:CSHL Press, 1978).

Manipulation of the sequence of cpffl has demonstrated that the C-terminal 23 amino acid residue stretch of hydrophobic amino acids normally responsible for a membrane anchor function can be altered in a variety of ways and retain the capacity to associate with membranes. Ff phage-based expression vectors were first described in

H i

which the cpIU amino acid residue sequence was modified by insertion of polypeptide "epitopes" (Parmely et al., Gene (1988) 73:305-318; and Cwirla et al., PNAS (1990) 87:6378-6382) or an amino acid residue sequence defining a larger polypeptide domain (McCafferty et al., Science (1990) 348:552-554). It has been demonstrated that insertions into gene HI can result in the production of novel protein domains on the virion outer surface. (Smith (1985) Science 228:1315-1317; and de la Cruz et al. (1988) J. Biol. Chem. 263:4318-4322). The test peptide-encoding gene may be fused to gene DI at the site used by Smith and by de la Cruz et al., e.g., at a codon corresponding to another domain boundary or to a surface loop of the protein, or to the amino terminus of the mature protein.

Similar constructions could be made with other filamentous phage. Pf3 is a well known filamentous phage that infects Pseudomonas aerugenosa cells that harbor an IncP-I plasmid. The entire genome has been sequenced ((Luiten et al. (1985) J. Virol. 56:268-276) and the genetic signals involved in replication and assembly are known (Luiten et al. (1987) DNA 6:129-137). The major coat protein of PF3 is unusual in having no signal peptide to direct its secretion. The sequence has charged residues ASP-7, ARG-37, LYS-40, and PHE44 which is consistent with the amino terminus being exposed. Thus, to cause a test peptide to appear on the surface of Pf3, a tripartite gene can be constructed which comprises a signal sequence known to cause secretion in P. aerugenosa, fused in-frame to a gene fragment encoding the test peptide sequence, which is fused in-frame to DNA encoding the mature Pf3 coat protein. Optionally, DNA encoding a flexible linker of one to 10 amino acids is introduced between the test peptide fragment and the Pf3 coat-protein gene. This tripartite gene is introduced into Pf3. Once the signal sequence is cleaved off, the test peptide is in the periplasm and the mature coat protein acts as an anchor and phage-assembly signal.

2) Bacteriophage X174

The bacteriophage φX174 is a very small icosahedral virus which has been thoroughly studied by genetics, biochemistry, and electron microscopy (see The Single Stranded DNA Phages (eds. Den hardt et al. (NY:CSHL Press, 1978)). Three gene products of φX174 are present on the outside of the mature virion: F (capsid), G (major spike protein, 60 copies per virion), and H (minor spike protein, 12 copies per virion). The G protein comprises 175 amino acids, while H comprises 328 amino acids. The F protein interacts with the single-stranded DNA of the virus. The proteins F, G, and H are translated from a single mRNA in the viral infected cells. As the virus is so tightly constrained because several of its genes overlap, φX174 is not typically used as a

cloning vector due to the fact that it can accept very little additional DNA. However, mutations in the viral G gene (encoding the G protein) can be rescued by a copy of the wild-type G gene carried on a plasmid that is expressed in the same host cell (Chambers et al. (1982) Nuc Acid Res 10:6465-6473). In one embodiment, one or more stop codons are introduced into the G gene so that no G protein is produced from the viral genome. Nucleic acid encoding the variegated peptide library can then be fused with the nucleic acid sequence of the H gene. An amount of the viral G gene equal to the size of the test peptide gene fragment is eliminated from the φX174 genome, such that the size of the genome is ultimately unchanged. Thus, in host cells also transformed with a second plasmid expressing the wild-type G protein, the production of viral particles from the mutant virus is rescued by the exogenous G protein source. Where it is desirable that only one test peptide be displayed per φX174 particle (e.g., monovalent), the second plasmid can further include one or more copies of the wild-type H protein gene so that a mix of H and test peptide/H proteins will be predominated by the wild- type H upon incoφoration into phage particles.

3) Large DNA Phage

Phage such as λ or T4 have much larger genomes than do Ml 3 or φX174, and have more complicated 3-D capsid structures than M13 or φPX174, with more coat proteins to choose from. In embodiments of the invention whereby the peptide library is processed and assembled into a functional form and associates with the bacteriophage particles within the cytoplasm of the host cell, bacteriophage λ and derivatives thereof are examples of suitable vectors. The intracellular morphogenesis of phage λ can potentially prevent protein domains that ordinarily contain disulfide bonds from folding correctly. However, variegated libraries expressing a population of functional antibodies, including both heavy and light chain variable regions, have been generated in λ phage, indicating that disulfide bonds can be formed in the test peptide library. (Huse et al. (1989) Science 246:1275-1281; Mullinax et al. (1990) PNAS 87:8095-8099; and Pearson et al. (1991) PNAS 88:2432-2436). Such strategies take advantage of the rapid construction and efficient transformation abilities of λ phage. When used for expression of peptide sequences, library DNA sequences may be readily inserted into a λ vector. For instance, variegated peptide libraries have been constructed by modification of λ ZAP II (Short et al. (1988) Nuc Acid Res 16:7583) comprising inserting the peptide-encoding nucleic acid into the multiple cloning site of a λ ZAP II vector (Huse et al. supra ).

b) Bacterial Cells as Display Packages

Recombinant peptides are able to cross bacterial membranes after the addition of bacterial leader sequences to the peptides (Better et al (1988) Science 240: 1041-1043; and Skerra et al. (1988) Science 240:1038-1041). Di addition, recombinant peptides have been fused to outer membrane proteins for surface presentation. Accordingly, one strategy for displaying test peptides on bacterial cells comprises generating a fusion protein by adding the test peptide to cell surface exposed portions of an integral outer membrane protein (Fuchs et al. (1991) Bio/Technology 9:1370-1372). In selecting a bacterial cell to serve as the display package, any well-characterized bacterial strain will typically be suitable, provided the bacteria may be grown in culture, engineered to display the peptide library on its surface, and is compatible with the particular affinity selection process practiced in the subject method. Among bacterial cells, the preferred display systems include Salmonella typhirnurium, Bacillus subtilis, Pseudomonas aeruginosa, Vibrio cholerae, Klebsiella pneumonia, Neisseria gonorrhoeae, Neisseria meningitidis, Bacteroides nodosus, Moraxella bovis, and especially Escherichia coli. Many bacterial cell surface proteins useful in the present invention have been characterized, and works on the localization of these proteins and the methods of determining their structure include Benz et al. (1988) Ann Rev Microbiol 42: 359-393; Balduyck et al. (1985) Biol Chem Hoppe-Seyler 366:9-14; Ehrmann et al (1990) PNAS 87:7574-7578; Heijne et al. (1990) Protein Engineering 4: 109-112; Ladner et al. U.S. Patent No. 5,223,409; Ladner et al. WO88/06630; Fuchs et al. (1991) Bio/technology 9: 1370-1372; and Goward et al. (1992) ΩBS 18:136-140.

To further illustrate, the LamB protein of E coli is a well understood surface protein that can be used to generate a variegated library of test peptides (see, for example, Ronco et al. (1990) Biochemie 72: 183-189; van der Weit et al. (1990) Vaccine 8:269-277; Charabit et al. (1988) Gene 70:181-189; and Ladner U.S. Patent No. 5,222,409). LamB of E. coli is a porin for maltose and maltodextrin transport, and serves as the receptor for adsorption of bacteriophages λ and K10. LamB is transported to the outer membrane if a functional N-terminal signal sequence is present (Benson et al. (1984) PNAS 81:3830-3834). As with other cell surface proteins, LamB is synthesized with a typical signal-sequence which is subsequently removed. Thus, the variegated peptide-encoding gene library can be cloned into the LamB gene such that the resulting library of fusion proteins comprise a portion of LamB sufficient to anchor the protein to the cell membrane with the test peptide portion oriented on the extracellular side of the membrane. Secretion of the extracellular portion of the fusion

Η

protein can be facilitated by inclusion of the LamB signal sequence, or other suitable signal sequence, as the N-terminus of the protein.

The E. coli LamB has also been expressed in functional form in S. typhimurium (Harkki et al. (1987) Mol Gen Genet 209:607-611), V. cholerae (Harkki et al. (1986) Microb Pathol 1:283-288), and K. pneumonia (Wehmeier et al. (1989) Mol Gen Genet 215:529-536), so that one could display a population of test peptides in any of these species as a fusion to E. coli LamB. Moreover, K. pneumonia expresses a maltoporin similar to LamB which could also be used. In P. aeruginosa, the Dl protein (a homologue of LamB) can be used (Trias et al. (1988) Biochem Biophys Acta 938:493- 496). Similarly, other bacterial surface proteins, such as PAL, OmpA, OmpC, OmpF, PhoE, pilin, BtuB, FepA, FhuA, IutA, FecA and FhuE, may be used in place of LamB as a portion of the display means in a bacterial cell. c) Bacterial Spores as Display Packages

Bacterial spores also have desirable properties as display package candidates in the subject method. For example, spores are much more resistant than vegetative bacterial cells or phage to chemical and physical agents, and hence permit the use of a great variety of affinity selection conditions. Also, Bacillus spores neither actively metabolize nor alter the proteins on their surface. However, spores have the disadvantage that the molecular mechanisms that trigger sporulation are less well worked out than is the formation of M13 or the export of protein to the outer membrane of £. coli, though such a limitation is not a serious detractant from their use in the present invention

Bacteria of the genus Bacillus form endospores that are ' extremely resistant to damage by heat, radiation, desiccation, and toxic chemicals (reviewed by Losick et al. (1986) Ann Rev Genet 20:625-669). This phenomenon is attributed to extensive intermolecular cross-linking of the coat proteins. In certain embodiments of the subject method, such as those which include relatively harsh affinity separation steps, such spores can be the preferred display package. Endospores from the genus Bacillus are more stable than are, for example, exospores from Streptomyces. Moreover, Bacillus subtilis forms spores in 4 to 6 hours, whereas Streptomyces species may require days or weeks to sporulate. In addition, genetic knowledge and manipulation is much more developed for B. subtilis than for other spore-forming bacteria.

Viable spores that differ only slightly from wild-type are produced in B. subtilis even if any one of four coat proteins is missing (Donovan et al. (1987) J Mol Biol

196: 1-10). Moreover, plasmid DNA is commonly included in spores, and plasmid encoded proteins have been observed on the surface of Bacillus spores (Debro et al. (1986) J Bacteriol 165:258-268). Thus, it can be possible during sporulation to express a gene encoding a chimeric coat protein comprising a test peptide of the variegated gene library, without interfering materially with spore formation.

To illustrate, several polypeptide components of B. subtilis spore coat (Donovan et al. (1987) J Mol Biol 196:1-10) have been characterized. The sequences of two complete coat proteins and amino-terminal fragments of two others have been determined. Fusion of the test peptide sequence to cotC or cotD fragments is likely to cause the test peptide to appear on the spore surface. The genes of each of these spore coat proteins are preferred as neither cotC or cotD are post-translationally modified (see Lader et al. U.S. Patent No. 5,223,409).

ii) Synthetic Peptide Libraries

In contrast to the recombinant methods, in vitro chemical synthesis provides a method for generating libraries of compounds, without the use of living organisms, that can be screened for ability to bind to a target protein. Although in vitro methods have been used for quite some time in the pharmaceutical industry to identify potential drugs, recently developed methods have focused on rapidly and efficiently generating and screening large numbers of compounds and are particularly amenable to generating peptide libraries for use in the subject method. The various approaches to simultaneous preparation and analysis of large numbers of synthetic peptides (herein "multiple peptide synthesis" or "MPS") each rely on the fundamental concept of synthesis on a solid support introduced by Merrifield in 1963 (Merrifield, R.B. (1963) J Am Chem Soc 85:2149-2154; and references cited in section I above). Generally, these techniques are not dependent on the protecting group or activation chemistry employed, although most workers today avoid Merrifield's original tBoc/Bzl strategy in favor of the more mild FmocΛBu chemistry and efficient hydroxybenzotriazole-based coupling agents. Many types of solid matrices have been successfully used in MPS, and yields of individual peptides synthesized vary widely with the technique adopted (e.g., nanomoles to millimoles).

<

A) Multipin Synthesis

One form that the peptide library of the subject method can take is the multipin library format. Briefly, Geysen and co-workers (Geysen et al. (1984) PNAS 81:3998- 4002) introduced a method for generating peptide by a parallel synthesis on polyacrylic acid-grated polyethylene pins arrayed in the microtitre plate format. In the original experiments, about 50 nmol of a single peptide sequence was covalently linked to the spherical head of each pin, and interactions of each peptide with receptor or antibody could be determined in a direct binding assay. The Geysen technique can be used to synthesize and screen thousands of peptides per week using the multipin method, and the tethered peptides may be reused in many assays. In subsequent work, the level of peptide loading on individual pins has been increased to as much as 2 μmol/pin by grafting greater amounts of functionalized acrylate derivatives to detachable pin heads, and the size of the peptide library has been increased (Valerio et al. (1993) Int J Pept Protein Res 42:1-9). Appropriate linker moieties have also been appended to the pins so that the peptides may be cleaved from the supports after synthesis for assessment of purity and evaluation in competition binding or functional bioassays (Bray et al. (1990) Tetrahedron Lett 31:5811-5814; Valerio et al. (1991) Anal Biochem 197:168-177; Bray et al. (1991) Tetrahedron Lett 32:6163-6166).

More recent applications of the multipin method of MPS have taken advantage of the cleavable linker strategy to prepare soluble peptides (Maeji et al. (1990) J Immunol Methods 134:23-33; Gammon et al. (1991) JExpMed 173:609-617; Mutch et al. (1991) Pept Res 4:132-137).

B) Divide-Couple-Recombine In yet another embodiment, a variegated library of peptides can provide on a set of beads utilizing the strategy of divide-couple-recombine (see, e.g., Houghten (1985) PNAS 82:5131-5135; and U.S. Patents 4,631,211; 5,440,016; 5,480,971). Briefly, as the name implies, at each synthesis step where degeneracy is introduced into the library, the beads are divided into as many separate groups to correspond to the number of different amino acid residues to be added that position, the different residues coupled in separate reactions, and the beads recombined into one pool for the next step.

In one embodiment, the divide-couple-recombine strategy can be carried out using the so-called "tea bag" MPS method first developed by Houghten, peptide synthesis occurs on resin that is sealed inside porous polypropylene bags (Houghten et al. (1986) PNAS 82:5131-5135). Amino acids are coupled to the resins by placing the

m

bags in solutions of the appropriate individual activated monomers, while all common steps such as resin washing and α-amino group deprotection are performed simultaneously in one reaction vessel. At the end of the synthesis, each bag contains a single peptide sequence, and the peptides may be liberated from the resins using a multiple cleavage apparatus (Houghten et al. (1986) Int J Pept Protein Res 27:673- 678). This technique offers advantages of considerable synthetic flexibility and has been partially automated (Beck-Sickinger et al. (1991) Pept Res 4:88-94). Moreover, soluble peptides of greater than 15 amino acids in length can be produced in sufficient quantities (>. 500 μmol) for purification and complete characterization if desired. Multiple peptide synthesis using the tea-bag approach is useful for the production of a peptide library, albeit of limited size, for screening the present method, as is illustrated by its use in a range of molecular recognition problems including antibody epitope analysis (Houghten et al. (1986) PNAS 82:5131-5135), peptide hormone structure-function studies (Beck-Sickinger et al. (1990) Int J Pept Protein Res 36:522-530; Beck-Sickinger et al. (1990) Eur J Biochem 194:449-456), and protein conformational mapping (Zimmerman et al. (1991) Eur J Biochem 200:519-528).

An exemplary synthesis of a set of mixed peptides having equimolar amounts of the twenty natural amino acid residues is as follows. Aliquots of five grams (4.65mmols) of p-methylbenzhydrylamine hydrochloride resin (MBHA) are placed into twenty porous polypropylene bags. These bags are placed into a common container and washed with 1.0 liter of CH 2 C1 2 three times (three minutes each time), then again washed three times (three minutes each time) with 1.0 liter of 5 percent DDΞA/CH 2 Cl 2 (DDSA = diisopropylethylamine; CH 2 C1 2 = DCM). The bags are then rinsed with DCM and placed into separate reaction vessels each containing 50 ml (0.56M) of the respective t- BOC-amino acid/DCM. N,N-Diisopropylcarbodiimide (DffCDI; 25 ml; 1.12M) is added to each container, as a coupling agent. Twenty amino acid derivatives are separately coupled to the resin in 50/50 (v/v) DMF/DCM. After one hour of vigorous shaking, Gisen's picric acid test (Gisen (1972) Anal. Chem. Acta 58:248-249) is performed to determine the completeness of the coupling reaction. On confirming completeness of reaction, all of the resin packets are then washed with 1.5 liters of DMF and washed two more times with 1.5 liters of CH 2 C1 2 . After rinsing, the resins are removed from their separate packets and admixed together to form a pool in a common bag. The resulting resin mixture is then dried and weighed, divided again into 20 equal portions (aliquots), and placed into 20 further polypropylene bags (enclosed).

t

In a common reaction vessel the following steps are carried out: (1) deprotection is carried out on the enclosed aliquots for thirty minutes with 1.5 liters of 55 percent TFA/DCM; and 2) neutralization is carried out with three washes of 1.5 liters each of 5 percent DLEA/DCM. Each bag is placed in a separate solution of activated t-BOC- amino acid derivative and the coupling reaction carried out to completion as before. All coupling reactions are monitored using the above quantitative picric acid assay.

Next, the bags are opened and the resulting t-BOC-protected dipeptide resins are mixed together to form a pool, aliquots are made from the pool, the aliquots are enclosed, deprotected and further reactions are carried out. This process can be repeated any number of times yielding at each step an equimolar representation of the desired number of amino acid residues in the peptide chain. The principal process steps are conveniently referred to as a divide-couple-recombine synthesis.

After a desired number of such couplings and mixtures are carried out, the polypropylene bags are kept separated to here provide the twenty sets having the amino- terminal residue as the single, predetermined residue, with, for example, positions 2-4 being occupied by equimolar amounts of the twenty residues. To prepare sets having the single, predetermined amino acid residue at other than the amino-terminus, the contents of the bags are not mixed after adding a residue at the desired, predetermined position. Rather, the contents of each of the twenty bags are separated into 20 aliquots, deprotected and then separately reacted with the twenty amino acid derivatives. The contents of each set of twenty bags thus produced are thereafter mixed and treated as before-described until the desired oligopeptide length is achieved.

C) Multiple Peptide Synthesis through Coupling of Amino Acid Mixtures Simultaneous coupling of mixtures of activated amino acids to a single resin support has been used as a multiple peptide synthesis strategy on several occasions (Geysen et al. (1986) Mol Immunol 23:709-715; Tjoeng et al. (1990) Int JPept Protein Res 35:141-146; Rutter et al. (1991) U.S. Patent No. 5,010,175; Birkett et al. (1991) Anal Biochem 196:137-143; Petithory et al. (1991) PNAS 88:11510-11514) and can have applications in the subject method. For example, four to seven analogs of the magainin 2 and angiotensinogen peptides were successfully synthesized and resolved in one HPLC purification after coupling a mixture of amino acids at a single position in each sequence (Tjoeng et al. (1990) Int JPept Protein Res 35; 141-146). This approach has also been used to prepare degenerate peptide mixtures for defining the substrate specificity of endoproteolytic enzymes (Birkett et al. (1991) Anal Biochem 196:137-

143; Petithory et al. (1991) PNAS 88:11510-11514). In these experiments a series of amino acids was substituted at a single position within the substrate sequence. After proteolysis, Edman degradation was used to quantitate the yield of each amino acid component in the hydrolysis product and hence to evaluate the relative k^ K r* values for each substrate in the mixture.

However, it is noted that the operational simplicity of synthesizing many peptides by coupling monomer mixtures is offset by the difficulty in controlling the composition of the products. The product distribution reflects the individual; rate constants for the competing coupling reactions, with activated derivatives of sterically hindered residues such as valine or isoleucine adding at a significantly slower rate than glycine or alanin for example. The nature of the resin-bound component of the acylation reaction also influences the addition rate, and the relative rate constants for the formation of 400 dipeptides form the 20 genetically coded amino acids have been determined by Rutter and Santi (Rutter et al. (1991) U.S. Patent No. 5,010,175). These reaction rates can be used to guide the selection of appropriate relative concentrations of amino acids in the mixture to favor more closely equimolar coupling yields.

D) Multiple Peptide Synthesis on Nontraditional Solid Supports

The search for innovative methods of multiple peptide synthesis has led to the investigation of alternative polymeric supports to the polystyrene-divinylbenzene matrix originally popularized by Merrifield. Cellulose, either in the form of paper disks (Blankemeyer-Menge et al. (1988) Tetrahedron Lett 29-5871-5874; Frank et al. (1988) Tetrahedron 44:6031-6040; Eichler et al. (1989) Collect Czech Chem Commun 54: 1746-1752; Frank, R. (1993) Bioorg Med Chem Lett 3:425-430) or cotton fragments (Eichler et al. (1991) Pept Res 4:296-307; Schmidt et al. (1993) BioorgMed Chem Lett 3:441-446) has been successfully functionalized for peptide synthesis. Typical loadings attained with cellulose paper range from 1 to 3 μmol/cm 2 , and HPLC analysis of material cleaved from these supports indicates a reasonable quality for the synthesized peptides. Alternatively, peptides may be synthesized on cellulose sheets via non- cleavable linkers and then used in ELISA-based binding studies (Frank, R. (1992) Tetrahedron 48:9217-9232). The porous, polar nature of this support may help suppress unwanted nonspecific protein binding effects. By controlling the volume of activated amino acids and other reagents spotted on the paper, the number of peptides synthesized at discrete locations on the support can be readily varied. In one convenient configuration spots are made in an 8 x 12 microtiter plate format. Frank has used this technique to map the dominant epitopes of an antiserum raised against a human

cytomegalovirus protein, following the overlapping peptide screening (Pepscan) strategy of Geysen (Frank, R. (1992) Tetrahedron 48:9217-9232). Other membrane-like supports that may be used for multiple solid-phase synthesis include polystyrene-grafted polyethylene films (Berg et al. (1989) JAm Chem Soc 111 : 8024-8026).

E) Combinatorial Libraries by Light-Directed, Spatially Addressable Parallel Chemical Synthesis

A scheme of combinatorial synthesis in which the identity of a compound is given by its locations on a synthesis substrate is termed a spatially-addressable synthesis. In one embodiment, the combinatorial process is carried out by controlling the addition of a chemical reagent to specific locations on a solid support (Dower et al. (1991) Annu Rep Med Chem 26:271-280; Fodor, S P A. (1991) Science 251:767; Pirrung et al. (1992) U.S. Patent No. 5,143,854; Jacobs et al. (1994) Trends Biotechnol 12: 19-26). The technique combines two well-developed technologies: solid-phase peptide synthesis chemistry and photolithography. The high coupling yields of Merrifield chemistry allow efficient peptide synthesis, and the spatial resolution of photolithography affords miniaturization. The merging of these two technologies is done through the use of photolabile amino protecting groups in the Merrifield synthetic procedure.

The key points of this technology are illustrated in Gallop et al. (1994) J Med Chem 37: 1233-1251. A synthesis substrate is prepared for amino acid coupling through the covalent attachment of photolabile nitroveratryloxycarbonyl (NVOC) protected amino linkers. Light is used to selectively activate a specified region of the synthesis support for coupling. Removal of the photolabile protecting groups by lights (deprotection) results in activation of selected areas. After activation, the first of a set of amino acids, each bearing a photolabile protecting group on the amino terminus, is exposed to the entire surface. Amino acid coupling only occurs in regions that were addressed by light in the preceding step. The solution of amino acid is removed, and the substrate is again illuminated through a second mask, activating a different region for reaction with a second protected building block. The pattern of masks and the sequence of reactants define the products and their locations. Since this process utilizes photolithography techniques, the number of compounds that can be synthesized is limited only by the number of synthesis sites that can be addressed with appropriate resolution. The position of each compound is precisely known; hence, its interactions with other molecules can be directly assessed. The target protein can be labeled with a fluorescent reporter group to facilitate the identification of specific interactions with individual members of the matrix.

In a light-directed chemical synthesis, the products depend on the pattem of illumination and on the order of addition of reactants. By varying the lithographic patterns, many different sets of test peptides can be synthesized in the same number of steps; this leads to the generated of many different masking strategies.

F) Encoded Combinatorial Libraries

In yet another embodiment, the subject method utilizes a peptide library provided with an encoded tagging system. A recent improvement in the identification of active compounds from combinatorial libraries employs chemical indexing systems using tags that uniquely encode the reaction steps a given bead has undergone and, by inference, the structure it carries. Conceptually, this approach mimics phage display libraries above, where activity derives from expressed peptides, but the structures of the active peptides are deduced from the corresponding genomic DNA sequence. The first encoding of synthetic combinatorial libraries employed DNA as the code. Two forms of encoding have been reported, encoding with sequenceable bio-oligomers (e.g., oligonucleotides and peptides), and binary encoding with non-sequenceable tags.

I) Tagging with sequenceable bio-oligomers

The principle of using oligonucleotides to encode combinatorial synthetic libraries was described in 1992 (Brenner et al. (1992) PNAS 89:5381-5383), and an example of such a library appeared the following year (Needles et al. (1993) PNAS 90:10700-10704). A combinatorial library of nominally 7 7 (= 823,543) peptides composed of all combinations of Arg, Gin, Phe, Lys, Val, D-Val and Thr (three-letter amino acid code), each of which was encoded by a specific dinucleotide (TA, TC, CT, AT, TT, CA and AC, respectively), was prepared by a series of alternating rounds of peptide and oligonucleotide synthesis on solid support. In this work, the amine linking functionality on the bead was specifically differentiated toward peptide or oligonucleotide synthesis by simultaneously preincubating the beads with reagents that generate protected OH groups for oligonucleotide synthesis and protected NH 2 groups for peptide synthesis (here, in a ratio of 1:20). When complete, the tags each consisted of 69-mers, 14 units of which carried the code. The bead-bound library was incubated with a fluorescently labeled antibody, and beads containing bound antibody that fluoresced strongly were harvested by fluorescence-activated cell sorting (FACS). The DNA tags were amplified by PCR and sequenced, and the predicted peptides were synthesized. Following the such techniques, the peptide libraries can be derived for use in the subject method and screened using the D-enantiomer of the target protein.

It is noted that an alternative approach useful for generating nucleotide-encoded synthetic peptide libraries employs a branched linker containing selectively protected OH and NH 2 groups (Nielsen et al. (1993) J Am Chem Soc 115:9812-9813; and Nielsen et al. (1994) Methods Compan Methods Enzymol 6:361-371). This approach requires that equimolar quantities of test peptide and tag co-exist, though this may be a potential complication in assessing biological activity, especially with nucleic acid based targets.

The use of oligonucleotide tags permits exquisitely sensitive tag analysis. Even so, the method requires careful choice of orthogonal sets of protecting groups required for alternating co-synthesis of the tag and the library member. Furthermore, the chemical lability of the tag, particularly the phosphate and sugar anomeric linkages, may limit the choice of reagents and conditions that can be employed for the synthesis on non- oligomeric libraries. In preferred embodiments, the libraries employ linkers permitting selective detachment of the test peptide library member for bioassay, in part (as described infra) because assays employing beads limit the choice of targets, and in part because the tags are potentially susceptible to biodegradation.

Peptides themselves have been employed as tagging molecules for combinatorial libraries. Two exemplary approaches are described in the art, both of which employ branched linkers to solid phase upon which coding and ligand strands are alternately elaborated. In the first approach (Kerr JM et al. (1993) J Am Chem Soc 115:2529- 2531), orthogonality in synthesis is achieved by employing acid-labile protection for the coding strand and base-labile protection for the ligand strand.

In an alternative approach (Nikolaiev et al. (1993) Pept Res 6:161-170), branched linkers are employed so that the coding unit and the test peptide are both attached to the same functional group on the resin. In one embodiment, a linker can be placed between the branch point and the bead so that cleavage releases a molecule containing both code and ligand (Ptek et al. (1991) Tetrahedron Lett 32:3891-3894). In another embodiment, the linker can be placed so that the test peptide can be selectively separated from the bead, leaving the code behind. This last construct is particularly valuable because it permits screening of the test peptide without potential interference, or biodegradation, of the coding groups. Examples in the art of independent cleavage and sequencing of peptide library members and their corresponding tags has confirmed that the tags can accurately predict the peptide structure.

It is noted that peptide tags are more resistant to decomposition during ligand synthesis than are oligonucleotide tags, but they must be employed in molar ratios nearly equal to those of the ligand on typical 130 μm beads in order to be successfully

sequenced. As with oligonucleotide encoding, the use of peptides as tags requires complex protection/deprotection chemistries.

2) Non-sequenceable tagging: binary encoding An alternative form of encoding the test peptide library employs a set of non- sequenceable electrophone tagging molecules that are used as a binary code (Ohlmeyer et al. (1993) PNAS 90:10922-10926). Exemplary tags are haloaromatic alkyl ethers that are detectable as their tetramethylsilyl ethers at less than femtomolar levels by electron capture gas chromatography (ECGC). Variations in the length of the alkyl chain, as well as the nature and position of the aromatic halide substituents, permit the synthesis of at least 40 such tags, which in principle can encode 2 40 (e.g., upwards of IO 12 ) different molecules. In the original report (Ohlmeyer et al., supra) the tags were bound to about 1% of the available amine groups of a peptide library via a photocleavable O-nitrobenzyl linker. This approach is convenient when preparing combinatorial libraries of peptides or other am e-containing molecules. A more versatile system has, however, been developed that permits encoding of essentially any combinatorial library. Here, the ligand is attached to the solid support via the photocleavable linker and the tag is attached through a catechol ether linker via carbene insertion into the bead matrix (Nestler et al. (1994) J Org Chem 59:4723-4724). This orthogonal attachment strategy permits the selective detachment of library members for bioassay in solution and subsequent decoding by ECGC after oxidative detachment of the tag sets.

Binary encoding with electrophoric tags has been particularly useful in defining selective interactions of substrates with synthetic receptors (Borchardt et al. (1994) J Am Chem Soc 116:373-374), and model systems for understanding the binding and catalysis of biomolecules. Even using detailed molecular modelmg, the identification of the selectivity preferences for synthetic receptors has required the manual synthesis of dozens of potential substrates. The use of encoded libraries makes it possible to rapidly examine all the members of a potential binding set. The use of binary-encoded libraries has made the determination of binding selectivities so facile that structural selectivity has been reported for four novel synthetic macrobicyclic and tricyclic receptors in a single communication (Wennemers et al. (1995) J Org Chem 60: 1108-1109; and Yoon et al. (1994) Tetrahedron Lett 35:8557-8560) using the encoded library mentioned above. Similar facility in defining specificity of interaction would be expected for many other biomolecules.

SLL

Although the several amide-linked libraries in the art employ binary encoding with the electrophoric tags attached to amine groups, attaching these tags directly to the bead matrix provides far greater versatility in the structures that can be prepared in encoded combinatorial libraries. Attached in this way, the tags and their linker are nearly as unreactive as the bead matrix itself. Two binary-encoded combinatorial libraries have been reported where the electrophoric tags are attached directly to the sohd phase (Ohlmeyer et al. (1995) PNAS 92:6027-6031) and provide guidance for generating the subject peptide library. Both libraries were constructed using an orthogonal attachment strategy in which the library member was linked to the solid support by a photolabile linker and the tags were attached through a linker cleavable only by vigorous oxidation. Because the library members can be repetitively partially photoeluted from the solid support, library members can be utilized in multiple assays. Successive photoelution also permits a very high throughput iterative screening strategy: first, multiple beads are placed in 96-well microtiter plates; second, ligands are partially detached and transferred to assay plates; third, a bioassay identifies the active wells; fourth, the corresponding beads are rearrayed singly into new microtiter plates; fifth, single active compounds are identified; and sixth, the structures are decoded.

The above approach was employed in screening for carbonic anhydrase (CA) binding and identified compounds which exhibited nanomolar affinities for CA. Unlike sequenceable tagging, a large number of structures can be rapidly decoded from binary- encoded libraries (a single ECGC apparatus can decode 50 structures per day). Thus, binary-encoded libraries can be used for the rapid analysis of structure-activity relationships and optimization of both potency and selectivity of an active series. The synthesis and screening of large unbiased binary encoded peptide libraries for lead identification, followed by preparation and analysis of smaller focused libraries for lead optimization, offers a particularly powerful approach to drug discovery using the subject method.

iii) Nucleic Acid Libraries In another embodiment, the library is comprised of a variegated pool of nucleic acids, e.g. single or double-stranded DNA or ARNA. A variety of techniques are known in the art for generating screenable nucleic acid libraries which may be exploited in the present invention. In particular, many of the techniques described above for synthetic peptide libraries can be used to generate nucleic acid libraries of a variety of formats. For example, divide-couple-recombine techniques can be used in conjugation

S3

with standard nucleic acid synthesis techniques to generate bead immobilized nucleic acid libraries.

In another embodiment, solution libraries of nucleic acids can be generated which rely on PCR techniques to amplify for sequencing those nucleic acid molecules which selectively bind the screening target. By such techniques, libraries approaching IO 15 different nucleotide sequences have been generated in solution (see, for example, Bartel and Szostak (1993) Science 261:1411-1418; Bock et al. (1992) Nature 355:564; Ellington et al. (1992) Nature 355:850-852; and Oliphant et al. (1989) Mol Cell Biol 9:2944-2949). According to one embodiment of the subject method, the SELEX (systematic evolution of ligands by exponential enrichment) is employed with the enantiomeric screening target. See, for example, Tuerk et al. (1990) Science 249:505-510 for a review of SELEX. Briefly, in the first step of these experiments on a pool of variant nucleic acid sequences is created, e.g. as a random or semi-random library. In general, an invariant 3' and (optionally) 5' primer sequence are provided for use with PCR anchors or for permitting subcloning. The nucleic acid library is applied to screening a target, and nucleic acids which selectively bind (or otherwise act on the target) are isolated from the pool, the isolates are amplified by PCR and subcloned into, for example, phagemids. The phagemids are then transfected into bacterial cells, and individual isolates can be obtained and the sequence of the nucleic acid cloned from the screening pool can be determined.

When RNA is the test ligand, the RNA library can be directly synthesized by standard organic chemistry, or can be provided by in vitro translation as described by Tuerk et al., supra. Likewise, RNA isolated by binding to the screening target can be reverse transcribed and the resulting cDNA subcloned and sequenced as above.

iv) Carbohydrate Libraries

In still another embodiment, the library can be provided as a mixture of carbohydrates. As described above, most sugars which make up carbohydrates are D- sugars and are readily available and/or easily synthesized. Moreover, the synthesis of polysaccharides has been rapidly advanced in recent years, both by direct chemical synthesis and by in vitro enzyme-mediated synthesis of polysaccharide.

The generation of carbohydrate diversity, particularly in the form of polysaccharide libraries, is state-of-the-art. See, for example, Kanie et al. (1995) Angew

S(o

Chem Int Ed Engl 34:2720-2722; Malik et al. (1990) Chemiker-Zeitung 114:371-375; and PCT publication WO 95/03315. Moreover, the advances made in polysaccharide synthesis, along with the rapidly expanding combinatorial chemistry field, plainly point the direction to a variety of other readily accessible techniques for generating variegated carbohydrate libraries for use in the present screening technology. That is, the methodology has reached the stage where complex oligosaccharides and other glycoconjugate arrays are readily provided.

In general, the carbohydrate library, like the peptide and nucleic acid libraries above, can be generated in any of a number of formats. For example, the carbohydrates can be provided free in solution or immobilized to insoluble supports. The members of the library can be derived with separate tags encoding their identity, or other deconvolution and/or detection methods can be used to identify those carbohydrate ligands isolated from a library by the subject method.

In an exemplary embodiment, the oligosaccharide is provided on a solid support and synthesized by step-wise reactions. For example, a first sugar can be attached to the support at its reducing end.

Thus, at each elongation step, the glycosyl acceptor is linked to the solid phase and coupling occurs with a solution-based donor. As the next cycle is contemplated, a unique acceptor hydroxyl must be exposed in the solid phase. This requires the donor used in the previous glycosidation to be furnished with a uniquely deprotectable blocking group (-OP) at the site of elongation (Danishefsky et al. (1993) Science 260:1307-1309). Such synthesis strategies can be carried out, for example, in a divide- couple-recombine format as described above to yield a variegated library of beads where each individual bead is homogenous with respect to the oligosaccharide type it supports.

In another exemplary embodiment, the carbohydrate library can be provided free in solution and deconvolution techniques employed to identify those members of solution library which bind to the target. For example, glycals can be used in the

solution synthesis of oligosaccharide ensembles (Suzuki et al. (1990) J Am Chem Soc 112:8895; and Danishefsky et al. (1992) JAm Chem Soc 114:8329). Briefly, the glycals offer the advantage of ease in differential protection. To illustrate one exemplary synthesis, a glycal is activated by conversion to an epoxide, such as by treatment with

3, 3-dimethyldioxirane (Danishefsky et al. (1993) Science 260:1307). Opening of the epoxide by glycosyl acceptor gives a glycoside linkage. Moreover, each glycosidation gives rise to a unique, free hydroxyl group at C2 of the previous donor, which is of potentially great value for the synthesis of branched oligosaccharides. Similar synthesis techniques can of course be carried out on solid phase as well.

In still other embodiments, the carbohydrate library, because it consists of the D- sugars, can be isolated from natural sources, e.g. such as from cells in culture. The carbohydrate library can also be provided in the form of glycopeptides. To illustrate, Ruberge et al. Science 269:202-204, describes a convergent synthesis approach to generating glycopeptides utilizing the epoxide activated glycals describe above. According to this method, combinatorial libraries in which the carbohydrate moiety is varied, or in which both the carbohydrate and peptide moieties are varied are contemplated.

Other combinatorial libraries of glycopeptides are known in the art as, for example, described in PCT publications WO 95/18971, WO 95/21850, and WO 95/03315.

v) Small Molecule Libraries

Recent trends in the search for novel pharmacological agents have focused on the preparation of chemical libraries. Peptide, nucleic acid, and saccharide libraries are described above. However, the field of combinatorial chemistry has also provided large numbers of non-polymeric, small organic molecule libraries which can be employed in the subject method.

Exemplary combinatorial libraries include benzodiazepines, peptoids, biaryls and hydantoins. In general, the same techniques described above for the various formats of chemically synthesized peptide libraries are also used to generate and (optionally) encode synthetic non-peptide libraries.

III. Selecting Compounds from the Library

As with the diversity contemplated for the screening target and form in which the compound library is provided, the subject method is envisaged with a variety of detection methods for isolating and identifying compounds which interact with the screening target. In most embodiments, the screening programs which test libraries of compounds will be derived for high throughput analysis in order to maximize the number of compounds surveyed in a given period of time. However, as a general rule, the screening portion of the subject method involves contacting the screening target with the compound library and isolating those compounds from the library which interact with the screening target. Such interaction may be detected, for example, based on directly detecting the binding of the compounds to the screening target, or inferred through the modulation of interactions involving the screening target with other molecules, such as protein-protein or protein-DNA interaction involving the screening target or modulation of an enzymatic/catalytic activity of the screening target. The efficacy of the test compounds can be assessed by generating dose response curves from data obtained using various concentrations of the test compound. Moreover, a control assay can also be performed to provide a baseline for comparison.

Complex formation between a test compounds and a screening target may be directly detected by a variety of techniques. The complexes can be scored for using, for example, detectably labeled compounds or screening targets, such as radiolabeled, fluorescently labeled, or enzymatically labeled polypeptides, by immunoassay, or by chromatographic detection.

In one embodiment, the variegated compound library is subjected to affinity enrichment in order to select for compounds which bind a preselected screening target. The term "affinity separation" or "affinity enrichment" includes, but is not limited to (1) affinity chromatography utilizing immobilizing screening targets, (2) precipitation using screening targets, (3) fluorescence activated cell sorting where the compound library is so amenable, (4) agglutination, and (5) plaque lifts. In each embodiment, the library of compounds are ultimately separated based on the ability of a particular compound to bind a screening target of interest. See, for example, the Ladner et al. U.S. Patent No.

5,223,409; the Kang et al. International Publication No. WO 92/18619; the Dower et al. International Publication No. WO 91/17271; the Winter et al. International Publication WO 92/20791; the Markland et al. International Publication No. WO 92/15679; the Breitling et al. International Publication WO 93/01288; the McCafferty et al. International Publication No. WO 92/01047; the Garrard et al. International Publication No. WO 92/09690; and the Ladner et al. International Publication No. WO 90/02809.

With respect to affinity chromatography, it will be generally understood by those skilled in the art that a great number of chromatography techniques can be adapted for use in the present invention, ranging from column chromatography to batch elution, and including ELISA and reverse biopanning techniques. Typically the screening target is immobilized on an insoluble carrier, such as sepharose or polyacrylamide beads, or, alternatively, the wells of a microtitre plate.

The population of compounds is applied to the affinity matrix under conditions compatible with the binding of compounds in the library to the immobilized screening target. The population is then fractionated by washing with a solute that does not greatly effect specific binding of compounds to the screening target, but which substantially disrupts any non-specific binding of components the library to the screening target or matrix. A certain degree of control can be exerted over the binding characteristics of the compounds recovered from the library by adjusting the conditions of the binding incubation and subsequent washing. The temperature, pH, ionic strength, divalent cation concentration, and the volume and duration of the washing can select for compounds within a particular range of affinity and specificity. Selection based on slow dissociation rate, which is usually predictive of high affinity, is a very practical route. This may be done either by continued incubation in the presence of a saturating amount of free screening target, or by increasing the volume, number, and length of the washes. In each case, the rebinding of dissociated compounds from the applied library is prevented, and with increasing time, compounds of higher and higher affinity are recovered. Moreover, additional modifications of the binding and washing procedures may be applied to find compounds with special characteristics. The affinities of some compounds may be dependent on ionic strength or cation concentration. Specific examples are peptides which depend on Ca 4-1" or other ions for binding activity and which release from the screening target in the presence of a chelating agent such as EGTA. (see, Hopp et al. (1988) Biotechnology 6:1204-1210). Such peptides may be identified in the compound library by a double screening technique isolating first those

that bind the screening target in the presence of Ca -1" *, and by subsequently identifying those in this group that fail to bind in the presence of EGTA.

After "washing" to remove non-specifically members of the compound library, when desired, specifically compounds can be eluted by either specific desorption (using excess screening target) or non-specific desoφtion (using pH, polarity reducing agents, or chaotropic agents). In preferred embodiments using biological display packages, the elution protocol does not kill the organism used as the display package such that the enriched population of display packages can be further amplified by reproduction. The list of potential eluants includes salts (such as those in which one of the counter ions is Na + , NH4+ Rb + , SO4 2 -, H 2 PO 4 -, citrate, K+ Li + , Cs+, HSO4-, CO 3 2 ", Ca 2+ , Sr 2 + CL-, PO4 2 -, HCO3-, Mg 2 + , Ba 2 + , Br, HPO4 2 -, or acetate), acid, heat, and, when available, soluble forms of the target antigen (or analogs thereof). Because bacteria continue to metabolize during the affinity separation step and are generally more susceptible to damage by harsh conditions, the choice of buffer components (especially eluates) can be more restricted when the display package is a bacteria rather than for phage or spores. Neutral solutes, such as ethanol, acetone, ether, or urea, are examples of other agents useful for eluting the bound display packages.

In preferred embodiments of biological peptide displays or certain nucleic acid libraries, affinity enriched packages or nucleic acids are iteratively amplified and subjected to further rounds of affinity separation until enrichment of the desired binding activity is detected. In certain embodiments, the specifically bound biological display packages, especially bacterial cells, need not be eluted per se, but rather, the matrix bound display packages can be used directly to inoculate a suitable growth media for amplification. Where the display package is a phage particle, the fusion protein generated with the coat protein can interfere substantially with the subsequent amplification of eluted phage particles, particularly in embodiments wherein the cpJH protein is used as the display anchor. Even though present in only one of the 5-6 tail fibers, some peptide constructs because of their size and/or sequence, may cause severe defects in the infectivity of their carrier phage. This causes a loss of phage from the population during reinfection and amplification following each cycle of panning. In one embodiment, the peptide can be derived on the surface of the display package so as to be susceptible to proteolytic cleavage which severs the covalent linkage of at least the antigen binding sites of the displayed peptide from the remaining package. For instance, where the cpHI coat protein of Ml 3 is employed, such a strategy can be used to obtain infectious phage

by treatment with an enzyme which cleaves between the peptide portion and cpJH portion of a tail fiber fusion protein (e.g. such as the use of an enterokinase cleavage recognition sequence).

To further minimize problems associated with defective infectivity, DNA prepared from the eluted phage can be transformed into host cells by electroporation or well known chemical means. The cells are cultivated for a period of time sufficient for marker expression, and selection is applied as typically done for DNA transformation.

The colonies are amplified, and phage harvested for a subsequent round(s) of panning.

After isolation of biological display packages which encode peptides having a desired binding specificity for the screening target, the nucleic acid encoding the peptide for each of the purified display packages can be recloned in a suitable eukaryotic or prokaryotic expression vector and transfected into an appropriate host for production of large amounts of protein.

On the other hand, where chemically synthesized libraries are used in the form of display packages, the isolated peptides are identified either directly from the display, e.g., by direct microsequencing, or the display packages are appropriately decoded, e.g., by elucidating the identity of an associated tag/index. Deconvolution techniques are also known in the art.

It will be apparent that, in addition to utilizing binding as the separation criteria, compound libraries can be fractionated based on other activities of the target molecule, such as modulation of catalytic activity.

Exemplification

The invention now being generally described, it will be more readily understood by reference to the following examples which are included merely for puφoses of illustration of certain aspects and embodiments of the present invention, and are not intended to limit the invention.

A. Chemical synthesis of D-peptide drug targets The chemical methods for D-peptide synthesis are identical to those of L-peptide synthesis, except that D-amino acid monomeric precursors are used instead of L-amino acid monomeric precursors.

i) B-D-GRP

The D-antiomer of the 27-amino-acid human GRP (D-GRP), was synthesized and purified commercially by Peptide Technologies Coφoration, MD, using FMOC chemistry, with a biotin and a spacer of two glycines incoφorated onto its amino terminus during the synthesis by standard methods. The structure is depicted using the standard single letter amino acid abbreviations:

Biotinyl-GGVPLPAGGGTVLTKMYPRGNHWAVGHLM-NH 2 (all D-amino acids)

The amino terminus was chosen for biotinylation because it is unnecessary for bombesin receptor binding (Cuttitta et al. (1985) Nature 316:823).

After purification by HPLC, HPLC analysis revealed that 97% of the product resided in one peak. Amino acid and mass spectral analyses supported the expected structure. The dominant mass spectral peak corresponded to a molecular weight of 3200 (calculated molecular weight = 3200.75). Since chemically synthesized L-GRP is known to fold into its active conformation spontaneously (Bachem, CA), the B-D-GRP target was assumed to also fold spontaneously into the mirror image structure, assuming minimal, if any, interference from the N-terminal biotinyl-gly-gly structure. ii) B-D-neuRL2 As with D-GRP, the eight amino-acid D-peptide derivative of human neuRL2 was chemically synthesized and purified commercially by Peptide Technologies Coφoration, MD, with a biotinyl-gly-gly spacer sequence incoφorated at its N- terminus:

Biotinyl-GGLGLRSLRE-amide (all D-amino acids) After purification by HPLC, HPLC analysis revealed that 100% of the product resided in one peak. Amino acid and mass spectral analyses supported the expected structure. The dominant mass spectral peak corresponded to a molecular weight of 1286.7 (calculated molecular weight = 1286.53).

Since chemically synthesized neuRL2 is known to fold into its active conformation spontaneously, the B-D-neuRL2 target was assumed to also fold spontaneously into the mirror image structure, assuming minimal, if any, interference from the N-terminal biotinyl-gly-gly structure. iii) B-D-IL-8(4-72)

Chemical synthesis of D-IL-8(4-72)

The D-enantiomer of the human 69-amino-acid U-8 (4-72) was kindly synthesized by Dr. Ian Clark-Lewis (Institute of Medical and Veterinary Science,

Adelaide, South Australia) in the un-biotinylated form to aid purification. The three N- terminal amino acids of IL-8 were omitted because they are unnecessary for biological activity (Rajarathnam et al. (1994) Science 264:90).

After purification by HPLC, the peptide was refolded under conditions known to refold L-IL-8. HPLC analysis revealed that 98% of the product resided in one peak, and amino acid and mass spectral analyses supported the expected structure. All analyses gave identical results to those normally obtained with its enantiomer, L-IL-8(4-

72), as expected.

Biotinylation of D-IL-8(4-72)

The following biotinylation method, used for all protein biotinylations, was that given in Dr. George Smith's selection and amplification manual (see also Smith et al. (1993) Gene 128:37-42). The general protocol is given together with comments specific to D-E.-8.

The protein to be biotinylated (400 ug of D-U-8, in this case) in 39.2 ul of water was added to 8.8 ul IM NaHCO3 pH 8.6. Approximately 1 mg of the biotinylating agent, NHS-LC-biotin (Pierce) was dissolved in water at a concentration 0.5 mg/ml. 40 ul of this solution was added immediately to the protein solution and allowed to react for 2 hours at room temperature. The final concentration of D-U-8 was 560 uM, sufficient to be almost completely dimerized (Kd = 20 uM).

In the case of D-U-8, half of the reaction mixture was removed at this stage for the cross-linking reaction (see below). Next, 500 μL IM ethanolamine buffer (pH 9, adjusted with HCl) was added to the remaining reaction mixture, and the mix left at room temperature for 2 hours. After this incubation, 20 ul of 50 mg/ml of dialyzed bovine serum albumin (BSA) and 1 ml of TRIS buffered saline (TBS, 50 mM TRIS HCl pH 7.5, 150 mM NaCl) was added. This was then concentrated on a Centricon ultrafilter (Amicon) with the appropriate molecular weight cut-off, and the retentate was washed three times with 2 ml TBS to remove all the unconjugated biotin. The last wash was with TBS/0.02% NaN3. The retentate was collected in a conical cup by back centrifugation, and the retentate was stored at 4°C. The volume obtained was generally 50-80 ul. The concentration of

{

biotinylated protein was calculated assuming no losses (analysis by gel electrophoresis proved this assumption to be reasonable). iv) X-B-D-IL-8 (cross-linked B-D-IL-8)

The published method for efficiently cross-linking L-TL-8 was adapted for B-D- U-8 (Paolini et al. (1994) J Immunology 153: 2704).

Half (44 μL) of the biotinylation reaction mixture described above that had already been reacted for 2 hours was added to 11 μL of freshly dissolved ethylene glycolbis(sulfosuccinimidylsuccinate) (Pierce), and the mixture left at room temperature for 2 hours. Next, 500 μL of IM ethanolamine buffer (pH 9, adjusted with HCl) was added to the mixture, and the mix left at room temperature for 2 hours. After this incubation, BSA was added and the X-B-D-IL-8 purified as described above for B-D-U-8.

The efficiency of cross-linking was measured by gel electrophoresis using an SDS/20% polyacrylamide gel (Harlow E, Lane D: Antibodies: A Laboratory Manual. Cold Spring Harbor Laboratory, NY. 1988) and silver staining (Bio-Rad).

B. Biotinylation of L-protein control targets i) B-MAb DF3-P

Purified, quantitated murine IgG MAb DF3-P (Perey et al. (1992) Cancer Research 52:2563) was kindly provided. The protein was biotinylated and purified according to method given above. ii) B-S-protein

Bovine pancreatic S-protein (Sigma) was biotinylated and purified according to method given above.

C. Bacteriophage libraries expressing random 6-mer and 15-mer L-peptides

Two different combinatorial bacteriophage peptide libraries were obtained. Both contain a randomized peptide epitope inserted in the same position of the pUI coat protein that, after removal of the signal peptide by processing, starts four amino acids from the N-terminus (Figures 2 A and 2B). One library contains a randomized hexapeptide epitope (Scott et al. (1990) Science 249:386; Figure 2B), while the other contains a randomized 15-mer peptide epitope (Figure 2 A). The 10 8 different peptide

lo5

variants per library represent most of all possible hexamer sequences (20 6 = 6.4 x IO 7 ), but only a very small fraction of all possible 15-mer sequences (20 15 = 3.2 x IO 19 ).

D. Selection and amplification protocols A detailed account of the general protocol which can be adapted for the present method has been published (Smith et al. (1993) Methods Enzymol 217:228). The protocol is summarized below. i) Coating petri dishes with streptavidin

On the bottom of a 35 mm polystyrene petri dish (Falcon), 1 ml of 0.1 M NaHCO3 pH 8.6 was added. 10 μl of 1 μg/μl streptavidin was then added and the dish rocked to completely coat the bottom surface. The dish was placed in a humidified plastic box on a rocker overnight in the refrigerator. The next day, the streptavidin was discarded and the dish was filled with blocking solution (0. IM NaHCO 3 , 5 mg/ml BSA, 0.1 μg/ml streptavidin, 0.02% NaN 3 ) and left for 1 hour in the cold. Then the dish was washed six times rapidly with TBS/Tween (50 mM Tris HCl pH 7.5, 150 mM NaCl, 0.5% Tween 20). ii) Panning bacteriophage after pre-reacting the dish with biotinylated target (round I)

Into each strep tavidin-coated dish, 400 μl of TBS/Tween containing 1 mg/ml dialyzed BSA and 0.02% NaN 3 was added. Then the desired biotinylated target (10 μg of MAb, or the equivalent moles of an alternative target) was added and the dish rocked in the cold for 2 hours. Then, 4 μl of 10 mM biotin was added, and the dish rocked for an additional hour in the cold. The dish was then washed six times rapidly with TBS Tween. Next, 400 μl of TBS/Tween and 4 μl of 10 mM biotin was added. An aliquot of the 6-mer or 15-mer bacteriophage library was then added to the dish, and the dish rocked for 4 hours in the cold. iii) Washing and eluting the dish

The bacteriophage were shaken out of the dish, and the dish was washed ten times with TBS/Tween. 400 μl of elution buffer (0.1M HCl adjusted to pH 2.2 with glycine, 1 mg/ml BSA) was added, and the dish rocked gently for 10 minutes. The eluate was pipetted into a tube containing 75 μl of IM Tris HCl pH. 9.1 to give a neutralized eluate. iv) Amplifying the bacteriophage eluate

The neutralized eluate was concentrated on a Centricon 30-KDa ultrafilter, washed once with TBS, and collected in a total volume of 100 μl. Next, 100 μl of K91Kan terrific broth cells (see below) were added, and the tube was mixed gently and left at room temperature for 30 minutes. The terrific broth cells had been prepared by (i) inoculating a few mis of LB

(Luria-Bertani solution) containing 100 μg/ml kanamycin with K91Kan E. coli cells on the previous day and allowing it to shake overnight at 37°C; (ii) inoculating 10 ml of terrific broth in a 125-ml flask with 100 μl of the overnight culture and shaking vigorously at 37°C until a lOx dilution reached an optical density of 0.12-0.25; and (iii) slowing down the shaking to allow sheared F-pili to regenerate, with the cells being used with 1 hour.

The above mixture of eluted bacteriophage and E. coli was added to 20 ml of pre-warmed LB medium containing 0.2 μg ml tetracydine in a 125 ml culture flask and shaken vigorously at 37°C for 30-60 minutes. 20 μl of 20 mg/ml tetracydine was added, and the flask shaken vigorously overnight. v) Purification of the amplified bacteriophage eluate

The culture was poured into an Oak Ridge tube and centrifuged for 10 minutes at 5K φm. The supernatant was poured into a second Oak Ridge tube and centrifuged at 10K φm for 10 minutes. The supernatant was then poured into a fresh Oak Ridge tube containing 3 ml of 16.7% PEG/3.3M NaCl, mixed by inverting 100 times, and left on ice for at least 4 hours. Next, this mixture was centrifuged for 15 minutes at 10K φm to pellet bacteriophage. The bacteriophage were then dissolved in 1 ml of TBS, transferred to a fresh tube, vortexed briefly, and centrifuged to pellet insoluble matter. The supernatant was transferred to a second tube containing 150 μl of 16.7% PEG/3.3M NaCl. This was mixed by 100 inversions and placed in the refrigerator for at least 1 hour. It was centrifuged for 10 minutes to pellet the bacteriophage. The bacteriophage were dissolved in 200 μl TBS/0.02% NaN 3 , followed by vortexing and centrifuging. The supernatant, termed the amplified eluate 1, was removed and stored at 4 °C. This completed selection and amplification round 1. vi) Panning after pre-reacting bacteriophage with the biotinylated target (round 2)

100 μl of amplified eluate 1 from round 1 was mixed with a few ul of biotinylated target to give a biotinylated target concentration of 100 nM. The reaction was left overnight in the cold to allow it to reach equilibrium. 400 μl of TBS/Tween was added, and the solution was added to a freshly prepared streptavidin-coated dish

(see step 1 above). Then steps 3, 4 (omitting the Centricon concentration step) and 5 were repeated to yield amplified eluate 2. This completed selection and amplification round 2. vii) Panning after pre-reacting bacteriophage with the biotinylated target (round 3) 100 μl of amplified eluate 2 from round 2 was mixed with a few μl of biotinylated target to give a biotinylated target concentration of 0.1 nM. The reaction was left overnight in the cold to allow it to reach equilibrium. 400 μl of TBS Tween was added, and the solution was added to a freshly prepared streptavidin-coated dish (see step 1 above). Then step 3 was again performed, but, in contrast to previous rounds, amplification was delayed until individual clones had been isolated. viii) Isolation of individual bacteriophage clones

Serial dilutions of the neutralized eluate were prepared in TBS containing 1 mg/ml gelatin. In 15 ml tubes, 10 μl aliquots of the dilutions were mixed with 10 μl aliquots of K91Kan terrific broth culture (see above) and incubated at room temperature for 10 minutes. 1 ml of LB containing 0.2 μg/ml tetracydine was added, and the tubes incubated at 37°C for 20-40 minutes. 200 μl of the infected cells were then spread on LB plates containing 40 μg/ml of tetracydine and 100 μg/ml of kanamycin, and the plates were incubated overnight at 37°C. ix) Small-scale propagation and processing of bacteriophage Individual colonies were inoculated in 1.7 ml aliquots of LB containing 20 μg/ml tetracydine, and the tubes shaken vertically on a shaker incubator for 16-24 hours at 37°C. Each culture was poured into a 1.5 ml microcentrifuge tube and centrifuged briefly to pellet the cells. 1 ml of each supernatant was pipetted into a second tube to which 150 μl of 16.7% PEG/3.3M NaCl was already added. This was mixed by 100 inversions and incubated at 4°C for at least 4 hours. The tubes were then centrifuged, and the pellets dissolved in 500 μl TBS by vigorous vortexing. These bacteriophage were adequately pure for preparing DNA sequencing templates. This completed selection and amplification round 3.

E. Characterization of selected bacteriophage

A detailed account of the general protocol which can be adapted for the present method has been published (Smith et al. (1993) Methods Enzymol 217:228). The protocol is summarized below.

i) Preparation of DNA sequencing templates

200 μl aliquots of purified bacteriophage (2 X 10 11 virions) from the previous step were extracted once with 200 μl of phenol and once with 200 μl of chloroform. The final aqueous phase (150 μl) was mixed with 250 μl 10 mM Tris HCl pH 8.0, 1 mM EDTA, and 40 μl of 3 M NaOAc pH 6, and 1 ml ethanol was added. The DNA was precipitated for 1 hour on ice, centrifuged for 15 minutes, and washed with 1 ml of 70% EtOH. The pellet was dissolved in 7 ul of water and stored in the freezer. ii) DNA sequence analysis

DNA sequence analyses with purified bacteriophage single-stranded DNA templates used an 18-mer synthetic DNA primer: 5'-TGAATTTTCTGTATGAGG. The priming site on the bacteriophage DNA is shown in Figure 2.

Initially, automated sequencing reactions with fluorescently labeled dideoxynucleotide triphosphates were performed by the Cellular and Molecular Biology Core Facility of the Dana-Farber Cancer Institute. Later, manual sequencing reactions using 32 P-end-labeled primer and PCR were used because this required less DNA template and allowed more rapid characterization of bacteriophage clones. The protocol is detailed in the manual of Promega's fmol DNA Sequencing System.

F. Selection and amplification of peptide ligands with B-D-GRP, B-D-neuRL2, and B- MAb targets

Initial experiments employed three rounds of selection and amplification with the 6-mer and 15-mer peptide libraries using B-D-GRP, B-D-neuRL2, and B-MAb as targets (see above). The results are summarized in Table 1, with + and - results indicating whether or not a peptide consensus sequence was selected in an experiment, respectively. The sequences of several different bacteriophage clones were determined per experiment (see Methods). The B-D-neuRL2 and most of the B-D-GRP experiments were performed in parallel with a successful B-MAb control experiment.

It is evident from Table 1 that none of the B-D-neuRL2 or B-D-GRP experiments yielded consensus sequences. In contrast, seven of the eight B-MAb experiments yielded consensus sequences (Table 1), with most (usually all) of the selected bacteriophage clones in each experiment sharing sequence identity with the known eight-amino-acid epitope of the MAb. Only a small number of different sequences were observed because the same sequences were selected repeatedly in the same or different

experiments with the MAb target. Interestingly, the consensus sequence match in the 15-mer was located within a 10-amino-acid sequence that was probably bridged by a disulphide bond between two cysteines. Such cysteine pairs have been found at a very high frequency in selected peptides, perhaps because they stabilize the conformation of potentially flexible peptide sequences (Scott (1992) ΗBS 17:241).

Because of the difficulty in isolating consensus sequences with the drug targets, several variations of the standard selection protocol, termed P+LS,2(PL+S) by Smith (Smith et al. (1993) Methods Enzymol 217:228; Table 1), were used in an attempt to identify low affinity ligands for B-D-GRP. The probability that a library contains a low affinity ligand for a target is much higher than the probability of it contaming a high affinity ligand. One protocol variant, termed 3(P+LS), used a high concentration of biotinylated D-GRP bound to streptavidin on a plate as the target for all three rounds of affinity purification, enabling multivalent interactions between the bacteriophage, which displays about four copies of each peptide, and two or more target molecules bound to a single streptavidin. In another variant, the bacteriophage were eluted from the target with excess target instead of acid, thereby reducing background elution and increasing specificity (OTS.e.1 et al. (1992) Proteins: Struct. Fund. Genet. 14:509). These protocol variants failed to isolate consensus motifs for D-GRP from the libraries (Table 1).

The lack of a consensus sequence with the small B-D-GRP and B-D-neuRL2 targets was not suφrising, since it is known that these libraries have a much higher probability of yielding ligands for large targets (Smith et al. (1993) Methods Enzymol

217:228). Interestingly, peptide ligands selected by large targets generally bind in the clefts of the targets, presumably because only clefts can form simultaneous contacts with several sides of a peptide ligand, thereby enabling a high affinity interaction. Targets of less than 30 amino acids in length, such as B-D-GRP and B-D-neuRL2, contain very small surfaces and are unlikely to contain clefts that may be necessary for high affinity interactions with ligands. However, ligands have been successfully selected for two targets smaller than B-D-GRP and B-D-neuRL2 (Saggio et al. (1993) Biochem. J

293:613). Cyclic peptide ligands were inadvertently selected for binding to the biotinyl lysine group of a biotin-labeled protein (Saggio et al., supra), and peptides were selected for binding to a 9-amino acid peptide (Sasaki et al. (1996) Tetrahedron Let 37:85-88)

G. Chemical synthesis ofB-D-IL-8 and X-B-D-IL-8 D-peptide targets

A69-amino acid N-terminally-truncated form of the D-enantiomer of U-8, D-U-8(4-72) obtained —# — from Dr. Ian Clark-Lewis.

no

D-U-8(4-72) was chemically labeled with biotin to give B-D-U-8, and a portion of this was chemically cross-linked to give X-B-D-U-8 (supra). Cross-linking was necessary to guarantee the existence of dimers under selection conditions, which necessarily use very low B-D-U-8 concentrations because of stringent washing conditions.

Analysis of the modified proteins by protein electrophoresis showed that D-U.-8 had the expected mobility, and the biotin-labeling reaction did not significantly alter the mobility of D-Dl ^ -8. The latter was not unexpected, because the biotinyl group (formular weight of 227) is only 3% as heavy as D-lL-8(4-72) (formular weight of 8094), and an average of less than one biotin per D-lL-8(4-72) was expected. X-B-D-1X-8 contained approximately one third dimer (formula weight of 16 200) and two thirds monomer by weight.

H. Preparation ofB-S-protein control target Since the bivalent B-MAb control target has the potential to form high affinity multivalent interactions with a bacteriophage (which displays about four copies of a peptide variant), it may have formed high affinity interactions with the selected bacteriophage. It was therefore not an ideal control target to test whether the selection and amplification protocols were able to select low affinity binders for drug targets. An alternative control target, S-protein, was therefore selected because this target is more like the drug targets in that it is small, monovalent, and only forms low affinity interactions with peptides selected from peptide libraries.

S-protein was chemically labeled with biotin to give B-S-protein. Analysis of the labeled S-protein by protein electrophoresis showed that it had the expected mobility, with the biotin-labeling reaction not significantly altering the mobility of S-protein. This was not unexpected, because the biotinyl group (formular weight of 227) is only 2% as heavy as S-protein (formular weight of approximately 11 000).

I. Selection and amplification of peptide ligands for B-D-IL-8, X-B-D-IL-8 and B-S- protein targets

Selection and amplification experiments using the 6-mer and 15-mer peptide libraries and the B-D-U-8, X-B-D-U-8 and B-S-protein targets were performed as described above. The standard three rounds of selection and amplification yielded consensuses for S-protein, but not for the B-D-U-8 and X-B-D-U-8 targets.

n\

Almost all of the selected 6-mer peptide clones in the experiment with S-protein shared sequence identity with the known consensus sequence of 6-mer peptides selected by S-protein (Smith et al. (1993) Gene 128:37-42), confirming the efficacy of the method with small monomeric targets. The different selected sequences are shown in Figure 3. A very different sequence was efficiently selected from the 15-mer library (Figure 4). This was unexpected, but not incompatible with published experiments because selection experiments for S-protein with a 15-mer library have not been reported. The 15-mer sequence does not closely resemble the sequence of S-peptide (Figure 4), a known ligand for S-protein. Given the lack of consensus sequence acquisition for B-D-GRP by varying incubation and elution conditions, it was decided to try a different variation in conditions with the B-D-U-8 targets. Four rounds of selection and amplification were carried out instead of the usual three.

After three rounds, consensus sequences were once again evident for the control B-S-protein target, but not the B-D-J.L-8 targets. However, after four rounds, a single consensus sequence was successfully identified from the 6-mer and 15-mer sequences with both the B-D-U-8 and X-B-D-D 8 targets (two representative DNA sequences are shown in Figure 5, all selected peptide sequences are shown in Figures 6 and 7, and the consensus sequence alignment is shown in Figure 8). Interestingly, when the 6-mer and 15-mer sequences were aligned, three of the four amino acids at the C-terminal end of the random 15-mer insert matched the corresponding invariant amino acids (gly-ala-ala- gly) encoded by the vector sequence of the 6-mer library (Figure 8). The consensus sequence is therefore drawn to include these vector sequences, although evidence for their importance is not as strong as for the variable sequence of the 6-mers. The first two positions of the consensus are not absolutely conserved, but they are always basic amino adds. The remaining positions are neutral amino acids, with trp and ile being absolutely conserved. This suggests that electrostatic and hydrophobic interactions are important for binding, and that the peptides bind to an acidic region of D-U-8.

It was theoretically possible that the identified consensus represented a type of background binding that only appears after four rounds, since four rounds have not been carried out with D-peptide targets before. However, prior selection and amplification experiments have not detected a single sequence that perfectly matches the consensus. Furthermore, this consensus does not match any known background sequences including those that bind to plastic, streptavidin or biotin, so it is very likely to represent bona fide ligands of D-U8.

10-

J. General conclusions

The goal of this work was to invent a method for ligand and drug discovery that may enable one to rapidly discover drug candidates for protein targets. Such a method was tested with three different drug targets and two different control targets. The results presented support the feasibility of the method. Two general conclusions may be drawn:

(i) The synthesis of several extracellular protein targets in mirror image (D-protein) form is feasible.

(ii) The selection and amplification of short consensus L-peptide sequences for a D- peptide target using standard bacteriophage libraries and methods is feasible, as demonstrated using the B-D-U-8 target. '

By the laws of stereochemistry, the D-peptide enantiomers of the L-peptide sequences selected with B-D-LL-8 are candidate ligands for U-8.

K. Potential contacts between selected L-peptide ligands and D-IL-8

Results demonstrated that the same consensus sequence was selected with either the monomeric U-8 at very low concentrations (due to washing) or the covalent dimer, so the binding site may reside in the monomer. However, the existence of dimeric biotinylated targets stabilized by the streptavidin tetramers or selected bacteriophage cannot be ruled out, and it is conceivable that the selected peptides are cationic because they may bind in the anionic groove of a D-U-8 dimer.

It is also possible that the selected peptides recognize monomeric U-8. If they bind to a site on the monomer that is not disrupted by U-8 dimerization, it is possible that an U-8 dimer may bind two identical peptides. This would imply that a dimer of the selected peptide connected covalently with a spacer arm of appropriate length would have a much higher affinity for U-8 dimers than would a monomeric selected peptide. Such a dimeric (or oligomeric) peptide ligand can be readily synthesized commercially

by the multimeric antigenic peptide (MAP) method (Tam et al. (1988) PNAS 85:5409), and may be a more potent drug than the monomeric peptide ligand. It may also find utility in studies testing the effect of increased dimer to monomer ratios of U-8. In the future, it may be possible to deteirriine the peptide binding site on U-8 by alterations in the known NMR spectrum of U-8.

K. Improving the biological selection and amplification methods

The standard three rounds of selection and amplification from the 6-mer and 15- mer libraries routinely identified ligands for the two control targets, but never identified consensuses for the four D-peptide drug targets (supra). However, four rounds of selection and amplification did yield 6-mer and 15-mer consensuses for the two D-U-8 targets. Since four rounds have not been used for the B-D-GRP and B-D-neuRL2 targets, it would be worthwhile to attempt four or more rounds with these and any subsequent targets. Nevertheless, it was more difficult to isolate ligands for the small D- peptide targets than the larger control targets. Other workers have also been unable to select ligands for certain targets, and have suggested three potential ways of improving the selection methodology to overcome this problem (Mattheakis et al. (1994) PNAS 91:9022; Scott (1992) ΩBS 17:241; and Smith et al. (1993) Methods Enzymol 217:228): (i) The conformations of the displayed random peptides could be stabilized by a disulphide bridge formed from two cysteines on either side of the insert. This has been shown to allow higher affinity peptides to be selected (Saggio et al. (1993) Biochem. J 293:613; Scott (1992) ΗBS 17:241; and OΗeil et al. (1992) Proteins: Struct. Funct. Genet. 14:509). Indeed, a potential disulphide bridge was isolated in some of the selection experiments presented here.

(ii) The diversity of the libraries could be increased by the simultaneous expression of greater numbers of peptides. The record for the number of peptides displayed simultaneously has recently increased from the IO 8 variants in bacteriophage display libraries to 10 12 variants in polysome display libraries (Mattheakis et al., supra). (iii) The binding affinities of selected and amplified ligands could be further increased by introducing repeated mutagenesis steps into the amplification process, thereby allowing true evolution of ligands (Mattheakis et al., supra). This has been done in RNA evolution experiments and can be readily adapted to peptide libraries (Lippman et al. (1993) Science 259:631).

Ideas (i) to (iii) could all be combined to increase the power of the selection and amplification procedure. Such a method may be successful in identifying high affinity ligands for D-GRP and D-neuRL2. The mutation, selection and amplification methods are roughly analogous to the immune system's use of germ line and recombinational diversity, followed by selection and amplification for the initial synthesis of antibodies specific to an antigen during the primary immune response, while the evolution methods are more analogous to the immune system's completed response to an antigen that, in addition to germ line and recombinatorial diversity and selection and amplification, incoφorates somatic mutation and affinity maturation.

L. Comparison of enantiomeric ligand selection with existing drug discovery technology

The potential advantages of enantiomeric ligand selection over alternative drug discovery methods are great. One of the non-biological screening methods, published during the course of this work, also enables the selection of D-peptide ligands for L- peptide targets via a very different approach from enantiomeric ligand selection. In this approach, a chemically synthesized D-peptide library is directly screened for ligands that bind to L-protein targets, without any ligand amplification step (Lam et al. (1993) Gene 137:13-16; and Doodley et al. (1994) Science 266: 2019). Since L-peptide targets may be used (which can be synthesized by ribosomes), there is no limitation on the size of the target. Furthermore, both extracellular and intracellular proteins may be targeted in the future if a library can be constructed with hydrophobic analogs of peptides capable of crossing membranes.

By combining (i) enantiomeric ligand selection, (ii) polysome libraries containing at least IO 12 variants (7), and (iii) true in vitro evolution, it should become possible to screen a large fraction of the possible random 15-mer cyclic D-peptide sequences, a total of 20 15 = 3.2 x IO 19 variants, for binding to a target. Such massive molecular variation rivals that of the immune system, and should be sufficient to virtually guarantee the isolation of a high affinity D-peptide ligand and drug candidate for the employed drug target.