Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYNTHESIS OF ENCODED POLYMERS
Document Type and Number:
WIPO Patent Application WO/1994/013623
Kind Code:
A1
Abstract:
Conjugates and methods of producing the conjugates and mixtures thereof are disclosed. Conjugates are comprised of an active polymer made up of monomer units selected from the group consisting of the monomer units of peptides and/or peptoids, and an encoding polymer comprised of encoding monomers wherein the encoding polymer corresponds to the active polymer, and a coupling moiety covalently coupled to the active polymer and the encoding polymer. In accordance with the synthesis methodology, mixtures of large numbers of conjugates are produced by providing a coupling moiety and covalent binding it to an active monomer and an encoding monomer. Additional monomer units are added to the active monomer to create an active polymer and additional monomer units are added to the encoding monomer to produce an encoding polymer until the desired length is reached for the active polymer. Mixtures of conjugates attached to support bases can be used to assay a sample. The sample is brought into contact with the conjugates and a determination is made with respect to which active proteins bind to receptor sites within the sample. When active binding proteins are determined, the encoding polymer associated with the active polymer is sequenced, and by deduction, the sequence of the active polymer is determined.

Inventors:
ZUCKERMANN RONALD N
KERR JANICE M
Application Number:
PCT/US1993/012013
Publication Date:
June 23, 1994
Filing Date:
December 10, 1993
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CHIRON CORP (US)
International Classes:
C07H21/02; C07C237/22; C07H21/00; C07H21/04; C07K1/04; C40B70/00; (IPC1-7): C07C237/00; C07H21/00; C07K1/00; C07K5/00; C07K7/00; C07K13/00; C07K17/00
Foreign References:
US4359353A1982-11-16
Other References:
Proceedings of the National Academy of Sciences USA, Vol. 89, issued June 1992, S. BRENNER et al., "Encoded Combinatorial Chemistry", pages 5381-5383, see the entire document.
Science, Vol. 257, issued 17 July 1992, I. AMATO, "Speeding Up a Chemical Game of Chance", pages 330-331, see the entire document.
Proceedings of the National Academy of Sciences USA, Vol. 90, issued November 1993, M.C. NEEDELS et al., "Generation and Screening of an Oligonucleotide-Encoded Synthetic Peptide Library", pages 10700-10704, see the entire document.
Science, Vol. 254, issued 06 December 1991, P.E. NIELSEN et al., "Sequence-Selective Recognition of DNA by Strand Displacement with a Thymine-Substituted Polyamide", pages 1497-1500, see page 1498.
BioTechniques, Vol. 13, No. 3, issued September 1992, R.A. HOUGHTEN et al., "The Use of Synthetic Peptide Combinatorial Libraries for the Identification of Bioactive Peptides", pages 412-421, see the entire document.
Nature, Vol. 354, issued 07 November 1991, K.S. LAM et al., "A New Type of Synthetic Peptide Library for Identifying Ligand-Binding Activity", pages 82-84.
Science, Vol. 249, issued 27 July 1990, J.J. DEVLIN et al., "Random Peptide Libraries: a Source of Specific Protein Binding Molecules", pages 404-406.
See also references of EP 0675873A4
Download PDF:
Claims:
CLAIMS
1. An assay conjugate, comprising: an active polymer comprising monomers selected from the group consisting of peptide and peptoid monomers; an encoding polymer comprising encoding monomers, wherein the encoding polymer corresponds to and allows identification of the active polymer; and a coupling moiety covalently coupled to the active peptide and the encoding polymer.
2. The conjugate of claim 1, wherein the coupling moiety comprises a solid support.
3. The conjugate of claim 1, wherein the coupling moiety comprises a soluble linking group.
4. The conjugate of claim 1, wherein all active polymers coupled to a single selected solid support are identical.
5. The conjugate of claim 1, wherein the active polymer comprises a polypeptide and the encoding polymer comprises a DNA or RNA oligonucleo¬ tide.
6. The conjugate of claim 1, wherein the active polymer comprises a polypeptoid and the encoding polymer comprises a DNA or RNA oligonucleo¬ tide.
7. The conjugate of claim 6, wherein the polypeptoid comprises a polymer of monomers of the formula: wherein R is alkyl of 26 carbon atoms, haloalkyl of 16 carbon atoms wherein halo is F, Cl, Br, or I, alkenyl of 26 carbon atoms, alkynyl of 26 carbon atoms, cyclolkyl of 38 carbon atoms, alkoxyalkyl of 2.
8. carbon atoms, aryl of 610 carbon atoms, arylalkyl of 712 carbon atoms, arylalkyl of 712 carbon atoms substituted with 13 radicals independently selected from halo and nitro and hydroxy, aminoalkyl of 16 carbon atoms, hydroxyalkyl of 16 carbon atoms, carboxy, carboxyalkyl of 26 carbon atoms, carboalkoxyalkyl of 310 carbon atoms, carbamyl, carbamylalkyl of 26 carbon atoms, imidazolyl, imid azolylalkyl of 410 carbon atoms, pyridyl, pyridylalkyl of 610 carbon atoms, piperidyl, pip eridylalkyl of 510 carbon atoms, indolyl, or indolyalkyl of 915 carbon atoms.
9. 8 A mixture of conjugates of claim 1, wherein the mixture comprises at least two distinct active polymers, the coupling moiety is a solid support, distinct active polymers are covalently coupled to separate solid supports and a distinct encoding polymer corresponding to each active polymer is covalently coupled to the support coupled to its corresponding active polymer.
10. 9 The mixture of claim 8, wherein the mixture comprises at least ten distinct active polymers.
11. The mixture of claim 9, wherein the mixture comprises at least 100 distinct active polymers.
12. A conjugate, comprising: a biologically active peptide comprised of five or more amino acids; an encoding polymer comprised of nucleic acids wherein one or more nucleic acids within the encoding polymer correspond to and can be readily identified with the amino acids of the active polymer; and a coupling moiety covalently coupled to the active peptide and the encoding polymer.
13. The conjugate of claim 11, wherein the amino acids are selected from the group consisting of alanine, cysteine, aspartic acid, glutamic acid, phenylalanine, glycine, histidine, isoleucine, lysine, leucine, methionine, asparagine, proline, glutamine, arginine, serine, threonine, valine, tryptophan, and tyrosine.
14. The conjugate of claim 11, wherein the encoding polymer includes three nucleotides for each amino acid of the active polymer and the three nucleotides are nucleotides which naturally encode the corresponding amino acid of the active polymer.
15. The conjugate of claim 11, wherein the coupling moiety comprises a solid support.
16. The conjugate of claim 14, wherein the solid support is in the form of a spherical bead having a diameter of less than one centimeter.
17. The conjugate of claim 11, wherein the coupling moiety comprises a soluble linking group.
18. A method of synthesizing an encoded peptide or peptoid polymer, the method comprising: a) providing a coupling moiety; b) coupling to the coupling moiety an active monomer, and coupling to the coupling moiety an encoding monomer which corresponds to the active monomer to form a conjugate having a bound active monomer and a bound encoding monomer; and c) repeating step b) until an active polymer of the desired length is obtained.
19. The method of claim 17 wherein the active monomers are amino acids and the encoding monomers are nucleic acid bases.
20. A method of synthesizing a mixture of encoded polymers, the method comprising: a) providing a plurality of coupling moieties; b) dividing the plurality of coupling moieties into a plurality of aliquots; c) within each aliquot, coupling to each coupling moiety an active monomer, and coupling to each said coupling moiety an encoding monomer which corresponds to the active monomer to form a plurality of conjugates having a bound active monomer and a bound encoding monomer, each in a separate aliquot, wherein different aliquots may contain different active monomers; d) combining the aliquots of conjugates to form a mixture of conjugates; and e) repeating steps bd) until a mixture of active polymers of the desired length is obtained.
21. The method of claim 19, wherein the active monomers comprise activated amino acids and the coupling moiety comprises a solid support.
Description:
SYNTHESIS OF ENCODED POLYMERS

Field of the Invention

This invention relates to the fields of biopolymer synthesis and drug design. More particularly, the invention relates to methods for synthesizing libraries of biologically active polymers in association with an included polymer which is encoded to facilitate deciphering.

Background of the Invention

Modern pharmaceutical technology has taken two divergent paths in pursuit of new therapeutic compounds. Rational drug design achieves results by intensive analysis of the molecular structure of binding sites, and designing compounds specifically to complement a desired binding site. For example, one interested in preparing new antihypertensive compounds might analyze the molecular structure of the β- adrenergic receptor binding site using X-ray crystallography and/or advanced NMR techniques, and then synthesize compounds calculated to fit within the binding site and complement the charge distribution.

The other approach is to prepare an enormous library of compounds and select only those compounds which exhibit a desired activity. This approach differs from the traditional pharmaceutical cycle of design/synthesize/test/synthesize variants by conducting the screening step in a massively parallel fashion, screening an enormous number of different compounds simultaneously. The challenge to this

approach is first to provide a group of compounds for screening that is sufficiently numerous and diverse to insure that the activity sought is represented in the group, and second to identify the active compounds at low concentration within the group.

Rutter et al., US 5,010,175 disclosed a method of making diverse mixtures of peptides by adjusting the concentration of each activated peptide in proportion to its reaction rate, in order to obtain a substantially eguimolar mixture of peptides. Rutter also disclosed the process of providing a mixture of peptides (having at least 50 different peptides) , and selecting one or more peptides having a desired property and separating them from the rest of the peptides.

Zuckermann et al., PCT 091/17823 disclosed an alternative method for preparing diverse mixtures of oligopeptides on solid-phase resins, and a robotic device for performing the necessary manipulations. In this method, a pool of resin particles is separated into a number of groups (wherein each group is defined as one or more separate reactions) , and a different amino acid coupled to the resin in each group. The groups are then mixed together, separated into a number of groups, and again coupled with a different amino acid for each group. This cycle is repeated until the desired number of amino acids per oligopeptide is obtained. One advantage of this approach is that each coupling reaction occurs in isolation from other reactants, which permits one to drive each reaction to completion without carefully adjusting the initial concentration of each reactant. This method also facilitates the preparation of oligopeptides wherein some positions within the peptide chain are held constant, and where some posi¬ tions are restricted to less than all amino acids. For example, one may use this method to prepare a pep¬ tide of the formula X^X^X j -Glu-Ala-X^Xs-X, ; , where X,.

can be any amino acid. If desired, one could limit, for example, X 3 and X 5 to hydrophobic residues. Zuckermann also disclosed that this method may be applied to the synthesis of oligonucleotides, which may then be inserted into cloning and expression vectors for biological expression.

Bartlett et al., PCT 091/19735 disclosed a variation of the Zuckermann et al. method in which a diverse set of non-amino acid monomers is employed to form mixtures of compounds called "peptoids."

Peptoids sample a different region of physico-chemical parameter space than traditional oligopeptides, depending on the type of linkage between monomers, and may be able to exhibit activities unavailable to peptide libraries due to the diversity (or difference) in side chains.

Houghten, US 4,631,211 disclosed a "tea-bag" peptide synthesis method. The "tea bags" are mesh bags containing resin beads for peptide synthesis. Houghten's method enables one to add the same amino acid to a number of different oligopeptides without mixing the products: a number of "tea bags" may be reacted with an amino acid in a common pot, then separated physically. Cook, EP 383620 described synthesis of COP-

1, a random polymer of Ala, Glu, Lys, and Tyr, having an average molecular weight of 23 kDa having activity in the treatment of multiple sclerosis. COP-1 is made in the prior art by chemical polymerization of the amino acids. However, Cook described expression from genes made by random polymerization of oligonucleotides, and selection for those clones expressing COP-1 with the highest activity.

Lebl et al., EP 445915 described a machine for performing multiple simultaneous peptide syntheses using a planar support surface. The planar support is, for example, paper or cotton.

Kauffman et al., O86/05803 disclosed production of peptide libraries by expression from synthetic genes which are partially or wholly "stochastic." Stochastic genes are prepared by polym- erizing a mixture of at least three oligonucleotides (at least heptamers) to form a double-stranded stochastic sequence, and ligating the stochastic sequence into an expression vector.

Lam et al., WO92/00091 disclosed libraries of oligonucleotides, oligopeptides, and peptide/nucleotide chimeras, and methods for screening the libraries for active compounds. However, Lam did not disclose conjugates having an active sequence and a coding sequence. K.M. Derbyshire et al.. Gene (1986) 46:145-

52 disclosed a method for "saturation mutagenesis" of a segment of DNA, by synthesizing oligonucleotides using contaminated pools of monomer. Each A, C, G, and T reservoir contained 1/54 parts of each of the other bases. The object was to prepare a DNA segment mixture having one or two mutations per sequence. They did not observe equal frequencies of mutation, presumably due to differences in coupling efficiency. The authors suggested synthesizing sequences using four reservoirs containing pure bases, and one reservoir containing a mixture of all four bases in the concentrations necessary to balance the coupling efficiencies.

J.F. Reidhaar-Olson et al.. Science (1988) 241:53-57 disclosed the generation of mutant λ repressor proteins by replacing two codons with random nucleotides (NNG/C) . The resulting mutant proteins were assayed for activity to determine which amino acid positions were critical, and which positions should be conserved.

I.S. Dunn et al., Prot Enα (1988) 2.J283-91 disclosed the use of random polynucleotides to gen¬ erate mutant 0-lactamase α-peptides, some of which

exhibited properties superior to the native sequence α-peptide.

A.R. Oliphant & K. Struhl, Nuc Acids Res (1988) JL6:7673-83 disclosed the use of random poly- nucleotides to investigate promoter function. A section of random polynucleotide was inserted into the -35 to -10 region of a gene conferring drug resistance in E. coli, and the transformants screened for resis¬ tance. Survivors were cloned and sequenced to provide a functional consensus sequence.

F. . Studier, Proc Natl Acad Sci USA (1989) jjj>:6917-21 disclosed a method for sequencing large volumes of DNA by random priming of cos id libraries.

A.R. Oliphant et al., Proc Natl Acad Sci USA (1989) S_6:9094-98 disclosed the generation of β- lactamase mutants having altered properties, by cloning a random polynucleotide into the 0-lactamase gene.

D.K. Dube et al., Biochem (1989) 28:5703-07 disclosed the generation of /S-lactamase mutants having altered properties, by cloning a random polynucleotide into the S-lactamase gene.

R.A. Owens et al., Biochem Biophys Res Comm (1991) 181:402-08 disclosed the selection of an HIV protease inhibitor from a library of 240,000 tetrapep- tides (in 22 mixtures) . The mixtures were prepared by the "mixed resin" technique.

These techniques enable one to prepare libraries of diverse compounds. However, the problem of identifying the resulting compounds has seldom been addressed. Oligopeptides are typically sequenced by stepwise cleavage of each amino acid from the parent compound (which is usually immobilized on a resin) , with chromatographic analysis of the cleaved moiety. Sensitive techniques are required to distinguish between twenty or more amino acids. Analysis is further complicated when uncommon amino acids are

employed (using current techniques) , especially when monomers are linked without using amide bonds.

Summary of the Invention The present invention provides a method of synthesizing true mixtures of diverse oligopeptides and/or peptide-like compounds along with an associated encoding polymer making it possible to easily analyze those compounds exhibiting a desired activity. The invention involves synthesizing an encoding DNA strand simultaneously with the peptide/peptoid. Each unique peptide/peptoid sequence associated with its own unique DNA strand to provide the conjugates of the invention. These conjugates are screened to determine which peptide/ peptoid compounds exhibit a desired activity, and the active conjugates analyzed by DNA sequencing methods to determine the attached peptide/peptoid sequence by deduc-tion, i.e., since each DNA sequence is associated with a known peptide/peptoid, once the DNA sequence is deter¬ mined, the sequence of the peptide/peptoid can be deduced.

Another aspect of the invention is a conjugate comprising a peptide or peptoid coupled to and/or directly associated with a coding polymer (CP) , e.g. a nucleic acid (NA) . The peptide/peptoid/CP conjugate may be linked directly (i.e., covalently bound either directly or through a small organic mol- ecule) , or by linkage to the same support (e.g., by synthesizing both peptide/peptoid and CP strand on the same particle or bead of resin) .

An important object of the invention is to provide a chemical synthesis method which allows the production of libraries of peptides and/or peptoids along with a unique encoded polymer such as a DNA strand which makes it possible to readily determine the sequence of the peptide or peptoid.

An advantage of the present invention is that the methodology makes it possible to readily identify and sequence peptides and/or peptoids having desirable biological activities. A feature of the present invention is that sequences of peptoids or peptides which contain nonconventional amino acids can still be readily determined by sequencing associated polymers such as DNA sequences which are simultaneously synthesized with the peptoids and encode them.

These and other objects, advantages and features of the present invention will become apparent to those persons skilled in the art upon reading the details of the structure, synthesis and use as more fully set forth below, reference being made to the accompanying figures forming a part hereof.

Brief Description of the Drawings

Figure 1 is a schematic diagram showing a specific embodiment of a conjugate of the invention which conjugate includes a "binding" strand or active polymer attached to a solid-support substrate which substrate is also bound to an information storage or "coding" strand; Figure 2 is a schematic flow diagram demonstrating how encoded libraries can be synthesized on beads as the solid-support substrate;

Figure 3 is a schematic diagram showing methods of the synthesis of both solid-phase and solution-phase libraries;

Figure 4 is a schematic diagram showing resin-bound libraries generated by the derivatization of non-hydrolyzable resins;

Figure 5 is an HPLC chromatogram of binding and coding peptide strands simultaneously synthesized via non-hydrolyzable resin linkage;

Figure 6 is an HPLC chromatogram of a coding and binding strand adduct which was synthesized via a hydrolyzable resin linker;

Figure 7 is a plotted graph resulting from ELISA competition of binding sequences versus binding/encoding sequences;

Figure 8 is a schematic diagram showing the analysis of a solid-phase amptide; and

Figure 9 is a schematic flow diagram showing the analysis of a solution-phase amptide.

Detailed Description of Preferred TCrnhod-j-nnan- g

Before the present method of synthesis, conjugates and methods of using such are described, it is to be understood that this invention is not limited to the particular methodologies, conjugates, or methods of use described as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting since the scope of the present invention will be limited only by the appended claims.

It must be noted that as used herein and in the appended claims, the singular forms "a," "and," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a peptide" includes mixtures of peptides, reference to "an amino acid" includes mixtures thereof, and reference to "the reaction" includes one or more reactions of the same type as generally understood by those skilled in the art, and so forth.

Unless defined all otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or

testing of the present invention, the preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to describe and disclose specific information for which the reference was cited in connection with.

In general, the invention provides a rapid method of synthesizing large numbers of conjugates which conjugates are comprised of a peptide/peptoid sequence, e.g., an amino acid sequence associated with a unique encoding sequence, e.g., a DNA sequence. The conjugates can be readily synthesized and thereafter screened for biological activity, and when activity is found, the particular peptide/peptoid sequence found to be active can be readily identified by its associated encoding (DNA) strand. Each conjugate of the invention is comprised of at least two components with one of the components being the peptide or peptoid sequence which binds to a receptor of interest and the other sequence being a polymer which encodes the binding sequence. The invention may utilize standard amino acids and DNA as encoding monomers to produce a chemically diverse library of solution-phase or solid-phase conjugates. In order to further describe the invention in detail, the following definitions are provided.

A. Definitions

The terms "nucleic acid" and "NA" refer to oligomers constructed from DNA and/or RNA bases which may be sequenced using standard DNA sequencing tech¬ niques. The NAs used herein may include uncommon bases so long as such bases are distinguishable from the other bases employed under the DNA sequencing methods to be used and include peptide-nucleic acids (PNAs) (disclosed by Nielsen, P.E., Egholm, M. , Berg, R.H. & Buchardt, 0. , Science (1991) 254, 1497-1500). Such PNAs could serve as coding strands and the

detection would be by hybridi-zation. NAs will usually be constructed from monomers linked by phos- phodiester bonds, but other similar linkages may be substituted if desired. For example, phosphorothioates may be employed to reduce lability. The term "peptide" as used herein refers to the 20 commonly-occurring amino acids: alanine (A) , cysteine (C) , aspartic acid (D) , glutamic acid (E) , phenylalanine (F) , glycine (G) , histidine (H) , isoleucine (I) , lysine (K) , leucine (L) , methionine (M) , asparagine (N) , proline (P) , gluta ine (Q) , arginine (R) , serine (S) , threonine (T) , valine (V) , tryptophan ( ) , and tyrosine (Y) .

The term "peptoid" as used herein refers to a non-peptide monomer of the general formula (R) n -X- (L) m , where R is a side chain group, n is at least 1, L is a linking group, m is at least 2, and X is a small organic radical. It is preferred to select L radicals that may be individually protected and deprotected. Preferably n will be 1 or 2 and m will be 2. Monomers wherein m is 3 or greater may be used to form branched active polymers. Presently preferred monomers are N-substituted glycine derivatives of the formula

wherein R is alkyl of 2-6 carbon atoms, haloalkyl of 1-6 carbon atoms wherein halo is F, Cl, Br, or I, alkenyl of 2-6 carbon atoms, alkynyl of 2-6 carbon atoms, cyclolkyl of 3-8 carbon atoms, alkoxyalkyl of 2-8 carbon atoms, aryl of 6-10 carbon atoms, arylalkyl of 7-12 carbon atoms, arylalkyl of 7-12 carbon atoms substituted with 1-3 radicals independently selected from halo and nitro and hydroxy, aminoalkyl of 1-6 carbon atoms, hydroxyalkyl of 1-6 carbon atoms,

carboxy, carboxyalkyl of 2-6 carbon atoms, carboalkoxy-alkyl of 3-10 carbon atoms, carbamyl, carbamylalkyl of 2-6 carbon atoms, imidazolyl, imid- azolylalkyl of 4-10 carbon atoms, pyridyl, pyridylalkyl of 6-10 carbon atoms, piperidyl, pip- eridylalkyl of 5-10 carbon atoms, indolyl, or indolyalkyl of 9-15 carbon atoms. Thus, active polymers composed of these monomers are equivalent to polyglycine having side chains attached at each nitrogen. These and other monomers are described in copending application USSN 07/715,823, incorporated herein by reference, and PCT 091/19735.

The terms "coding" and "encoding" indicated that one or more coding monomers corresponds directly and uniquely to a given active monomer, e.g., conventional nucleic acids encode (in groups of three) the 20 natural amino acids. The number of coding monomers used for each code depends on the number of different coding mono¬ mers and the number of different active monomers. Typically, the number of different active monomers used will range from about 5 to about 30. A basis set of 4 coding monomers can encode up to 16 active monomers taken in "codons" of 2 coding monomers. By increasing the coding monomer basis set to five distinct monomers, one can encode up to 25 different peptide/peptoid monomers. A basis set of 4 coding monomers can encode up to 64 peptide/peptoid monomers taken in "codons" of 3 coding monomers. Note that one can make the code degenerate or nondegenerate, and can insert additional coding information into the sequence. For example, one may wish to begin each codon with the same base (e.g. , G) , using that base only in the first position, thus unambiguously identi¬ fying the beginning of each codon. As a practical matter, the group of monomers selected for use as coding monomers will form polymers that are easier to

sequence than the active polymers, i.e., the coding monomers may be more readily identified using present day sequencing technology as compared to the monomer of the active polymers. With current technology, the order of preference for coding monomers is nucleic acids > peptides > peptoids. Nucleic acids have the additional advantage that the coding sequence may be amplified by cloning or PCR (polymerase chain reaction) methods known in the art. The term "active polymer" and/or "binding polymer" refers to a polymer having a desired biological activity. Suitable biological activities include binding to natural receptors, pharmaceutical effects, immunogenicity/antigenicity, and the like. "Immunogenicity" refers to the ability to stimulate an immune response (whole or partial serum-mediated immunity and/or cell-mediated immunity) in a bird or mammal following administration. Antigenicity requires only that the active polymer bind to the antigen-binding site of an antibody. Pharmaceutical activities, for the purposes of this invention, will generally depend on the ability of the active polymer to bind a protein, carbohydrate, lipid, nucleic acid, or other compound present in the subject. For example, an active polymer may bind to a cell surface receptor and compete with the receptor's natural ligand, with or without activation of the receptor. Other useful pharmaceutical activities include cleavage of endogenous molecules (e.g. , protease activity, nuclease activity, and the like) , catalysis of reactions either primarily or as a cofactor, donation of functional groups (e.g., acyl, ATP, alkyl, and the like) , pore formation, and the like. Active polymers comprise a series of monomers which are linked sequentially. The monomers will generally be peptides, peptoids, or carbohydrates in the practice of the instant invention.

The term "mixture" as used herein refers to a composition having a plurality of similar components in a single vessel.

The term "couple" as used herein refers to formation of a covalent bond.

The term "coupling moiety" refers to a soluble or insoluble support to which can be attached one or more active monomers and the corresponding encoding monomers. Insoluble supports ("solid support means") may be any solid or semi-solid surface which is stable to the reaction conditions required for synthesis of the active and coding polymers, and is suitable for covalently attaching and immobilizing both polymers, for example, most resins commonly employed in DNA and peptide synthesis, such as MBHA, Rink, and the like. The particular resin used will depend upon the choice of coding and active polymers and their associated synthetic chemistries. Soluble coupling moieties are molecules having functional groups to which active and coding monomers may be attached. Each soluble coupling moiety must be able to accommodate at least one coding polymer and at least one active polymer, although the active and coding polymers need not be present in a 1:1 ratio. The soluble coupling moiety may be as simple as an amino acid having an functional group in its side chain, or may be as complex as a functionalized (soluble) polymer.

The term "conjugate" as used herein refers to the combination of any "active polymer" and its associated "coding" polymer. The conjugate may be formed using a "coupling moiety" or by binding both the "active polymer" and "encoding polymer" to the same support surface in close proximity with each other so that the two polymers are "associated" with each other. When both polymers are bound to the same support surface, such as a small bead, the encoding polymer can be readily sequenced off of the bead and

the other "active polymers" remaining on the bead will be identified once the encoding sequence is known.

B. Related Libraries and Synthesis Methodologies for producing same.

There are many limitations with the current technologies for probing the receptor-binding properties of peptide libraries. Filamentous bacteriophage libraries offer the largest source of peptide diversity («10 7 -10 8 different components) of any current technology to date (Scott, J. & Smith, G. , Science. (1990), 249, 386-390; Devlin, J. , Panganiban, L. & Devlin, P., Science. (1990), 249, 404-406; Cwirla, S., Peters, E., Barret, R. & Dower, W. , Proc. Natl. Acad. Sci. U.S.A.. (1990), 87, 6378-6382).

These libraries, however, are limited to the natural set amino acids, suffer from biological biases (i.e., varying rates of growth, proteolysis, etc.) and also suffer, in practice, from high levels of background binding. The present invention is designed to overcome these difficulties.

Multiple-peptide synthesis technology has substantially increased the ability to generate individual peptides (Geysen, H. , Meloen, R. & Barteling, S., Proc. Natl. Acad. Sci. U.S.A.. (1984), 81, 3998-4002; Houghten, R. , Proc. Natl. Acad. Sci. U.S.A.. (1984), 5131-5135; Schnorrenberg, G. & Gerhardt, H., Tetrahedron. (1989), 45, 7759-7764; Gausepohl, H. , Kraft, M. , Boulin, C. & Frank, R. in Peptides: Chemistry. Structure and Biology

(Proceedings of the 11th American Peptide Symposium,. (1990), eds. Rivier J. & Marshall, G. , (ESCOM, Leiden), pp. 1003-1004; Frank, R. & Dδring, R. , Tetrahedron. (1988), 44, 6031-6040; Fodor, S., Read, j., Pirrung, M. , Stryer, L. , Lu, A. & Solas, D.,

Science. (1991), 251, 767-773). The synthesis of «10 4 individual peptides per cm 2 of glass wafer represents the diversity limit of this technology (Fodor, S.,

Read, J. , Pirrung, M. , Stryer, L. , Lu, A. & Solas, D. , Science. (1991) , 251, 767-773) . A mixed-resin algorithm method (Furka, A. , Sebestyen, M. , Asgedom M. & Dibό, G. , Int. J. Peptide Protein Res.. (1991), 37, 487-493) has recently been used to generate solution- phase libraries (Houghten, R. , Pinilla, C, Blondelle, S., Appel, J. , Dooley, C. & Cuervo, J. , Nature. (1991), 354, 84-86) and resin-bound peptide libraries (Lam, K., Salmon, S., Hersh, E. , Hruby, V., Kazmiersky, W. & Knapp, R. , Nature. (1991), 354, 82- 84) that contain «10 6 and «10 7 components, respectively. The solution-phase libraries offer the advantage of providing quantitative receptor-binding information (Zuckermann, R., Kerr, J. , Siani, M., Banville, S. & Santi, D.V. , Proc. Natl. Acad. Sci. U.S.A.. 89, 4505-4509 (1992)). Furthermore, these libraries allow the affinity of the solution conformation of a ligand to be determined, a quantity that is essential for rational drug design. An apparatus for the automated synthesis of equimolar peptide mixtures is disclosed in Zuckermann, R.N., Kerr, J.M. , Siani, M.A. & Banville, S.C., Int. J. Pep. Pro. Res.. (1992), 40, 498-507.

The publications cited and discussed above can be used in producing the active or binding polymer which is used in producing the conjugate of the present invention. Accordingly, the disclosures of all of these publications are incorporated herein by reference in order to disclose peptide and peptoid synthesis methodology. Although the methodology discussed within these references is extremely valuable with respect to the production of large amounts of different types of binding polymers, the mixtures of polymers produced by this methodology are often so large and complex that there are many practical limitations with respect to their actual analysis and use. The present invention can be readily applied with such synthesis methodologies in

order to provide an efficient, commercially practical method of analyzing the proteins produced using such methodology.

Both the mixed-resin and solution-phase methods, however, do not allow incorporation of many non-standard amino acids because of the limitations of peptide analysis. Resin-bound peptide libraries, in particular, suffer from a relatively slow rate of analysis (peptide sequencing at 3 beads per day) and are limited in complexity to «10 7 beads/ml. In order to generate a "complete" peptide library, there must be multiple copies (>10) of any given peptide sequence. This becomes problematic at the sequencing stage because the same "hit" sequence may appear multiple times. The alternative is to work with libraries that are not complete at the risk of losing sequences that bind.

When using the methodology of the present invention, the sequence of a biologically active protein can be determined even without isolating the protein of interest. This can be done by synthesizing large numbers of different proteins on large numbers of different support surfaces such as small beads. An encoding polymer is attached to beads to identify each protein. A sample to be tested is then brought into contact with the beads and the beads are observed with respect to which proteins bind to a receptor site in the sample. The bead having the receptor bound thereon is analyzed by sequencing the coding polymer which has also been synthesized on the bead. When the encoding polymer has been sequenced, the sequence of the active polymer, which may be a peptide, can be readily deduced. Thus, the present invention makes it possible to determine the activity and sequence an active polymer, such as a biologically active peptide, without ever isolating the peptide.

C. General Methodology

This invention describes a methodology for the synthesis and screening of large synthetic polymer libraries that contain non-standard amino acids and even non-amide based polymers. The strategy utilizes a modified mixed-resin peptide synthesis methodology to simultaneously synthesize two polymer sequences: one polymer strand (the "binding" strand) is synthesized for the intended purpose of receptor binding, and the second strand (the "coding" strand) contains standard amino acids or deoxyribonucleotides that encodes for the binding strand (Figure 1) . The ability to decipher the binding sequence by analysis of the coding strand with standard peptide or oligonucleotide techniques allow the inclusion of a wide variety of novel building blocks and conformational constraints into a diverse ligand library.

This invention also describes a methodology to increase the size (>10 8 ) and screening rate of a ligand library. The method uses two polymers as above, but specifically utilizes an oligodeoxyribonucleotide for the "coding" strand. The use of DNA as the coding strand allows for an increased sensitivity of detection (fmol vs pmol for peptide analysis) . This increased sensitivity allows for a larger library size since the amount of polymer needed for detection is reduced dramatically. The rate of sequence determination of receptor binders is increased since many samples can be analyzed in parallel.

In order to couple a polymer's sequence information with a peptide or oligonucleotide sequence, there needs to be a method that unambiguously correlates each polymer to each other. Thus, when any particular non-standard amino acid (or other monomer) is added to a "binding" polymer chain, the corresponding information (amino acid or

nucleotide monomer) must also be added to the "coding" strand. A "genetic code" is thus established (Table 1) where each binding monomer corresponds to (a multiple) of standard amino acids or nucleotides on the coding strand. For example, the use of three standard amino acids or nucleotides, in a 3:1 ratio with a novel monomer, would allow for the unambiguous representation of 27 novel monomers.

TABLE 1. Custom genetic code.

The synthesis of coded libraries requires a modified mixed-resin algorithm (Figure 2) . The resin beads are divided into equal portions, a unique monomer is added to the "binding" strand, followed by the coupling of a corresponding amino acid or nucleotide to the "coding" strand. The resin aliquots are then combined to generate a mixture. A set of compatible protecting groups is thus required to preferentially deprotect and extend each strand independently.

Two synthesis formats are possible for amptide libraries, one that generates resin-bound libraries and one that generates solution-phase libraries (Figure 3) . Resin-bound libraries can be synthesized using non-hydrolyzable linkers that are derivatized with the "binding" and "coding" monomers strands. Solution-phase libraries can be synthesized

as a 1:1 polymer:peptide/DNA conjugate via a hydrolyzable linker attached to the resin.

Peptide as the "Coding" strand The use of base-labile Fmoc-protected monomers and acid-labile (l-P-Ddz-protected amino acids (Birr, C, Nassal, M. , Pipkorn, R. , Int. J. Peptide Protein Res.. (1979), 13, 287-295), for example, allow for selective deprotection and coupling to two individual polymer strands. Resin-bound libraries can be generated by the derivatization of non-hydrolyzable resins with a 1:1 ratio (or any desired ratio) of Fmoc:Ddz monomers (Figure 4) . This introduces two differently protected amino acids that an be extended independently. Solution-phase libraries that contain a 1:1 ratio of binding:coding strands can be synthesized by using a hydrolyzable Fmoc-Lysine(Moz)- OH linker that allowed for chain growth at both the α- and ε-amino groups. Amino acids which do not contain functional groups are preferred for the "coding" strand in order to minimize unwanted binding interactions.

The receptor-binding ligand can be identified by bead staining techniques (Lam, K. , Salmon, S., Hersh, E., Hruby, V., Kazmiersky, W. & Knapp, R. , Nature, (1991), 354, 82-84) and the sequence determined by N-terminal Edman degradation. In order to ensure that only the "coding" strand is sequenced, it is essential that the N-terminus of the "binding" strand be acetylated or otherwise made non- sequencable.

DNA as the "Coding" Strand

The construction of libraries with DNA as the coding strand is similar to those with peptides but offers several advantages: the information storage and replicative properties of DNA allow for

increased sensitivity of detection, a larger library size and an increased rate of sequence determination. The synthesis of DNA as the coding polymer requires compatibility between the assembly of Fmoc- based monomers and standard DNA chemistry. These synthesis strategies are likely to be compatible ((a) Juby, C, Richardson, C. & Brousseau, R., Tet. Letters. (1991), 32, 879-882. (b) Haralambidis, J. , Duncan, L., Angus, B. & Tregear W., Nucleic Acid Res.. (1990), 18, 493-499) (see Table 2) . Alternatively, allyl-based protection strategies exists for both peptide (Lyttle, M.H.; Hudson, D. , Peptides: Chemistry and Biology fProceedings of the 12th American Peptide Sym osiuml.: Smith, J. and Rivier, J.E., Eds.; ESCOM, Leiden, 1992, pp. 583-584) and oligodeoxyribonucleotide (Hayakawa, Y., Wakabayashi, S., Kato, H. & Noyori, R. , J. Am. Chem. Soc. r (1990), 112, 1691-1696) synthesis. The assay of solution- phase libraries can be facilitated by using only pyrimidines in the coding strand, thereby avoiding the potential problem of base pairing between individual strands.

Strategies for the synthesis of coding and active polymers, as well as matching the active polymer with a coding sequence to provide a genetic tag are described in Brenner et al. Proc. Natl. Acad. Sci. U.S.A.. 89:5381-5383 (June 1992) which is incorporated herein by reference.

TABLE 2

Compatibility of DNA vs. Peptide Synthesis Chemistry

Peptide chemistry Oligonucleotide Chemistr

Permanent Protecting Groups;

-OH t-butyl ether A,C benzoyl amide -C0 2 H t-butyl ester G isobutyryl -NH 2 t-boc -P=0 cyanoethyl, methyl his, cys trityl arg sulfonyl

removed by;

85% trifluoroacetic acid cone . NI^OH 55 °C 5 hours 2 hours 6 room temp

Temporary protecting groups;

9-fluorenylmethoxycarbonyl 4,4-dimethoxytrityl

removed bv;

20% piperidine 3% trichloroacetic acid

Cleavage from solid support;

85% trifluoroacetic acid cone. NH 4 OH 55°C 5 hours 2 hours θ room temp

SUBSTITUTE SHEET

TABLE 2 (Cont) Compatibility of DNA vs. Peptide Synthesis Chemistry

SUBSTITUTE SHEET

Resin-bound libraries can be synthesized by using non-hydrolyzable linkers to attach both the C- terminus of the peptide and the 3'-end of the oligonucleotide to the same bead. Solution-phase libraries can be synthesized as a 1:1 peptide- oligonucleotide conjugate, in which the C-terminus of the peptide is attached to the 3'-end of the oligonucleotide through a hydrolyzable Fmoc-Ser(O-Dmt) linker which is attached to the resin. The identification of binders in the resin- bound peptide libraries can be detected by the bead staining methodology (Lam, K. , Salmon, S., Hersh, E. , Hruby, V., Kazmiersky, W. & Knapp, R. , Nature, (1991), 354, 82-84). Although the peptides are bound to a solid-phase, there does not have to be a 1:1 peptide- oligonucleotide ratio since the DNA can be amplified prior to the determination of its sequence. In fact, less DNA is preferred so that there will be less interference with the polymer's binding properties. Once a bead is identified, the DNA sequence is determined (Stahl, S., Hultman, T. , Olsson, A., Mois T., et al., Nucleic Acid Res.. (1988), 16, 3025-3038) after PCR amplification or by thermal-cycle sequencing (Figure 8) . This requires the inclusion of one or more primer sites neighboring the coding region of the oligonucleotide. Similarly, the use of solution-phase libraries requires isolation of each sequence from each other. This can^be accomplished by restricting the DNA after PCR amplification and inserting it into M13 (or other suitable vector) for clonal isolation and sequencing (Figure 9) .

EXAMPLES The following examples will provide those skilled in the art with a complete disclosure of how to make and use the invention and are not intended to limit the scope of the invention. Efforts have been made to insure accuracy with respect to numbers used

(e.g. amounts, temperature, etc.), but some experimental error and deviation should be accounted for. Unless indicated otherwise, parts or parts by weight, molecular weight is weight average molecular weight, temperature is in degrees centigrade and pressure is at or near atmospheric.

Example 1 The independent synthesis of two unambiguously correlated sequences has been successfully completed. The subsequent sequence analysis of the "coding" strand has also been demonstrated. For convenience, two peptide sequences were chosen. The "binding" strand was synthesized with N β -Fmoc-protected amino acids and the "coding" strand was synthesized with IT-Ddz-protected amino acids. The simultaneous synthesis of the two peptide strands was tested in two formats, 1) resin-bound peptide library synthesis and 2) solution-phase peptide libraries using a hydrolyzable Fmoc-Lys(Moz)- OH linker (Wang, S.S.; Chen, S.T, Wang, K.T., and Merrifield, R.B., Int. J. Peptide Protein Res.. (1987) , 30, 662-667) . These syntheses were performed on single peptides (not libraries) as a demonstration of research concept and in order to allow the full characterization of the synthesis products.

-25-

A. Synthesis of a Resin-Bound Library Model

"Binding" Sequence: Ac-Arg-Leu-Val-Thr-His (Fmoc peptide)

Resin

11 Coding" Sequence: H j N-Ala-Ser-Gly-Glu-Phe-Ala (Ddz peptide)

Synthesis Scheme:

After the TFA deprotection, the model library bead has two independently synthesized sequences and is ready for assay. Only the coding strand has a free α-amino group and can be characterized by N-terminal Edman degradation. The binding strand is acetylated and there-fore will not interfere with the sequencing. The two peptides were cleaved from the resin with HF thereby providing both the "binding" and "coding" sequences as free peptides. The amino acid composition, mass spectro-scopy and N- terminal sequencing data are consistent with the correct products. (See Figures 5, 6 and 7.)

Mass Spectrometry:

Amino Acid Composition:

N-Terminal Edman Sequencing of Resin beads (coding peptide only) :

Example 2

Synthesis of a Solution-Phase Peptide Library Model

In this example, a 1:1 solution-phase adduct between a "binding" and a "coding" strand was synthesized and fully characterized. The "binding" strand was assembled with Fmoc-protected monomers, and the "coding" strand was assembled with Ddz-protected monomers.

"Binding" sequence : Ac-Glu-Ser-Thr-Arg-Pro-nLeu-Lys-β- (Fmoc peptide) Ala-NH 2

"Coding" sequence: HjN-Gly-Ala-Phe-Gly-Ala-Phe-CONH (Ddz peptide)

Synthesis Scheme:

Following TFA cleavage and deprotection, the model solution-phase library contains a 1:1 Fmoc/Ddz conjugate peptide. One peptide sequence was synthesized and not a mixture in order to fully characterize the reaction product. The amino acid composition and mass spectroscopy data are consistent with the correct product. In addition, the "binding" and "coding" hybrid peptides were tested in a competition ELISA format. The ELSTRPnL "binding" sequence binds to an anti-gpl20 antibody with submicromolar affinity. This value was not affected by the presence of the "coding" peptide.

Mass Spectroscopy:

Theoretical Observed Fmoc/Ddz peptide conjugate: 1492.7 1492.6

Amino Acid Composition:

The instant invention is shown and described herein in what is considered to be the most practical, and preferred embodiments. It is recognized, however, that departures may be made therefrom which are within the scope of the invention, and that obvious modifications will occur to one skilled in the art upon reading this disclosure.