Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IDENTIFICATION OF LIGAND RECOGNITION DOMAINS
Document Type and Number:
WIPO Patent Application WO/2009/138250
Kind Code:
A3
Abstract:
The present invention relates to methods for the identification of ligand recognition domains (LRD) in ligand-binding proteins or the identification of ligand-containing molecules, to diagnostic assays and kits for carrying out said methods and to peptides comprising LRD sequences.

Inventors:
PRZYBYLSKI MICHAEL (DE)
MOISE ADRIAN (DE)
GABIUS HANS-JOACHIM (DE)
Application Number:
PCT/EP2009/003495
Publication Date:
January 07, 2010
Filing Date:
May 15, 2009
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
STEINBEIS TRANSFERZENTRUM BIOP (DE)
PRZYBYLSKI MICHAEL (DE)
MOISE ADRIAN (DE)
GABIUS HANS-JOACHIM (DE)
International Classes:
G01N33/68; C07K14/435
Other References:
CHEN Y -J ET AL: "Carbohydrate-encapsulated gold nanoparticles for rapid target-protein identification and binding-epitope mapping", CHEMBIOCHEM 200507 DE, vol. 6, no. 7, July 2005 (2005-07-01), pages 1169 - 1173, XP002498396, ISSN: 1439-4227
HOCHLEITNER ELISABETH O ET AL: "Characterization of a discontinuous epitope of the human immunodeficiency virus (HIV) core protein p24 by epitope excision and differential chemical modification followed by mass spectrometric peptide mapping analysis", PROTEIN SCIENCE, vol. 9, no. 3, March 2000 (2000-03-01), pages 487 - 496, XP002498397, ISSN: 0961-8368
PARKER C E ET AL: "MALDI/MS-BASED EPITOPE MAPPING OF ANTIGENS BOUND TO IMMOBILIZED ANTIBODIES", MOLECULAR BIOTECHNOLOGY, HUMANA PRESS, INC, US, vol. 20, no. 1, 1 January 2002 (2002-01-01), pages 49 - 62, XP001156661, ISSN: 1073-6085
MANN MATTHIAS ET AL: "Analysis of proteins and proteomes by mass spectrometry", ANNUAL REVIEW OF BIOCHEMISTRY ANNUAL REVIEWS {A}, 4139 EL CAMINO WAY, PALO ALTO, CA, 94303-0139, USA SERIES : ANNUAL REVIEW OF BIOCHEMISTRY (ISSN 0066-4154), 2001, pages 437 - 473, XP002498398, ISSN: 0-8243-0870-0
BRIDONNEAU P ET AL: "SITE-DIRECTED SELECTION OF OLIGONUCLEOTIDE ANTAGONISTS BY COMPETITIVE ELUTION", ANTISENSE & NUCLEIC ACID DRUG DEVELOPMENT, MARY ANN LIEBERT, INC., NEW YORK, US, vol. 9, no. 1, 1 February 1999 (1999-02-01), pages 1 - 11, XP009024412, ISSN: 1087-2906
Attorney, Agent or Firm:
LENHARD, Konstanze (Grafinger Str. 2, München, DE)
Download PDF:
Claims:
Claims

1. A method for the identification and/or determination of at least one ligand recognition domain (LRD) of a ligand-binding proteinaceous compound, comprising the steps of:

(a) contacting said proteinaceous compound with a proteolytically stable ligand immobilized on a matrix under conditions suitable for binding said proteinaceous compound to said ligand;

(b) proteolytically degrading the proteinaceous compound bound to the immobilized ligand of step (a);

(c) releasing the ligand-bound LRD-containing fragment(s) of the proteinaceous compound obtained in step (b) from the ligand; and

(d) analyzing the fragment(s) obtained in step (c) using mass spectrometry (MS).

2. The method according to claim 1 , . wherein the ligand comprises a carbohydrate moiety of a nucleic acid moiety.

3. The method according to claim 1 or 2, wherein the proteolytical degradation of step (b) is carried out enzymatically.

4. The method according to claim 3, wherein the enzyme is trypsin and/or Pronase.

5. The method according to any of claims 1 to 4, wherein the releasing of the ligand-bound LRD-containing fragment(s) from the ligand in step (c) is carried out by competitive elution.

6. A kit for carrying out the method according to any of claims 1 to 5, comprising a proteolytically stable ligand immobilized on a matrix, a suitable elution buffer, and a proteolytic agent.

7. A method for the identification and/or determination of at least one ligand-

containing molecule in a sample comprising the steps of:

(a) contacting said sample with at least one ligand recognition domain (LRD)-containing peptide obtainable by the method of any of claims 1 to 5 immobilized on a matrix under conditions suitable for binding said ligand-containing molecule to said peptide;

(b) releasing the LRD-bound ligand-containing molecules obtained in step (a) from said peptide; and

(c) analyzing the ligand-containing molecule obtained in step (b).

8. The method according to claim 7, wherein the releasing of the LRD-bound ligand-containing molecule from said peptide in step (b) is carried out by competitive elution.

9. The method according to any of claims 7 or 8, wherein the ligand- containing molecule is presented by a cell, selected from the group consisting of mammalian cells, plant cells, insect cells and bacterial cells.

10. The method of claim 9, wherein the cell is a tumor cell.

11. The method according to any of claims 1 to 5 and 7 to 10, wherein the ligand comprises a carbohydrate moiety or a nucleic acid moiety.

12. A biochip comprising at least one immobilized ligand recognition domain (LRD)-containing peptide obtainable by the method of any of claims 1 to 5 for the identification and/or determination of ligand-containing molecules in a sample.

13. A peptide comprising the amino acid sequence of a ligand recognition domain (LRD) of a proteinaceous compound obtainable by the method of any of claims 1 to 5.

Description:

"Identification Of Ligand Recognition Domains"

Description

The present invention relates to methods for the identification of ligand recognition domains (LRD) in ligand-binding proteins or the identification of ligand-containing molecules, to diagnostic assays and kits for carrying out said methods and to peptides comprising LRD sequences.

The vast majority of proteins has the ability to specifically interact with one or more ligands. Such proteins include not only enzymes interacting with a substrate, but also, for example, transport proteins, receptors, and transcription factors. Two very important classes of ligands for proteins are carbohydrates and nucleic acids. Interactions of proteins with carbohydrates or nucleic acids play an important role in a multitude of physiological and cellular processes such as cellular recognition processes, intracellular regulation pathways, immunological reactions and the transcriptional or post-transcriptional regulation of gene expression. The identification and/or determination of the LRD of a protein binding such ligands is a prerequisite for the thorough mechanistic understanding of that protein's function and for the development of potential biochemical, analytical or biomedical applications involving that protein.

The complete structures of proteins with their bound ligand have been determined in a few cases both by X-ray crystallography and by nuclear magnetic resonance (NMR) spectroscopy, e.g. for oligo- and polysaccharide complexes bound to lectins. To accurately identify the position and the mode of binding of a ligand, a resolution of at least 2 to 2.5 A is required. Such a resolution is often difficult or impossible to obtain, and in general crystallisation of protein-ligand complexes suitable for structure and interaction determinations is a tedious and time-demanding process. In some cases, a crystal structure can be obtained for a ligand-binding protein independent of the ligand, and its structure and potential binding site can be predicted using information from the three-dimensional structure of a homologous protein. For the determination of

the structure of biomolecules and their interactions in solution, NMR is often a preferred technique. In NMR, the proton-proton distances can be obtained following assignment of the proton resonances through multidimensional techniques, such as the nuclear Overhauser effect (NOE). This information coupled with computational methods employing computer modelling allows the prediction of e.g. glycan conformations in solution. It is possible to extend this information to examine the interaction of e.g. a glycan with a protein in solution using transfer NOE (TRNOE).

However, this approach has a number of premises and limitations, and requires relatively large amounts of purified samples. Available computer-assisted modelling strategies and information from, e.g. glycan solution conformations and protein three-dimensional structures can be combined with NMR. Although these approaches can be highly informative, they are often limited by the requirement of relatively large amounts of highly purified material and by the problem that many of these interactions are performed under non-equilibrium conditions. Hence, a molecular chemical method capable of identifying the binding structures at high molecular specificity and sensitivity (low sample amounts), and with lower requirements in purity is needed.

Direct methods of chemical structure identification of biological ligand complexes such as protein-peptide interactions, receptor-ligand complexes and antigenic determinants (epitopes) in peptide/protein-antibody complexes using combinations of selective proteolytic and affinity mass spectrometric methods have been developed since several years, for example the isolation of antibody-bound peptides using immuno-affinity techniques followed by the precise identification of epitope peptides by mass spectrometry. Mass spectrometric methods can provide structural details on many levels for different classes of biomolecules. Proteins can now be analyzed by mass spectrometry to reveal elemental composition, complete or partial amino acid sequences, post-translational modifications, protein-protein interaction sites, and even provide insight into higher order structure. One key advantage of mass spectrometric analyses over other analytical techniques is its capability

to study extremely small quantities of molecules with high sensitivity. The "soft" ionization/desorption techniques such as electrospray (ESI) and matrix assisted laser desorption/ionization (MALDI) are major tools for these analyses.

In summary, major problems and limitations of present methods for the elucidation of protein structures, with or without bound ligands, are inter alia due to (i) high requirements regarding the purity of proteins for obtaining X-ray crystal structure determinations or NMR structure analyses; (ii) large sample amounts needed in pure/homogeneous form, which may make difficult, limit or obviate the application of these methods to biological materials; and (iii) NMR and X-ray crystal structure analyses of protein-ligand complexes are time-consuming and require elaborate and lengthy experimental protocols.

Thus, the technical problem underlying the present invention is to provide a novel method for the identification and/or determination of ligand recognition domains in ligand-binding proteins requiring a low amount of input material which may have a lower purity which can be carried out maintaining the native structure of the proteins with high accuracy.

The solution to the above technical problem is achieved by the embodiments characterized in the claims.

In particular, the present invention relates to a method for the identification and/or determination of at least one ligand recognition domain (LRD) of a ligand-binding proteinaceous compound, comprising the steps of:

(a) contacting said proteinaceous compound with a proteolytically stable ligand immobilized on a matrix under conditions suitable for binding said proteinaceous compound to said ligand;

(b) proteolytically degrading the proteinaceous compound bound to the immobilized ligand of step (a);

(c) releasing the ligand-bound LRD-containing fragment(s) of the proteinaceous compound obtained in step (b) from the ligand; and

(d) analyzing the fragment(s) obtained in step (c) using mass spectrometry (MS).

In another aspect, the present invention relates to a diagnostic assay for detecting a disorder associated with an increase or decrease of a ligand- binding proteinaceous compound and/or with an altered ligand recognition of a ligand-binding proteinaceous compound, comprising the steps of:

(a) contacting a sample containing a proteinaceous compound with a proteolytically stable ligand immobilized on a matrix under conditions suitable for binding said proteinaceous compound to said ligand;

(b) proteolytically degrading the proteinaceous compound bound to the immobilized ligand of step (a);

(c) releasing the ligand-bound ligand recognition domain (LRD)-containing fragment(s) of the proteinaceous compound, obtained in step (b) from the ligand; and

(d) analyzing the fragment(s) obtained in step (c) using mass spectrometry (MS).

In one embodiment of the present invention, the disorder associated with an increase or decrease of a ligand-binding proteinaceous compound and/or with an altered ligand recognition of a ligand-binding proteinaceous compound is associated with an increase or decrease in abundance or concentration of a ligand-binding proteinaceous compound.

Further, in another embodiment of the present invention, the LRD is defined as a partial amino acid sequence of the ligand-binding proteinaceous compound.

The term "ligand" as used herein relates to any compound having the ability to specifically interact with a certain proteinaceous compound resulting in a binding of the proteinaceous compound to the ligand. The ligand used herein is proteolytically stable, which means that it does not decompose in step (b) of the

above method, or decomposes at a significantly slower rate, and is a more stable ligand than the binder under the assay conditions. In a preferred embodiment of the present invention, the ligand according to the present invention comprises a carbohydrate moiety. In a more preferred embodiment of the present invention, the carbohydrate moiety is selected from the group consisting of monomeric carbohydrates, dimeric carbohydrates, linear polymeric carbohydrates and branched polymeric carbohydrates. Examples of such a carbohydrate moiety are lactose; monosaccharides and their N- acetylated derivatives, like galactose, glucose, mannose, fucose, N- acetylglucosamine and N-acetyl-galactosamine; sialic acids, like N- acetylneuraminic acid and related carbohydrates; disaccharides and their N- acetylated derivatives, like saccharose, lactose, N-acetyl-lactosamine; blood group A trisaccharide, blood group B trisaccharide, blood group H trisaccharide; blood group A tetrasaccharide, blood group B tetrasaccharide; Man5 and other branched oligosaccharides.

In another preferred embodiment of the present invention, the ligand according to the present invention comprises a nucleic acid moiety. In a more preferred embodiment of the present invention, the nucleic acid moiety is selected from the group consisting of double-stranded (ds) DNA, single stranded (ss) DNA, ssRNA, dsRNA, mRNA, tRNA, siRNA, rRNA, DNA aptamers, RNA aptamers, or natural and synthetic G - A - T - C combinations of any sequence.

The ligand used in the method according to the present invention is not limited to any structural properties such as the length of a nucleic acid or the bulk of carbohydrate as long as the structural properties of the ligand do not have any negative impact on the binding of the ligand-binding proteinaceous compound.

The term "ligand recognition domain" as used herein relates to any part of the proteinaceous compound being in direct, full or partial physical contact to the ligand, for example a certain portion of an amino acid chain. In a preferred embodiment, the ligand recognition domain is a continuos epitope or a discontinuous epitope. If the ligand recognition domain is a discontinuous

epitope, the ligand is recognized by the three-dimensional structure of the ligand recognition domain.

The term "proteinaceous compound" as used herein relates to any peptide, oligopeptide, polypeptide, monomeric protein, multimeric protein or multisubunit protein complex.

The expression "ligand-binding proteinaceous compound" as used herein relates to any proteinaceous compound having the ability to specifically interact with a certain ligand, resulting in a binding of the ligand by the proteinaceous compound. In a preferred embodiment of the present invention, the proteinaceous compound is a lectin.

The term "matrix" as used herein relates to any solid carrier material having the ability to bind and present a ligand which is inert to the procedural steps of the above method, especially the proteolytically degradation of the proteinaceous compound in the above method. The term "matrix" does not have any specific limitations, and relates, for example, to an insoluble polymer material, which can be an organic polymer, such as polyamide or a vinyl polymer (e.g. poly(meth)acrylate, polystyrene and polyvinyl alcohol, or derivatives thereof), a natural polymer such as cellulose, dextrane, agarose, chitin and polyamino acids, or an inorganic polymer, such as glass or metallohydroxide. The matrix can be in the form of a microcarrier, particles, membranes, strips, paper, film, pearls or plates, such as microtiter plates or microarrays. The term "microarray" as used herein means any arrangement of ligands in addressable locations on a matrix resulting in a so-called "biochip". An example of the immobilization matrix is epoxy-activated Sepharose 6B.

The term "proteolytical" as used herein relates to any form of degradation of a proteinaceous compound. Means of proteolytical degradation according to the present invention include chemical and enzymatical degradation. For example the proteolytical degradation is carried out enzymatically. The enzymatically degradation can be carried out using for example an enzyme being selected

from the group, consisting of aminopeptidases, carboxypeptidases, cathepsins, endopeptidases, exopeptidases, metalloproteases, serine proteases, chymotrypsin, papain, pepsin, Pronase, proteinase K, trypsin, ArgC- protease, Lys-C protease, GIu-C protease, dipeptidyl-aminopeptidase, and any mixture combination of these proteases. In a preferred embodiment of the present invention the enzyme is trypsin and/or Pronase.

The expression "ligand recognition domain (LRD)-containing fragment" as used herein relates to any fragment of a proteinaceous compound comprising at least one of the proteinaceous compound ' s LRD. In a preferred embodiment of the present invention, the LRD is a carbohydrate recognition domain (CRD) and the ligand comprises a carbohydrate moiety.

The expression "altered ligand recognition" as used herein relates to any significantly strengthened, weakened or completely abolished binding of a proteinaceous compound to its ligand which might be caused by for example an altered amino acid sequence of the proteinaceous compound or conformational changes which might be due to for example disease-related changes in the processing of the proteinaceous compound.

The term "sample" as used herein relates to any composition that contains or is suspected of containing a proteinaceous compound according to the present invention. Examples for such a sample are whole blood, serum, plasma, urine or liquor of an individual or eukaryotic or prokaryotic cell extracts or plant tissues. Further, the sample may comprise a solution derived from naturally occurring systems, e.g. a solution containing isolated blood compounds or processed blood products. In one embodiment of the present invention, the sample may comprise cells expressing the proteinaceous compound, e.g. a cell culture system. Methods for obtaining the above samples are known in the prior art.

In a preferred embodiment of the present invention, step (a), of the above methods is further carried out under conditions suitable for maintaining the

native form of the proteinaceous compound.

In another preferred embodiment of the present invention, the above methods further comprise at least one washing step. In a more preferred embodiment, the above methods comprise the step of:

(i) washing the immobilized ligand after contacting it with the proteinaceous compound and analyzing the washing fraction by mass spectrometry (MS); and/ or

(ii) washing the immobilized ligand after proteolytical degradation of the proteinaceous compound to obtain the non-LRD-containing fragments of the proteinaceous compound and analyzing these by mass spectrometry.

In one embodiment of the present invention, releasing the ligand-bound LRD- containing fragment(s) of the proteinaceous compound from the ligand being fixed to the matrix comprises competitive elution. The term "competitive elution" as used herein relates to any method of releasing a ligand-bound compound from the ligand by using a compound that can competitively bind to the ligand- bound compound, using the same binding site thus displacing the immobilized ligand.

In another embodiment of the present invention, mass spectrometry comprises electrospray ionization (ESI) and/or matrix assisted laser desorption/ionization (MALDI). These so called "soft" ionization/desorption techniques are suitable to directly analyze intact peptide/protein-ligand interactions and to yield stable, intact and highly accurate molecular mass data of polypeptides, carbohydrates and nucleic acids. In a preferred embodiment of the present invention, mass spectrometry comprises time of flight mass spectrometry (TOF-MS) and/or Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS).

An example of the present invention is the so-called CREDEX-MS (carbohydrate recognition domain excision mass spectrometry). In this embodiment, the ligand-binding proteinaceous compound according to the present invention is a carbohydrate binding protein. In a preferred embodiment

of the present invention, the carbohydrate binding protein is selected from the group consisting of lectins, anti-carbohydrate antibodies, carbohydrate binding enzymes and carbohydrate binding transport proteins. In a more preferred embodiment of the present invention, the carbohydrate binding protein is a lectin. Accordingly, the ligand in CREDEX-MS comprises a carbohydrate moiety. In a more preferred embodiment, the carbohydrate moiety is lactose.

In another embodiment of the present invention, the proteinaceous compound is bound to a cell. The expression "proteinaceous compound presented by a cell" as used herein relates to any proteinaceous compound that is directly or indirectly bound on the surface of a cell. The cell may be for example selected from the group consisting of mammalian cells, plant cells, insect cells, bacterial cells, protozoae, viruses, fungi, or neuronal cells. In a preferred embodiment of the present invention, the cell is a tumor cell.

In one embodiment, the present invention relates to a biochip comprising at least one immobilized ligand for the identification and/or determination of at least one ligand recognition domain (LRD) of a ligand-binding proteinaceous compound, wherein the ligand comprises a carbohydrate moiety or a nucleic acid moiety according to the present invention. In a preferred embodiment of the present invention, the biochip comprises only one type of ligand. In another preferred embodiment of the present invention, the biochip comprises at least two different types of ligand.

In another embodiment, the present invention relates to the use of a biochip comprising at least one immobilized ligand in the method for the identification and/or determination of at least one ligand recognition domain (LRD) of a ligand-binding proteinaceous compound according to the present invention.

In a further embodiment, the present invention relates to the use of a biochip comprising at least one immobilized ligand in the diagnostic assay for detecting a disorder associated with an increase or decrease of a ligand-binding proteinaceous compound and/or with an altered ligand recognition of a ligand-

binding proteinaceous compound according to the present invention.

The present invention also relates to a kit for carrying out the above method, comprising a proteolytically stable ligand immobilized on a matrix, a suitable elution buffer, and a proteolytic agent. In a preferred embodiment, the kit further comprises at least one suitable washing buffer.

Further, the present invention relates to a kit comprising means for carrying out at least one of the methods according to the present invention. Means for carrying out the method for the identification and/or determination of at least one ligand recognition domain (LRD) of a ligand-binding proteinaceous compound include, but are not limited to, suitable buffers, ligands, reagents, consumables, matrices and biochips.

Moreover, the present invention relates to a diagnostic kit comprising means for carrying out at least one of the diagnostic assays according to the present invention. Means for carrying out the diagnostic assay for detecting a disorder associated with an increase or decrease of a ligand-binding proteinaceous compound and/or with an altered ligand recognition of a ligand-binding proteinaceous compound include, but are not limited to, suitable buffers, reagents, consumables, matrices and biochips.

In one embodiment, the present invention relates to a peptide comprising the amino acid sequence of a ligand recognition domain (LRD) of a proteinaceous compound obtainable by any of the preceding methods according to the present invention. In a preferred embodiment of the present invention, the peptide has a length of 8 to 20 amino acids, more preferred of 5 to 12 amino acids, and most preferred of 5 to 7 amino acids.

In another embodiment, the present invention relates to a method for the preparation of peptides comprising the amino acid sequence of a ligand recognition domain (LRD) of a proteinaceous compound identified by any of the preceding methods according to the present invention comprising solid phase

peptide synthesis. In one embodiment of the present invention, the peptide is a peptide as defined above.

Additionally, the present invention relates to a method for the identification and/or determination of at least one ligand-containing molecule in a sample comprising the steps of:

(a) contacting said sample wjth at least one ligand recognition domain (LRD)- containing peptide obtainable by any of the above methods immobilized on a matrix under conditions suitable for binding said ligand-containing molecule to said peptide;

(b) releasing the LRD-bound ligand-containing molecules obtained in step (a) from said peptide; and

(c) analyzing the ligand-containing molecules obtained in step (b).

In another aspect, the present invention relates to a diagnostic assay for detecting a disorder associated with an increase or decrease of a specific ligand comprising the steps of:

(a) contacting a sample containing at least one ligand-containing molecule with at least one ligand recognition domain (LRD)-containing peptide according to the present invention immobilized on a matrix under conditions suitable for binding said ligand-containing molecule to said peptide;

(b) releasing the LRD-bound ligand-containing molecules obtained in step (a) from said peptide; and

(c) analyzing the ligand-containing molecule obtained in step (b).

The terms "ligand", "ligand recognition domain", "proteinaceous compound", "sample", "ligand recognition domain (LRD)-containing peptide, and "matrix" as used herein are defined as defined above.

The term "ligand-containing molecule" as used herein relates to any compound containing at least one moiety that can bind to a ligand recognition domain (LRD)-containing peptide according to the present invention.

The increase or decrease of a specific ligand may be due to any alteration of physiological functions in the individual from which the sample is taken. For example the increase or decrease of a specific ligand may be due to a reduction or activation of glycosylation mechanisms or an altered expression of the specific ligand. The expression "altered expression of the specific ligand" as used herein relates to any change in the expression of a specific ligand that leads to significantly strengthened, weakened or completely abolished binding of the ligand to a ligand recognition domain (LRD)-containing peptide. For example if changes in a glycosylation pattern of a cell occur, a ligand can turn into a different ligand being specific for a diseased cell which is a viable ligand for a different LRD.

The analyzing of the ligand-containing molecule can be carried out by all suitable methods known in the art. For example, the ligand-containing molecules can be evaluated qualitatively and/or quantitatively by methods well known in the art. Examples for the evaluation of the ligand-containing molecules include, but are not limited to, mass spectrometry, sequencing, SPR, NMR, XRD, ELISA, HPLC, gel electrophoresis, Edman sequencing, BCA assay, and various ligand-specific chemical tests.

In one embodiment of the present invention, at least steps (a) and (b) of the above methods are carried out under conditions suitable for maintaining the native form of the ligand.

In a preferred embodiment of the present invention, the above methods further comprise at least one washing step. In a more preferred embodiment, the above methods comprise the step of washing the immobilized ligand recognition domain (LRD)-containing peptide after contacting it with the sample and analyzing the washing fraction by mass spectrometry (MS).

In another preferred embodiment of the present invention, releasing the LRD- bound ligand-containing molecules from the peptide comprises competitive

elution. The term "competitive elution" as used herein relates to any method of releasing a ligand-bound compound from the ligand by using a compound that can competitively bind to the ligand-bound compound, the compound being the same as the ligand or different from it.

The ligand-containing molecule may be bound to a cell. The expression "molecule bound to a cell" as used herein, relates to any compound that is directly or indirectly bound on the surface of a cell. The cell may be selected from the group consisting of mammalian cells, plant cells, insect cells, bacterial cells, protozoae, viruses, fungi, or neuronal cells. In a preferred embodiment of the present invention, the cell is a tumor cell.

Further, the present invention relates to the use of ligand recognition domain (LRD)-containing peptides according to the present invention in the manufacture of a biochip comprising at least one immobilized ligand recognition domain (LRD)-containing peptide obtainable by any of the above methods for the identification and/or determination of ligand-containing molecules in a sample.

Moreover, the present invention relates to a biochip comprising at least one immobilized ligand recognition domain (LRD)-containing peptide obtainable by any of the above methods for the identification and/or determination of ligand- containing molecules in a sample. In a preferred embodiment of the present invention, the biochip comprises only one type of ligand. In another preferred embodiment of the present invention, the biochip comprises at least two different types of ligand.

In one embodiment, the present invention relates to the use of a biochip comprising at least one immobilized ligand recognition domain (LRD)-containing peptide according to the present invention in the method for the identification and/or determination of at least one ligand-containing molecule in a sample according to the present invention.

In another embodiment, the present invention relates to the use of a biochip comprising at least one immobilized ligand recognition domain (LRD)-containing peptide according to the present invention in the diagnostic assay for detecting a disorder associated with an increase or decrease of a specific ligand and/or with an altered expression of a specific ligand according to the present invention.

Additionally, the present invention relates to a kit comprising means for carrying out at least one of the methods according to the present invention. Means for carrying out the method for the identification and/or determination of at least one ligand-containing molecule in a sample include, but are not limited to, suitable buffers, reagents, consumables, matrices and biochips.

In one embodiment, the present invention relates to a diagnostic kit comprising means for carrying out at least one of the diagnostic assays according to the present invention. Means for carrying out the diagnostic assay for detecting a disorder associated with an increase or decrease of a specific ligand and/or with an altered expression of a specific ligand include, but are not limited to, suitable buffers, reagents, consumables, matrices and biochips.

The figures show:

Figure 1 : Principle of CREDEX-MS: Analytical concept and scheme for proteolytic-excision mass spectrometry identification of LRD sequences. CREDEX-MS consists of the proteolytic excision and mass spectrometric identification of LRD sequences. The ligand-binding is bound on a column with immobilised ligand and digested with proteases. The protein fragments not binding to the ligand are washed from the column and analyzed by FTICR-MS. The remaining bound fragments are eluted with the same or a different ligand (competitive elution) and analyzed by mass spectrometry. The native conformation of the protein is kept by running the experiment in physiological conditions.

Figure 2: Epoxy activated Sepharose 6B is the immobilization matrix used for carbohydrate immobilisation. The Sepharose 6B contains 6% (w/v) agarose which is a polysaccharide containing β-D-galactose. All lectins and antibodies capable of recognizing and binding to β-D-galactose and oligo and polysaccharides which contain β-D-galactose have to be tested for Sepharose affinity in order to eliminate a false-positive response.

Figure 3: Galectin-3 interaction study with Sepharose matrix by affinity MS as negative control. Since the galectins specifically recognize β-D-galactose, a galectins-Sepharose affinity test is necessary in order to eliminate a false- positive response.

Figure 4: FTICR-MS of tryptic peptide fragments of galectin-3. In solution digestion was performed on 100 μg galectin-3 using trypsin (enzyme : substrate ratio 1 :50) for three hours at 37°C in 100 μl_ PBS 5OmM pH 7.5.

Figure 5: CREDEX-MS of galectin-3 interaction with lactose provides two specific LRD peptides (152-162) and (177-183). The upper spectrum shows the fragments identified in the supernatant fraction and the lower spectrum shows the fragments identified in the elution fraction (the LRD peptides (152-162) and (177-183)). FTICR-MS was obtained with a Bruker Apex Il 7T FT-ICR mass spectrometer (Bruker Daltonik, Bremen, Germany).

Figure 6: Topology of LRD peptides identified by CREDEX-MS (152-162) and (177-183) in the crystal structure of galectin-3. The molecular graphics image presented was produced using the UCSF Chimera package from the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco (supported by NIH P41 RR-01081).

Figure 7: Comparison of CREDEX-MS and crystallographic data of galectin-3 LRD. On the left side is the crystal structure of galectin-3 with the peptides identified by CREDEX-MS (152-162) and (177-183). On the right side is the crystal structure of galectin-3 with the binding site of the carbohydrate (152-

186). The amino acids that are in direct contact with the carbohydrate are marked in bold (H158, N160, R162, N174, N171 , W181 , E184, R186). The molecular graphics images presented were produced using the UCSF Chimera package from the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco (supported by NIH P41 RR-01081).

Figure 8 a. b. c: Characterization of synthetic LRD peptides by affinity MS. Two galectin-3 peptides (157-163) and (157-175) were synthesized by Fmoc-SPPS, purified by RP-HPLC and characterized by MALDI-TOF-MS. The synthetic peptides solutions were added to the lactosyl-Sepharose column and after 12 hours shaking at 37°C the supernatant was analysed by MALDI-MS. The column was washed with the binding solvent until the washing fractions did not present any signals when analysed by MALDI-MS. Lactose elution was performed with 400 μL 0.3 M lactose in PBS pH 7.5 shaking at 37 0 C for 15 min, and elution fractions analyzed by MALDI-MS.

Figure 9: Analytical concept of identification of thrombin interaction domains with aptamer polynucleotide by proteolytic extraction and affinity- mass spectrometry.

Figure 10: SDS PAGE of digestibility of reduced and alkylated thrombin using chymotrypsin. The gel shows that after 48h 25 degree incubation, modified thrombin was effectively digested by chymotrypsin. The reaction conditions were as follows: E/S 1 :20, 25 Q C (RT) 48h with gentle shaking. The arrow indicates that reduced and alkylated thrombin is digested by chymotrypsin.

Figure 11 : MALDI-TOF mass spectrum of in-solution tryptic digested R&A thrombin fragments. HC = heavy chain, LC = light chain.

Figure 12: MALDI-ToF mass spectrum of in-solution chymotryptic digested R&A thrombin fragments. HC = heavy chain, LC = light chain.

Figure 13: MALDI-ToF mass spectrum of elution fraction in chymotryptic epitope extraction. All three peptides 108 ERNIEKISMLEKIY 121 , 246 VMKSPFNNRWY256 and 257 QMGIVSWGEGCDRDGKY 273 are located on HC. HC = heavy chain, LC = light chain.

Figure 14: MALDI-ToF mass spectrum of elution fraction in tryptic epitope extraction. HC = heavy chain, LC = light chain.

Figure 15: LC-MS/MS spectrum of the peptide identified in binding studies by tryptic proteolytic excision.

Figure 16: Structure of human alpha-thrombin in complex with the aptamer. Tulinsky, A. et al. (1993) The structure of alpha-thrombin inhibited by a 15-mer single-stranded DNA aptamer. J.Biol.Chem. 268: 17651 -17654 (PDB file: 1 HUT)

Figure 17: Structure modeling of aptamer recognition site identified in thrombin Affinity-MS using chymotrysin.

One goal of the present invention is to provide a method for the determination of relevant ligand recognition domain (LRD) peptide sequences from ligand- binding proteins with high sensitivity and by application of the molecular specificity of mass spectrometry. The combination of specific proteolytic excision of affinity-bound proteins and mass spectrometric analysis of the eluted peptides has been shown to be a highly efficient tool for elucidating specific LRD structures. The method is generally applicable to a wide range of ligand-binding proteins and is feasible to a wide range of non-proteinaceous and non-peptidic ligands that can be immobilised on a solid surface or related substrate. Furthermore, the method is feasible with a range of soft ionisation/desorption methods of mass spectrometry, such as electrospray ionization (ESI) and matrix assisted laser desorption/ionisation (MALDI), and a wide range of mass spectrometer types, such as quadrupole, ion-trap and time- of-flight analysers, although the use of high resolution FTICR-MS is of particular

advantage for unequivocal peptide identification. The method is also of clinical relevance in the diagnosis of disorders that are associated with qualitative and/or quantitative alterations in the expression of a ligand-binding protein.

An innovative aspect of the invention is that it provides inter alia the direct identification, and access to molecular chemical structures, in the form of LRD- containing peptides, that are relevant for specific ligand affinity of proteins. . Peptide identification is fast, easy, requires only minimal amounts of starting material and can be achieved directly out of complex starting materials such as cells. For a given ligand, all proteins and their ligand recognition domains that are involved in interactions with that ligand can be identified. The corresponding peptides represent useful model structures which are readily amenable to chemical synthesis, thus facilitating the study, mechanistic understanding and biomedical applications of ligand-binding biomolecules, as well as the use of such peptides as lead compounds in the development of new diagnostic and therapeutic drugs. Making use of such peptides, the present invention also provides a method for the identification of ligand-containing molecules in a sample. This method can be used in the diagnosis of disorders that are associated with qualitative and/or quantitative alterations in the expression of a specific ligand.

The present invention will now be further illustrated in the following examples without being limited thereto.

Examples

Example 1 : The ligand recognition domains of lactose-binding Galectin-3

Interaction of Galectin-3 with Sepharose matrix as negative control

Epoxy-activated Sepharose 6B is the immobilization matrix used for the lactose immobilisation. Sepharose 6B contains 6% (w/v) agarose which is a polysaccharide containing β-D-galactose. All proteins capable of recognizing

and binding to β-D-galactose and/or oligo- and polysaccharides which contain β-D-galactose have to be tested for Sepharose affinity in order to eliminate a false-positive response.

The matrix preparation for the control test was made as follows: Epoxy- activated Sepharose 6B was subjected to swelling in MiIIiQ water for 1 hour, with 2 changes of solution. The resulting gel was washed for 1 hour on a sintered glass filter. The gel was mixed 1 :1 with 0.2 M K2HPO4 pH 13 and shaken at 37°C for 24 hours. The gel was then washed using 100 ml_ of the same buffer. A blocking step followed, with 1 M ethanolamine pH 8, in a shaker at 40 0 C overnight. The gel was then washed with 100 ml_ 0.2 M K 2 HPO 4 pH 13 followed by 4 cycles alternative washing with volumes of 10 mL buffers of different pH (0.1 M acetate buffer containing 0.5 M NaCI pH 4 and 0.1 M Tris- HCI buffer containing 0.5 M NaCI pH 8). The gel was then equilibrated with 50 mL PBS 50 mM pH 7.5.

For the galectin-3 negative control a solution of 100 μg galectin-3 in 100 μl_ PBS 50 mM pH 7.5 was mixed with 200 μl_ pure Sepharose gel (prepared as specified above) and shaken gently at 37°C for 24 h. The supernatant was collected and analysed by MALDI-TOF-MS in order to verify the extent of binding. The free (unbound) lectin was washed out with binding buffer (30-40 mL) and the washing fractions were analysed by MALDI-TOF-MS. The next step was an elution with lactose, 400 μL 0.3 M lactose in PBS pH 7.5 shaking at 37°C for 15 min. This step was repeated two times. This was followed by another washing step with 1.2 mL 0.3 M lactose in PBS pH 7.5. The elution fractions were analysed by MALDI-TOF-MS. The mass spectrum of the first washing fraction with the signals for the single, double and triple charged ions can be seen. The MALDI-TOF spectra of the final washing fraction and elution fraction presented no signal. This proves that Sepharose is a suitable matrix, since no binding of Galectin-3 was observed.

Immobilisation of lactose to epoxy-activated Sepharose

Epoxy-activated Sepharose 6B was subjected to swelling in MiIIiQ water for 1 hour, with 2 changes of solution. The resulting gel was washed for 1 hour on a sintered glass filter. For the lactose immobilisation the gel was mixed with a solution of 50 mg lactose for each mL of gel in 0.2 M K 2 HPO 4 pH 13 and shaken at 37°C for 24 hours. The excess ligand was washed using the coupling buffer (100 mL). The remaining active epoxy groups were blocked with 1 M ethanolamine pH 8, in a shaker at 40 0 C overnight. The gel was then washed with 100 mL coupling buffer followed by 4 cycles alternative washing with volumes of 10 mL buffers of different pH (0.1 M acetate buffer containing 0.5 M NaCI pH 4 and 0.1 M Tris-HCI buffer containing 0.5 M NaCI pH 8) in order to remove the electrostatically bound lactose. The gel was then equilibrated with 50 mL binding buffer (PBS 50 mM pH 7.5 ).

Identification of the lactose recognition domains in Galectin-3

A solution of 100 μg galectin-3 /100 μL PBS 50 mM pH 7.5 was mixed with 200 μL lactosyl-Sepharose gel and shaken gently at 37°C for 24 h. The supernatant was collected and analyzed by MALDI-TOF-MS in order to verify the extent of binding. The free (unbound) Galectin-3 was washed out with binding buffer (30 to 40 mL) and the washing fractions were analysed by MALDI-TOF-MS. The digestion of bound galectin was performed using trypsin, in the binding buffer, with an enzyme to substrate ratio of 1 :100, at 37°C, for 3 h. After digestion, the supernatant was measured by MALDI-FTICR-MS. The free peptides (not bound to lactose) were washed out with 30 to 40 mL PBS pH 7.5. The washing fractions were analysed by MALDI-FTICR-MS. The peptides bound to lactose were eluted by shaking three times with 400 μL 0.3 M lactose in PBS pH 7.5 at 37 0 C for 15 min. This was followed by another washing step with 1 .2 mL 0.3 M lactose in PBS pH 7.5. The eluate was analyzed by MALDI-FTICR-MS.

Mass spectrometry

All FTICR-MS spectra were obtained with a Bruker Apex Il 7T FT-ICR mass spectrometer (Bruker Daltonik, Bremen, Germany) equipped with an external

Scout 100 fully-automated X-Y target stage MALDI source with pulsed collision gas. The pulsed nitrogen laser was operated at 337 nm. Ions generated by laser shots were accumulated in the hexapole for 0.5-1 sec at 15 V and extracted at -7 V into the analyzer cell. A 100 mg/ml solution of 2,5-dihydroxybenzoic acid (DHB, Aldrich, Germany) in acetonitrile: 0.1% TFA in water (2:1) was used as the matrix. 0.5 μl_ of sample solution were mixed on the stainless-steel MALDI sample target and allowed to dry.

Synthesis of ligand recognition domain-containing Galectin-3 peptides

Peptides Gal-3i 57 -i63 ( 157 FHFNPRF 163 ) and Gal-3i 57 -i75

( 157 FHFNPRFNENNRRVIVCNT 175 ) were synthesized by solid-phase peptide synthesis (SPPS) on a NovaSyn TGR resin, containing a polystyrene- polyethyleneglycol resin and a Rink-amide-linker cleavable under acidic conditions, according to commercially available material and published literature procedures. 9-Fluorenylmethoxycarbonyl/t-butyl (Fmoc/tBu) chemistry was used throughout the synthesis using a semi-automated Economy Peptide Synthesizer EPS-221 (ABIMED, Germany). The following N(-Fmoc) and side- chain protected amino acid derivatives were used: Fmoc-Ala-OH, Fmoc- Cys(Trt)-OH, Fmoc-Asp(OtBu)-OH, Fmoc-Glu(OtBu)-OH, Fmoc-Phe-OH, Fmoc-Gly-OH, Fmoc-His(Trt)-OH, Fmoc-lle-OH, Fmoc-Lys(Boc)-OH, Fmoc- Asn(Trt)-OH, Fmoc-Pro-OH, Fmoc-Gln(Trt)-OH, Fmoc-Arg(Pbf)-OH, Fmoc- Ser(tBu)-OH, Fmoc-Thr(tBu)-OH, Fmoc-Val-OH, Fmoc-Trp(Boc)-OH. The synthesis was performed according to the following general protocol: (i) DMF washing (3x1 min), (ii) Fmoc deprotection with 2% DBU, 2% piperidine in DMF (15 min), (iii) DMF washing (6x1 min), (iv) coupling of 5 equivalents of Fmoc amino acid : PyBOP : NMM in DMF (40 to 60 min), (v) DMF washing (3x1 min). After completion of the syntheses, the peptides were cleaved from the resin using a TFA, triethylsilane and deionized water mixture (95:2.5:2.5, WWV) for 3 h at room temperature. The crude product was precipitated with cold tert- butylmethylether, washed three times with diethyl ether and solubilized in 10% acetic acid (aqueous solution) prior to freeze-drying. Purification of the peptides was performed by semipreparative HPLC. Subsequent

characterisation by HPLC and MALDI-TOF mass spectrometric analysis ensured molecular homogeneity of the peptide. The synthetic peptides were used for affinity-MS characterisation.

Characterization of synthetic ligand recognition domain-containing Galectin-3 peptides by MS

Two galectin-3 peptides (157-163) and (157-175) were synthesized by Fmoc- SPPS, purified by RP-HPLC and characterized by MALDI-TOF-MS. The synthetic peptides solutions were added over the lactosyl-Sepharose columns and after 12 hours shaking at 37 0 C the supernatant was analysed by MALDI- TOF. The columns were washed with the binding solvent until the washing fractions did not present any signals when analysed by MALDI-TOF. After this, the lactose elution was performed with 400 μL 0.3 M lactose in PBS pH 7.5 shaking at 37°C for 15 min and repeated two times. The elution fractions for both peptides, (157-163) and (157-175), presented signals (Figure 8 a and b). This experiment shows that the peptides chosen for synthesis indeed present affinity for lactose thus supporting the CREDEX-MS results.

Example 2: Characterization of thrombin-aptamer interaction domains by proteolytic excision/extraction mass spectrometry

The characterisation of DNA- interaction domains is described in the following by the example of the Thrombin-aptamer interaction. The structure of the thrombin-aptamer complex was resolved by X-Ray crystallography (see Figure 17). In the following the epitope is described as the region of thrombin which is in contact to the aptamer.

Proteolytic epitope extraction and epitope excision-mass spectrometry were used to identify the protein-ligand interaction. In the first steps of the procedure the digestibility of the ligand and the affinity to the aptamer binder were characterised (Figures 9 to 12). In the epitope extraction procedure, the

digestion step was carried out before the affinity binding. In both epitope extraction and epitope excision, the affinity of the thrombin-aptamer was tested and confirmed before the mass spectrometric identification of the binding domains.

The preparation of an affinity column with the aptamer nucleotide was performed as follows:

The aptamer was dissolved in MiIIiQ water. Before the binding experiment, 423 μl_ aptamer solution containing 200 μg aptamer (42.3 nMol) were lyophilized first, and the lyophilized aptamer powder was redissolved in 200 μl_ coupling buffer solution (0.2 M NaHCO 3 , 0.5 M NaCI, pH 8.3; 1.68 g NaHCO 3 and 2.92 g NaCI in 100 ml_ MiIIiQ). During the aptamer pretreatment, 100 μl_ NHS activated sepharose 4 fast flow were applied into an affinity column and washed gently with 1.5 ml_ acidification solution (1 mM HCI, kept on ice).

Proteolytic Epitope extraction

Initial proteolytic degradation studies were performed with modified (reduced and alkylated) human alpha thrombin applied to an Eppendorf tube with buffer solution (5mM NH 4 HCO 3 , pH 8). Protease (Enzyme:Substrate ration, 1 :20) was added to the Eppendorf tube, and the mixture was incubated at 37°C for 24 hours (trypsin) and 48 hours (chymotrypsin) with gentle shaking.

The product mixture was characterized by SDS-Page and MALDI-ToF MS.

After the proteolyic cleavage, the mixture of modified human alpha thrombin fragments was added into the affinity column which has been loaded with aptamer-coupled sepharose. Another 200 μl_ 5 mM NH 4 HCO 3 (pH 8) were added into the same affinity column. The column was incubated at 37°C with gentle shaken for 24 hours.

After overnight incubation, the supernatant was collected and lyophilized. The

column was washed with 8 ml_ 5mM NH 4 HCO 3 (pH 8) and eluted with 300 μL MALDI solvent (ACN:0.1%TFA 2:1) twice. Last 1 mL wash fraction and both the elution fractions were collected and lyophilized.

The supernatant, last 1 mL wash fraction and the elution fractions were analysed by MALDI-ToF- MS.

Proteolytic Epitope Excision

20 μg modified human alpha thrombin dissolved in 5 mM NH 4 HCO 3 , pH 8 and 220 μL 5 mM NH 4 HCO 3 (pH 8) were mixed and incubated in the aptamer- coupled sepharose affinity column at 37 Q C for 24 hours with gentle shaking.

After the incubation, the supernatant was collected, and the column washed with 5 mL 5mM NH 4 HCO 3 (pH 8). Then 1 μg protease (E:S 1 :20) was added to the affinity column. The mixture was incubated at 37°C for 24 hours (trypsin) and 48 hours (chymotrypsin) with gentle shaking. The supernatant was collected, lyophilized and the column washed with 8 mL 5 mM NH 4 HCO 3 (pH 8) and eluted with 300 μL MALDI solvent (ACN:0.1 %TFA 2:1 ) twice. The final 1 mL of the washing fraction marked and both elution fractions were collected and lyophilized.

Identified Epitopes (Figures 13 to 17)

Through affinity mass spectrometry and proteolytic epitope extraction/excision using chymotrypsin, the thrombin peptide [108-121] was identified as the epitope, which is in agreement with results from X-ray crystallography of the thrombin-aptamer complex. However, several additional epitope peptides, which have not been described previously literature, were also identified. These results show proteolytic epitope extraction and epitope excision-mass spectrometry to be efficient approaches for characterizing DNA-binding epitope structures. This Example shows the identification of discontinous epitope peptides.