Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
REAGENTS AND METHODS FOR CROSS-LINKING BIOLOGICAL MOLECULES
Document Type and Number:
WIPO Patent Application WO/2007/113575
Kind Code:
A3
Abstract:
The invention provides novel reagents based on calixarene and particularly resorcinarene scaffolds. The reagents are useful as cross-linking reagents for investigating non-covalent interactions between biological macromolecules . Methods of using the novel reagents are also provided.

Inventors:
DOWDEN JAMES (GB)
WELHAM MELANIE JOANNE (GB)
Application Number:
PCT/GB2007/001277
Publication Date:
December 13, 2007
Filing Date:
April 05, 2007
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV BATH (GB)
DOWDEN JAMES (GB)
WELHAM MELANIE JOANNE (GB)
International Classes:
C07D207/46; A61K31/4015; A61K31/4188; C07D495/04
Domestic Patent References:
WO2006042104A22006-04-20
WO2001012586A12001-02-22
Foreign References:
DE10352466A12005-06-16
US20020134280A12002-09-26
Other References:
CHEN X ET AL: "SYNTHESIS, ANTIBIOTIC ACTIVITY, AND ANTIANGIOGENIC ACTIVITY OF CALIXARENE DERIVATIVES THAT ARE TOPOLIGCAL MIMETICS OF AMPHIPATHIC PEPTIDES", AMERICAN CHEMICAL SOCIETY. ABSTRACTS OF PAPER. AT THE NATIONAL MEETING, AMERICAN CHEMICAL SOCIETY, WASHINGTON, DC, US, vol. 229, no. PART 2, March 2005 (2005-03-01), pages V175, XP008065328, ISSN: 0065-7727
LAMARTINE ROGER ET AL: "Antimicrobial activity of calixarenes", 2002, COMPTES RENDUS - CHIMIE, ELSEVIER, PARIS, FR, PAGE(S) 163-169, ISSN: 1631-0748, XP002385707
Attorney, Agent or Firm:
FORREST, Graham et al. (York House23 Kingsway,Greater London, London WC2B 6HP, GB)
Download PDF:
Claims:

Claims :

1. A cross-linking reagent having the formula I

(I)

or formula II:

wherein each X is independently -H, -OH, -R, -O-R, or -O-Y-Z, where Z is a reactive functional group capable of reacting with a group found on a target molecule and Y is a spacer arm;

provided that at least two X groups are -O-Y-Z.

2. A cross-linking reagent according to claim 1 having the formula III:

(III)

or formula IV:

3. A cross-linking reagent according to claim 1 or claim 2 having the formula V:

(V)

or formula VI :

(VI)

4. A cross-linking reagent according to any one of claims 1 to 3 wherein none, one or two X groups have the formula R or - O-R.

5. A cross-linking reagent according to any one of claims 1 to 4 wherein at least one X group attached to each phenyl group is -O-Y-Z.

6. A cross-linking reagent according to claim 5 wherein at least two X groups attached to each phenyl group are -O-Y-Z.

7. A cross-linking reagent according to claim 6 wherein all X groups are -O-Y-Z.

8. A cross-linking reagent according to any one of claims 1 to 7 wherein at least one Z group is an amine-reactive functional group.

9. A cross-linking reagent according to claim 8 wherein the amine-reactive functional group is an isothiocyanate, isocyanate, acyl azide, N-hydroxysuccinimide (NHS) ester, sulfonyl chloride, aldehyde, glyoxal, epoxide, carbonate, arylating agent, imidoester, carbodimide or acid anhydride.

10. A cross-linking reagent according to any one of claims 1 to 7 wherein at least one Z group is a thiol-reactive functional groups .

11. A cross-linking reagent according to claim 10 wherein the thiol-reactive functional group is a haloacetyl or alkyl halide derivative, maleimide, aziridine, acryloyl derivative or disulfide.

12. A cross-linking reagent according to any one of claims 1 to 7 wherein at least one Z group is a carboxylate-reactive functional group.

13. A cross-linking reagent according to claim 12 wherein the carboxylate-reactive functional group is a diazoalkane, diazoacetyl compound, diazoacetate ester, diazoacetamide, carbonyldiimidazole or carbodiimide .

14. A cross-linking reagent according to any one of claims 1 to 7 wherein at least one Z group is a hydroxyl-reactive functional group.

15. A cross -linking reagent according to claim 14 wherein the hydroxyl-reactive functional group is a carbonate.

16. A cross-linking reagent according to any one of claims 1 to 7 wherein at least one Z group is an aldehyde- ' or ketone- reactive functional group.

17. A cross-linking reagent according to claim 16 wherein the aldehyde- or ketone-reactive functional group is a hydrazine or amine .

18. A cross-linking reagent according to any one of claims 1 to 7 wherein at least one Z group is capable of reacting with a carbon atom carrying an active hydrogen atom.

19. A cross-linking reagent according to claim 18 wherein the group capable of reacting with a carbon atom carrying an active hydrogen atom is a diazonium derivative.

20. A cross-linking reagent according to any one of claims 1 to 7 wherein at least one Z group is a photoreactive group.

21. A cross-linking reagent according to claim 20 wherein the photoreactive group is an aryl azide, benzophenone or diazirine .

22. A cross-linking reagent according to any one of claims 1 to 7 wherein at least one Z group is a benzophenone, diazirine, aryl azide, hydrazide, alkyl halide, maleimide, epoxide, alkyne, phosphine, reactive ester, carbonate or anhydride .

23. A cross-linking reagent according to any one of the preceding claims wherein the spacer arms Y independently consist of or comprise saturated or unsaturated chains comprising one or more substituted or unsubstituted alkyl,

alkene, aldehyde, ketone, alcohol or aryl groups, atnino acids, or sugars, linked by carbon or heteroatom linkages such as ether, amine, amide, alkane, alkene, alkyne, thiol or ester linkages .

24. A cross-linking reagent according to any one of the preceding claims wherein each spacer arm Y independently comprises a linear chain of between 1 and 20 atoms between the reactive functional group and the respective phenol-derived oxygen of the calixarene ring.

25. A cross-linking reagent according to any one of the preceding claims wherein the spacer arms Y are independently straight-chain or branched.

26. A cross-linking reagent according to claim 25 wherein one or more spacer arms Y is branched or dendritic.

27. A cross-linking reagent according to claim 26 wherein said branched or dendritic spacer arm Y comprises two or more reactive functional groups Z. '

28. A cross-linking reagent according to claim 27 wherein said branched or dendritic spacer arm Y carries a reactive functional group Z at each chain terminus.

29. A cross-linking reagent according to ' any one of the preceding claims wherein one or more spacer arms comprises a selectively cleavable groups.

30. A cross-linking reagent according to claim 29 wherein the selectively cleavable group is cleaved by a chemical reaction, enzymatic reaction or irradiation.

31. A cross-linking reagent according to claim 29 or 30 wherein the selectively cleavable group is an alkene, disulfide, hydrazinobenzoic acid ester, amide or anhydride,

diol, dithionite derivative, hydroxylamine-cleavable ester, base-labile sulfone derivative, hydrazide or photolabile group such as a nitrophenylethyl ether, ester, or amide.

32. A cross-linking reagent according to any one of claims 29 to 31 comprising one selectively cleavable group per reactive functional group.

33. A cross-linking reagent according to any of the preceding claims wherein all Y and Z groups are identical.

34. A cross-linking reagent according to claim 33 having the formula:

35. A cross-linking reagent according to any one of the preceding claims wherein the R groups independently comprise one or more substituted or unsubstituted alkyl, alkene, aldehyde, ketone, alcohol or aryl groups, amino acids, or sugars, linked by carbon or heteroatom linkages such as ether, amine, amide, alkane, alkene, alkyne, thiol or ester linkages.

36. A cross-linking reagent according to any one of the preceding claims wherein one or more R groups comprises a reactive functional group Z .

37. A cross-linking reagent according to claim 36 wherein said R group consists of or comprises -Y-Z, wherein Y is a spacer arm and Z is a reactive functional group.

38. A cross-linking reagent according to any one of the preceding claims wherein the R group comprises an affinity tag, radiolabel, spectrophotometrically detectable moiety or peptide .

39. A cross-linking reagent according to claim 38 wherein the spectrophotometrically detectable moiety is a fluorescent dye.

40. A cross-linking reagent according to claim 38 wherein the peptide comprises a sequence which is capable of targeting a protein molecule to a particular intracellular compartment or transducing an associated molecule across the plasma membrane.

4l. A cross-linking reagent according to claim 40 wherein the peptide comprises a nuclear targeting sequence, an ER localisation sequence or Penetratin.

42. A cross-linking reagent according to claim 41 comprising the sequence PKKKRKV or KDEL.

43. A cross-linking reagent according to any one of the preceding claims wherein one or more R groups comprises a morpholine or arginine group.

44. A cross-linking reagent according to any of the preceding claims comprising one or more atoms of 13 C, 14 C, 18 O, 2 H or 3 H.

45. A cross-linking reagent according to any of the preceding claims which is a calix[4] arene and has the configuration C4v or C2h.

46. A method of cross-linking two or more sites in a complex containing two or more components including a biological macromolecule, comprising contacting said complex with a cross-linking reagent according to any one of the preceding claims such that at least two reactive functional groups of said reagent react independently with components of said complex to form a cross-linked complex.

47. A method according to claim 46 wherein the biological macromolecule is a protein.

48. A method according to claim 47 wherein the complex further comprises at least one other protein, a nucleic acid, a protein cofactor, a modulator of protein activity or a drug molecule .

49. A method according to any one of claims 46 to 49 wherein a component of said complex is a predetermined target biological macromolecule.

50. A method according to any one of claims 46 to 49 wherein said complex is present in a cell.

51. A method according to any one of claims 46 to 49 wherein said complex is present in a cell-free system.

52. A method according to any one of claims 46 to 51 further comprising isolating said cross-linked complex.

53. A method according to claim 52 comprising isolating said cross-linked complex from unreacted reagent and/or from un- cross-linked complex, and/or from target biological macromolecules not part of a complex.

54. A method according to claim 52 further comprising isolating said cross-linked complex from one or more other biological macromolecules.

55. A method according to any one of claims 52 to 54 wherein said isolation comprises affinity purification.

56. A method according to claim 55 wherein said affinity purification comprises contacting said cross-linked complex with a binding partner capable of binding specifically to a component of the complex.

57. A method according to claim 56 wherein said affinity purification comprises contacting said cross-linked complex with a binding partner capable of binding specifically to the cross-linking reagent.

58. A method according to claim 57 wherein the component of the complex is a target biological maσromolecule .

59. A method according to any one of claims 56 to 58 wherein the binding partner is an antibody, streptavidin, avidin, or a metal ion.

60. A method according to any one of claims 56 to 59 wherein the binding partner is immobilised on a solid phase.

61. A method according to any one of claims 52 to 60 wherein said isolation comprises purification on the basis of molecular mass, size (e.g. hydrodnynamic radius) or charge.

62. A method according to claim 61 wherein said isolation comprises chromatography, SDS-PAGE or differential centrifugation.

63. A method according to claim 62 wherein said purification comprises HPLC.

64. A method according to any one of claims 46 to 63 comprising determining the molecular weight or size of the cross-linked complex.

65. A method according to any one of claims 46 to 64 comprising determining or verifying the identity of one or more components of the complex.

66. A method according to any one of claims 46 to 65 further comprising analysing said cross-linked complex by mass spectroscopy .

67. A method according to claim 66 comprising disrupting said cross-linked complex prior to mass spectroscopy.

68. A method according to claim 67 comprising contacting said cross-linked complex with a protease.

69. A method according to claim 67 or claim 68 wherein the reagent comprises a cleavable group and the method comprises selectively cleaving said cleavable group.

70. A method according to claim 69 wherein the cleavage is performed by a chemical or enzymatic reaction.

71. A method according to any one of claims 6 to 70 comprising acquiring data concerning molecular weight and/or charge of a component of the complex or a fragment thereof, and interrogating a suitable database to identify -possible matches for the component or fragment thereof .

Description:

REAGENTS AND METHODS FOR CROSS-LINKING BIOLOGICAL MOLECULES

Field of the Invention

The present invention relates to the study of interactions between biological molecules, such as proteins and nucleic acids. It provides novel reagents which may be used to crosslink biological molecules which interact non-covalently and often transiently with one another, and so facilitate the identification and characterisation of such interactions.

Background to the Invention

Transient protein-protein interactions regulate a diversity of cellular responses.' 1 ' 21 Currently, it is difficult to~-predict such interactions a priori from sequence information. Thus, methods to characterize protein-protein interactions are of significant interest for cell biology C3] as such characterization may enable the development of small molecule interventions and design of specific small molecule modulators of such interfaces. 141 A comparison of wide-scale studies of the Saccharomyces cerevisiae protein interactome, l5] including yeast two-hybrid ts ' sl and tandem affinity purification, t7 ' 81 has highlighted poor overlap of data-sets and the need for an intersection of data derived from diverse techniques . [3/91 Furthermore, few methods reveal information about the topology of individual multimeric protein complexes.

Chemical cross -linking 1101 provides covalent capture of transient protein-protein interactions and can facilitate topological analysis using mass spectrometry. 111 ' 121 A limitation of chemical cross-linkers is that tether length and reactivity must be optimized for individual protein-protein complexes. Additionally, reagents do not discriminate between inter- versus intra-molecular links or non-productive modifications. A modular synthetic route was recently reported that offered rapid access to a versatile arsenal of cross-linkers. 1131 However, individual compound evaluation is still necessary to select the optimum reagent for each system.

Summary of the Invention

A chemical cross-linker that does not require specific optimization for use with different target molecules or complex mixtures, would be highly desirable. A further desirable feature would be to increase the likelihood of forming inter-molecular cross-links over intra-molecular links, to increase the efficiency of such a reagent in cross- linking different members of non-covalently associated complexes .

Calixarenes are cylic oligomers derived from condensation reactions between an aldehyde and a phenol. Calixarenes typically contain either 6 phenyl units in a ring (referred to as calix[6] arenes) or 4 phenyl units in a ring (referred to as calix[4] arenes) . The phenol component is typically phenol, resorcinol (1, 3-benzenediol) or pyrogallol (1,2,3- benzenetriol) but could also be catechol (1,2- benzenediol) , or a derivative of any of these.

For example, a reaction between the aldehyde RCHO and resorcinol provides a resorcinarene (calix [4] arene) as follows :

Calixarenes have an lower (or inner) rim and an upper (or outer) rim. In the resorcinarene formula above, the lower rim is shown as the interior of the macrocyclic ring, and the

upper rim is that carrying the eight hydroxy1 groups. The same nomenculature is applied to other calixarenes.

Where the R groups are comparatively small, the molecule can be relatively flexible, with a degree of rotational freedom for the phenyl groups. However, when the R groups are larger, they restrict rotation and tend to lock the molecule into a particular configuration.

The present inventors have realised that such a rigid calixarene core would provide a well-defined geometrical structure for a cross-linking reagent, which could display multiple functional groups (e.g. groups reactive with various kinds of biological macromolecules) over a large surface area by tethering them to the hydroxy groups of the calixarene via spacer arms . By offering multiple copies of reactive functionality in an organised arrangement, such a structure should bias cross-linking reactions toward inter-molecular links between proteins and thus lead to greater cross-linking efficiency.

Furthermore careful selection of the aldehyde or aldehydes incorporated into the calixarene synthesis allows the properties of the cross-linking reagent to be finely tailored.

Thus, the invention provides a molecule having the formula I:

( I )

or formula I I :

Thus formula I shows a calix [4] arene and formula II shows a calix [6] arene .

The invention further provides compounds having formula III

(III)

or formula IV:

Calixarenes based on resorcinol have been found to be particularly suitable. Thus, in certain embodiments, the invention provides compounds of formula V:

(V)

or formula VI :

(VI)

In all formulae above, the R groups are the same or different. They can be independently chosen according to the properties desired for the molecule, and will be described in more detail below.

The X groups are also the same or different and may independently be -H, -OH, or may have the formula -O-Y-Z, where Z is a terminal functional group and Y is a spacer arm. The functional group is preferably a reactive functional group, capable of reacting with a group found on a target

molecule. Alternatively, one or more X groups may have the formula R or -0-R. Preferably none, one or two X groups per molecule has the formula R- or -0-R.

The compounds described herein are used as cross-linking reagents. They may be used to form intra-molecular crosslinks (by linking groups found on the same target molecule) and/or inter-molecular links (by linking groups found on two or more non-covalently associated target molecules) . Thus the compounds of the invention must carry at least two such reactive functional groups . Although R groups may carry reactive functional groups (see below) , preferably at least two X groups comprise reactive functional groups and therefore have the formula -O-Y-Z.

Preferably at least one X group on each phenyl group in the calixarene ring has the formula -O-Y-Z. In some embodiments at least two X groups on each phenyl group in the calixarene ring have the formula -O-Y-Z. Optionally, all X groups have the formula -O-Y-Z.

Preferably all X groups in formulae V and VI represent the formula -O-Y-Z.

Typical target molecules are biological macromolecules such as peptides, proteins and nucleic acids (DNA or RNA) .

In this specification the term "peptide" is used to refer to peptides of less than 100 amino acids, and "polypeptide" or "protein" for a molecule of 100 amino acids or more. However, reference to one should be taken to include the other unless the context specifically demands otherwise.

Other target molecules include anything capable of forming a non-covalent complex with a biological macromolecule such as a protein, peptide, or nucleic acid. These include other organic molecules such as drug molecules, modulators of

protein activity, isolated amino acids, nucleotides or nucleosides, protein cofactors such as heme groups, vitamins, etc ..

Proteins and peptides display various chemical groups which can be targeted by cross-linking reagents such as free amine groups (e.g. in lysine side chains and at the N-terminus) , free hydroxy groups (e.g. in serine and threonine side chains) , thiol groups (cysteine side chains) , carboxyl groups (aspartate and glutamate side chains, and the C-terminus of the molecule) . Suitable reactive functional groups are well- known in the art; see e.g. Bioconjugate Techniques, by Greg T Hermanson; ISBN 0-12-342336-8.

Thus amine-reactive functional groups which may be used in the cross -linking reagents described here include isothiocyanates, isocyanates, acyl azides, N-Hydroxysuccinimide (NHS) esters, sulfonyl chlorides, aldehydes, glyoxals, epoxides, carbonates, arylating agents, iπddoesters, carbodimides and acid anhydrides.

Thiol-reactive functional groups which may be used in the cross-linking reagents described here include haloaσetyl and alkyl halide derivatives, maleimides, aziridines, acryloyl derivatives, and disulfides, which can participate in thiol- disulfide exchange reactions.

Carboxylate-reactive functional groups which may be used in the cross -linking reagents described here include diazoalkanes and diazoacetyl compounds (diazoacetate esters and diazoacetamides) , carbonyldiimidazoles and carbodiimides .

Hydroxyl-reactive functional groups which may be used in the cross-linking reagents described here include carbonates.

Aldehyde- or ketone- reactive functional groups which may be used in the cross -linking reagents described here include

hydrazines, and amines. These groups may be useful for targeting carbohydrates (e.g. those found in protein glycosylation) and non-naturally occurring amino acids.

Reactive functional groups which may be used to react with carbon atoms carrying "active hydrogen" atoms include diazonium derivatives. These may be used, inter alia, for reaction with carbon atoms carrying hydrogen in aromatic rings by electrophilic addition.

The reactive functional group may not be capable of reacting with a target molecule until a particular stimulus is applied. For example, certain groups known generally as photoreactive groups will only react on irradiation, typically with visible or UV irradiation. Lasers are often used to deliver the radiation because of their ability to deliver high-intensity monochromatic radiation at a very closely defined location. Photoreactive groups include aryl azides, benzophenones and diazirine compounds . Photoactivation leads to formation of reactive free radical groups, which can react with various groups on the target molecule. They tend to be less specific in their reaction than the other reactive functional groups described above, but normally react with alkyl and aryl groups on the target molecule. They can be particularly useful for cross-linking nucleic acids, either to other nucleic acids or to other molecules such as proteins.

Particular selected examples of reactive functional groups include benzophenone and diazirine, aryl azides, hydrazides, alkyl halides and maleimides, epoxides, alkynes , phosphines, reactive esters, carbonates and anhydrides.

The spacer arms Y may be straight-chain or branched groups which are O- linked to the phenyl group at one end and linked to the reactive functional group at the other.

The spacer arms may consist of or comprise saturated or unsaturated chains of any desired length comprising, for example, one or more substituted or unsubstituted alkyl, alkene, aldehyde, ketone, alcohol or aryl groups, amino acids, or sugars, which may be linked by carbon or heteroatom (0, N, S, etc.) linkages such as ether, amine, amide, alkane, alkene, alkyne, thiol or ester linkages, or any other suitable linkage known to the skilled person.

There will typically be a linear chain of between 1 and 20 atoms between the reactive functional group and the respective phenol-derived oxygen of the calixarene ring to which it is linked. The linear chain may be 2, 3, 4, 5., 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 atoms in length, or more depending on the circumstances .

Thus the spacer arms Y may comprise or' consist of any one of the following:

Ci -10 alkyl, optionally substituted with one or more substituents as defined herein, e.g. a group which is a substituted or unsubstituted Ci -I0 alkyl, C 1-10 haloalkyl, C 1 - I0 hydroxyalkyl , C 1-10 carboxyalkyl , C 1-10 aminoalkyl group;

C 1-10 cycloalkyl, optionally substituted with one or more substituents as defined herein;

Ci-io cycloalkyl-Ci-io alkyl, optionally substituted with one or more substituents as defined herein,-

C 5-20 aryl, optionally substituted with one or more substituents as defined herein, e.g. C 5-20 carboaryl or C 5-20 heteroaryl;

C 1 -K) alkyl-C 5 . 20 aryl and C 5 . 20 haloaryl, optionally substituted with one or more substituents as defined herein,-

C 5-2O aryl-C 1-10 alkyl, optionally substituted with one or more substituents as defined herein;

C 3-2 Q heterocyclyl, optionally substituted with one or more substituents as defined herein.

For example, a spacer arm may comprise the formula -OCH 2 COO-

Where the spacer arm is branched, it may carry two or more reactive functional groups. Hyperbranched or dendritic architectures are possible. Dendritic polymers are polymers derived from branched monomers attached to a central core . Thus each successive addition of monomer (referred to as a generation) results in an increase in the number of terminal groups, as the number of branches increases. Commonly used dendrimers include polyamidoamines , polyamines, polyamides, poly(aryl ethers), polyesters and carbohydrates. (See Lee et al . 2005, Nature Biotech 23(12), 1517 for a review, and references cited therein) . Dendrimeric spacer arms may therefore carry a plurality of reactive functional groups, to a maximum of one per chain terminus .

One Y group may also be connected to two or more phenyl groups within the calixarene ring, as shown in the following illustration:

Y = alkyl, aryl or other briding atoms

This may be achieved using Y groups such as alkyl, polyethylene glycol, polyalkyl amine, eneone derivatives • and polyamides, or by olefin polymerisation.

The spacer arms may contain selectively cleavable groups, which can be cleaved e.g. by a suitable chemical reaction, enzymatic reaction or irradiation. Such- groups facilitate the analysis of complexes cross-linked by the reagents described because the residues of reactive groups bound to target molecules can be detached from the calixarene core and separated from one another. Thus a spacer arm may contain a selectively cleavable group such as an alkene, disulfide, hydrazinobenzoic acid derivative (e.g. ester, amide or anhydride) , diol or other group cleavable on contact with periodate, dithionite derivative, hydroxylamine-cleavable ester, base-labile sulfone derivative, hydrazide or photolabile group such as a nitrophenylethyl ether, ester, or amide. Again, these groups are exemplary only. The skilled person will be aware of other alternatives which may be used.

Any given spacer arm may contain two or more different types of selectively cleavable group if desired.

Some or all spacer arms may contain selectively cleavable groups as described above . Branched spacer arms having two or more terminal reactive functional groups Z may contain one or more selectively cleavable groups for each reactive functional group Z, so that each can be cleaved from the calixarene core structure and separated from other groups carried by the same spacer arm. It is therefore possible to disrupt a cross- linking reagent so that each reactive functional group (or each residue of a reactive functional group bound to a target molecule) may be separated from all other reactive functional groups or residues thereof. That is to say, so that none of the reactive functional groups or residues thereof remain covalently linked to any other such group or residue, either

via the calixarene core, or via a branched spacer arm or portion thereof .

Individual Y and Z groups in any one calixarene molecule may be the same or different. Thus one molecule may contain spacer arms of two or more different lengths, and of different compositions. Different cross-linking applications may require different length spacer arms for optimal results.

Furthermore, any one cross-linking reagent may comprise more than one type of reactive functional group, e.g. groups capable of reacting with different target groups on target molecules. For example, a single resorcinarene-based reagent having 8 spacer arms may have 4 groups capable of reacting with amine groups (such as NHS groups) and four photoreactive groups .

Where two types of reactive functional group are used, the reagent may have an [(AB) n ] configuration in which spacer arms having different groups alternate around the calixarene ring. This could be achieved during initial synthesis, or by later modification of the upper rim.

Alternatively, [(A) 1n (B)J configuration might be used. This would provide different binding functionalities at the two opposite hemispheres of the calixarene molecule.

An example of a structure in which all reactive functional groups and spacer arms are the same is as follows:

(VII)

The R groups in any given molecule may also be the same or different. Using two aldehydes having different R groups in the original synthesis of the calixarene ring will result in a calixarene having both types of R group. Alternatively one or more R groups can be selectively modified after calixarene synthesis .

Any desired group may be used as an R group, as long as it does not interfere with the cross -linking function of the molecule. Thus the R groups may comprise one or more substituted or unsubstituted alkyl, alkene, aldehyde, ketone, alcohol, alkene or aryl groups, amino acids, sugars, etc. which may be linked by carbon or heteroatom (O, N, S, etc.) linkages such as ether, amine, amide, alkane, alkene, alkyne,

thiol or ester linkages, or any other suitable linkage known to the skilled person.

Thus each R group may comprise or consist of any. one of the following:

C 1-10 alkyl, optionally substituted with one or more substituents as defined herein, e.g. a group which is a substituted or unsubstituted C 1-10 alkyl, C 1-10 haloalkyl, C 1-10 hydroxyalkyl , C 1-10 carboxyalkyl , C 1-10 aminoalkyl group;

C 1-10 cycloalkyl, optionally substituted with one or more substituents as defined herein;

Ci-io cycloalkyl-C 1-10 alkyl, optionally substituted with one or more substituents as defined herein;

C 5-20 aryl , optionally substituted with one or more substituents as defined herein, e.g. C 5-2O carboaryl or C 5-2O heteroaryl;

C 1 - H0 alkyl-C 5-2O aryl and C 5-20 haloaryl , optionally substituted with one or more substituents as defined herein,-

C 5-20 aryl -C 1-10 alkyl, optionally substituted with one or more substituents as defined herein;

C 3-2O heterocyclyl , optionally substituted with one or more substituents as defined herein.

As indicated above, the compounds of the present invention

(and particularly the Y and R groups) may be unsubstituted or substituted by one or more functional groups. Unless otherwise specified, the term "substituted" means a parent group which bears one or more substituents. The term "substituent" is used herein in the conventional sense and refers to a chemical moiety which is covalently attached to, appended to, or if appropriate, fused to, a parent group. A

wide variety of substituents are well known in the art, and methods for their formation and introduction into a variety of parent groups are also well known.

In the present invention, "aromatic substituent" as defined herein are independently selected from hydrogen, -F, -Cl, -Br, -I, -OH, -OMe, -OEt, -SH, -SMe, -SEt, -C(=0)Me, -C(=0)0H, -C(=0)0Me, -CONH 2 , -CONHMe, -NH 2 , -NMe 2 , -NEt 2 , -N(nPr) 2 , -N(IPr) 2 , -CN, -NO 2 , -Me, -Et, -CF 3 , -OCF 3 , -CH 2 OH, -CH 2 CH 2 OH, -CH 2 NH 2 , -CH 2 CH 2 NH 2 , -Ph, ether (e.g., C 1-7 alkoxy) ; ester; amido,- amino; and, C 1 _ 7 alkyl (including, e.g., unsubstituted Ci -7 alkyl, Ci -7 haloalkyl , Ci -7 hydroxyalkyl , C 1-7 carboxyalkyl , C 1-7 aminoalkyl , C 5-20 aryl-Ci -7 alkyl) .

In the present invention, "substituents" as defined herein are independently selected from hydrogen, halo; hydroxy,- oxo; ether (e.g., Ci -7 alkoxy) ; formyl; acyl (e.g., C 1-7 alkylacyl ,

C 5-20 arylacyl) ; acylhalide; carboxy,- ester,- acyloxy; amido; acylamido; thioamido,- tetrazolyl,- amino,- nitro; nitroso; azido; cyano; isocyano; cyanato,- isocyanato,- thiocyano,- isothiocyano,- sulfhydryl; thioether (e.g., C 1-7 alkylthio) ; sulfonic acid; sulfonate; sulfone; sulfonyloxy; sulfinyloxy; sulfamino; sulfonamino,- sulfinamino; sulfamyl; sulfonamido,-

Cx^alkyl (including, e.g., unsubstituted Ci_ 7 alkyl , C 1-7 haloalkyl, C^hydroxyalkyl, Ci_ 7 carboxyalkyl , Ci -7 aminoalkyl,

C 5 . 20 aryl-C 1-7 alkyl) ,- C 3 _ 20 heterocyclyl (including

C 5-6 heterocyclyl) ,- or C s _ 20 aryl (including, e.g., C 5-20 carboaryl,

C 5 . 20 heteroaryl, Ci -7 alkyl-C 3-20 aryl and C 5-20 haloaryl) , and especially C 5 . s aryl) .

In one preferred embodiment, the substituent (s) are independently selected from:

-F, -Cl, -Br and -I;

=0 -OH;

-OMe, -OEt, -O(tBu) and -OCH 2 Ph;

-SH;

-SMe, -SEt, -S(tBu) and -SCH 2 Ph;

-C(=O)H;

-C(=O)Me, -C(=O)Et, ~C(=O) (tBu) and -C(=O)Ph;

-C(=O)OH; -C(=O)OMe, -C(=O)OEt and -C (=0) O (tBu) ;

~C(=O)NH 2 , -C(=O)NHMe, -C(=0)NMe 2 and -C(=0)NHEt;

-NHC (=0) Me, -NHC (=0) Et, -NHC (=0) Ph, suGciniraidyl and maleimidyl;

-NH 2 , -NHMe, -NHEt, -NH(iPr), -NH(nPr), -NMe 2 , -NEt 2 , -N(IPr) 2 , -N(nPr) 2 , -N(nBu) 2 and -N(tBu) 2 ;

-CN;

-NO 2 ;

-Me, -Et, -nPr, -iPr, -nBu and -tBu;

-CF 3 , -CHF 2 , -CH 2 F, -CCl 3 , -CBr 3 , -CH 2 CH 2 F, -CH 2 CHF 2 and -CH 2 CF 3 ; -OCF 3 , -OCHF 2 , -OCH 2 F, -OCCl 3 , -OCBr 3 , -OCH 2 CH 2 F, -OCH 2 CHF 2 and

-OCH 2 CF 3 ;

-CH 2 OH, -CH 2 CH 2 OH and -CH(OH)CH 2 OH;

-CH 2 NH 2 , -CH 2 CH 2 NH 2 and -CH 2 CH 2 NMe 2 ; and, substituted or unsubstituted phenyl.

For phenyl substituents, if the phenyl group has less than the full complement of substituents, they may be arranged in any combination. For example, if the phenyl group has a single substituent other than hydrogen, it may be in the 2-, 3-, or 4-position. Similarly, if the phenyl group has two substituents other than hydrogen, they may be in the 2,3-,

2,4-, 2,5-, 2,6-, 3,4-, or 3 , 5-positions . If the phenyl group has three substituents other than hydrogen, they may be in, for example, the 2,3,4-, 2,3,5-, 2,3,6-, 2,4,5-, 2,5,6-, or 3 , 4, 5-positions . If the phenyl group has four substituents other than hydrogen, they may be in, for example, the

3,4,5,6-, 2,4,5,6-, 2,3,5,6-, 2,3,4,6-, or 2 , 3 , 4 , 5-positions .

In one preferred embodiment, the substituent (s) are independently selected from: -OH; =0

-OMe, -OEt, -O(tBu) and -OCH 2 Ph; -C(=O)OMe, -C(=O)OEt and -C(=O)O(tBu) ; -Cf=O)NH 2 , -C(=O)NHMe, -C(=O)NMe 2 and -C (=0) NHEt; -NH 2 , -NHMe, -NHEt, -NH(iPr) -NH(nPr), -NMe 2 , -NEt 2 , -N(IPr) 2 , -N(nPr) 2 , -N(nBu) 2 and -N(tBu) 2 ; -Me, -Et, -nPr, -iPr, -nBu, -tBu;

-CF 3 , -CHF 2 , -CH 2 F, -CCl 3 , -CBr 3 , -CH 2 CH 2 F, -CH 2 CHF 2 , and -CH 2 CF 3 ; -CH 2 OH, -CH 2 CH 2 OH, and -CH(OH)CH 2 OH; and, -CH 2 NH 2 , -CH 2 CH 2 NH 2 and -CH 2 CH 2 NMe 2 .

The compounds of the invention may be derivatised in various ways. As used herein "derivatives" of the compounds includes well known ionic, salt, solvate and protected forms of the compounds or their substituents mentioned herein. For example, a reference to carboxylic acid ( -COOH) also includes the anionic (carboxylate) form (-C00 " ) , a salt or solvate thereof, as well as conventional protected forms. Similarly, a reference to an amino group includes the protonated form (-N + HR 1 R 2 ) , a salt or solvate of the amino group, for example, a hydrochloride salt, as well as conventional protected forms of an amino group. Similarly, a reference to a hydroxyl group also includes the anionic form (-0 " ) , a salt or solvate thereof, as well as conventional protected forms.

Certain compounds may exist in one or more particular geometric, optical, enantiomeric, diasteriomeric, epimeric, atropic, stereoisomeric, tautomeric, conformational, or anomeric forms, including but not limited to, cis- and transforms,- E- and Z-forms; c-, t-, and r- forms; endo- and exo- forms; R-, S-, and meso-forms; D- and L-forms; d- and 1-forms ; (+) and (-) forms; keto-, enol-, and enolate-forms; syn- and anti-forms; synclinal- and anticlinal-forms; α and β-forms; axial and equatorial forms,- boat-, chair-, twist-, envelope-, and halfchair-forms ,- and combinations thereof, collectively referred to as "isomers" (or "isomeric forms").

Note that, except as discussed below for tautomeric forms,

specifically excluded from the term "isomers", as used herein, are structural (or constitutional) isomers (i.e., isomers which differ in the connections between atoms rather than merely by the position of atoms in space) . For example, a reference to a methoxy group, -OCH 3 , is not to be construed as a reference to its structural isomer, a hydroxymethyl group, -CH 2 OH. Similarly, a reference to ortho-chlorophenyl is not to be construed as a reference to its structural isomer, meta- chlorophenyl . However, a reference to a class of structures may well include structurally isomeric forms falling within that class (e.g., Ci_ 7 alkyl includes n-propyl and iso-propyl; butyl includes n-, iso-, sec-, and tert-butyl; methoxyphenyl includes ortho-, meta- , and para-methoxyphenyl) .

The above exclusion does not pertain to tautomeric forms, for example, keto-, enol-, and enolate-forms, as in, for example, the following tautomeric pairs: keto/enol (illustrated below), imine/enamine, amide/imino alcohol, amidine/amidine, nitroso/oxime, thioketone/enethiol, N-nitroso/hyroxyazo, and nitro/aci-nitro.

keto enol enolate

Note that specifically included in the term "isomer" are compounds with one or more isotopic substitutions. For example, H may be in any isotopic form, including 1 H, 2 H (D) , and 3 H (T) ; C may be in any isotopic form, including 12 C, 13 C, and 14 C; 0 may be in any isotopic form, including 15 O and 18 O; and the like.

Unless otherwise specified, a reference to a particular compound includes all such isomeric forms, including (wholly or partially) racemic and other mixtures thereof. Methods for the preparation (e.g. asymmetric synthesis) and separation (e.g., fractional crystallisation and chromatographic means) of such isomeric forms are either known in the art or are readily obtained by adapting the methods taught herein, or

known methods, in a known manner.

It may be convenient or desirable to prepare, purify, and/or handle a corresponding salt of the active compound.

For example, if the compound is anionic, or has a functional group which may be anionic (e.g., -COOH may be -COO " ), then a salt may be formed with a suitable cation. Examples of suitable inorganic cations include, but are not limited to, alkali metal ions such as Na + and K + , alkaline earth cations such as Ca 2+ and Mg 2+ , and other cations such as Al 3+ . Examples of suitable organic cations include, but are not limited to, ammonium ion (i.e., NH 4 + ) and substituted ammonium ions (e.g., NH 3 R + , NH 2 R 2 + , NHR 3 + , NR 4 + ) . Examples of some suitable substituted ammonium ions are those derived from: ethylamine, diethylamine, dicyclohexylamine, triethylamine, butylamine, ethylenediamine, ethanolamine, diethanolamine, piperazine, benzylamine, phenylbenzylamine, choline, meglumine, and tromethamine, as well as amino acids, such as lysine and arginine . An example of a common quaternary ammonium ion is N (CH 3 ) 4 + .

If the compound is cationic, or has a functional group which may be cationic (e.g., -NH 2 may be -NH 3 + ), then a salt may be formed with a suitable anion. Examples of suitable inorganic anions include, but are not limited to, those derived from the following inorganic acids: hydrochloric, hydrobromic, hydroiodic, sulfuric, sulfurous, nitric, nitrous, phosphoric, and phosphorous .

Examples of suitable organic anions include, but are not limited to, those derived from the following organic acids: 2-acetyoxybenzoic, acetic, ascorbic, aspartic, benzoic, camphorsulfonic, cinnamic, citric, edetic, ethanedisulfonic, ethanesulfonic, fumaric, glucheptonic, gluconic, glutamic, glycolic, hydroxymaleic, hydroxynaphthalene carboxylic, isethionic, lactic, lactobionic, lauric, maleic, malic,

methanesulfonic, mucic, oleic, oxalic, palmitic, pamoic, pantothenic, phenylacetic, phenylsulfonic, propionic, pyruvic, salicylic, stearic, succinic, sulfanilic, tartaric, toluenesulfonic, and valeric.

Illustrative R groups are shown in the examples below, and include -C 6 H 4 (OCH 2 CH 2 ) 2 OCH 3 , -C 6 H 4 (OCH 2 CH 2 ) 3 OCH 3 , C 6 H 4 O (CH 2 ) 3 CH 3 , - (CH 2 ) 2 CHO, ~(CH 2 ) 4 CH 3/ and -(CH 2 J 2 CH 2 OH.

Any R group may consist of or comprise a reactive functional group as described above. Thus one or more R groups (e.g. 1, 2, 3, 4, 5, or 6 R groups) in a given molecule may consist of or comprise the formula -Y-Z as defined above in relation to the X groups. In general, though, the R groups will not normally carry reactive functional groups of the type carried by the X groups .

R groups may also carry other functional groups. Examples include affinity tags, such as biotin, hexahistidine tags, or epitope peptides, which may be used to aid detection and/or purification of cross-linked complexes.

Additionally or alternatively, other groups to aid detection of the cross -linked complex may be incorporated. These include tags such as radiolabels and spectrophotometrically detectable moieties such as fluorescent dyes (BODIPY θ ; fluorescein, rhodamine etc.) . Of course, if binding agents (such as antibodies) specific for these groups are available, then they may be used in purification if required.

Peptide sequences which are capable of targeting protein molecules to particular intracellular compartments (such as nuclear, endoplasmic reticulum or Golgi targeting sequences) may also be added, in order to target the cross-linking reagent to those compartments. An example of a nuclear localisation sequence is PKKKRKV (derived from the SV40 large T antigen) . An example of an ER

retention sequence is KDEL, which will target proteins to the KDEL receptor on the endoplasmic reticulum. Other moieties which direct ' molecules into cells or to particular sub-cellular compartments may also be used. For example, certain peptides are known which are capable of crossing the plasma membrane and can also transduce associated (e.g. covalently attached) molecules across the plasma membrane, such as the 16 amino acid peptide named "Penetratin" , which is derived from Antennapedia protein.

For example, some of the examples below utilise a monohydroxy resorcinarene in which 3 of the R groups are identical and the other contains a free OH group, which is readily derivatised by conventional chemistries. The examples show incorporation of a biotin tag and a fluorescein tag.

R groups can also be derivatised as required by rapid and conventional techniques such as transamidation, "click" chemistry, and native Staudinger ligation. "Click" chemistry is a set of reactions which are normally used for rapid modification of molecules, particularly biological molecules. The 1,3 dipolar addition of azides to alkynes is one example of click chemistry:

R 1 -N=N=N Cu R 1^N * N

+

=—R2 R 2

Native Staudinger ligation involves the addition of phosphines to azides, then intramolecular transfer to form a stable amide; e.g.

Native Staudinger ligation has been proposed for assembly of proteins and peptides .

An R groups which consists of or comprises a peptide may be further derivatised by peptide ligation techniques, e.g. by intein mediated ligation.

The inventors have found that suitable choice of R groups enables the compounds of the invention to cross the plasma membranes of living cells, which is highly desirable as it allows the trapping of interactions in a cellular environment without disrupting the cells. This may be particularly useful for trapping transient and/or low affinity interactions which may not survive cell lysis or may not take place in cell-free systems.

Amine functional groups, such as morpholine-derivatives may be desirable, as these may be protonated in physiological milieu (which may improve solubility), but may also cross membranes.

Where the reagents are not required to cross biological membranes, solubilising groups, particularly charged groups may be useful to increase compound solubility. Arginine derivatives are particularly interesting because they are known to have cell penetrating properties.

Thus the R groups may have, or may comprise, the formula -Y- Z' , where Y is a spacer as defined above in relation to X groups, and Z' is a reactive functional group Z or is an alternative functional group such as an affinity tag, a

spectrophotometrically-detectable moiety such as a fluorescent dye, a radiolabel, a peptide (e.g. of 2 to 50 amino acids, 2 to 30 amino acids, 2 to 20 amino acids or 2 to 10 amino acids) , etc ..

Radioactive or non-radioactive isotope tags may be incorporated into R groups (or elsewhere in the molecule) to assist in mass spectroscopic analysis of the resulting complexes. Unusual isotopes are readily identifiable by mass spectroscopy and can provide useful information about the number of cross-linking reagents attached to a given protein or fragment . Thus the molecule may comprise one or more atoms of, e.g., 13 C, 14 C, 18 O, 2 H or 3 H.

Calixarenes exist in different configurations, depending on the orientation of the R groups. For example, calix[4] arenes (such as the resorcinarenes described in the examples) exist in the two configurations:

C4v C2h

where the black circles represent the R groups.

The orientation of the spacer arms and hence the reactive functional groups also varies depending on the configuration. Particular configurations may be preferred for particular applications .

As will be apparent from the description above, the compounds described in this specification find use as cross-linking agents, particularly for the study of structures of biological macromolecules, interactions between biological macromolecules, and also interactions between biological macromolecules and other molecules.

Thus the invention provides a method of cross-linking two or more sites in a complex containing two or more components including at least one biological macromolecule, comprising contacting said complex with a cross-linking reagent as described above such that at least two reactive functional groups of said reagent react independently with components of said complex to form a cross-linked complex.

The complex consists of two or more molecules non-covalently associated with one another, e.g. by hydrophobic and/or electrostatic interactions. Each individual molecule present in the complex is referred to as a component . The complex comprises at least one biological macromolecule, which is normally a peptide or protein, although it may also be a nucleic acid. It may be non-covalently complexed with one or more other peptides or proteins, or with other non-peptide molecules . Thus the complex may comprise two or more copies of said target protein. The complex may further comprise at least one other protein, a nucleic acid, a protein cofactor, a modulator of protein activity or a drug molecule.

The complex may comprise a predetermined target biological macromolecule. That is to say, cross-linking may be performed with the specific intention of identifying molecules which associate non-covalently with a predetermined biological macromolecule. However, it is not always necessary to look for binding partners of a predetermined molecule. It is also possible to perform cross-linking in a system containing a plurality of biological macromolecules and subsequently

identifying components of any cross-linked complexes obtained without prior knowledge of any components of the complex.

The complex to be cross-linked by the reagents described here may be present in an intact cell, as some of the cross-linking reagents described are capable of crossing a cell's plasma membrane. Alternatively, they may be present in a cell-free solution, such as a cell lysate (which may contain intact cellular organelles such as nuclei, endoplasmic reticulum or Golgi vesicles, mitochondria, etc.) or a reconstituted system containing only selected target molecules . The method typically comprises contacting a solution containing said complex with said reagent.

The inventors believe that the geometry of the cross-linker compounds described in this specification makes them particularly suitable for cross-linking of complexes containing more than two non-covalently associated protein subunits. The relatively large rigid core structure tends to prevent the compound from penetrating the interior of most proteins. As a result, the compound is likely to be restricted to the exterior of a protein complex and so may be less prone to forming purely intramolecular links than many commercially available bifunctional cross-linking reagents.

In addition, because the compounds preferably contain a large number of reactive functional groups, on tether arms spaced around the core, one molecule is well-adapted to cross-link more than two different components of a complex. This would require two or more molecules of a conventional bifunctional reagent having only two reactive functional groups. Therefore cross-linking studies can use lower concentrations of the compounds described herein than of many conventional reagents, which in turn is likely to improve the specificity of the cross-linking reaction, and reduce the chance of forming "false-positive" links between molecules which do not naturally associate with one another. Consequently it is

believed that the results achieved with the cross-linking reagents described in this specification may be more physiologically relevant than those achieved with conventional cross-linkers .

Therefore the complex may comprise at least three non- covalently associated biological macromolecule components. Potentially it may contain even more, such as four, five, six, seven, eight, nine, ten, twenty, thirty or more components, such as a ribosome, viroid or virus.

Where the reagent contains activatable reactive functional groups, it may be necessary to provide an appropriate stimulus to induce reaction between the reagent and the complex. For example, where the reagent contains photoreactive groups it will normally be necessary to apply electromagnetic radiation of a suitable wavelength.

Typically the method will further comprise the step of isolating the resulting cross-linked complex, e.g. from other components in the solution. For example, the cross-linked complex may be isolated from unreacted reagent and/or from un- cross-linked complex, and/or from target biological macromolecules not part of a complex. Typically, the cross- linked complex will also be isolated from other biological macromolecules in the solution.

The purification may involve affinity purification. Typically, this will involve contacting the cross -linked complex with a binding partner capable of binding specifically to a component of the complex. The affinity purification comprises contacting said cross-linked complex with a binding partner capable of binding specifically to the cross-linking reagent .

The reagent may possess an affinity tag to facilitate this. Affinity tags include epitopes for known antibodies . Suitable

epitopes include small molecules or short (approximately 6 or more amino acid) peptides . Alternatively the reagent may carry a tag which is capable of binding by coordination to a metal ion, such as a hexa-histidine peptide, which can bind to immobilised nickel ions. The reagent may also carry a biotin • moiety, which can be bound by avidin or streptavidin. Yet another alternative is a nucleic acid tag, which could be bound by hybridisation to a complementary nucleic acid probe. These possibilities are not exclusive, and the skilled person ,will be aware of many others.

Alternatively, the affinity purification may use a binding partner specific for a component of the complex, such as the target biological macromolecule. Antibodies may be used as binding partners for almost any complex component, but particularly for proteins. Complementary nucleic acids may be used as binding partners for nucleic acids.

The component for which the binding partner is specific may have been specifically engineered or modified to facilitate isolation. For example, it may be genetically engineered to contain a specific nucleic acid or peptide sequence, for binding to a chosen binding partner such as a complementary nucleic acid probe, antibody or metal ion. Alternatively, it may be chemically modified, e.g. to incorporate an affinity tag such as a biotin moiety. Thus, such a modified component may be introduced into a system (e.g. into a cell) in order to isolate and identify other molecules capable of forming a complex with that component.

The binding partner is typically immobilised on a solid phase so that the bound cross -linked complex can easily be isolated from undesirable contaminants.

Additionally or alternatively, the purification may comprise one or more steps of purification on the basis of a physical

property such as molecular mass, size (e.g. hydrodynamic radius) or charge.

Preferred separation methods include differential centrifugation/sedimentation analysis (e.g. analytical ultracentrifugation) , chromatography, particularly HPLC (high pressure liquid chromatography) , SDS-PAGE or isoelectric focussing. Chromatographic purification techniques such as HPLC are particularly convenient, as the resultant purified fractions can readily be analysed by mass spectroscopy (see below) . A combination of HPLC and mass spectroscopic analysis also lends itself readily to automation.

Such methods may also comprise the step of determining the molecular mass of the cross-linked complex. This may give an indication of the number or nature of the components of the complex.

The method may comprise the step of determining (or verifying) the identity of one or more components of the complex. This may be achieved by contacting the complex with a binding partner capable of binding specifically to a component suspected of being present in the complex. Suitable binding partners for proteins include antibodies, which may be used in immunoblotting techniques, or immunosorbent assays such as an enzyme-linked immunosorbent assay (ELISA) . Suitable binding partners for nucleic acids include hybridisation probes and polymerase chain reaction primers. Thus it may be possible to identify a particular nucleic acid in a complex by a hybridisation reaction or PCR.

It may be desirable to disrupt the complex as described in more detail below before the determination or verification step .

Mass spectroscopy has become a common technique in proteomic analysis. It is often used in conjunction with chemical

cross-linking studies to determine structures of macromolecules and macromolecular complexes (especially proteins) and to identify components of such complexes.

Thus the methods described may further comprise the step of analysing the cross-linked complex by mass spectroscopy. It may follow a purification or separation step, or be used as an alternative to a purification or separation step.

Prior to mass spectroscopy, it may be desirable to disrupt the cross-linked complex, e.g. for reasons described below, to separate the various components of the complex from one another and/or from the cross-linking reagent. It may also be desirable to disrupt individual components of the complex into smaller fragments, e.g. by cleaving proteins into peptide fragments .

Thus, disruption may comprise chemical cleavage of the cross- linked complex:, e.g. with periodate, bases, acids, hydrazides, by photolysis, or by any other suitable technique known to the skilled person. Alternatively disruption may comprise contacting the cross-linked complex with a protease, such as trypsin. Trypsin cleaves proteins on the C-terminal side of lysine or arginine residues, except where they are followed by proline.

Additionally or alternatively, the cross -linking reagent may comprise a selectively cleavable group between two of the reactive functional groups bound to the complex. In some embodiments, as described above, there is at least one such selectively cleavable group between each reactive functional group and the core, allowing all cross-linked components of the complex to be separated from one another before analysis. The method will therefore comprise selectively cleaving said cleavable group. Depending on the nature of the cleavable group, cleavage may be performed by a chemical or enzymatic reaction, or irradiation.

After disruption of the cross-linked complex, there may be further purification steps to isolate the various disrupted components or fragments thereof, to facilitate subsequent analysis. HPLC is a preferred technique.

It is possible to use the compounds described herein to form intra-molecular cross-links in a single target macromolecule . Thus the present invention also provides methods as described above (mutatis mutandis) for the formation of intra-molecular cross-links in a single target molecule, although this is less preferred than the formation of inter-molecular links in a molecular complex between two or more molecules . Such a method comprises cross-linking two or more sites in a target biological macromolecule, by contacting said macromolecule with a cross-linking reagent as described above such that at least two reactive functional groups of said reagent react independently with sites of said macromolecule to form a cross-linked macromolecule.

Recently, it has become possible to use a combination of chemical cross-linking and mass spectroscopy to analyse a number of features of proteins . These include the folding of indivdual protein molecules, as well as analysis of the structure and composition of complexes containing two or more protein subunits.

This has been enabled by the explosion in protein sequence data which is now available for many organisms including humans, coupled with advances in sensitivity of mass spectroscopy (MS) techniques.

In simple terms, MS typically involves measuring a mass/charge (m/z) ratio for an analyte molecule. Having determined the m/z ratio for a particular protein analyte, a database of protein sequences can be interrogated to find likely candidate sequences for the analyte, leading to its identification.

These techniques can be expanded to identification of protein fragments. Thus, when a parent protein is cleaved at particular residues or sequences (e.g. by a protease with a preference for specific residues or sequences) , the m/z ratios of the resulting peptide fragments can be determined and the database then interrogated to identify peptide sequences which match both the protease' s cleavage preferences and the appropriate m/z ratio.

When a protein, or a protein complex comprising more than one subunit, is reacted with a cross-linking reagent, a number of different reactions may occur. The cross-linking reagent may react at only one site on the protein or complex (i.e. not form any cross-links) yielding a residue carrying a tag. Secondly, an intra-molecular link may form, where the reagent links two residues within the same protein chain. Thirdly, an inter-molecular link may form, between residues in different protein chains .

Digestion of the cross-linked assembly can therefore yield free peptides unconjugated to any linker, single peptides conjugated to one or more cross-linker molecules which are not linked to any other peptides, and pairs of peptides derived from the same or different protein chain connected by one or more cross-linker molecules.

(This assumes a cross-linking reagent with two reactive functional groups . The situation is clearly more complex if three or more groups are present, as combinations of inter- and intra-molecular interactions are possible.)

Clearly a peptide bound to cross-linker (s) will not have the same molecular mass or m/z ratio as an identical unconjugated peptide. However, knowing the molecular mass of the cross- linking reagent and the chemistry by which it reacts with its target groups on a protein, it will be possible to predict the effect of the cross-linker- on the m/z ratio of any given

peptide. Therefore, if the algorithm used to interrogate the protein sequence database is able to take account of the molecular mass of the cross-linker when conjugated to one or more peptides (or the database itself contains suitable data) , it remains possible to identify the peptide.

Knowledge of the three-dimensional structure of the cross- linking reagent may suggest the footprint of the reagent in relation to the cross-linked molecule (s) and thus lead to information about the topology of protein-protein interactions within a cross-linked complex. In addition, incorporation of isotope tags into that part of the cross-linking reagent which remains associated with a target molecule or fragment thereof after any disruption step can be used for comparison of fragment populations.

If no single peptide match is found for a particular product, then that product is likely to contain two peptides connected by one or more cross-linkers. It may therefore be possible to interrogate the database to identify pairs of peptides from the same or different proteins in order to identify the sequences of those peptides. This analysis will be made easier if information is available about the identity of the proteins which might be present in the cross-linked complex.

This analysis may be made easier by using cross -linking reagents which include groups which are cleavable (e.g. by a suitable chemical or enzymatic cleavage reaction) between reactive functional groups. This allows the peptides linked by a single cross-linking reagent to be separated when desired and analysed independently.

Cross-linking and MS can also be used to obtain structural information about proteins and protein complexes . Proteins have complicated three-dimensional structures. A number of factors restricts the number and relative spatial location of groups on a protein which can be cross-linked by any given

reagent. These include the conformation of the protein, the length and flexibity of the spacer between two reactive functional groups, whether the target groups in the protein are available on the surface of the molecule or buried internally, and whether the linker can access the interior and exterior of the protein or only the exterior. This can restricted by steric and electrostatic effects, as the interior of many proteins is hydrophobic, while parts of cross-linking reagents (particularly the reactive functional groups) are often hydrophilic.

Thus, identifying groups within a single protein or a protein complex which can be linked by a given cross-linking reagent provides information about the relative location of those residues. This may be used to deduce information about the structure of a molecule, the relative topological location of subunits within a complex, etc..

For more information about these techniques, see references [11] to [13] and documents cited therein. The compounds described in this specification may be used in any such method as desired, and more applications will be apparent to the skilled person based on the teaching provided here.

Thus in general terms, MS analysis involves acquiring data concerning molecular weight and/or charge of a candidate molecule, such as a protein, or protein fragment. In the methods described here, the candidate molecule will be a component of a cross-linked complex or a fragment thereof. The candidate molecule may be covalently linked to a cross- linking reagent as described in this specification or to a fragment of such a reagent . Having acquired appropriate data (e.g. a m/z ratio), a suitable database is interrogated in order to identify possible matches for the candidate compound, e.g. proteins whose sequences satisfy the requirements imposed by the data, (or which contain peptide sequences which would satisfy those requirements when present in isolated form) and

any other requirements imposed by the experimental methodology, such as any enzymatic or chemical cleavage reactions performed on the complex or the cross-linking reagent .

The invention will now be described in more detail, by way of example and not limitation, by reference to the accompanying figures .

Description of the Figures

Figure 1. Crystal structure for intermediate octaester 3 clearly reveals the geometric distribution of the eight esters arising from the molecule's C 2tl symmetry.

Figure 2. Cross-linking of GST was performed with molar equivalents of either SOXL 1 (lanes 1,3,5,7,9,11) or DSS 5 (lanes 2,4,6,8,10,12) as indicated. In lanes 11 and 12 an equimolar mixture of GST and BSA was used and a 4 -fold excess of cross-linker added. Lanes 13 and 14 represent GST and BSA alone, respectively, in the absence of added cross-linkers . a) Immunoblotting with anti-GST antibody, b) Immunoblotting performed on the same gel using anti-BSA antibody.

Figure 3. Oligomeric states of Nd and Cd.

A. MW distribution, c(M) , of Nd and Cd from sedimentation velocity studies. Nd and Cd have mass distributions of 62 and 13 kDa, corresponding to a tetramer and monomer, respectively.

B. Sedimentation equilibrium data for Nd and Cd. Only one loading concentration is shown for clarity. Data were obtained at 12,000, 16,000, 22,000 and 28,000 rpm. Residual plots of the fit to a single species are shown below each data set. Nd has a molecular weight corresponding to a tetramer, whereas Cd is a monomer. C. SDS-PAGE analysis showing cross -linking of DnaD with SOXL. Lanes 1 (9 μM DnaD) and 2 (4.5 μM DnaD) indicate control reactions in the absence of SOXL whereas lanes 3 {protein (18 μM) /SOXL (180 μM) } and 4 {protein (18

μM) /SOXL (504 μM) } indicate reactions with increasing relative ratios of SOXL to DnaD or Nd. Monomeric and dimeric species of DnaD are shown as (D)I and (D) 2 whereas monomeric, dimeric and tetrameric species of Nd as (Nd)I, (Nd) 2 and (Nd) 4, respectively. Molecular weight markers (kDa) are shown in lanes M.

Figure 4. A. The gel filtration profile of a mixture of Nd and Cd. The two domains elute at different positions through a superdex S200 analytical column. The individual proteins alone elute at exactly the same positions (data not shown) . Nd elutes earlier indicating an oligomer. No interaction between the domains was observed. Samples from the peaks were analysed by SDS-PAGE. The presence of Nd and Cd in the early and late peaks, respectively was verified by SDS-PAGE. The MW of Nd (16,056 kDa) is slightly bigger than that of Cd (13,730 kDa) and therefore it appears at a slightly higher position in the gel . B. Cross -linking of Nd at high concentration using SOXL. Nd (30 μM) was cross-linked with SOXL (lane 2; 840 μM, lane 3; 1680 μM) . Lane M and lane 1 show molecular weight markers (kDa) and unlinked Nd, respectively. Internal cross linking is apparent in the monomeric (Nd)I and dimeric (Nd) 2 species. Higher species corresponding to trimers (Nd) 3, tetramers (Nd) 4 and multimers (Nd)X are also apparent.

Figure 5. SOXL cross-linking of Cd in the presence and absence of a ssl9mer oligonucleotide. Panel A shows SOXL cross linking of Cd (18 μM) in the absence of oligonucleotide and control Cd in the absence of cross-linker, as indicated. Panel B shows control Cd (18 μM) in the absence of cross-linker and also cross -linking reactions with increasing concentrations of SOXL (0.5, 1.0 and 1.5 mM) , at two different oligonucleotide concentrations (0.9 and 0.19 μM) , as indicated. Internal cross-links as well as large crosslinked species are indicated as Cd and (Cd)X, respectively. Molecular weight markers are indicated in kDa in lanes M. In the presence of the

oligonucleotide large cross-linked Cd species are clearly visible whereas in the absence of DNA only internal crosslinks can be seen, confirming the DNA-induced oligomerisation activity of Cd.

Figure 6. Comparison of SOXLl cross-linking with cross- linking achieved by divalent reagents having equivalent span between reactive groups. Cross-linking of GST was performed with 1, 2 and 4 molar equivalents of either SOXL 1 (lanes 1, 5 and 9), DSS 4 (lanes 2,6 and 10), DSG 5 (lanes 3, 7 and 11), or SAB 6 (lanes 4, 8 and 12), as indicated. After 30 mins cross- linking, proteins were denatured and separated by SDS-PAGE prior to immunoblotting with anti-GST antibodies. Position of GST complexes are indicated.

Figure 7. Crosslinking of phosphoinositde 3 -kinase heteromers performed with SOXL 1, SOXL 2, SOXL 3 and DSS. The pllθ/p85 heteromeric complex alone (No XL) and bound to nickel beads (Beads) were used as controls Position of molecular weight standards (in kDa) and PI3K heteromeric complexes are indicated.

Figure 8. Comparison of the time-course of SOXLl and DSS- mediated crosslinking. Cross-linking of GST was performed with 4 molar equivalents of either SOXL 1 (S) or DSS (D) , for the times indicated (in minutes) prior to the reaction being terminated. Proteins were denatured and separated by SDS-PAGE prior to immunoblotting with anti-GST antibodies.

Figure 9. SDS-PAGE gel showing cross-linking of DnaD Nd using cross-linking reagent 26 with cleavable groups in the linker arms. x Nd' contains protein (N domain of DnaD) only. The other lanes contain cross-linked protein and are labelled to reflect the molar ratio of protein: reagent in the cross- linking reaction. The numerical annotations 1-6 represent samples collected for cleavage and mass spectrometry, thus 1 =

Nd untreated control, 2 = Nd monomers + linker 26; 3 = Nd dimers + linker 26; 4-6 = higher oligomers of Nd + linker 26,

Examples

Compound Syntheses

Synthesis of SOXL 1

Resorcinarene 2

All chemicals were purchased from commercial suppliers and used as received. Thin Layer Chromatography was performed with silica gel 60 pre-coated aluminium plates (0.20 mm thickness) from Macherey-Nagel , with visualization by UV light (254 nm) and stains . Flash chromatography was performed on silica gel Matrex 60 from Fisher Chemicals. Infrared spectra were determined as KBr discs or films as stated using Perkin-Elmer Spectrum RXI FT-IR and peak positions are expressed in cm "1 . 1 H and 13 C NMR spectra were recorded with either using JEOL JMN GX-270 or EX-400 spectrometers, the chemical shifts (δ) are reported relative to residual solvent in parts per million (ppm) and J values are quoted in Hz. Mass spectra were recorded using electrospray ionization on a Micromass LCT at the University of Nottingham.

Numbering scheme for NMR assignment of C 2h resorcin[4] arene derivatives .

[(2,8,14,20-Tetra{4- [2- (2- methoxyethoxy) ethoxy] phenyl}pentacycIo [19.3.1.1 3 ' 7 .1 9 ' 13 .1 15 ' 19 ] oct acosa-1 (25) ,3,5,7(28) , 9, 11, 13 (27) , 15, 17, 19 (26) , 21, 23- dodecaenβ-4, 6,10, 12,16, 18, 22,24-octol (2)

Concentrated HCl (2.5 mL, 37 %) was added dropwise to a stirred mixture of resorcinol (1.37 g, 12.5 mmol) and 4- (2- (2- (2 -methoxyethoxy) ethoxy) ethoxy) benzaldehyde [ 19) (3.34 g, 12.5 mmol) in absolute ethanol (10 mL) . The red-coloured solution was stirred for 18 h at 70 0 C. The mixture was cooled to 0 0 C and the resulting precipitate was collected by suction filtration. The solid was recrystallised from absolute ethanol and dried to give resorcinarene 3 as a grey solid (4.49 g, 78 %) ; m.p. (EtOH) 214-216°C; (KBr) 3432vs, 2878s, 1611m, 1511m, 1429w, 1381w, 1283w, 1251m, 1202m, 1079m, 1021W, 932w, 847w; δ H (399.8 MHz; DMSO; Me 4 Si) 3.26 (12 H, S, OCH 3 ), 3.46 (8 H, m, CH 3 OCH 2 ), 3.55 (8 H, m, CH 3 OCH 2 CJJ 2 ), 3.58 (8 H, m, CH 3 OCH 2 CH 2 OCJJ 2 ) , 3.62 (8 H, m, ArOCH 2 CH 2 OCJJ 2 ), 3.74 (8 H, t, J " 5 Hz, ArOCH 2 CJJ 2 ), 3.97 (8 H, t, J 5 Hz, ArOCH 2 ), 5.50 (4 H, s, H c ) , 5.70 (2 H, s, H bl ) , 6.16 (2 H, s, H b2 ) , 6.32 (2 H, S, H a2 ) , 6.34 (2 H, s, H a i) , 6-49 (8 H, d, J " 9 Hz, H e ) , 6.53 (8 H, d, J 9 Hz, Ha), 8.47 (4 H, s, ArOH), 8.54 (4 H, s, ArOH); δ c (100.5 MHz; DMSO; Me 4 Si) 41.6 (CH), 58.6 (CH 3 ), 67.3 (CH 2 ), 69.6 (CH 2 ), 70.1 (CH 2 ), 70.3 (CH 2 ), 70.5 (CH 2 ), 71.8 (CH 2 ), 102.0 (CH), 102.1 (CH), 113.5 (CH), 121.3 (C), 121.8 (C),

129.4 (CH) , 130.3 (CH) , 132.3 (CH) , 137.0 (C) , 152.9 (C) , 153.1 (C) , 156.1 (C) ; HRMS-ES ' (m/z) : [M] " calcd for C 80 H 95 O 24 : 1439.6212; found 1439.6296.

2,2',2'',2''',2'''',2''''',2'''''',2''''"''-[(2,8,14,2O- Tetra{4- [2- (2- mβthoxyethoxy) ethoxy] phenyl}pentacyclo [19.3.1.1 3 ' 7 .1 9 ' 13 . l 15 ' 19 ] oct acosa-l(25) ,3,5,7(28) , 9, 11, 13 (27) , 15, 17,19 (26) ,21,23- dodeσaene-4,6,10,12,16,18,22,24-octayl)octakis(oxy) loctakis- acetic acid oσtaethyl ester (3) .

Resorcinarene (3) (2.09 g, 1.45 tnmol) and Cs 2 CO 3 (4.72 g, 14.5 mmol) were combined in dry DMP (15 mL) and stirred at 50° C for 1 h. ethyl bromoacetate (7.26 g, 43.5 mmol) was added and the mixture stirred at 50° C for 18 h. After evaporation of the DMF under reduced pressure, the residue was partitioned between CH 2 Cl 2 and water. The organic phase was washed with water and dried over MgSO 4 . After evaporation of the solvent, the crude material was purified by silica flash chromatography [CH 2 Cl 2 - CH 3 OH (10:1), R.f = 0.56]. The fractions containing product were collected and the solvent removed in vacuo to give the octaester(4) as a pale yellow solid (1.8 g, 58 %) ; m.p. 62-64°C; v^/crrf 1 (KBr) 3432vs, 2928s, 2361s, 1752s, 161Ow, 1585w, 1512m, 1498m, 1456s, 1406s, 1381s, 1381W, 1297m, 1247s, 1208m, 1176m, 1158m, 1112m, 1081w, 1023w, 932w, 857w, 825w; δ H (400.13 MHz; DMSO; Me 4 Si) 1.16 (12 H, t, ι78 Hz,

CH 2 CU 3 ), 1.17 (12 H, t, J 8 Hz, CH 2 CH 3 ), 3.24 (12 H, s, OCH 3 ), 3.44 (8 H, m, CH 3 OCH 2 ), 3.53 (8 H, m, CH 3 OCH 2 CH 2 ), 3.55 (8 H, m, CH 3 OCH 2 CH 2 OCH 2 ), 3.60 (8 H, m, ArOCH 2 CH 2 OCH 2 ), 3.73 (8 H, t, J " 5 Hz, ArOCH 2 CH 2 ), 3.97 (8 H, t, J 5 Hz, ArOCH 2 ), 4.10 (8 H, q, J " 8 Hz, CH 3 CH 2 ), 4.11 (8 H, q, J 8 Hz, CH 3 CH 2 ), 4.28 (4 H, d, J 16

Hz, CH A H B COO) , 4.46 (4 H, d, J 16 Hz, CH λ H B C00) , 4.57 (4 H, d, J " 16 Hz, CH A H B C00) , 4.65 (4 H, d, J 16 Hz, CH A H B COO) , 5.69 (4 H, S, H c ) , 5.94 (2 H, S, H bl ) , 6.19 (2 H, S, H b2 ) , 6.41 (2 H, S, H a2 ) , 6.55 (8 H, d, J 9 Hz, HJ , 6.59 (8 H, d, J 9 Hz, H d ) , 6.65 (2 H, s, H 31 ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 14.2 (CH 3 ) , 42.1 (CH) , 59.0 (CH 3 ) , 61.0 (CH 2 ) , 61.1 (CH 2 ) , 66.9 (CH 2 ) , 67.2 (CH 2 ) , 67.3 (CH 2 ) , 69.8 (CH 2 ) , 70.6 (CH 2 ) , 70.7 (CH 2 ) , 70.8 (CH 2 ) , 71.9

(CH 2 ) , 99.5 (CH) , 100.9 (CH) , 113.8 (CH) , 126.8 (C) , 127.8 (C) ,

128.6 (CH) , 130.0 (CH) , 132.2 (CH) , 134.5 (C) , 154.5 (C) ,

154.7 (C) , 156.6 (C), 168.8 (C), 169.2 (C); LRMS (MALDITOF) m/z = calcd for Cn 2 Hi 44 O 40 M + requires 2128.9; found 2128.0.

2,2 l ,2 1 l ,2 I I I ,2 l " ',2 I I I I I ,2 l ' « ' « l ,2 I I I I I I I -[(2,8,14,20- Tetra{4- [2- (2- methoxyethoxy) ethoxy] phenyl}pentacyclo [19.3.1.1 3 ' 7 .1 9 ' 13 .1 15 ' 19 ] oct acosa-l(25) , 3, 5,7 (28) , 9, 11, 13 (27) , 15, 17,19 (26) , 21,23- dodecaene-4, 6,10,12, 16, 18, 22, 24-octayl) octakis (oxy) ] octakis- acetic acid (4) .

Octaester (4) (1.8 g, 0.85 mmol) was dissolved in THF (50 raL) . To this was added a KOH solution (5.7 g KOH in 30 mL water) and the reaction mixture was stirred vigorously at room temperature for 1 h. The mixture was washed with ether and the aqueous phase adjusted to pH 2. The mixture was left at - 20 °C for 18 h and the resultant precipitate was collected by suction filtration. The solid was dried (0.1 tnmHg) to give the octaacid (5) as a white solid (1.30 g, 80 %) ; m.p. 168- 172°C; v^/cra "1 (KBr) 3398s, 2931vs, 1754vs, 1610m, 1584m,

1510s, 1440m, 1403m, 1299m, 1248m, 1186m, 1160m, 1111m, 1075m, 931w, 844w, 823W, 733w; δ H (399.8 MHz ; CDCl 3 ; Me 4 Si) 3.37 (12 H, s, OCH 3 ) , 3.56 (8 H, m, CH 3 OCH 2 ) , 3.66 (8 H, m, CH 3 OCH 2 CH 2 ) , 3.69 (8 H, m, CH 3 OCH 2 CH 2 OCH 2 ) , 3.74 (8 H, m, ArOCH 2 CH 2 OCH 2 ) , 3.85 (8 H, t, J 5 Hz, ArOCH 2 CH 2 ) , 4.08 (8 H, t, J 5 Hz, ArOCH 2 ) , 4.36 (4 H, d, J " 16 Hz, CH A H B COO) , 4.42 (4 H, d, J 16 Hz, CH A H B COO) , 4.50 (4 H, d, J 16 Hz, CH A H B COO) , 4.52 (4 H, d, J 16 Hz, CH A H B COO) , 5.89 (4 H, s, H 0 ) , 5.95 (2 H, S, H bi ) , 6.27 (2 H, S, H b2 ) , 6.40 (2 H, s, H 82 ) , 6.47 (2 H, s, H al ) , 6.58 (16 H, m, H e , < _) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 42.1 (CH) , 58.2 (CH 3 ) , 66.5 (CH 2 ) , 66.7 (CH 2 ) , 67.2 (CH 2 ) , 69.7 (CH 2 ) , 70.2 (CH 2 ) , 70.3 (CH 2 ) , 70.5 (CH 2 ) , 71.7 (CH 2 ) , 99.5 (CH) , 100.5 (CH) , 113.7 (CH) , 126.7 (C) , 127.6 (C) , 128.7 (CH) , 130.0 (CH) , 132.2 (CH) , 134.5 (C) , 154.4 (C) , 154.6 (C) , 156.7 (C) , 169.8 (C) , 170.1 (C) ; HRMS-ES ' (m/z) : [M] 2" calcd for [C 96 H 1 IoO 40 ] 2" : 951.8322; found 951.8224.

2,2 I ,2 'I ',2 I I I ,2 I 1 I ',2 l l l l I ,2 I I I 1 I I ,2 l l l l l l l -t(2,8,14,20-

Tetra{4- [2-(2- methoxyethoxy) ethoxy] phenyl}pβntacycIo [19.3.1.1 3 ' 7 .1 9 ' 13 .1 15 ' 19 ] oct acosa-l(25) ,3, 5,7 (28) , 9, 11, 13 (27) ,15,17,19(26) ,21,23- dodecaene-4, 6, 10, 12, 16, 18,22, 24-octayl) octakis (oxy) ] octakis- acetic acid octa-2, 5-dioxo-pyrrolidin-l-yl ester (1).

To a solution of octaacid (5) (100 mg, 0.052 ramol) in dry CH 2 Cl 2 (1 mL) were added oxalyl chloride (109.8 μL, 1.26 mmol) and a catalytic amount of DMF (1 drop of a solution of 1 drop of DMF in 1 mL CH 2 Cl 2 ) at 0 0 C. After 5 rain, the solution was allowed to warm to room temperature and stirred for 1 h. The solvent was removed by evaporation, and CH 2 Cl 2 was evaporated twice from the remaining octaacid chloride which was then dissolved in CH 2 Cl 2 (1 mL) . In a separate flask, N- hydroxysuccinimide was co-evaporated with dry toluene (3 x 1 mL) , dissolved in dry THF (1 mL) , and cooled to -10 °C. Then the solution of the octaacid chloride and piperidinomethyl polystyrene HL (200 - 400 mesh) (413.5 mg, 1.65 mmol) was added, and the mixture was stirred for 2 h at room temperature. After 2 h the mixture was filtered through celite to remove the piperidinomethyl polystyrene HL (200 - 400 mesh) and the resulting octasuccinimidyl ester, SOXL 1 was used without further purification to prepare cross-linked GST; (139 mg, quant.); 1 H NMR (399.8 MHz, CDCl 3 , 25 0 C): δ = 2.74 (s, 16H), 2.82 (br s, 16H), 3.29 (s, 12H; OCH 3 ), 3.46 (m, 8H;

CH 3 OCH 2 ) , 3.55 (m, 8H; CH 3 OCH 2 CH 2 ) , 3.57 (m, 8H; CH 3 OCH 2 CH 2 OCH 2 ) , 3.60 (m, 8H; ArOCH 2 CH 2 OCH 2 ) 3.70 (t, 3 J(H, H) = 5 Hz, 8H; ArOCH 2 CH 2 ) , 3.92 (t, 3 J(H, H) = 5 Hz, 8H; ArOCH 2 ) , 4.70 (d, 2 J(H, H) = 16 Hz, 4H; CH A H B COO) , 4.78 (d, 2 J(H, H) = 16 Hz, 4H; CH A H B COO) , 4.83 (d, 2 J(H, H) = 16 Hz, 4H; CH A H B COO) , 4.92 (d,

2 J(H, H) = 16 Hz, 4H; CH A H B COO) , 5.77 (s, 4H; H c ) , 5.91 (s, 2H; H bl ) , 6.24 (s, 2H; H b2 ) , 6.35 (s, 2H; H a2 ) , 6.37 (s, 2H; H al ) , 6.39 (d, 3 J(H 7 H) = 9 Hz, 8H; H e ) , 6.53 (d, 3 J(H, H) = 9 Hz, 8H; H d ) ; 13 C NMR (100.5 MHz, CDCl 3 , 25 0 C, TMS) : δ = 25.5 (CH 2 ) , 25.6 (CH 2 ) , 42.0 (CH) , 58.9 (CH 3 ) , 63.6 (CH 2 ) , 64.5 (CH 2 ) , 66.8 (CH 2 ) , 69.7 (CH 2 ) , 70.5 (CH 2 ) , 70.6 (CH 2 ) , 71.8 (CH 2 ) , 98.5 (CH) , 100.5 (CH) , 113.9 (CH) , 125.8 (C) , 127.5 (C) , 128.5

(CH), 130.2 (CH), 132.4 (CH), 134.0 (C), 153.6 (C), 156.4 (C), 164.9 (C), 165.0 (C), 168.9 (C), 169.3 (C); IR(KBr): V = 3503 cm "1 , 2936, 1830, 1791, 1735, 1507, 1436, 1361, 1302, 1250, 1202, 1076 ; Anal. CaIc. for Ci 28 H 136 N 8 O 56 -H 2 O: C, 56.93; H, 5.15, N, 4.15; found: C, 56.90; H, 5.30, N, 4.06.

2, 2 1 ,2' • , 2''',2'''',2''''',2'''''',2'''''''-[(2,B, 14,20- Tetra{4-[2~ (2- methoxyethoxy) ethoxy] phenyl}pβntacycIo [19.3.1.1 3 ' 7 .1 9 ' 13 . l 15 ' 19 ] oct acosa-l(25) , 3 , 5,7 (28) ,9,11,13 (27) ,15,17,19 (26) ,21,23- dodecaene-4, 6, 10, 12, 16, 18, 22, 24-octayl) oσtakis (oxy) ] octakis [oxy (l-oxo-2, 1-ethanediyl) imino] ] octakis-acetic acid octaethylβstβr .

To confirm identity, a portion of SOXL 1 was used to prepare the corresponding octaglycine ethyl ester .

Glycine ethyl ester (115.8 mg, 0.83 mmol) was dissolved in CH 2 Cl 2 (2 mL) . After addition of Et 3 N (230.8 μL, 1.66 mmol), the mixture was cooled to 0 0 C, and the octasuccinimidyl ester (1) (139 mg, 0.052 mmol) in dry CH 2 Cl 2 (1 mL) , was added slowly. The mixture was stirred for 10 min at 0 °C and 2 h at room temperature . After evaporation of solvent under reduced pressure, the residue was partitioned between CH 2 Cl 2 and water. The organic phase was washed with water and dried over MgSO 4 . After evaporation of the solvent, the crude material was purified by silica flash chromatography [CH 2 Cl 2 - CH 3 OH (10:1), R.f = 0.52]. The fractions containing product were collected and the solvent removed in vacuo to give (6) as a pale yellow solid (33.1 mg, 74 %) ; m.p. 204-206 0 C; v^/cm '1 (KBr) 3409vs, 2934VS, 2361m, 2342m, 1750vs, 1684m, 1540m, 1507s, 1437m,

1406m, 1303m, 1250m, 1196m, 1103m, 1060m, 1024m, 939w, 855w; δ H (400.13 MHz; DMSO; Me 4 Si) 1.15 (12 H, t, J " 7.2 Hz, CH 2 CH 3 ), 1.19 (12 H, t, J 7.2 Hz, CH 2 CH 3 ), 3.24 (12 H, S, OCH 3 ), 3.44 (8 H, m, CH 3 OCH 2 ), 3.53 (8 H, m, CH 3 OCH 2 CH 2 ), 3.55 (8 H, m, CH 3 OCH 2 CH 2 OCH 2 ), 3.60 (8 H, m, ArOCH 2 CH 2 OCH 2 ), 3.73 (8 H, m,

ArOCH 2 CH 2 ), 3.75-3.91 (16 H, m, NHCH 2 ), 3.96 (8 H, m, ArOCH 2 ), 4.02 (8 H, q, J 7.2 Hz, CH 2 CH 3 ), 4.11 (8 H, q, J 7.2 Hz,

CH 2 CH 3 ) , 4.21 (4 H, d, J 16 Hz, CH A H B COO) , 4.26 (4 H, d, J 16 Hz, CH λ H B COO) , 4.51 (4 H, d, J 16 Hz, CH A H B COO) , 4.65 (4 H, d, J 16 Hz, CH A tf B COO) , 5.90 (4 H, s, H 0 ) , 6.03 (2 H, s, H bl ) , 6.25 (2 H, s, H b2 ) , 6.63 (2 H, s, H a2 ) , 6.65 (16 H, m, H e , d ) , 6.98 (2H, s, H a i) , 7.19 (4 H, t, J 6 Hz, NH) , 7.86 (4 H, t, J 6 Hz, NH) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 14.0 (CH 3 ) , 14.1 (CH 3 ) , 40.5 (CH 2 ) , 42.3 (CH) , 59.0 (CH 3 ) , 61.4 (CH 2 ) , 61.6 (CH 2 ) , 67.2 (CH 2 ) , 67.5 (CH 2 ) , 67.9 (CH 2 ) , 69.6 (CH 2 ) , 70.5 (CH 2 ) , 70.6 (CH 2 ) , 70.7 (CH 2 ) , 71.8 (CH 2 ) , 97.1 (CH) , 99.5 (CH) , 114.2 (CH) , 125.4 (C) , 126.3 (C) , 129.1 (CH) , 129.4 (CH) , 132.1 (CH) , 132.7 (C) ,

153.4 (C) , 153.7 (C) , 157.3 (C) , 167.8 (C) , 168.1 (C) , 169.3 (C) ; HRMS-ES " (m/z) : [M] 2+ calcd f or [C 128 H 169 N 8 O 48 ] 2+ : 1293.5549; found 1293.5669. Anal. CaIc. for C 128 H 168 N 8 O 48 .2H 2 O: C, 58.62; H, 6.61, N, 4.27; found: C, 58.80; H, 6.47, N, 4.21.

The formula shown above (and in Scheme 1 below) is that originally assumed for SOXLl. However, from analysis of the NMR data provided above, it will be perfectly clear to the skilled person that the group Rl is in fact C 6 H 4 (OCH 2 CH 2 ) 3 OMe . References to SOXL and SOXLl throughout this specification should therefore be construed accordingly.

Synthesis is therefore more accurately shown as:

Resorcinarene 2 R 1 = C 6 H 4 (OCH 2 CH2) 3 θMe SOXL 1' wherein the correct formula of the molecule designated SOXLl is shown as SOXLl' . The compounds described above should be

named as follows: [(2,8,14,20-Tetra{4-{2- [2-(2- methoxy ethoxy) ethoxy] ethoxy } phenyl Jpentacyclo [19.3.1.1 3 ' 7 .1 9 ' 13 .1 15<19 ]tetracosa-l(25) , 3 , 5, 7 (28) , 9, 11, 13 (27) , 15, 17, 19 (26) , 21,23 - dodecaene-4,6,10,12,16, 18, 22, 24-octol;

2, 2', 2'',2' ' ',2' ' ' ',2' ' ' '',2' ''''',2' '' ' ' ' '-[U, 8, 14, 20- Tetra{4-{2- [2-(2- methoxyethoxy) ethoxy] ethoxy}phenyl}pentacyclo [19.3.1.1 3 ' 7 , 1 9 ' 13 .1 15 ' 19 ] tetracosa-l(25) , 3 , 5, 7 (28) , 9, 11, 13 (27) , 15, 17, 19 (26) , 21, 23- dodecaene-4, 6, 10 , 12 , 16, 18, 22 , 24-octayl) octakis (oxy) ] octakis- acetiσ acid octaethyl ester;

2,2' ,2' ' ,2' ' ' ,2' ' ' ' ,2' ' ' ' ' ,2 < ' ' ' ' ' ,2' ' ' ' ' ' '- [(2,8,14,20- Tetra{4-{2- [2- (2- methoxyethoxy) ethoxy] ethoxy}phenyl Jpentacyclo [19.3.1.1 3 ' 7 .1 9 ' 13 .1 15 ' 19 ] tetracosa-l(25) , 3 , 5, 7 (28) , 9, 11, 13 (27) , 15 , 17 , 19 (26) , 21,23- dodecaene-4, 6, 10, 12, 16, 18,22 , 24-octayl) octakis (oxy) ] octakis- acetic acid;

2, 2', 2'',2''',2'''',2''''',2'''''',2'''''''-[U, 8, 14, 20- Tetra{4-{2- [2- (2- methoxyethoxy) ethoxy] ethoxyjphenylJpentacyclo [19.3.1.1 3 ' 7 .1 9 ' 13 .1 15 ' 19 ]tetracosa-l(25) , 3 , 5, 7 (28) , 9, 11, 13 (27) , 15, 17, 19 (26) , 21,23- dodecaene-4, 6,10,12,16,18,22, 24-octayl) octakis (oxy) ] octakis- acetic acid octa-2 , 5-dioxo-pyrrolidin-l-yl ester,- and 2,2',2'',2''',2'''',2''''',2'''''',2'''''''-[(2,8,14,2O- Tetra{4-{2- [2- (2- methoxyethoxy) ethoxy] ethoxy}phenyl Jpentacyclo [19.3.1.1 3 ' 7 .1 9 ' 13 .1 15 ' 19 ]octacosa-l(25) ,3 , 5, 7 (28) , 9, 11, 13 (27) , 15, 17, 19 (26) ,21,23 - dodecaene-4 , 6 , 10 , 12 , 16 , 18 , 22 , 24-octayl) octakis (oxy) ] octakis [oxy (1-oxo-2 , 1-ethanediyl) imino] ] octakis-acetic acid octaethylester

For the avoidance of doubt, it is confirmed that a molecule defined as having the formula shown as SOXLl' above, or as obtainable by the methods described, or as having the NMR data given herein, as well as methods of making the molecule having the formula SOXLl' by the methods described herein, constitute part of the disclosure of the present example and of the

invention .

Incorporation of alternative R group

(7)

Concentrated HCl (12.8 mL, 37 %) was added dropwise to a stirred mixture of resorcinol (8.5 g, 77.2 mmol) and hexanal (7.7 g, 77.2 mmol) in absolute ethanol (51 mL) . The red- coloured solution was stirred for 168 h at 50 0 C. The mixture was cooled to 0 0 C and the resulting precipitate was collected by suction filtration. The crude material was purified by silica flash chromatography [CH 2 Cl 2 - Acetone (6:4), R.f = 0.52] . The fractions containing product were collected and the solvent removed in vacuo to give resorcinarene (7) as a yellow solid (11.6 g, 78 %) ; m.p. (EtOH) 330-331°C; 2957W, 2929m, 2862w, 1616m, 1498s, 1456s, 1293s, 1173s, 1080m, 1044W; δ H (399.8 MHz; DMSO; Me 4 Si) 0.84 (12 H, m, CH 3 ), 1.22 (24 H, m, CH 2 CJf 2 CH 2 CH 3 ), 2.00 (8 H, m, CH 2 CH), 4.22 (4 H, t, J 8 Hz, CHCH 2 ), 6.15 (4 H, s, Hj, 7.16 (4 H, S, H b ) , 7.16 (8 H, S, ArOH); δ c (100.5 MHz; DMSO; Me 4 Si) 13.9 (CH 3 ), 22.2 (CH 2 ), 27.4 (CH 2 ), 30.7 (CH), 31.7 (CH), 33.9 (CH 2 ), 102.3 (CH), 123.0 (2 x C), 124.9 (CH), 151.6 (C), 151.9 (C); HRMS-ES + (tn/z) : [M] + calcd for C 48 H 64 O 8 : 769.0256; found 769.4785.

(8)

Resorcinarene (7) (1.49 g, 1.94 mmol) and K 2 CO 3 (5.37 g, 38.9 mmol) were combined in dry DMF (5 mL) and stirred at 50 °C for 1 h. Ethyl bromoacetate (6.49 g, 38.9 mmol) was added and the mixture stirred at 50° C for 18 h. After evaporation of the DMF under reduced pressure, the residue was partitioned between CH 2 Cl 2 and water . The organic phase was washed with water and dried over MgSO 4 . After evaporation of the solvent, the crude material was purified by silica flash chromatography [CH 2 Cl 2 - Acetone (8:2), R.f = 0.58]. The fractions containing product were collected and the solvent removed in vacuo to give the octaester (8) as a pale yellow oil (1.2 g, 79 %) ; V max /cm "1 2930m, 2362w, 1759s, 1728w, 1612w, 1497w, 1305w, 1195s, 1078m; δ H (399.8 MHz; CDCl 3 ; Me 4 Si) 0.85 (12 H, t, J 7 Hz, CH 3 ), 1.30 (24 H, t, J 7.2 Hz, OCH 2 CH 3 ), 1.36 (24 H, m,

CH 2 CH 2 CH 2 CH 3 ) , 1.86 (8 H, m, CH 2 CH) , 4.22 (16 H, q, J 7.2 Hz, OCH 2 CH 3 ) , 4.27 (16 H, s, ArOCH 2 ) , 4.61 (4 H, t, J 7.5 Hz, CHCH 2 ) , 6.24 (4 H, s, H 3 ) , 6.62 (4 H, S, H b ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 13.9 (3 x CH 3 ) , 22.5 (CH 2 ) , 27.6 (CH 2 ) , 32.8 (CH 2 ) , 34.4 (CH 2 ) , 35.7 (CH) , 60.9 (2 X CH 2 ) , 66.9 (2 X CH 2 ) , 107.3

(CH) , 126.4 (CH) , 127.9 (2 X C) , 154.2 (2 x C) , 169.3 (2 X C) ; HRMS-ES + (m/z) : [M] + calcd for C 80 H 112 O 24 : 1457.7408; found 1457.7872.

(9)

Octaester (8) (0.94 g, 0.65 mmol) was dissolved in THF (30 mli) . To this was added a KOH solution (4.34 g KOH in 23 mL water) and the reaction mixture was stirred vigorously at room temperature for 1 h. The mixture was washed with ether and the aqueous phase adjusted to pH 2. The mixture was left at - 20 0 C for 18 h and the resultant precipitate was collected by- suction filtration. The solid was dried (0.1 mmHg) to give the octaacid (9) as a white solid (0.8 g, 82 %) ; 2363w, 1725s, 1612w, 1500s, 1241s, 1187m, 1075m; δ H (399.8 MHz; DMSO; Me 4 Si) 0.83 (12 H, m, CH 3 ), 1.25 (24 H, m, CH 2 CH 2 CH 2 CH 3 ), 1.77 (8 H, m, CH 2 CH), 4.26 (8 H, d, J 38 Hz, CH A H B COO) , 4.30 (8 H, d, J 38 Hz, CH A H B COO) , 4.48 (4 H, m, CHCH 2 ), 6.36 (4 H, s, H a ) , 6.49 (4 H, s, H b ) ; δ c (100.5 MHz; DMSO; Me 4 Si) 13.9 (CH 3 ), 22.2 (CH 2 ), 27.5 (CH 2 ), 31.5 (CH 2 ), 33.9 (CH 2 ), 35.1 (CH), 66.1 (2 x CH 2 ), 100.1 (CH), 125.4 (CH), 126.0 (2 x C) , 154.1 (2 x C), 170.5 (2 x C); HRMS-ES " (m/z) : [M] " calcd for C 64 H 80 O 24 : 1231.3120; found 1231.4785.

(10 )

To a solution of octaacid (9) (212 mg, 0.17 mmol) in dry CH 2 Cl 2 (1 mL) were added oxalyl chloride (360.3 μL, A. HS mmol) and a catalytic amount of DMF (1 drop of a solution of 1 drop of DMF in 1 mL CH 2 Cl 2 ) at 0 0 C. After 5 min, the solution was allowed to warm to room temperature and stirred for 1 h. The solvent was removed by evaporation, and CH 2 Cl 2 was evaporated twice from the remaining octaacid chloride which was then dissolved in CH 2 Cl 2 (1 mL) . In a separate flask, iV-hydroxysuccinimide (163.8 mg, 1.423 mmol) was co-evaporated with dry toluene (3 x 1 mL) , dissolved in dry THF (1 mL) , and cooled to -10 0 C. The solution of the octaacid chloride was added to the N- hydroxysuccinimide . To this was added piperidinomethyl polystyrene HL (200 - 400 mesh) (2.0 g, 8.23 mmol). The mixture was stirred for 2 h at room temperature. After 2 h the mixture was filtered through celite to remove the piperidinomethyl polystyrene HL (200 - 400 mesh) and the resulting octasuccinimidyl ester, SOXL 2 was used without further purification to prepare cross-linked GST; (344 mg, quant.); v^/cm "1 2928s, 2362m, 1735s, 1501m, 1201s, 1076s; δ H (399.8 MHz; CDCl 3 ; Me 4 Si) 0.86 (12 H, m, CH 3 ), 1.31 (24 H, m, CH 2 CH 2 CH 2 CH 3 ), 1.85 (8 H, m, CH 2 CH), 2.84 (32 H, s) , 4.66 (4 H, t, J 7.2 Hz, CHCH 2 ), 4.73 (16 H, s, ArOCH 2 ), 6.45 (4 H, s, H 3 ) , 6.52 (4 H, s, H b ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 13.1 (CH 3 ), 21.6 (CH 2 ), 24.6 (4 x CH 2 ), 26.4 (CH 2 ), 31.1 (CH 2 ), 33.4 (CH 2 ), 34.3 (CH), 63.6 (2 x CH 2 ), 100.1 (CH), 125.5 (CH), 128.1 (2 X C) ,

152 . 8 (2 x C) , 164 .2 (4 x C) , 168 . 1 ( 2 x C) ; HRMS-ES + (m/z) : [M] 2+ calcd for C 96 H 104 N 8 O 40 : 1024 . 80288 ; found 1024 . 8025 .

Incorporation, of alternative R group

(11)

Concentrated HCl (4.2 mL, 37 %) was added dropwise to a stirred mixture of resorcinol (2.29 g, 20.9 mmol) and bezaldehyde derivative (5.48 g, 20.9 mmol) in absolute ethanol (17 mL) . The red-coloured solution was stirred for 18 h at 70 0 C. The mixture was cooled to 0 0 C and the resulting precipitate was collected by suction filtration. The solid was recrystallised from absolute ethanol and dried to give resorcinarene (11) as a grey solid (5.2 g, 69 %) ; δ H (399.8 MHz; CDCl 3 ; Me 4 Si) 0.91 (12 H, m, CH 3 ), 1.42 (48 H, m, CH 2 CH 2 CH 2 CH 2 CH 2 CH 2 ) , 1.58 (8 H, m, OCH 2 CH 2 CH 2 ), 1.85 (8 H, m, OCH 2 CH 2 ), 4.02 (8 H, t, J 6.4 Hz, OCH 2 ), 5.78 (4 H, S, H c ) , 6.29 (4 H, s, H b ) , 6.44 (4 H, s, HJ , 6.65 (8 H, d, J 8.8 Hz , H e ) , , 6.80 (8 H, d, J 8.8 Hz, H d ) , 7.37 (8 H, s, ArOH); δ c (100.5 MHz; DMSO; Me 4 Si) 14.4 (CH 3 ), 23.4 (CH 2 ), 27.8 (CH 2 ), 30.4 (CH 2 ), 30.8 (CH 2 ), 30.9 (CH 2 ), 30.92 (CH 2 ), 30.94 (CH 2 ), 32.7 (CH 2 ), 42.3 (CH), 68.5 (CH 2 ), 103.3 (CH), 114.4 (CH), 122.4 (C), 130.7 (CH), 133.4 (CH), 137.6 (C), 154.0 (C), 157.9 (C);

(12)

Resorcinarene (11) (1.14 g, 0.80 mmol) and Cs 2 CO 3 (3.71 g, 11.4 mmol) were combined in dry DMF (15 mL) and stirred at 50 "C for I h. Ethyl bromoacetate (2.15 g, 12.9 mmol) was added and the mixture stirred at 50 0 C for 18 h. After evaporation of the DMF under reduced pressure, the residue was partitioned between CH 2 Cl 2 and water. The organic phase was washed with water and dried over MgSO 4 . After evaporation of the solvent, the crude material was purified by silica flash chromatography [CH 2 Cl 2 - CH 3 OH (10:1), R.f = 0.66]. The fractions containing product were collected and the solvent removed in vacuo to give the octaester (12) as a pale yellow solid (1.35 g, 80 %) ; δ H (399.8 MHz; CDCl 3 ; Me 4 Si) 0.91 (12 H, t, J 6.8 Hz, CH 3 ), 1.26 (24 H, m, OCH 2 CH 3 ), 1.39 (48 H, m, CH 2 CH 2 CH 2 CH 2 CH 2 CH 2 ) , 1.50 (8 H, m, OCH 2 CH 2 CH 2 ), 1.81 (8 H, m, OCH 2 CH 2 ), 3.92 (8 H, t, J 6.4 Hz, OCH 2 ), 4.12 (2 H, d, J " 16 Hz, CH A CH B COO) , 4.20 (2 H, d, J " 16 Hz, CH λ CH B COO) , 4.24 (2 H, d, J 16 Hz, CH A CH B COO) , 4.34 (2 H, d, J 16 Hz, CH A CH B COO) , 4.20 (16 H, m, OCH 2 CH 3 ), 5.88 (4 H, S, H c ) , 6.08 (2 H, s, H bl ) , 6.18 (2 H, s, H b2 ) , 6.24 (2 H, S, H a2 ), 6.39 (2 H, s, H a i) , 6.62 (8 H, d, J 8.8 Hz, H e ), 6.76 (8 H, d, J 8.8 Hz, H d ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 14.1 (CH 3 ), 22.7 (CH 2 ), 26.2 (CH 2 ), 29.4 (CH 2 ), 29.6 (CH 2 ), 29.7 (CH 2 ), 31.9 (CH 2 ), 42.3 (CH), 60.9 (CH 2 ), 67.4 (CH 2 ), 67.5 (CH 2 ), 67.9 (CH 2 ), 101.5 (CH), 101.7 (CH), 113.8 (CH), 127.2 (C), 128.2 (CH), 128.4

(CH), 129.9 (CH), 132.7 (CH), 134.8 (C), 154.7 (C), 157.2 (C), 169.2 (C) , 169.2 (C) ;

(13)

Octaester (12) (0.81 g, 0.39 mmol) was dissolved in THF (50 mL) . To this was added a KOH solution (5.7 g KOH in 30 mL water) and the reaction mixture was stirred vigorously at room temperature for 1 h. The mixture was washed with ether and the aqueous phase adjusted to pH 2. The mixture was left at - 20 °C for 18 h and the resultant precipitate was collected by suction filtration. The solid was dried (0.1 mmHg) to give the octaacid (13) as a white solid (0.68 g, 94; δ H (399.8 MHz; DMSO; Me 4 Si) 0.85 (12 H, m, CH 3 ), 1.27 (48 H, m, CH 2 CH 2 CJf 2 CH 2 CH 2 CH 2 ), 1.44 (8 H, m, OCH 2 CH 2 CH 2 ), 1.71 (8 H, m, OCH 2 CH 2 ), 3.88 (8 H, m, OCH 2 ), 3.94 (2 H, m, CH A CH B COO) , 4.27 (4 H, m, CH A CH B COO, CH A CH B C00) , 4.46 (2 H, m, CH A CH B COO, ), 5.71 (4 H, S, H c ), 6.04 (2 H, S, H bl ) , 6.11 (2 H, S, H b2 ) , 6.36 (2 H, S, H a2 ) , 6.46 (2 H, s, Hai) , 6.61 (8 H, d, J 8.8 Hz, H e ) , 6.69 (8 H, d, J " 8.8 Hz, H d ) ; δ c (100.5 MHz; DMSO; Me 4 Si) 13.9 (CH 3 ), 22.1 (CH 2 ), 25.7 (CH 2 ), 28.7 (CH 2 ), 29.0 (CH 2 ), 29.1 (CH 2 ), 31.3 (CH 2 ), 41.6 (CH), 66.6 (CH 2 ), 67.3 (CH 2 ), 113.4 (CH), 129.7 (CH), 134.5 (C), 156.5 (C), 170.2 (C);

(14)

To a solution of octaacid (13) (120 mg, 0.064 mmol) in dry CH 2 Cl 2 (1 mli) were added oxalyl chloride (167 μL, 1.91 mmol) and a catalytic amount of DMF (1 drop of a solution of 1 drop of DMF in 1 mL CH 2 Cl 2 ) at 0 0 C. After 5 min, the solution was allowed to warm to room temperature and stirred for 1 h. The solvent was removed by evaporation, and CH 2 Cl 2 was evaporated twice from the remaining octaacid chloride which was then dissolved in CH 2 Cl 2 (1 mL) . In a separate flask, N- hydroxysuccinimide (59.1 mg, 0.51 mmol) was co-evaporated with dry toluene (3 x 1 mL) , dissolved in dry THF (1 mL) , and cooled to -10 0 C. Then the solution of the octaacid chloride and piperidinomethyl polystyrene HL (200 - 400 mesh) (467 mg, 1.91 mmol) was added, and the mixture was stirred for 2 h at room temperature. After 2 h the mixture was filtered through celite to remove the piperidinomethyl polystyrene HL (200 - 400 mesh) and the resulting octasuccinimidyl ester, SOXL 3 (14) was used without further purification to prepare cross- linked GST; (159.3 g, 94 %) ; δ κ (399.8 MHz; CDCl 3 ; Me 4 Si) 0.90

(12 H, m, CH 3 ), 1.35 (48 H, m, CH 2 CH 2 CH 2 CH 2 CH 2 CH 2 ) , 1.47 (8 H, m, OCH 2 CH 2 CH 2 ), 1.80 (8 H, m, OCH 2 CH 2 ), 2.82 (br s, 16H), 3.91 (8 H, t, J 6.8 Hz, OCH 2 ), 4.42 (2 H, d, J 17.2 Hz, CH λ CH B COO) , 4.56 (2 H, d, J 17.2 Hz, CH A CH B COO) , 4.92 (2 H, d, J " 17.2 Hz, CH A CH S COO) , 5.02 (2 H, d, J 17.2 Hz, CH A CH B COO) , 5.93 (4 H, s, H c ) , 6.15 (2 H, s, H bl ) , 6.22 (2 H, s, H b2 ) , 6.49 (2 H, S, H a2 ) , 6.56 (2 H, s, H al ) , 6.64 (8 H, d, J 8.8 Hz, H e ) , 6.77 (8 H, d, J 8.8 Hz, H d ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 11.8 (CH 3 ), 20.4 (CH 2 ),

23.3 (CH 2 ), 23.4 (CH 2 ), 23.9 (CH 2 ), 27.1 (CH 2 ), 27.3 (CH 2 ), 27.4 (CH 2 ), 29.6 (CH 2 ), 39.6 (CH), 62.4 (CH 2 ), 63.1 (CH 2 ), 65.6 (CH 2 ), 111.6 (CH), 125.1 (C), 126.6 (CH), 127.6 (CH), 131.2 (CH), 151.7 (C), 151.9 (C), 155.0 (C), 162.7 (C), 162.9 (C), 166.7 (C), 167.0 (C);

Preparation of fluorescein-tagged compound

(15)

Monohydroxy resorcinarene (chemical communications, 2005, , 4164-4166) (350 mg, 0.46 mmol) and Cs 2 CO 3 (1.51 g, 4.62 mmol) were combined in dry DMF (5 mL) and stirred at 50 'C for 1 h. Ethyl bromoacetate (1.24 g, 7.41 mmol) was added and the mixture stirred at 50° C for 18 h. After evaporation of the DMF under reduced pressure, the residue was partitioned between CH 2 Cl 2 and water. The organic phase was washed with water and dried over MgSO 4 . After evaporation of the solvent, the crude material was purified by silica flash chromatography [CH 2 Cl 2 - Acetone (8:2), R.f = 0.62]. The fractions containing product were collected and the solvent removed in vacuo to give the octaester (15) as a pale yellow oil (190 mg, 30%) ;

V max /cm "1 2933m, 1758s, 1734m, 1612w, 1586w, 1499w, 1442w, 1407w, 1379w, 1304w, 1203s, 1182s, 1125w, 1080m; δ H (399.8 MHz; CDCl 3 ; Me 4 Si) 0.84 (9 H, t, J 7.2 Bz, CH 3 ), 1.29 (42 H, m, OCH 2 CJJ 3, CH 2 CH 2 CH 2 CH 3 ), 1.52 (2 H, m, CH 2 CH 2 OH), 1.85 (6 H, m,

CH 2 CH 2 CH 2 CH 2 CH 3 ) , 2.01 (2 H, m, CH 2 CH 2 CH 2 OH) , 3.68 (2 H, m, CH 2 OH) , 4.21 (16 H, m, OCH 2 CH 3 ) , 4.27 (16 H, s, ArOCH 2 ) , 4.59 (4 H, t, J " 7.5 Hz, CHCH 2 ) , 4.71 (1 H, m, OH) , 6.23 (4 H, S, H a ) , 6.61 (2 H, s, H b i ) , 6.63 (2 H, s, H b2 ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 14.2 (3 X CH 3 ) , 22.7 (CH 2 ) , 27.7 (CH 2 ) , 29.7 (CH 2 ) , 30.3 (CH 2 ) , 32.1 (CH 2 ) , 34.3 (CH 2 ) , .34.4 (CH 2 ) , 35.7 (CH) , 60.9 (CH 2 ) , 61.0 (CH 2 ) , 61.1 (CH 2 ) , 61.8 (CH 2 ) , 66.8 ('CH 2 ) , 67.2 (CH 2 ) , 67.3 (CH 2 ) , 100.4 (CH) , 107.3 (CH) , 126.5 (CH) , 127.8 (C) , 128.3 (C) , 128.4 (C) , 128.6 (C) , 154.3 (C) , 154.5 (C) , 154.7 (C) , 169.3 (C) , 169.4 (C) ; HRMS-ES + (m/z) : [M]Na + CaICd for C 78 H 108 O 25 : 1468.6669; found. 1467.7306.

(16)

Oxalyl chloride (26 μL, 0.28 mmol) was dissolved in CH 2 Cl 2 (200 μL) under an argon atmosphere and cooled to -78 0 C. DMSO (60 μL, 0.84 mmol) was added and the reaction mixture was stirred for 15 mins at -78 0 C. Alcohol (15) (109.1 mg, 0.076 mmol) dissolved in CH 2 Cl 2 (100 μL) was added dropwise over 2 mins and the reaction was stirred for 2 hr at -78 0 C. Et 3 N (53.9 μL, 0.38 mmol was added and the reaction was allowed to warm to 0 0 C using an ice bath over 1 hr. Water (380 μL) was added and the aqueous layer was extracted with CH 2 Cl 2 (3 x 2 ml) . The organic phase was dried over MgSO 4 and the solvent evaporated. The crude material was purified by silica flash chromatography [CH 2 Cl 2 - Acetone (9:1), R.f = 0.58]. The fractions containing

product were collected and the solvent removed in vacuo to give aldehyde (16) as a pale yellow oil (142 mg, 79 %) ; Vm a x/cm '1 2929m, 1758s, 1735m, 1613w, 1499w, 1444W, 1408W, 1378W, 1303W, 1204s, 1182s, 1126w, 1081m; δ H (399.8 MHz; CDCl 3 ; Me 4 Si) 0.85 (9 H, m, CH 3 ) , 1.25 (42 H, m, OCH 2 CH 3( CH 2 CH 2 CH 2 CH 3 ) , 1.84 (6 H, m, CH 2 CH 2 CH 2 CH 2 CH 3 ) , 2.21 (2 H, m, CH 2 CH 2 CH 2 OH) , 2.52 (2 H, m, CH 2 CHO) , 4.20 (16 H, m, OCH 2 CH 3 ) , 4.26 (16 H, s, ArOCH 2 ) , 4.61 (4 H, m, CHCH 2 ) , 6.20 (2 H, S, H al ) , 6.21 (2 H, s, H a2 ) , 6.60 (2 H, S, H bl ) , 6.64 (2 H, s, H b2 ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 13.1 (CH 3 ) , 13.2 (CH 3 ) , 13.3 (CH 3 ) , 21.3 (CH 2 ) ,

21.6 (CH 2 ) , 21.7 (CH 2 ) , 25.8 (CH 2 ) , 26.6 (CH 2 ) , 26.7 (CH 2 ) , 28.7 (CH 2 ) , 29.4 (CH 2 ) , 30.5 (CH 2 ) , 31.0 (CH 2 ) , 31.1 (CH 2 ) , 33.4 (CH 2 ) , 33.7 (CH 2 ) , 33.8 (CH 2 ) , 34.6 (CH) , 34.7 (CH) , 41.0 (CH) , 59.8 (CH 2 ) , 59.9 (CH 2 ) , 60.0 (CH 2 ) , 60.1 (CH 2 ) , 60.6 (CH 2 ) , 65.7 (CH 2 ) , 65.9 (CH 2 ) , 66.2 (CH 2 ) , 66.3 (CH 2 ) , 66.9 (CH 2 ) , 99.0

(CH) , 99.3 (CH) , 99.7(CH), 125.3 (CH) , 125.4 (CH) , 126.2 (C) ,

126.4 (C) , 126.8 (C) , 127.3 (C) , 127.5 (C) , 127.7 (C) , 129.0 (C) , 131.4 (C) , 152.7 (C) , 152.9 (C) , 153.2 (C) , 153.3 (C) ,

153.5 (C) , 153.6 (C) , 153.7 (C) , 152.8 (C) , 166.4 (C) , 168.1 (C) , 168.2 (C) , 168.3 (C) , 168.4 (C) , 168.5 (C) 202.0 (CH) ;

HRMS-ES + (m/z) : [M]Na* calcd for C 78 H 106 O 25 : 1466.6669; found 1465.6621.

(17)

Aldehyde (16) (93 mg, 0.065 ramol) and fluorescein-5- thiosemicarbazide (27.2 mg, 0.065 mmol) were combined in CH 3 OH (4 mL) and heated for 72 hr at 60 0 C. After evaporation of

the solvent, the crude material was purified by silica flash chromatography [CH 2 Cl 2 - CH 3 OH (10:1.) , R.f = 0.62] . The fractions containing product were collected and the solvent removed in vacuo to give the (17) (66 tng, 55 %) ; δ H (399.8 MHz; CDCl 3 ; Me 4 Si) 0.83 (9 H, m, CH 3 ) , 1.28 (42 H, m, OCH 2 CH 3 , CH 2 CH 2 CH 2 CH 3 ) , 1.82 (6 H, m, CH 2 CH 2 CH 2 CH 2 CH 3 ) , 2.23 (2 H, m, CH 2 CH 2 CH 2 OH) , 2.69 (2 H, m, CH 2 CHN) , 4.21 (32 H, m, OcH 2 CH 3 , ArOCH 2 ) , 4.58 (4 H, m, CHCH 2 ) , 6.20 (4 H, s, H a ) , 6.52 (4 H, S, H b ) , 6.58 (4 H, m) , 6.74 (4 H, s) , 7.25 (2 H, m) , 7.65 (1 H, m) , 7.99 (1 H, S, OH) , 10.99 (1 H, S, CHN) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 14.1 (CH 3 ) , 22.6 (CH 2 ) , 22.7 (CH 2 ) , 27.5 (CH 2 ) , 27.6 (CH 2 ) , 30.3 (CH 2 ) , 32.0 (CH 2 ) , 32.1 (CH 2 ) , 34.4 (CH 2 ) , 35.1 (CH) , 35.5 (CH) , 61.0 (CH 2 ) , 61.1 (CH 2 ) , 61.2 (CH 2 ) , 61.3 (CH 2 ) , 66.8 (CH 2 ) , 67.0 (CH 2 ) , 67.1 (CH 2 ) , 67.3 (CH 2 ) , 76.7 (CH 2 ) , 77.0 (CH 2 ) , 77.2 (CH 2 ) , 77.4 (CH 2 ) , 100.6 (CH) , 103.1 (CH) , 125.9 (CH) , 126.4 (CH) , 126.7 (C) , 128.1 (C) , 128.9 (C) , 134.9 (C) , 153.1 (C) , 154.3 (C) , 154.4 (C) , 154.5 (C) , 154.6 (C) , 169.1 (C) , 169.4 (C) , 169.5 (C) , 169.8 (C) ; HRMS-ES " (m/ z ) : [M] 2+ calcd for [C 99 H 119 N 3 O 29 S] 2+ : 923.3839; found 923.3839.

(18)

Octaester (17) (35 mg, 0.019 mmol) was dissolved in THF (5 mli) . To this was added a KOH solution (0.57 g KOH in 3 mL water) and the reaction mixture was stirred vigorously at room temperature for 1 h. The mixture was washed with ether and the aqueous phase adjusted to pH 2. The mixture was left at - 20 °C for 18 h and the resultant precipitate was collected by

suction filtration. The solid was dried (0.1 mmHg) to give the octaacid (18) as a white solid (24 mg, 78 %) ;

HRMS-ES " (m/z) : [M] K 2+ calcd for [C 83 H 87 N 3 O 29 S] 2+ : 830.88545; found 829.7273.

NHS groups may be added to the ends of the spacer arms as described above.

Preparation of biotinylatβd derivative

(19)

Aldehyde (16) (90 mg, 0.062 mmol) and (+) -biotin hydrazide

(16.1 rag, 0.062 mmol) were combined in CH 3 OH (4 mL) and heated for 72 hr at 60 0 C. After evaporation of the solvent, the crude material was purified by silica flash chromatography [CH 2 Cl 2 - CH 3 OH (10:1), R.f = 0.51]. The fractions containing product were collected and the solvent removed in vacuo to give the (19) (103 mg, 98 %) ;

HRMS-ES " (m/z) : [M] Na 2+ calcd for [C 88 H 122 N 4 O 26 S] 2+ : 852.9034; found 853.9055.

(20)

Octaester (19) (53.3 mg, 0.032 mmol) was dissolved in THF (5 mL} . To this was added a KOH solution (0.57 g KOH in 3 mL water) and the reaction mixture was stirred vigorously at room temperature for 1 h. The mixture was washed with ether and the aqueous phase adjusted to pH 2. The mixture was left at - 20 °C for 18 h and the resultant precipitate was collected by- suction filtration. The solid was dried (0.1 mmHg) to give the octaacid (20) as a white solid (24 mg, 70 %) ;

HRMS-ES + (m/z) : [M] + calcd for C 72 H 90 N 4 O 26 S : 1459.571; found 1459.5829.

NHS groups may be added to the ends of the spacer arms as described above .

De novo synthetic route for preparing biotin- and fluorescein- tagged compounds

Vi 20a Vi 20b Ri = R1 =

R2 = ,OH

Synthesis of compounds (vi 41) and (vi 48) . i.) HCl, EtOH,- ii.) Ethyl 2-bromoacetate, Cs 2 CO 3 , DMF, 80 0 C; iii.) (COCl) 2 , DMSO, -78°C, Et 3 N, rt; iv.) ( + ) -biotin hydrazide, MeOH, 72 hr, 6O 0 C; v.) KOH, H 2 O/THF, rt ; vi . ) f luorescein-5-thiosemicarbazide, MeOH, 72 hr, 60 0 C; vi . ) KOH, H 2 0/THP, rt.

Biotin derivative (vi 41) and fluorescein derivative (vi 48) were synthesized in 0.8% and 0.5% yields respectively. The poor yields are a consequence of the first synthetic step (5%) compared to the literature yield of (25%) . Compounds (vi 20a) and (vi 20b) were poorly separated by silica column chromatography with 90% of fractions remaining mixed. All attempts to improve the recovery of the desired (vi 20b) were unsuccessful. This remains an excellent route to mono derivatised resorcin [4] arenes, despite the poor overall yields, as the starting materials are relatively inexpensive. Compounds (vi 41) and (vi 48) require activation to the NHS- ester to give functional chemical cross-linkers .

Incorporation of biotin tag on upper rim of resorcinarene (X=R) One equivalent of biotin hydrazide and Et3N were added to compound (v 6) in CH 2 Cl 2 and the reaction was stirred at room temperature for 2 hours. After this time the solvent was evaporated and the remaining esters were hydrolyzed with IM NaOH to give a mixture of products that were identified by MS analysis. This mixture included (vii 16b) as well as compounds with masses corresponding to the octa-acid, a two biotin addition, three biotins, and four biotins.

Synthesis of compound (vii 16b) . i) biotin hydrazide, Et3N,

DCM, 2 h; ii) IM NaOH.

Incorporation of a cleavable sulfonβ moiety into linker arms

98%

VM38 Vi137

Synthesis of compound (vii 17). i.) DBU, DCM, 24 hr; ii.) mCPBA, DCM, 24 hr ; iii . ) TFA; iv) (COCl) 2 , DCM, then Et 3 N, (vil39) , 8 hr; v) bis (tributyltin) oxide, toluene, reflux; vi . ) (COCl) 2 , then N-hydroxysuccinimide, piperidinomethyl polystyrene, THF -1O 0 C. This synthesis is explained in more detail in the following passages. 3- (2- tert-Butoxycarbonylamino-ethylsulfanyl) -propionic acid methyl ester (21)

Methyl mercaptopropionate (2.70 g, 22.5 mmol) and BoαNethyl bromide (5.03 g, 22.5 mmol) were combined in dry DCM (10 mL) at -78 0 C. To this stirred solution was added DBU (3.42 g,

22.5 mmol) dropwise over 30 minutes. The solution was stirred for 2 Ii at -78 0 C and then allowed to warm to room temperature and stirred for a further 48 h. The solution was diluted with DCM (50 mL) , washed with water and then brine and dried over MgSO 4 . After evaporation of the solvent, the crude material was purified by silica flash chromatography [petroleum ether:ethyl acetate, 8:2, R.f. = 0.47]. The fractions containing product were collected and the solvent removed in vacuo to give 3- (2-tert-Butoxycarbonylamino- ethylsulfany1) -propionic acid methyl ester 21 as a pale yellow oil (5.27 g, 89 %) ; δ H (400 MHz; CDCl 3 ; Me 4 Si) 1.45 (9 H, s, C(CH,) 3 ), 2.62 (2 H, t, J 7.2 Hz, SCH 2 CH 2 CO), 2.67 (2 H, t, J 7.2 Hz, SCH 2 CH 2 N), 2.80 (2 H, t , J 7.2 Hz , COCH 2 ), 3.32 (2 H, m, NHCH 2 ), 3.70 (3 H, S, COCH 3 ), 4.90 (1 H, s, NH), δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 24.7 (CH 2 ), 26.3 (CH 3 ), 26.4 (CH 3 ), 26.5 (CH 3 ), 30.5 (CH 2 ), 32.7 (CH 2 ), 49.9 (CH 3 ) 77.5 (C), 153.9 (C), 170.3 (C); HRMS-ES + (m/z) : [MNa] + calcd for C 11 H 21 NO 4 S 286.1089, found 286.1082.

3- (2-tert-Butoxycarbonylamino-ethanesulfonyl) -propionic acid methyl ester (22)

vi138

3- (2-tert-Butoxycarbonylamino-ethylsulfanyl) -propionic acid methyl ester (2.91 g, 11.06 mmol) was dissolved in dry DCM (10 mL) and cooled to -10 0 C. To this stirred solution was added mCPBA (5.73 g, 33.19 mmol) dropwise over 30 minutes and the mixture was stirred for a further 48 h at room temperature. After this time Na 2 S 2 O 3 (5 mL, sat.) and Na 2 CO 3 (5 mL, sat.) were added and the reaction stirred for 2 h at room temperature. The organic phase was then washed with water and then brine and dried over MgSO 4 . After evaporation of the solvent, the crude material was purified by silica flash chromatography [petroleum ether: ethyl acetate, 1:1, R.f. = 0.41] . The fractions containing product were collected and the solvent removed in vacuo to give 3- (2-tert-

Butoxycarbonylamino-ethanesulfonyl) -propionic acid methyl ester 22 as a pale yellow oil (3.2 g, 98 %) ; δ H (400 MHz; CDCl 3 ; Me 4 Si) 1.41 (9 H, s, C (CH 3 ) 3 ), 2.85 (2 H, t, J 7.2 Hz, SCH 2 CH 2 N), 3.24 (2 H, t , J 7.2 Hz , COCH 2 ), 3.34 (2 H, t, J 7.2 Hz, SCH 2 CH 2 CO), 3.61 (2 H, m, NHCH 2 ), 3.71 (3 H, S, COCH 3 ), 5.40 (1 H, S, NH); δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 26.5 (CH 2 ), 28.2 (CH 3 ), 28.4 (CH 3 ), 28.6 (CH 3 ), 34.4 (CH 2 ), 49.1 (CH 2 ), 52.5 (CH 3 ), 53.0 (CH 2 ), 80.1 (C), 155.8 (C), 170.8 (C); HRMS- ES + (m/z): [MNa] + calcd for C 11 H 21 NO 6 S 318.0987 found 318.0967.

3- (2 -Amino-ethanesulfonyl) -propionic acid methyl ester (23)

vi139 Compound 3- (2-tert-Butoxycarbonylamino-ethanesulfonyl) - propionic acid methyl ester 22 (128 mg, 0.44 mmol) was treated with TFA (15 mL) . This mixture was stirred at room temperature for 5 mins . After evaporation of the TFA using a directed nitrogen stream, the crude material was purified by silica flash chromatography [DCM - MeOH (9:1), R.f. = 0.44]. The fractions containing product were collected and the solvent removed in vacuo to give 3- (2-Awino-ethanesulfonyl) - propionic acid methyl ester 23 as a pale yellow oil as the TFA salt (128 mg, 100 %) ; δ H (400 MHz; MeOH; Me 4 Si) 2.91 (2 H, t, J 7.2 Hz, SCH 2 CH 2 N), 3.49 (2 H, t, J 7.2 Hz, COCH 2 ), 3.50 (2 H, t, J 7.2 Hz, SCH 2 CH 2 CO), 3.55 (2 H, m, NHCH 2 ), 3.76 (3 H, S, COCH 3 ); δ c (100.5 MHz; MeOH; Me 4 Si) 27.5 (CH 2 ), 34.3 (CH 2 ), 49.2 (CH 2 ), 50.8 (CH 2 ), 52.8 (CH 3 ), 172.7 (C); HRMS-ES + (m/z) : [MH] + calcd for C 6 H 13 NO 4 S 196.0643, found 196.0629

2,2',2'',2''',2'''',2''''',2'''''',2'''''''-[(2,8,14,2Q- Tetra{4- [2- (2- methoxyethoxy) ethoxy] phenyl }pentacyclo [19.3.1.1 3 ' 7 .1 9 ' 13 .1 15 ' 19 J oct acosa-l(25) ,3, 5, 7 (28), 9, 11, 13 (27), 15, 17, 19 (26), 21, 23- dodecaene-4,6,10, 12, 16 , 18 , 22 , 24-octayl) octakis (oxy) ] octakis-3- ( 2 -amino-ethanesulfonyl) -propionic acid methyl ester (24)

To a solution of octaacid (13) (650 mg, 0.34 mmol) in dry CH 2 Cl 2 (5 mL) were added oxalyl chloride (2 mL, 17.1 mmol) and a catalytic amount of DMF (1 drop of a solution of 1 drop of DMP in 1 mL CH 2 Cl 2 ) at 0 0 C. After 5 min, the solution was allowed to warm to room temperature and stirred for 1 h. The solvent was removed by evaporation, and CH 2 Cl 2 was evaporated twice from the remaining octaacid chloride which was then dissolved in CH 2 Cl 2 (5 mL) . Compound (vil39) (115.8 mg, 0.83 mmol) was dissolved in CH 2 Cl 2 (5 mL) . After addition of Et 3 N (1.1 mL, 7.3 mmol), the mixture was cooled to 0 °C, and the octaacid chloride in dry CH 2 Cl 2 (1 mL) , was added slowly. The mixture was stirred for 10 min at 0 1 C and 2 h at room temperature . After evaporation of solvent under reduced pressure, the residue was partitioned between CH 2 Cl 2 and water. The organic phase was washed with water and dried over MgSO 4 . After evaporation of the solvent, the crude material was purified by silica flash chromatography [CH 2 Cl 2 - CH 3 OH (9:1), R.f = 0.47] . The fractions containing product were collected and the solvent removed in vacuo to give (24) as a pale yellow solid (871 mg, 77 %) ; δ H (400 MHz; CDCl 3 ; Me 4 Si) 2.75 (4 H, t, J 7.2 Hz, COCH 2 ), 2.82 (4 H, t, J " 7.2 Hz, SCH 2 CH 2 N), 3.26 (4 H, t, J " 7.2 Hz, SCH 2 CH 2 CO), 3.26 (12 H, S, OCH 3 ), 3.37 (4 H, m, NHCH 2 ), 3.53 (8 H, m, CH 3 OCH 2 ), 3.63 (8 H, m, CH 3 OCH 2 CH 2 ), 3.65 (8 H, m, CH 3 OCH 2 CH 2 OCH 2 ), 3.68 (8 H, m, ArOCH 2 CH 2 OCH 2 ), 3.69 (24

H, s, COOCH 3 ), 3.80 (8 H, t, J " 5 Hz, ArOCH 2 CH 2 ), 4.01 (8 H, t, J " 5 Hz, ArOCH 2 ), 4.24 (4 H, d, J 16 Hz, CH A H B COO) , 4.35 (4 H, d, J 16 Hz, CH A H B COO) , 4.41 (4 H, d, J 16 Hz, CH A H B C00) , 4.45 (4 H, d, J 16 Hz, CH A H B COO) , 5.74 (4 H, S, H c ) , 6.17 (2 H, s, H bl ) , 6.45 (2 H, s, H b2 ) , 6.56 (16 H, m H e , d ) , 6.69 (4 H, s, H 31 , a2 ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 32.3 (CH 2 ), 32.4 (CH 2 ), 32.7 (CH 2 ), 33.2 (CH 2 ), 42.4 (CH), 48.8 (CH 2 ), 48.9 (CH 2 ), 49.1 (CH 2 ), 49.2 (CH 2 ), 49.4 (CH 2 ), 49.6 (CH 2 ), 49.8 (CH 2 ), 51.3 (CH 2 ), 51.5 (CH 2 ), 51.8 (CH 2 ), 52.3 (CH 3 ), 52.5 (CH 2 ), 58.9 (CH 3 ), 67.3 (CH 2 ), 67.5 (CH 2 ), 68.7 (CH 2 ), 69.7 (CH 2 ), 70.4 (CH 2 ), 70.5 (CH 2 ), 70.6 (CH 2 ), 71.8 (CH 2 ), 97.5 (CH), 100.9 (CH), 114.3 (CH), 125.3 (C), 127.3 (C), 129.0 (CH), 129.6 (CH), 132.0

(CH) , 132.8 (C) , 153.7 (C) , 153.8 (C) , 157.4 (C) , 168.6 (C) , 168.7 (C) , 168.8 (C) , 168.9 (C) , 170.9 (C) , 171.0 (C) ; HRMS- ES + (m/z) : [M] 3+ calcd for [C 144 H 200 N 8 O 64 Sa] 3+ ; 1109.0209; found 1109.0192.

2,2' ,2' ' ,2' ' ' ,2' ' ' ' ,2' ' ' ' ' ,2' ' ' ' ' ' ,2' ' ' ' ' ' '- [(2,8,14,2Q- Tetra{4- [2- (2- methoxyethoxy) ethoxy] phenyl }pentacyclo [19.3.1.1 3 ' 7 .1 9 ' 13 . l 15 ' 19 ] oct acosa-l(25) , 3 , 5 , 7 (28) , 9, 11, 13 (27) , 15 , 17 , 19 (26) , 21, 23- dodecaene-4,6,10,12,lG,18,22,24-octayl)octakis (oxy) ] octakis-3-

(2-amino-ethanesulfonyl) -propionic acid (25)

To a suspension of compound (24) (462 rag, 0.0105 mmol) in dry toluene (15 tnL) was added bis (tributyltin) oxide (1.3 g, 2.24 mmol) . The mixture was refluxed (107 0 C) under argon for 48 hr resulting in the formation of a red solution. After this time the toluene was evaporated and the oil suspended in ethyl acetate (20 rtiL) . To this was added 2M HCl (5 mL) and the biphasic mixture was stirred for 2 hr at room temperature. An off white precipitate formed between the two layers, which was collected, washed with water and diethyl ether and residual solvent removed in vacuo to give (25) as an off white solid (284 mg, 64 %) ; δ H (400 MHz; CDCl 3 ; Me 4 Si) 2.60 (4 H, t, J 7.2 Hz, COCH 2 ), 2.68 (4 H, t, J 7.2 Hz, SCH 2 CH 2 N), 3.24 (12 H, s, OCH 3 ), 3.26 (4 H, t, J 7.2 Hz, SCH 2 CH 2 CO), 3.39 (4 H, m, NHCH 2 ), 3.44 (8 H, m, CH 3 OCH 2 ), 3.49 (8 H, m, CH 3 OCH 2 CH 2 ), 3.56 (8 H, m, CH 3 OCH 2 CH 2 OCH 2 ), 3.61 (8 H, m, ArOCH 2 CH 2 OCH 2 ), 3.74 (8 H, m, ArOCH 2 CH 2 ), 3.98 (8 H, m, ArOCH 2 ), 4.20 (4 H, d, J " 16 Hz,

CH A H B C00) , 4.21 (4 H, d, J 16 Hz, CH λ H B COO) , 4.52 (4 H, d, J 16 Hz, CH A H B COO) , 4.56 (4 H, d, J 16 Hz, CH A H B COO) , 5.82 (4 H, S, H e ) , 6.01 (2 H, S, H bl ) , 6.21 (2 H, s, H ba ) , 6.64 (2 H, S, H a2 ) , 6.69 (16 H, m, H e , d ) , 6.94 (2 H, S, H ai ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 26.7 (CH 2 ) , 27.1 (CH 2 ) , 40.6 (CH) , 48.5 (CH 2 ) 48.7 (CH 2 ) , 50.6 (CH 2 ) , 51.6 (CH 2 ) , 52.4 (CH 2 ) , 58.5 (CH 3 ) , 63.3 (CH 2 ) , 64.9 (CH 2 ) , 67.3 (CH 2 ) , 69.5 (CH 2 ) , 70.1 (CH 2 ) , 70.3 (CH 2 ) , 70.4 (CH 2 ) , 71.7 (CH 2 ) , 114.8 (CH) , 127.7 (C) , 129.5 (CH) , 156.9 (C) , 168.3 (C) , 172.1 (C) ; HRMS-ES " (m/z) : [M] 3" calcd f or [C 136 H 184 N 8 O 64 S 8 I 3+ : 1069.2816; found 1069.2816.

2,2' ,2' ' ,2' I 1 ,2' 1 ' ' ,2 I I I ! I / 2 i i i i i i 2 ' ' ' ' ' ' '- [(2,8,14 ,20-

Tetra{4- [2- (2 _ methoxyethoxy) ethoxy] phenyl }pentacyclo [19. 3.1. I 3 ' 7 .1 9 ' 13 .l 15 ' 19 ]oct acosa-1 (25) ,3 ,5, 7(28) ,9,11, 13 (27) ,15,17,19 (26) ,21,23 _ dodecaene-4 ,6 ,10 , 12 , 16, 18, 22,24-octayl) octakis (oxy) ] octakis-3-

(2-amino-ethanesulfonyl) -propionic acid octa-2, 5dioxo- pyrrolidin-1-yl ester (26)

To a solution of octaacid (25) (32.5 mg, 0.010 mmol) in dry CH 2 Cl 2 (1 mL) were added oxalyl chloride (100 mg, 1.26 mmol) and a catalytic amount of DMF (1 drop of a solution of 1 drop of DMF in 1 mL CH 2 Cl 2 ) at room temperature. The mixture was stirred at room temperature for 24 h, resulting in the formation of a red solution. The solvent was removed by evaporation, and CH 2 Cl 2 was evaporated twice from the remaining octaacid chloride which was then dissolved in CH 2 Cl 2 (1 mL) . In a separate

flask, iV-hydroxysuccinimide (9.2 rag, 0.08 mmol) was co- evaporated with dry toluene (3 x 1 mL) , dissolved in dry THF (1 mL) , and cooled to -10 0 C. Then the solution of the octaacid chloride and piperidinoraethyl polystyrene HL (200 - 400 mesh) (41.4 mg, 0.17 mmol) was added, and the mixture was stirred for 2 h at room temperature. After 2 h the mixture was filtered through sintered glass to remove piperidinomethyl polystyrene HL (200 - 400 mesh) and the resulting octasuccinimidyl ester, (26) was used without further purification to prepare cross-linked GST (data not shown); (39.9 mg, 48 %) ; δ H (400 MHz; CDCl 3 ; Me 4 Si) 2.59 (4 H, t, J 7.2 Hz, COCH 2 ), 2.61 (s, 32H), 2.67 (4 H, t, J 7.2 Hz, SCH 2 CH 2 N), 3.24 (12 H, s, OCH 3 ), 3.27 (4 H, t, J 7.2 Hz, SCH 2 CH 2 CO), 3.40 (4 H, m, NHCH 2 ), 3.45 (8 H, m, CH 3 OCH 2 ), 3.47 (8 H, m, CH 3 OCH 2 CH 2 ), 3.57 (8 H, m, CH 3 OCH 2 CH 2 OCH 2 ), 3.62 (8 H, m, ArOCH 2 CH 2 OCH 2 ), 3.75 (8 H, m, ArOCH 2 CH 2 ), 3.98 (8 H, m, ArOCH 2 ), 4.19 (4 H, d, J " 16 Hz, CH A H B COO) , 4.21 (4 H, d, J 16 Hz, CH A H B COO) , 4.53 (4 H, d, J 16 Hz, CH A H B COO) , 4.56 (4 H, d, J 16 Hz, CH A H B COO), 5.82 (4 H, s, H c ) , 6.02 (2 H, S, H bl ) , 6.21 (2 H, s, H b2 ) , 6.65 (2 H, s, H a2 ) , 6.70 (16 H, m, H e,d ) , 6.94 (2 H, s, H al ) ; δ c (100.5 MHz; CDCl 3 ; Me 4 Si) 25.7 (CH 2 ), 26.8 (CH 2 ), 27.1 (CH 2 ), 40.5 (CH), 48.5 (CH 2 ) 49.1 (CH 2 ), 50.6 (CH 2 ), 51.7 (CH 2 ), 52.2 (CH 2 ), 58.5 (CH 3 ), 63.2 (CH 2 ), 64.9 (CH 2 ), 67.2 (CH 2 ), 69.5 (CH 2 ), 70.0 (CH 2 ), 70.2 (CH 2 ), 70.4 (CH 2 ), 71.7 (CH 2 ), 114.7 (CH), 127.6 (C), 129.4 (CH), 156.9 (C), 162.7 (C), 168.3 (C), 170.4 (C), 172.1 (C) , 173.3 (C) .

The presence of a sulfone group in the cross-linking reagent can allow for cleavage of a conjugate through hydrolysis of the linkage under basic conditions. In 0.1 M sodium phosphate, adjusted to pH 11.6 by addition of Tris base, containing 6 M urea, 0.1 % SDS, and 2 mM DTT, sulfone groups have been successfully cleaved after incubation at 37 0 C for 2 h.

Hydrolysis of the sulfone yields two cross-linker fragments,

one terminating in a sulfonic acid group and the other containing a hydroxyl group, as shown below:

Cross-Linking of GST with SOXL

Production and purification of glutathione S-transferase

(GST) .

XL-lBlue E.coli (Stratagene, CA, USA) were transformed with the pGEX2T plasmid (AmerhsamBioscience) . This plasmid drives bacterial expression of the Schistosoma japonicum form of GST. XL-I Blue harbouring pGEX2T were cultured in 2xYT containing 100 μg/ml ampicillin at 37 0 C, to an optical density between 0.6-0.8 OD 600 before being transferred to 28 0 C, at which time isopropyl-β-D-thiogalactoside was added to a final concentration of 0.1 mM, as previously described (1) . After 16 h incubation, cells were harvested, lysed and GST purified over Glutathione-Sepharose according to the manufacturer's recommendations (AmershamBioscience) . Eluted GST was dialysed extensively against 4 changes of PBS and the protein concentration determined by Bradford assay. The integrity of the GST was verified by SDS-PAGE and Coomassie blue staining. Purified GST was stored in 100 μl aliquots at -80 0 C until required.

Gross-linking experiments

GST used at a constant final concentration of 2.7 μM in 30% (v/v) phosphate buffered saline, pH 7.2 (Invitrogen) 70% (v/v) water. Where stated, BSA was added in equimolar concentrations. Stock solutions of cross-linkers 1 and 5 were dissolved in DMSO to identical concentrations and between 1 and 8 molar equivalents added to GST solution to a final volume of 100 μl . The overall DMSO concentration was constant for all experiments (8% v/v) . Reactions were performed for 60

minutes at room temperature and quenched by addition of 1 μl of IM Tris-HCl pH7.5. After cross-linking, samples were heated with 2-mercaptoethanol, 20 μl of each sample loaded per lane and separated through 10% poly-acrylamide SDS gels.

SDS-PAGE and Immunoblotting

SDS-PAGE and immunoblotting were carried out as described previously (2, 3). A monoclonal anti-GST antibody (BD Bioscience Pharmingen, Oxford) was used for detection of GST and its cross-linked forms at a concentration of 5 μgml "1 .

Rabbit anti-rat secondary antibody conjugated to horseradish peroxidase were used at a concentration of 0.05 μg/ml (Dako, Denmark) . Immunoblots were developed using the ECL system (AmershamBioscience) and Kodak X-AR 5 film.

Results

Octavalent cross-linker, SOXL 1, was rapidly synthesized using acid-catalyzed Friedel-Crafts alkylation of resorcinol with 4- (2- (2- (2-methoxyethoxy) ethoxy) ethoxy) benzaldehyde in absolute ethanol at 80 0 C to provide resorcinarene 2 in 70 % yield after recrystallization from hot ethanol (scheme 1) . t141 Subsequent alkylation of all eight resorcin[4] arene phenols proceeded in good overall yield of the octaethyl ester 3 using a two molar excess of ethyl 2-bromoacetate for each phenolate. Saponification of the resulting ethyl esters provided the octa-acid, 4, which was treated with oxalyl chloride, then N- hydroxysuccinimide to yield the prototypical resorcin[4] arene derived succinimidyl octaester chemical cross -linker SOXL 1. We found that employing the polymer-supported base, piperidinomethyl polystyrene, in this transformation provided high yields of essentially pure product 1.

Scheme 2. Reagents and conditions: (a) Ethyl 2-brotnoacetate, Cs 2 CO 3 , DMF, 8O 0 C; (b) KOH, H 2 0/THF, rt ; (c) (COCl) 2 , then N- hydroxysuccinimide, piperidinomethyl polystyrene, THF -1O 0 C.

Comparison of NMR data with similar polyether derivatives suggested a flattened cone structure,' 141 but we subsequently obtained crystals of the ethyl ester 3 , [1S1 that provided unambiguous assignment of the C 2h isomer in which adjacent pairs of aldehyde derived groups sit on opposite faces of the central macrocycle. The terminal esters occupy a distorted rhomboid geometry, with C-C distances for the carbonyls ranging between 4.5 to 13.5 A (figure 1) . We then set out to compare the octaester 1 with a commercial homobivalent crosslinker disuccinimidyl suberate, DSS 5, for their efficiency to cross-link a known protein dimer. Glutathione S-transferases (GST) [E . C.2.5.1.18] tlε) exist as homodimers , although higher order oligomers have also been reported. 1171 Experiments were performed on the Schistosoma japonicum form of GST purified from E.coli transformed with the pGEX2T expression vector . [18] Aliquots of between 1 and 8 molar equivalents of SOXL 1 or DSS 5 were added to GST solution in phosphate buffer to a constant final volume of DMSO. All solutions remained clear suggesting complete dissolution of either cross-linker.

Immunoblotting provided the principal means for visual inspection of the results. In the absence of cross-linker,

only monqmeric GST is apparent (Figure 2a, lane 13). At 2, 4, 6 and 8 molar equivalents of cross-linker to GST, increased levels of dimer (-56 kDa) are apparent in samples treated with SOXL 1 (Figure 2a, lanes 3, 5, 7 & 9) compared to DSS 5 (Figure 2a, lanes 4, 6, 8 & 10) . For SOXL 1 treated lanes, an additional band consistent with addition of at least one copy of SOXL 1 (-1.6 kDa) to GST monomer (-27 kDa) is apparent. SOXL 1 treated GST-dimer has a slightly higher molecular weight than that treated with DSS 5, most likely due to the additional molecular weight contributed by SOXL 1. The nature of the higher molecular weight band in the DSS 5 treated lanes is currently not clear.

Next we sought to explore whether the numerous reactive esters presented by the multivalent linker might survive long enough to react with non-neighboring proteins. To test this, we used BSA (Bovine Serum Albumin) that does not form strong complexes with GST. Equimolar mixtures of BSA and GST were exposed to either of the cross-linkers and the resulting mixtures examined by immunoblotting. As observed in Figure 2a, lanes 11 and 12, the presence of BSA did not alter GST-dimer formation. Importantly, probing with an anti-BSA antibody only detected BSA monomer (-66 kDa) , with no evidence of covalently linked GST-BSA (-93 kDa), shown in Figure 2b. These results suggest that SOXL 1 only combines with near-neighbor proteins .

Incorporation of multiple succinimidyl ester functionality onto a semi-rigid scaffold furnished a reagent capable of cross- linking GST homo-dimers and higher-order species more effectively than the homobivalent DSS 5. We envisage that the semi-rigid architecture described provides bias toward surface modifications and not at the core. Meanwhile, the number of reactive functional groups presented over a large surface area increases cross-linking between neighboring proteins and the amount of stabilized multimeric species when compared to simple divalent cross-linkers, such as DSS 5. Modular improvement of the prototype cross-linker 1 will be facilitated by the rapid, adaptable synthesis described

herein. For example incorporation of photoactivatable diazirine functionality may offer a more reactive, but shortlived species upon irradiation and the timing of cross-linking to be controlled more precisely. Meanwhile, adjustment of the aldehyde derived component may achieve modulation of solubility, or incorporation of biotin for affinity purification. Finally,- the defined geometry offered by SOXL 1 may offer insight into the topology of the interacting protein complex by analysis of its footprint, perhaps after partial digestion of the entrapped protein and mass spectrometry.

Addition of functionality to increase the ionization of the linker under MS conditions is anticipated.

Cross-Linking of DnaD with SOXL

The SOXL molecule has also been used in a study of the domain organisation of the DnaD protein of Bacillus subtilxs. During DNA replication in low G+C content gram positive bacteria including B. subtilis, the DnaD protein participates in an essential primosomal cascade to recruit the replicative helicase and primase proteins to the oriC or to a restart site (Sonenshein et al, 2002) .

DnaD consists of N-terminal (Nd) and C-terminal (Cd) domains. Cd is tetrameric, binds DNA and DNA-binding induces its oligomerisation. Although Cd binds to the plasmid pBR322 it fails to open it up either alone or in the presence of Nd.

Overall it appears that both Nd and Cd participate in different oligomerisation interfaces; Nd oligomerisation is DNA independent whereas Cd oligomerisation is DNA dependent.

Nd and Cd must be linked in the same polypeptide to exhibit

DNA remodelling activity.

Cloning and purification of Nd and Cd C-terminally His -tagged Nd and Cd were constructed from a pET22b-DnaD vector (Turner et al , 2004) by amplifying Ndel- Xhol gene fragments using PCR and cloning into the same sites

of pET22b. Both domains were tagged with the ELH6 sequence at their C-termini. They were over-expressed in E. coli BL21(DE3) and purified using a HiTrap-Ni2+-chelating column and gel filtration through superdex S75. Both proteins were purified in 50 mM Tris pH 7.5, 100 mM NaCl, 1 mM DTT and made up to 10%v/v glycerol before snap-freezing in liquid nitrogen for storage at -80 0 C.

SOXL cross linking Proteins (DnaD, Nd) were incubated with SOXL at different protein: SOXL molar ratios (1:14 and 1:28) in 50 mM Tris pH7.5, 2 mM EDTA, 1 mM DTT, 350 mM NaCl (for DnaD) or 100 mM NaCl (for Nd and Cd) for 30 minutes at 37 0 C. Samples were then resolved by SDS-PAGE through a 12% gel. A 25 mM stock of SOXL in DMSO was prepared for long storage and prior to a linking experiment SOXL was diluted further in 10% v/v DMSO and added to the protein solution to maintain the DMSO at a maximum of 2% v/v in the reaction mixture. Cross-linking of Cd (18 μM) in the presence or absence of DNA was carried out with increasing concentrations of SOXL (0.5, 1.0, 1.5 mM) in the presence or absence of the same ssl9mer oligonucleotide (0.9 and 0.18 μM) that was used in the gel shift assays.

Analytical ultracβntrifugation Analytical ultracentrifugation was carried out in an Optima XL-A ultracentrifuge (Beckman Coulter, Palo Alto, CA, USA) . Sedimentation velocity was carried out at three loading concentrations in two channel centrepieces. All experiments were at 40,000 rpm, data were taken at 5 minute intervals, and analysed using SEDFIT (www.analyticalultracentrifugation.com). Sedimentation equilibrium was carried out in six channel centre pieces and samples were centrifuged at 12,000, 16,000, 22,000 and 28,000 rpm. Data were fitted globally using SEDPHAT. Errors were estimated by 1,000 runs of a Monte Carlo simulation. Solvent density was determined using an Anton Paar DMA 5,000 density meter (Anton Paar, Hertford, UK). Solution

viscosity was determined using a Schott Gerate viscometry unit attached to an Oswald viscometer (Schott Gerate, Germany) .

Gel shift assays Gel shift assays were carried out in 50 mM Tris pH 7.5, 0,1 KiM EDTA, 4 mM MgCl2, 1 mM DTT, 10% v/v glycerol, 2.5 nM radioactively labelled probe (ss or ds 19mer, 30mer and 54mer oligonucleotides) at different concentrations of proteins, as indicated. Reactions were incubated at room temperature for 15 minutes. Samples were resolved through an 8% non-denaturing polyacrylamide gel made in TBE and run in 0.5 X TBE. Gels were dried, visualised and analysed with a phosphorimager . Agarose shift assays were carried out as described before (Zhang et al, 2005) .

RESULTS

Domain organisation of DnaD

Using the domain prediction programme InterPro (Apweiler et al , 2001) with secondary structure prediction programmes (nnpredict and Predict Protein) we identified a domain boundary that separates two domains; Nd (residues 1-128) and Cd (residues 129-232) . We cloned, over-expressed and purified Nd and Cd as C-terminally tagged proteins. From their gel filtration elution profiles it appeared that Nd (MW 16,056 Da, including the extra ELH6 tag) is oligomeric while Cd (MW

13,730 Da, including the extra M, ELH6 tag) is monomeric. No interaction between the two domains was detected (Fig. 3A) . Velocity and equilibrium ultracentrifugation analysis established that Nd is a tetramer while Cd is a monomer. Sedimentation velocity data showed that both proteins sedimented as single species, and the molecular mass distributions were determined using the c (M) method of Schuck (Schuck, 2000) . The Nd has a MW of a tetramer (62 kDa) , while the Cd is a monomer (13 kDa) . Direct fitting of the Lamm equation to the velocity traces yielded molecular weight values of a tetramer (58,842 [54,542 63,142] Da) for Nd and a monomer (13,092 [12, 417 14,092] Da) for Cd; errors quoted are

from the nonlinear fit at the 67 % confidence level. Frictional coefficients were determined using the program SEDNTERP, and examination of these values showed that the Nd was in a more elongated conformation that Cd (1.482 vs 1.282) . All hydrodynamic data are summarized in Table 1 below. Sedimentation equilibrium experiments of both the Nd and Cd gave molecular weights of 61,118 [60,679 61,560] Da and 12,053 [11,659 12,242] Da, respectively, corresponding again to a tetramer and a monomer (Fig. 3B) .

Table 1

The tetrameric state of Nd was also confirmed with chemical cross-linking using the multivalent cross-linker succinimidyl octaester (SOXL) . Interestingly dimers and tetramers of Nd (18 μM) were cross -linked by SOXL at 4.5 - 9μM (Fig. 3C) . At higher Nd concentration (30 μM) , bigger oligomers were also cross-linked indicating a concentration dependent oligomerisation tendency (Fig. 4B) . It was also apparent that at 30 μM individual Nd monomers were cross-linked internally (Fig. 4B) . By comparison, under the same conditions (4.5 - 9 μM) crosslinking of full length DnaD produced only a diitieric species with no obvious signs of oligomers. However, at higher concentrations it can also form a mixture of concentration dependent oligomers (Turner et al, 2004).

The Cd monomer binds DNA and exhibits a DNA-induced oligomerisation activity Cd bound to small oligonucleotides in gel shift assays (not shown) . By comparison Nd did not exhibit any DNA binding activity. Although the Cd monomer is small the observed shifts were large and comparable to these obtained with the full

length DηaD. The sizes of the complexes appear to be very large as they fail to enter into the gel. Similar shifts were observed with longer 54mer and shorter 19mer ss oligonucleotides, as well as with an intermediate 30mer ds oligonucleotide. Therefore the large complexes are not the result of individual Cd molecules binding side by side to the oligo substrates, but instead there is an inherent oligomerisation activity that is induced by binding to DNA. Once a Cd molecule binds to DNA it acts as a λ seed' to induce the binding of more Cd molecules, forming large complexes. We examined this further using SOXL cross-linking in the presence and absence of DNA (Fig. 5) . In the presence of a ssl9mer oligonucleotide, large complexes of Cd molecules with variable stoichiometries were cross-linked appearing as a smear higher up the gel (Fig. 5) . In the absence of DNA no such complexes were detected but some internal cross-linking was apparent manifested as a smear immediately above the Cd band (Fig. 5) . This behaviour is also apparent in the full length protein (Turner et al) . As the DNA-induced oligomerisation activity resides exclusively on Cd it is different than the DNA- independent oligomerisation activity of Nd.

The efficiency of SOXLl cross-linking is not simply due to span The crystal structure of an intermediate ester of SOXLl revealed a distribution of carbonyl groups between 4.5 and 13.5 A. A panel of commercial bivalent linkers, ranging in span from approximately * 8 to 13 A, were tested for their ability to cross-link GST and compared to SOXLl. None of these bivalent cross-linkers were as efficient as SOXL 1 (Figure 6) , confirming the benefit of the geometric presentation of multiple reactive groups.

Comparison of lanes 10, 11 and 12 with lane 1 shows that cross-linking with SOXLl is more efficient than that achieved using a four-fold quantity of DSS, DSG or SAB, representing equimolar amounts of reactive groups.

Resorcin[4] arene configuration may affect cross-linking

Cross-linkers with C 4v symmetry, (derived from alkyl aldehydes, S0XL2) were compared with C 2h symmetric analogues (derived from 4-alkoxyphenyl aldehydes, SOXL3) . While each compound has the same number of reactive esters, distinct cross-linking efficiency was apparent. The crown shaped octavalent linker SOXL2 displays all eight esters on the same face, but is less efficient, and also displays lower cross-linking when compared directly to the bivalent linker DSS or to SOXLl. The possibility that this is a simple effect of solubility in the media can be discounted because the hydrophobic C 2h compound, SOXL3 , is just as effective, if not more so, than its polyethylene glycol modified analogue, SOXLl. Possible explanations for this are that the C 2h compound achieves a wide three dimensional display, but the span of the C 4v compound is relatively narrow and homogenous . Alternatively, the C 4v compound may self-associate due to its more ordered hydrophobic lower rim, which may reduce the amount of reagent available for cross-linking at the protein interface.

Cross- linking of phosphoinositide 3 -kinase heteromers using SOXLl .

Crosslinking of phosphoinositde 3 -kinase heteromers (pllθβ/p85α) was performed with a 20 molar excess of either SOXL 1, SOXL 2, SOXL 3 or DSS. The pllθ/p85 heteromeric complex either alone (No XL) and or bound to nickel beads were used as controls (the pllOβ subunit has a His-tag facilitating purification on a Nickel resin) . 60 mins after cross-linking, proteins were denatured and separated by SDS-PAGE prior to immunoblotting with anti-pllOβ antibodies. Results are shown in Fig. 7. The 320 kDa species is consistent with a complex of two pllOβ subunits and one p85α subunit, the 282 kDa specific is consistent with a complex of one pllO subunit and two p85α subunits.

Comparison of the time-course of SOXLl and DSS-mediated crosslinking.

Cross-linking of GST was performed with 4 molar equivalents of either SOXLl (S) or DSS (D) , for 1, 5, 10 and 30 minutes before the reactions were terminated. Proteins were denatured and separated by SDS-PAGE prior to immunoblotting with anti- GST antibodies. The position of molecular weight standards (in kDa) and GST complexes are indicated. The results (Fig. 8) indicate that SOXL-I initiates cross-linking within 1 minute and the reaction plateaus at around 30 minutes.

Cross-linking of DnaD Nd using chemically cleavable cross- linker A stock of 5OmM chemically cleavable linker (26) was made by dissolving 33.6mg of linker into 168.5μl of DMSO (100%).

Protein (DnaD, Nd) was incubated with the chemically cleavable linker 26 at different protein: linker molar ratios (1:14 and 1:28) in Tris pH 7.5, 2 mM EDTA, ImM DTT, 100 mM NaCl for 30mins at 37 0 C. Samples were then resolved by SDS-PAGE through a 12% gel. Prior to cross-linking the chemically cleavable linker was diluted further to 5mM in 10% v/v DMSO and added to the protein solution to maintain the DMSO at a maximum of 2% v/v in the reaction mixture. Incubation with 1:14 or 1:28 molar ratio was achieved by adding 2.6μl and 5.2 μl ofcleavable linker 26 (5mM 10%DMSO) to 30μl of Nd (30.6μl) respectively, and incubating for 30mins at 37°C. The control expertiment run with 30.6μM Nd + 2.6 μl of TE-100 buffer (5OmM tris, ImM EDTA, 10OmM NaCl) . Samples were then resolved by SDS-PAGE through a 12% gel (Figure 9) .

Chemical cleavage of linker molecule and MS analysis Bands 1-6 shown in Figure 9 were cut from the gel. Thus band 1 contains Nd protein alone, band 2 contains Nd monomers + linker 26, band 3 contains Nd dimers + linker 26, bands 4 to 6 contain higher oligomers of Nd + linker 26.

Gel bands were excised, dessicated with acetonitrile, then treated with buffer containing 3mM DTT, 6M urea, 10OmM tris buffer and 10OmM phosphoric acid adjusted to pH11.6, by addition of 1OM NaOH, for 2 hours at 37 0 C. This treatment cleaves the sulfone groups in the linker arms. Crosslinked proteins will therefore be separated into monomers, but those monomers will still carry residual fragments of the cross- linking reagent (extending from the sulfone group to the reactive NHS group) covalently bound to those residues with which they reacted in the cross-linking reaction.

The chemically cleaved material was reduced with DTT and alkylated with iodoacetamide before trypsin digestion in situ. The extracted material was then digested with trypsin, to yield a mixture of peptides. Those peptides containing residues with which the cross-linking reagent bound will have greater molecular weight than expected simply from their amino acid sequence because of the residual fragments of the cleaved reagent which remain bound to them.

Mass spectrometry of the cleaved and digested material was performed at the Biopolymer Synthesis and Analysis Unit, University of Nottingham. Tryptic peptides were desalted by binding and then elution from a C18 Zip-tip (Millipore) , and then analyzed by electrospray mass spectrometry (Waters QTOF2 hybrid quadrupole mass spectrometer) .

The following fragments are identified as arising from trypsin digest of the N-terminal domain (Nd) of Bacillus subtilis DnaD protein. The sequences highlighted in bold (T12*, 1091.5 and T13*, 1897.0) represent fragments consistently observed alongside peaks at 1-200 Da higher mass (1194.6 and 2046) in the spectra of chemically cleaved material. A further peak at 2744.2 (marked (?)) is seen alongside a higher mass peak at 2898 in two out of four spectra of chemically cleaved material (data not shown) .

DnaD - Average Mass 14334.5909, Monoisotopic Mass = 14325.2830

N-Terminus = H, C- terminus = OH

Modified amino acids: CAM(B) = CAM Cysteine

Digest: Trypsin :/K-\P /RλP

Frag Res# Sequence [M+H]

T12* 97-105 (K)YSLQPLWGK(L) 1091.59

T4* 27-40 (K)QLGLNETELiLLLK(I) 1596.96

T10 79-93 (K)GFLFIEEBEDQNGIK(F) 1798.83

T13* 106-120 (K)LYEYIQLAQNQTQER(-) 1896.95

T3* 4-26 (K)QQFIDMQEQGTSTIPNLLLTHYK(Q) 2705.36

T7* 48-71 (K)GSYFPTPNQLQEGMSISVEEBTNR(L) 2744.23 (?)

This data confirms that "labelled" peptides carrying residual fragments of cross-linking reagents can be detected by mass spectrometry .

Cross-linking of multi-protein complexes.

Cell-free analyses.

A system has been established to investigate the ability of the cross-linking reagents described in this specification to cross-link multi -protein complexes, based on knowledge of the interactions between chosen proteins within mammalian cells. It is intended to mimic these interactions in a cell-free system.

Fusion proteins encoding growth factor receptor binder 2 (Grb2, 28 kDa) , grb2-associated binder 2 (Gab2, 90 kDa) , src homology phosphatase 2 (Shp2, 68 kDa) and the 85kDa regulatory subunit of class IA phosphoinositide 3 -kinases (p85, 85 kDa) have all been expressed in XL-lBlue E.coli, purified and the fusion partner removed by thrombin cleavage. Further purification of each protein has been conducted to yield preparations of greater than 95% purity.

In addition, Gab2 has been expressed in XL-I Blue E.coli expressing an active tyrosine kinase, resulting in constitutive tyrosine phosphorylation of Gab2. It is envisaged that these purified proteins will be used in a series of in vitro experiments. In some cases, the uncleaved fusion proteins may be used instead of the cleaved protein product, especially if undesirably small amounts of pure protein are obtained following the cleavage and re-purification steps.

Previous analyses in mammalian cells have demonstrated that Grb2 binds via its SH3 domains to proline-rich motifs of Gab2. This represents a hetero-dimeric interaction.

Shp2 and p85 can bind to Gab2 when it is tyrosine phosphorylated, via SH2 domains present within Shp2 and p85. This represents a hetero-trimeric interaction.

Co- incubation of Grb2, p85, Shp2 and tyrosine phosphorylated Gab2 should yield a hetero-quadrameric multiprotein complex.

The compounds described may be used to capture heteromeric protein interactions in an incremental manner as described below.

Hetero-dimers of Grb2 and Gab2.

Purified Grb2 and non-phosphorylated Gab2 proteins will each be used at a final concentration between 0.1 μM and 3 μM and mixed together in 30% (v/v) PBS, pH 7.2, 70% (v/v) water, or used separately as controls. In additional control experiments, BSA will be added at equimolar concentrations. Stock solutions of the cross-linkers (SOXL and derivatives) will be dissolved in DMSO at identical concentrations and between 1 and 8 molar equivalents added to the protein solutions in a final reaction volume of 50 μl . Reactions will be incubated for 60 minutes at 21°C and

quenched by addition of 1 μl of IM tris-HCl pH7.5. 5 x SDS sample buffer will then be added to each sample, prior to boiling to denature the proteins. Samples will then be subjected to SDS-PAGB and immunoblotting using standard techniques. Hetero-dimers of Grb2 and Gab2 will be detected by immunoblotting with either Grb2 or Gab2 specific antibodies . The anticipated molecular weight of a complex containing one molecule of each protein is 118 kDa. Mass spectrometry may be used to confirm the compositon of the captured complex.

Hetero-trimers of Shp2 , p85 and Gab2.

Purified Shp2, p85, tyrosine phosphorylated and non- phosphorylated Gab2 proteins will each be used at a final concentration between 0.1 μM and 3 μM and incubated either individually or in combination in 30% (v/v) PBS, pH 7.2, 70% (v/v) water. In control experiments, BSA will be added at equimolar concentrations. Stock solutions of the cross- linkers (SOXL and derivatives) will be dissolved in DMSO at identical concentrations and between 1 and 8 molar equivalents added to the protein solutions in a final reaction volume of 50 μl . Reactions will be incubated for 60 minutes at 21°C and quenched by addition of 1 μl of IM Tris-HCl pH7.5. 5 x SDS sample buffer will then be added to each sample, prior to boiling to denature the proteins. Samples will then be subjected to SDS-PAGE and immunoblotting using standard techniques . Hetero-trimers containing Shp2 , p85 and tyrosine phosphorylated Gab2 will be detected by immunoblotting with Shp2, p85 or Gab2 specific antibodies. The anticipated molecular weight of a complex containing one molecule of each protein is 243 kDa. We may also detect hetero-dimers of Shp2-Gab2 (-158 kDa) and of p85-Gab2 (~ 175 kDa) . The composition of these complexes will be distinguished through the use of protein selective antibodies.

Multimeric complexes containing Grb2 , Gab2 , Shp2 and p85.

Purified Shp2 , p85 ; tyrosine phosphorylated and non- phosphorylated Gab2 and Grb2 proteins will each be used at a final concentration between 0.1 μM and 3 μM and incubated either individually or in combination in 30% (v/v) PBS, pH 7.2, 70% (v/v) water. In control experiments, BSA will be added at equimolar concentrations. Stock solutions of the cross-linkers (SOXL and derivatives) would be dissolved in DMSO at identical concentrations and between 1 and 8 molar equivalents added to the protein solutions in a final reaction volume of 50 μl . Reactions will be incubated for 60 minutes at 21°C and quenched by addition of 1 μl of IM Tris-HCl pH7.5. 5 x SDS sample buffer will then be added to each sample, prior to boiling to denature the proteins. Samples will then be subjected to SDS-PAGE and immunoblotting using standard techniques. A multimeric complex containing all four proteins would yield a protein complex of approximately 271 kDa, which is close to the limits of resolution by SDS-PAGE. However, we should be able to also detect hetero-trimers of varying composition and most likely also hetero-dimers of varying composition. The composition of such complexes will be distinguished through the use of protein selective antibodies .

Analyses within cells It has been found that a fluorescent variant of SOXL can enter into BaF/3 cells, an IL-3 -dependent pro-B cell line widely used for analysis of IL-3 -induced signal transduction pathways. A number of derivatives of BaF/3 have been genetically engineered to express epitope-tagged versions of Gab2 , Shp2 and p85.

BaF/3 cells will be expanded in IL-3 -containing media. Prior to experimentation, the cells will be washed free of cytokine and incubated in serum-free and cytokine-free medium for Ih at 37°C, after which time SOXL or another compound of the invention will be added to the cells at concentrations between 0.5 and 20 μM and incubated for a

further Ih at 37 0 C. A proportion of cells would be lysed directly and a further portion treated with lOng/ml rmIL-3 for 2-30 minutes to induce protein phosphorylation and subsequent protein-protein interactions. Control cells, to which no cross-linker is added, would be treated in an identical manner for comparison. After cell lysis, protein concentrations of the lysates would be determined and equivalent amounts of protein (at least 500μg per sample) would be immunoprecipitated using standard techniques with antibodies specific to Shp2, Gab2, Grb2 or p85. After extensive washing, the immunoprecipitates would be denatured prior to separation by SDS-PAGE and immunoblotting . Immunoblotting would be performed as described above, using antibodies specific for the different protein anticipated to participate in protein- protein interactions.

The stabilisation of protein complexes by the cross-linking reagents would be apparent if higher molecular weight complexes could be detected in samples generated in the presence of cross-linker, compared to in its absence. As an example, Shp2 forms a complex with tyrosine phosphorylated Gab2 following stimulation with IL-3. Therefore, Shp2 immunoprecipitates contain tyrosine phosphorylated Gab2. However, since this is a non-covalent interaction, following boiling in SDS-PAGE sample buffer the complex dissociates such that each protein is detected as a discreet entity on immunoblots, Shp2 at 68 kDa and Gab2 at approximately 100 kDa. However, if cross-linker were achieved then it would be predicted that Shp2 and Gab2 reactive complexes could be detected in the 170 kDa range.

The use of cross-linkers containing biotin moieties would provide a convenient means to carry out affinity purification of cross-linked complexes, prior to analysis by either immunoblotting or mass spectrometry and would be incorporated in to the analyses performed.

While the invention has been described in conjunction with the exemplary embodiments described above, many equivalent modifications and variations will be apparent to those skilled in the art when given this disclosure. Accordingly, the exemplary embodiments of the invention set forth are considered to be illustrative and not limiting. Various changes to the described embodiments may be made without departing from the spirit and scope of the invention. All documents cited herein are expressly incorporated by reference .

References and Footnotes

[1] J. A. Papin, T. Hunter, B. 0. Palsson, S. Subramaniam,

Nat. Rev. MoI. Cell. Biol. 2005, 6 99-111.

[2] Pawson, Tony and Nash, Piers. Science 2003, 300, 445-452.

[3] R. B. Russell, F. Alber, P. Aloy, F. P. Davis, D. Korkin,

M. Pichaud, M. Topf, A. SaIi, Curr. Opin. Struct. Biol. 2004, 14 313-324.

[4] S. Fletcher, A. D. Hamilton, Curr. Opin. Chem. Biol. 2005,

9 632-638.

[5] T. Ito, T. Chiba, R. Ozawa, M. Yoshida, M. Hattori, Y.

Sakaki, Proc. Natl. Acad. Sci . U.S.A. 2001, 98 4569-4574. [6] A. C. Gavin, M. Bosche, R. Krause, P. Grandi , M. Marzioch,

A. Bauer, J. Schultz, J. M. Rick, A. M. Michon, C. M. Cruciat,

M. Remor, C. Hofert, M. Schelder, M. Brajenovic, H. Ruffner,

A. Merino, K. Klein, M. Hudak, D. Dickson, T. Rudi , V. Gnau,

A. Bauch, S. Bastuck, B. Huhse, C. Leutwein, M. A. Heurtier, R. R. Copley, A. Edelmann, E. Querfurth, V. Rybin, G. Drewes,

M. Raida, T. Bouwmeester, P. Bork, B. Seraphin, B. Kuster, G.

Neubauer, G. Superti-Furga, Nature 2002, 415, 141-147.

[7] P. Uetz, L. Giot, G. Cagney, T. A. Mansfield, R. S.

Judson, J. R. Knight, D. Lockshon, V. Narayan, M. Srinivasan, P. Pochart, A. Qureshi-Emili , Y. Li, B. Godwin, D. Conover, T.

Kalbfleisch, G, Vijayadaraodar, M. J. Yang, M. Johnston, S. Fields, J. M. Rothberg, Nature 2000, 403, 623-627.

[8] S. Maslov, K. Sneppen, Science 2002, 296, 910-913.

[9] C. von Mering, R. Krause, B. Snel, M. Cornell, S. G. Oliver, S. Fields, P. Bork, Nature 2002, 417, 399-403.

[10] G. T. Hermanson, Bioconjugate Techniques, Academic Press, London 1996, pp.169-297.

[11] J. W. Back, L. De Jong, A. O. Muijsers, C. G. De Koster, J. MoI. Biol. 2003, 331, 303-313. [12] a) M. M. Young, N. Tang, J. C. Hempel, C. M. Oshiro, E. W. Taylor, I. D. Kuntz, B. W. Gibson, G. Dollinger, Proc . Natl. Acad. Sci. U.S.A. 2000, 97, 5802-5806; b) F. X. Chu, S. O. Shan, D. T. Mσustakas, F. Alber, P. F. Egea, R. M. Stroud, P. Walter, A. L. Burlingame, Proc. Natl. Acad. Sci. U.S.A. 2004, 101 16454-16459.

[13] M. Trester-Zedlitz, K. Kamada, S. K. Burley, D. Fenyoe, B. T. Chait, T. W. Muir, J. Am. Chem. Soc. 2003, 125, 2416- 2425.

[14] A. J. Wright, S. E. Matthews, W. B. Fischer, P. D. Beer, Chem. Eur. J. 2001, 7, 3474-3481.

[15] Colourless tablet, 1.12 x 1.12 x 0.58 mm, triclinic, Pl, a = 11.7830(8), Jb = 15.3349(11), c = 31.054(2) A, α = 91.283 (2), /3 = 91.879(2) , y = 102.979(2) ° , V= 5462.3 (6) A 3 , D ca i cd = 1.295 g/cm 3 , 29,^ x = 55°, Mo Ka 1 λ = 0.71073 A, ω scans, T = 150(2) K, 50321 reflections measured, all 24449 unique used in the refinement, no absorption or extinction corrections applied, structure solution by direct and difference Fourier methods using SHELXS97, structure refinement used SHELXL97, 1374 parameters, H atoms geometrically placed and refined using a riding model, R = 0.0758, wR = 0.235, full-matrix least-squares on F 2 , final residual electron density 1.19 and -0.75 e A "3 . CCDC 297190 contains the supplementary crystallographic data for this paper. These data can be obtained free of charge from the Cambridge Crystallographic Data Centre via www. cede . cam.ac .uk/data_request/cif . " .

[16] J. D. Hayes, J. U. Flanagan, I. R. Jowsey, Ann. Rev. Pharmacol. Toxicol. 2005, 45, 51-88.

[17] D. A. Fancy, K. Melcher, S. A. Johnston, T. Kodadek, Chem. Biol. 1996, 3, 551-559. [18] D. B. Smith, M. R. Rubira, R. J. Simpson, K. M. Davern, W. U. Tiu, P. G. Board, G. F. Mitchell, MoI. Biochem. Parasitol. 1988, 27, 249-256.

[19]. Lottner , C; Bart K. -C; Bernhart G.; Brunner H., J. Med. Chem. 2002, 45, 2079-2089. [20] Bone, H., and M. Welhara. Cell. Signal. 2000, 12, 183-189.

[21] Welham, M. J., U. Dechert, K. B. Leslie, F. Jirik, and J. W. Schrader. J. Biol. Chem., 1994. 269, 23764-23768.

[22] Welham, M. J., and J. W. Schrader. J. Immunol., 1992. 149, 2772-2783. [23] Schuck, P. (2000) Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and Lamm equation modeling. Biophys J 78: 1606-1619.

[24] Sonenshein, A. L., Hoch, J.A. and Losick, R. (2002) Bacillus subtilis and its closest relatives; from genes to cells. 73-86, ASM press.

[25] Turner, I.J., Scott, D.J., Allen, S., Roberts, CJ. and Soultanas, P. (2004) The Bacillus subtilis DnaD protein: a putative link between DNA remodelling and initiation of DNA replication. FEBS Lett 577: 460-464. [26] Zhang, W., Carneiro, M. J.V.M., Turner, I.J., Allen, S., Roberts, CJ. and Soultanas, P. (2005) The Bacillus subtilis DnaD and DnaB proteins exhibit different DNA remodelling activities. J MoI Biol 351: 66-75.