Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
NUCLEIC ACID ENCODED CHEMICAL LIBRARIES
Document Type and Number:
WIPO Patent Application WO/2020/128064
Kind Code:
A1
Abstract:
This invention relates to the production of nucleic acid encoded chemical libraries using a population of first conjugates comprising a first nucleic strand coupled to a first reactive group and a first set of one or more chemical moieties at a first end and a population of second conjugates comprising a second nucleic strand coupled to a second reactive group and a second set of one or more chemical moieties. The first and second nucleic acid strands are hybridised together to produce a population of double stranded molecules having the first and second sets of chemical moieties at an end thereof, and the first and second reactive groups are then reacted to covalently link the first and second sets of chemical moieties and produce cyclised pharmacophores coupled to the double stranded molecules. The population of double stranded molecules form a chemical library. Nucleic acid encoded chemical libraries and their production and use are provided.

Inventors:
SAMAIN FLORENT (CH)
GORRE EMILE (CH)
MILLUL JACOPO (CH)
DONCKELE ETIENNE (CH)
GIRONDA MARTINEZ ADRIAN (CH)
Application Number:
PCT/EP2019/086831
Publication Date:
June 25, 2020
Filing Date:
December 20, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
PHILOCHEM AG (CH)
International Classes:
C12N15/10
Domestic Patent References:
WO2018166532A12018-09-20
WO2006135786A22006-12-21
WO2016080838A12016-05-26
WO2015091207A12015-06-25
WO2015135856A12015-09-17
WO2009077173A22009-06-25
WO2003076943A12003-09-18
WO2007124758A12007-11-08
WO2006048025A12006-05-11
WO2003076943A12003-09-18
WO2009077173A22009-06-25
WO2015091207A12015-06-25
WO2015135856A12015-09-17
Other References:
HONGFENG DENG ET AL: "Discovery of Highly Potent and Selective Small Molecule ADAMTS-5 Inhibitors That Inhibit Human Cartilage Degradation via Encoded Library Technology (ELT)", JOURNAL OF MEDICINAL CHEMISTRY, vol. 55, no. 16, 23 August 2012 (2012-08-23), US, pages 7061 - 7079, XP055677832, ISSN: 0022-2623, DOI: 10.1021/jm300449x
MATTHEW A CLARK ET AL: "Design, synthesis and selection of DNA-encoded small-molecule libraries (+ Errata)", vol. 5, no. 9, 1 September 2009 (2009-09-01), pages 647 - 654, 772, XP002677464, ISSN: 1552-4450, Retrieved from the Internet [retrieved on 20090917], DOI: 10.1038/NCHEMBIO.211
DARIO NERI ET AL: "DNA-Encoded Chemical Libraries: A Selection System Based on Endowing Organic Compounds with Amplifiable Information", ANNUAL REVIEW OF BIOCHEMISTRY, vol. 87, no. 1, 20 June 2018 (2018-06-20), US, pages 479 - 502, XP055677223, ISSN: 0066-4154, DOI: 10.1146/annurev-biochem-062917-012550
ROBERT A. GOODNOW ET AL: "DNA-encoded chemistry: enabling the deeper sampling of chemical space", NATURE REVIEWS. DRUG DISCOVERY, vol. 16, no. 2, 9 December 2016 (2016-12-09), GB, pages 131 - 147, XP055672163, ISSN: 1474-1776, DOI: 10.1038/nrd.2016.213
ALEXANDER LITOVCHICK ET AL: "Encoded Library Synthesis Using Chemical Ligation and the Discovery of sEH Inhibitors from a 334-Million Member Library", SCIENTIFIC REPORTS, vol. 5, no. 1, 10 June 2015 (2015-06-10), XP055677526, DOI: 10.1038/srep10916
LUCA MANNOCCI ET AL: "20 years of DNA-encoded chemical libraries", CHEMICAL COMMUNICATIONS, ROYAL SOCIETY OF CHEMISTRY, UK, vol. 47, no. 48, 28 December 2011 (2011-12-28), pages 12747 - 12753, XP002695069, ISSN: 1359-7345, [retrieved on 20111114], DOI: 10.1039/C1CC15634A
SCHEUERMANN J ET AL: "DNA-encoded chemical libraries for the discovery of MMP-3 inhibitors", BIOCONJUGATE CHEMISTRY, AMERICAN CHEMICAL SOCIETY, US, vol. 19, no. 3, 1 March 2008 (2008-03-01), pages 778 - 785, XP002737640, ISSN: 1043-1802, [retrieved on 20080207], DOI: 10.1021/BC7004347
MELKKO S ET AL: "Encoded self-assembling chemical libraries", NATURE BIOTECHNOLOGY, GALE GROUP INC., NEW YORK, US, vol. 22, no. 5, 1 May 2004 (2004-05-01), pages 568 - 574, XP002541364, ISSN: 1087-0156, [retrieved on 20040418], DOI: 10.1038/NBT961
SCHEUERMANN J ET AL: "DNA-encoded chemical libraries", JOURNAL OF BIOTECHNOLOGY, ELSEVIER, AMSTERDAM, NL, vol. 126, no. 4, 1 December 2006 (2006-12-01), pages 568 - 581, XP024956589, ISSN: 0168-1656, [retrieved on 20061201], DOI: 10.1016/J.JBIOTEC.2006.05.018
MARGIT HAAHR HANSEN ET AL: "A yoctoliter-scale DNA reactor for small-molecule evolution", JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, AMERICAN CHEMICAL SOCIETY, US, vol. 131, no. 3, 28 January 2009 (2009-01-28), pages 1322 - 1327, XP002627212, ISSN: 0002-7863, [retrieved on 20090105], DOI: 10.1021/JA808558A
JIANG LIU ET AL: "Nucleoside macrocycles formed by intramolecular click reaction: efficient cyclization of pyrimidine nucleosides decorated with 5'-azido residues and 5-octadiynyl side chains", BEILSTEIN JOURNAL OF ORGANIC CHEMISTRY, vol. 14, 22 August 2018 (2018-08-22), pages 2404 - 2410, XP055677988, DOI: 10.3762/bjoc.14.217
GUIXIAN ZHAO ET AL: "Future challenges with DNA-encoded chemical libraries in the drug discovery domain", EXPERT OPINION ON DRUG DISCOVERY, vol. 14, no. 8, 3 August 2019 (2019-08-03), London, GB, pages 735 - 753, XP055677227, ISSN: 1746-0441, DOI: 10.1080/17460441.2019.1614559
L. MANNOCCIM. LEIMBACHERM. WICHERTJ. SCHEUERMANND. NERI, CHEM. COMMUN., vol. 47, 2011, pages 12747 - 12753
R. A. JR GOODNOWC. E. DUMELINA. D. KEEFE, NAT. REV. DRUG DISCOV., vol. 16, 2017, pages 131 - 147
R. M. FRANZINIC. RANDOLPH, J. MED. CHEM., vol. 59, 2016, pages 6629 - 6644
R. A. LERNERS. BRENNER, PROC. NATL. ACAD. SCI. USA, vol. 89, 1992, pages 5381 - 5383
S. MELKKO ET AL., NAT. BIOTECHNOL., vol. 22, 2004, pages 568 - 574
Z. J.GARTNER ET AL., SCIENCE, vol. 305, 2004, pages 1601 - 1605
D. NERIR. A. LERNER, ANNU. REV. BIOCHEM., vol. 87, 2018, pages 1 - 5
R. M. FRANZINID. NERIJ. SCHEUERMANN, ACC. CHEM. RES., vol. 47, 2014, pages 1247 - 1255
M. A. CLARK ET AL., NAT. CHEM. BIOL., vol. 5, 2009, pages 647 - 654
Y. LIR. DE LUCAS. CAZZAMALLIF. PRETTOD. BAJICJ. SCHEUERMANND. NERI, NATURE CHEMISTRY, vol. 10, 2018, pages 441 - 448
MAIANTI, J. P ET AL., NATURE, vol. 511, 2014, pages 94 - 98
W. DECURTINS ET AL., NAT. PROTOC., vol. 11, 2016, pages 764 - 780
M. WICHERT ET AL., NAT. CHEM., vol. 7, 2015, pages 241 - 249
MANNOCCI ET AL., PNAS, vol. 105, 2008, pages 17670 - 17675
MANNOCCI ET AL., BIOCONJ. CHEM., vol. 21, 2010, pages 1836 - 1841
FRANZINI ET AL., BIOCONJ. CHEM., vol. 25, 2014, pages 1453 - 1461
FRANZINI ET AL., ANGEW. CHEM. INT. ED., vol. 54, no. 1, 2015, pages 3927 - 3931
FRANZINI ET AL., CHEM. COMMUN., vol. 51, 2015, pages 8014 - 8016
LI ET AL., NATURE CHEM., vol. 10, 2018, pages 441 - 448
BIGATTI ET AL., CHEMMEDCHEM., vol. 12, 2017, pages 1748 - 1752
ZIMMERMANN ET AL., CHEMISTRY, vol. 23, 2017, pages 8152 - 8155
LIU ET AL., BEILSTEIN J. ORG. CHEM., vol. 14, 2018, pages 2404 - 2410
CHALKER ET AL., CHEM. ASIAN J., vol. 4, 2009, pages 630 - 640
BULLER ET AL., BIOORGANIC & MEDICINAL CHEMISTRY LETTERS, vol. 18, 2008, pages 5926 - 5931
Y. LI ET AL., ACS COMB. SCI., vol. 18, 2016, pages 438 - 444
SATZ ET AL., BIOCONJUGATE CHEMISTRY, vol. 26, 2015, pages 1623 - 1632
LI J. Y. ET AL., BIOCONJUGATE CHEM., vol. 29, no. 11, 2018, pages 3841 - 3846
BIRON ET AL.: "Chap.", vol. 9, 2017, WILEY, pages: 205 - 241
KOLMEL ET AL., CHEMMEDCHEM., vol. 20, 2018, pages 2159 - 2165
FRANZINI ET AL., BIOCONJUGATE CHEMISTRY, vol. 25, 2014, pages 1453 - 1461
DECURTINS WWICHERT MFRANZINI RMBULLER FSTRAVS MAZHANG YNERI D, SCHEUERMANN J NAT PROTOC., vol. 11, no. 4, 2016, pages 764 - 780
Attorney, Agent or Firm:
MEWBURN ELLIS LLP (GB)
Download PDF:
Claims:
Claims

1 . A method of producing a nucleic acid encoded chemical library comprising;

providing a population of first conjugates comprising a first nucleic strand coupled to a first reactive group and a first set of one or more chemical moieties at a first end,

providing a population of second conjugates comprising a second nucleic strand coupled to a second reactive group and a second set of one or more chemical moieties,

hybridising the first and second nucleic acid strands together to produce a population of double stranded molecules having the first and second sets of chemical moieties an end thereof, and

reacting the first and second reactive groups to covalently link the first and second sets of chemical moieties and produce cyclised pharmacophores coupled to the double stranded molecules,

said population of double stranded molecules forming a chemical library.

2. A method of producing a member of a nucleic strand encoded chemical library comprising;

providing a first conjugate comprising a first nucleic strand coupled to a first reactive group and a first set of one or chemical moieties,

providing a second conjugate comprising a second nucleic strand coupled to a second reactive group and a second set of one or chemical moieties,

hybridising the first and second nucleic acid strands together to produce a double stranded molecule having the first and second sets of chemical moieties at an end thereof, and

reacting the first and second reactive groups to covalently link the first and second sets of chemical moieties,

thereby producing a cyclised pharmacophore coupled to the end of the double stranded molecule.

3. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an alkynyl and an azido group and the second reactive group comprising the other of the alkynyl and the azido group

4. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an alkenyl group and a thiol group and the second reactive group comprises the other of the alkenyl group and thiol group.

5. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a sulfhydryl (thiol) group and a maleimide group and the second reactive group comprises the other of the sulfhydryl (thiol) group and the maleimide group.

6. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a dienyl or imine group and an alkenyl group and the second reactive group comprises the other of the dienyl or imine group and the alkenyl group.

7. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an amino group and an carbonyl group or activated version thereof and the second reactive group comprises the other of the amino group and the carbonyl group or activated version thereof.

8. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a hydroxyl group or a halide group and the second reactive group comprises the other of the hydroxyl group or a halide group.

9. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an amino group or isothiocyanates/isocyanates and the second reactive group comprises the other of the isothiocyanates and isocyanates or an amino group.

10. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a boronyl group or a halide group and the second reactive group comprises the other of the boronyl or halide group.

1 1 . A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a hydroxyl group or a halide group and the second reactive group comprises the other of the hydroxyl group or a halide group.

12. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a diazoalkane group or a vinyl group and the second reactive group comprises the other of the diazoalkane group and vinyl group.

13. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a nitrone group and an alkenyl or alkynyl group and the second reactive group comprises the other of the nitrone group and the alkenyl or alkynyl group.

14. A method according to claim 1 or claim 2 wherein the first and second reactive groups comprise thiol groups.

15. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an alkenyl group and a disulfide group and the second reactive group comprises the other of the alkenyl group and disulfide group.

16. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an amino group and an isothiocyanate or isocyanate group and the second reactive group comprises the other of the amino group and isothiocyanate or isocyanate group.

17. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a carboxylic group or a vinyl group and the second reactive group comprises the other of the carboxylic group or a vinyl group.

18. A method according to claim 1 or claim 2 wherein the first and second reactive groups comprise alkenyl groups.

19. A method according to claim 1 or claim 2 wherein the first and second reactive groups comprise alkynyl groups.

20. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a sulfonyl halide group and an amino group and the second reactive group comprises the other of the sulfonyl halide group and amino group.

21 A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an alkyl halide group and an amino group and the second reactive group comprises the other of the alkyl halide group and amino group.

22. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a carbonyldiimidazole group and an amino group and the second reactive group comprises the other of the carbonyldiimidazole group and amino group.

23. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an acyl halide group and an amino group and the second reactive group comprises the other of the acyl halide group and amino group.

24. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an acrylamide group and an amino group and the second reactive group comprises the other of the acrylamide group and amino group.

25. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an a,b- unsaturated carbonyl group and an alkenyl group and the second reactive group comprises the other of the a,b-unsaturated carbonyl group and alkenyl group.

26. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of an alkynyl group and an alkenyl group and the second reactive group comprises the other of the alkynyl group and alkenyl group.

27. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a halide group and an alkenyl group and the second reactive group comprises the other of the halide group and alkenyl group.

28. A method according to claim 1 or claim 2 wherein the first reactive group comprises one of a halide group and an alkynyl group and the second reactive group comprises the other of the halide group and alkynyl group.

29. A method according to any one of the preceding claims wherein the first and second sets of chemical moieties comprise the first and second reactive groups, respectively.

30. A method according to any one of claims 1 to 29 wherein the first set of chemical moieties comprises the first reactive group and the second nucleic acid strand comprises the second reactive group.

31 . A method according to any one of claims 1 to 29 wherein the second set of chemical moieties comprises the second reactive group and the first nucleic acid strand comprises the first reactive group.

32. A method according to any one of the preceding claims wherein the first set of chemical moieties is attached to the 5’ end of the first nucleic acid strand and the second set of chemical moieties is attached to the 3’ end of the second nucleic acid strand.

33. A method according to any one claims 1 to 31 wherein the first set of chemical moieties is attached to the 3’ end of the first nucleic acid strand and the second set of chemical moieties is attached to the 5’ end of the second nucleic acid strand.

34. A method according to any one of the preceding claims wherein the nucleic acid strands are DNA, RNA or LNA strands.

35. A method according to any one of the preceding claims wherein the first nucleic strand comprises coding sequences that encode the chemical moieties in the first set.

36. A method according to any one of the preceding claims wherein the second nucleic strand comprises coding sequences that encode the chemical moieties in the second set.

37. A method according to claim 35 or claim 36 wherein each said coding sequence encodes one or more chemical moieties.

38. A method according to any one of the preceding claims wherein the first nucleic strand comprises coding sequences that code for the chemical moieties in the first and second sets

39. A method according to claim 38 wherein the second nucleic acid strand comprises a coding sequence that encodes two chemical moieties in the first and/or the second set of chemical moieties.

40. A method according to any one claims 35 to 39 wherein the second nucleic acid strand comprises one or more non-hybridisable spacer regions at positions that correspond, when the first and second nucleic acid strands are hybridised together, to the positions of the coding sequences in the first nucleic acid strand.

41 . A method according to claim 40 wherein the second nucleic acid strand further comprises a coding sequence that encodes one or more chemical moieties in the second set.

42. A method according to claim 40 or 41 wherein the non-hybridisable spacer regions comprise an abasic linker.

43. A method according to claim 42 wherein the non-hybridisable spacer regions comprise an abasic deoxyribose phosphate linker.

44. A method according to any one claims 35 to 43 wherein hybridisation of the first and second nucleic acid strands leaves a single-stranded overhanging region of the second nucleic acid and the method further comprises extending the first nucleic acid strand along the second nucleic acid strand, such that the first nucleic acid strand incorporates the complement coding sequences of the second nucleic acid strand.

45. A method according to claim 44 wherein the overhanging region of the second strand comprises the coding sequences for the chemical moieties in the second set and said extension incorporates the coding sequences into the first nucleic acid strand.

46. A method according to any one claims 35 to 43 wherein hybridisation of the first and second nucleic acid strands leaves a single-stranded overhanging region of the second nucleic acid and the method further comprises ligating an oligonucleotide to the first nucleic acid strand, wherein said oligonucleotide comprises the complement coding sequences of the second nucleic acid strand.

47. A method according to claim 46 wherein the overhanging region of the second strand comprises the coding sequences for the chemical moieties in the second set and said ligation incorporates the coding sequences into the first nucleic acid strand.

48. A method according to any one of the preceding claims wherein the first set of chemical moieties comprises one chemical moiety and the second set of chemical moieties comprises one chemical moiety.

49. A method according to claim 48 wherein the first nucleic acid strand comprises a first coding sequence that codes for the chemical moiety in the first set.

50. A method according to claim 48 or 49 wherein the second nucleic acid strand comprises a second coding sequence that codes for the chemical moiety in the second set.

51. A method according to claim 49 or claim 50 wherein the second nucleic acid strand further comprises a non-hybridisable spacer at a position that corresponds, when the first and second nucleic acid strands are hybridised together, to the position of the first coding sequence in the first nucleic acid strand.

52. A method according to claim 48 wherein the first nucleic acid strand comprises a first coding sequence that codes for the chemical moiety in the first set and a second coding sequence that codes for the chemical moiety in the second set.

53. A method according to claim 52 wherein the second nucleic acid strand comprises a non- hybridisable spacer at a position that corresponds, when the first and second nucleic acid strands are hybridised together, to the position of the first coding sequence in the first nucleic acid strand.

54. A method according to claim 53 wherein the second nucleic acid strand further comprises a second coding sequence that codes for the chemical moiety in the second set

55. A method according to claim 52 wherein the second nucleic acid strand comprises non-hybridisable spacers at positions that correspond, when the first and second nucleic acid strands are hybridised together, to the positions of the first coding sequence in the first nucleic acid strand and the second coding sequence that codes for the chemical moiety in the second set.

56. A method according to any one of claims 1 to 47 wherein the first set of chemical moieties comprises two chemical moieties and the second set of chemical moieties comprises one chemical moiety

57. A method according to claim 56 wherein the first set of chemical moieties comprises a first chemical moiety attached to the first nucleic strand and a second chemical moiety attached to the first chemical moiety, and wherein the first reactive group is attached to the second chemical moiety.

58. A method according to claim 56 wherein the first set of chemical moieties comprises a first chemical moiety attached to the first nucleic strand and a second chemical moiety attached to the first chemical moiety, and wherein the first reactive group is attached to the first chemical moiety.

59. A method according to any one of claims 56 to 58 wherein the first nucleic acid strand comprises first and second coding sequences that code for the chemical moieties in the first set and a third coding sequence that codes for the chemical moiety in the second set.

60. A method according to any one of claims 56 to 59 wherein the second nucleic acid strand comprises two non-hybridisable spacer regions at positions that correspond, when the first and second nucleic acid strands are hybridised together, to the positions of the first and second coding sequences in the first nucleic acid strand.

61. A method according to claim 60 wherein the second nucleic acid strand further comprises a third coding sequence that codes for the chemical moiety in the second set.

62. A method according to any one of claims 1 to 47 wherein the first set of chemical moieties comprises two chemical moieties and the second set of chemical moieties comprises two chemical moieties

63. A method according to claim 62 wherein the first nucleic acid strand comprises first and second coding sequences that code for the chemical moieties in the first set and third and fourth coding sequences that code for the chemical moieties in the second set.

64. A method according to claim 63 wherein the second nucleic acid strand comprises two non- hybridisable spacer regions at positions that correspond, when the first and second nucleic acid strands are hybridised together, to the positions of the first and second coding sequences in the first nucleic acid strand.

65. A method according to claim 64 wherein the second nucleic acid strand further comprises third and fourth coding sequences that code for the chemical moieties in the second set.

66. A method according to any one of claims 1 to 47 wherein the first set of chemical moieties comprises three chemical moieties and the second set of chemical moieties comprises one chemical moiety

67. A method according to claim 66 wherein the first nucleic acid strand comprises first, second and third coding sequences that code for the chemical moieties in the first set and a fourth coding sequence that codes for the chemical moiety in the second set.

68. A method according to claim 67 wherein the second nucleic acid strand comprises three non- hybridisable spacer regions at positions that correspond, when the first and second nucleic acid strands are hybridised together, to the positions of the first, second and third coding sequences in the first nucleic acid strand.

69. A method according to claim 68 wherein the second nucleic acid strand further comprises a fourth coding sequence that codes for the chemical moieties in the second set.

70. A method according to any one of the preceding claims comprising isolating and/or purifying the double stranded molecule or molecules following the covalent linkage of the first and second reactive groups.

71. A nucleic acid encoded chemical library produced by the method of any one of claims 1 to 70.

72. A method of screening a nucleic acid encoded chemical library comprising;

producing a nucleic acid encoded chemical library by the method of any one of claims 1 to 70, contacting the library with a target molecule and selecting one or more library members which bind to the target.

73. A method according to claim 72 comprising isolating the nucleic acid strands of the library members.

74. A method according to claim 72 or 73 comprising amplifying the nucleic acid strands of the selected library members.

75. A method according to any one of claims 72 to 74 comprising determining the sequence of the nucleic acid strands of the selected library members to identify the chemical moieties of the selected library members.

Description:
Nucleic Acid Encoded Chemical Libraries

Field

This invention relates to nucleic acid encoded chemical libraries, particular cyclised nucleic acid-encoded self-assembling chemical libraries (ESACs), and methods of production and screening thereof.

Background

DNA-Encoded Chemical Libraries (DECLs) are collections of small organic chemical moieties covalently linked to identifier oligonucleotides encoding the identity of chemical moieties [L. Mannocci, M. Leimbacher, M. Wichert, J. Scheuermann, D. Neri Chem. Commun. 47, 12747-12753 (2011)]. Serving as a chemical solution for exponentially generating molecular diversity, DECLs are increasingly being employed for the isolation of small-molecule binders against target proteins of interest [R. A. Jr Goodnow, C. E. Dumelin, & A. D. Keefe Nat. Rev. Drug Discov. 16, 131-147 (2017)].

DECLs represent an innovative manner to discover new ligands toward target proteins of therapeutic interest. The members of nucleic acid encoded chemical libraries display pharmacophores made up of one or more chemical moieties (also called“building blocks”). These chemical libraries can be used to identify pharmacophores which are candidate binding agents or have improved characteristics, for example improved binding [R. M. Franzini, C. Randolph J. Med. Chem., 59, 6629-6644 (2016)].

The basic concept of DECLs, i.e. the use of DNA as a tag for the unique identification of library members through sequencing, was proposed by Lerner and Brenner in 1992 [R. A. Lerner & S. Brenner Proc. Natl. Acad. Sci. USA, 89, 5381-5383 (1992)]. The first examples of functional libraries were published in 2004 by the Neri group at ETH [S. Melkko et al Nat. Biotechnol., 22, 568-574 (2004)] and by the Liu group at Harvard University [Z. J. Gartner et al Science, 305, 1601-1605 (2004)].

Various methods for generating DNA-encoded chemical libraries have been disclosed also in

W02003/076943, W02009/077173 and WO2015/091207. Methods for screening DNA-encoded chemical libraries have been described in WO2015/135856.

Since DNA can be efficiently amplified by PCR and read by high-throughput DNA sequencing methods, the encoding of combinatorial libraries with DNA barcodes allows both the easy identification of specific ligands to protein targets immobilized on a solid support and the convenient handling of the libraries as mixtures of compounds. These libraries are typically synthesized by a stepwise assembly procedure, using a variety of chemical building blocks (BBs) which can be reacted to create the final structures of the library members [D. Neri, R. A. Lerner Annu. Rev. Biochem., 87, 1-5 (2018)].

Two main types of DECLs have been constructed and have found application in drug discovery: singlepharmacophore library and dual-pharmacophore libraries. The first category displays an individual small molecule (it does not matter how complex it is) at the individual extremities of the double strand DNA-tag fragment, which serves as an amplifiable identification barcode, whereas in the second class two sets of small organic compounds are attached to the adjacent complementary DNA strands [R. M. Franzini, D. Neri, J. Scheuermann Acc. Chem. Res., 47, 1247-1255 (2014)]. Most of the DECLs reported so far by both academia and industry were constructed by split-and-pool synthetic procedures [M. A. Clark et al. Nat. Chem. Biol. 5, 647-654 (2009)] aiming at drug-like molecules complying with Lipinski’s rule of five (R05) [R. M. Franzini & C. Randolph J. Med. Chem. 59, 6629-6644 (2016)].

For instance, the first set of amino acid reactive moieties is singularly conjugated to short oligonucleotide and HPLC purified. All the conjugates are then pooled together in equimolar amounts and split into a number of aliquots equal to the number of amino acid reactive moieties in the second set. From this point on, it is not possible to singularly isolate the new conjugates. In order to ensure the quality and performance of the final library, it is necessary that the coupling reactions of each of the amino acid reactive moieties from the second set are high yielding and clean (there is no formation of side products). Therefore, it is not advisable to add amino acid reactive moieties with a peculiar reactivity to single-pharmacophore libraries.

Several examples of the discovery of new targeted ligands from single-pharmacophore DECLs displaying macrocyclic scaffold have been reported in literature. With the aim to achieve a versatile and specific recognition of different target proteins, Li et al. have investigated a fixed macrocyclic scaffold with antiparallel b-sheets and further multiple diversity elements, which facilitates the investigation of specific binders against various target proteins: the resulting binders exhibited antibody-like properties, enabling biochemical and biological applications [Y. Li, R. De Luca, S. Cazzamalli, F. Pretto, D. Bajic, J. Scheuermann & D. Neri Nature Chemistry 10, 441-448 (2018)]. However, the intrinsically larger size and complexity of macrocyclic binders may be difficult to optimize, because modification on the cyclic backbone may lead to unexpected conformational changes [Maianti, J. P. et al. Nature 511 , 94-98 (2014)].

In dual-pharmacophore DECLs, two building blocks are simultaneously connected to the extremities of complementary DNA strands, thus enabling the formation of combinatorial libraries by the self-assembly of oligonucleotide conjugates. Each member of the two sub-libraries (E1 and E2) forming the ESAC library is singularly synthesized and HPLC purified: this procedure allows the introduction in the ESAC libraries of compounds with low conjugation yields and formation of side products. The structure of ESAC libraries allows the members to explore a wide surface of the target protein, increasing the possibility to find binding pockets. Moreover, ESAC libraries may facilitate the identification of synergistic amino acid reactive moieties, which recognize adjacent pockets on target proteins of interest [S. Melkko et al. Nat. Biotechnol., 22, 568-574 (2004)]. The library structure, the construction procedure and the synergistic effect make the ESAC library suitable for the identification of new binding molecules which are able to interact with large surfaces of target proteins.

While DNA encoded chemical libraries may preferentially yield binders for targets with defined pockets, such as kinases, proteases, or phosphatases, the recognition of large surfaces of target proteins remains a challenge. Typically, millions of molecules have to be screened, in order to find a suitable candidate. The preparation of very large libraries of organic molecules and their purity are cumbersome. Furthermore, the complexity associated with the identification of specific binding molecule from a pool of candidates grows with the size of the chemical library to be screened. Summary

The present inventors have recognised that covalent linkage between the chemical moieties (also called building blocks (BBs)) on complementary strands of members of nucleic acid encoded chemical libraries facilitates the generation of large, high purity libraries of cyclised pharmacophores through a ring closure the chemical moieties (e.g. on the top) that may for example facilitate screening for molecules that bind to large surfaces of target proteins.

A first aspect of the invention provides a method of producing a nucleic acid encoded chemical library comprising;

providing a population of first conjugates comprising a first nucleic strand coupled to a first reactive group and a first set of one or more chemical moieties at a first end,

providing a population of second conjugates comprising a second nucleic strand coupled to a second reactive group and a second set of one or more chemical moieties,

hybridising the first and second nucleic acid strands together to produce a population of double stranded molecules having the first and second sets of chemical moieties at an end thereof, and

reacting the first and second reactive groups to covalently link the first and second sets of chemical moieties and produce cyclised pharmacophores coupled to the double stranded molecules,

said population of double stranded molecules forming a chemical library.

A second aspect of the invention provides a method of producing a member of a nucleic strand encoded chemical library comprising;

providing a first conjugate comprising a first nucleic strand coupled to a first reactive group and a first set of one or more chemical moieties,

providing a second conjugate comprising a second nucleic strand coupled to a second reactive group and a second set of one or more chemical moieties,

hybridising the first and second nucleic acid strands together to produce a double stranded molecule having the first and second sets of chemical moieties at an end thereof, and

reacting the first and second reactive groups to covalently link the first and second sets of chemical moieties and produce a cyclised pharmacophore coupled to the double stranded molecule.

A third aspect of the invention provides a nucleic acid encoded chemical library comprising;

a diverse population of members, each member comprising;

a first nucleic strand coupled to a first set of one or more chemical moieties,

a second nucleic strand coupled to a second set of one or more chemical moieties

wherein the first and second nucleic strands are hybridised together to form double stranded molecules, and the first and second sets of chemical moieties are covalently linked to form cyclised pharmacophores coupled to the double stranded molecules.

A suitable nucleic acid encoded chemical library may be produced by a method of the first aspect. Suitable members of the nucleic acid encoded chemical library may be produced by a method of the second aspect. In some preferred embodiments, one of the first and second nucleic acid strands of the first, second and third aspects may be coupled to one chemical moiety and the other of the first and second nucleic acid strands may be coupled to one chemical moiety (see for example Figure 1 A).

In other preferred embodiments, one of the first and second nucleic acid strands of the first, second and third aspects may be coupled to two chemical moieties and the other of the first and second nucleic acid strands may be coupled to one chemical moiety (see for example Figures 1 B and 1 E). For example, one of the first and second nucleic acid strands of the first, second and third aspects may be coupled to a first chemical moiety. A second chemical moiety may be attached to the first chemical moiety. A reactive group may be attached to the first or the second chemical moiety. The other of the first and second nucleic acid strands may be coupled to one chemical moiety which is attached to a reactive group (see for example Figures 1 B and 1 E).

In other preferred embodiments, one of the first and second nucleic acid strands of the first, second and third aspects may be coupled to two chemical moieties and the other of the first and second nucleic acid strands may be coupled to two chemical moieties (see for example Figure 1 C).

In other preferred embodiments, one of the first and second nucleic acid strands of the first, second and third aspects may be coupled to three chemical moieties and the other of the first and second nucleic acid strands may be coupled to one chemical moieties (see for example Figure 1 D).

A fourth aspect of the invention provides a method of screening a nucleic acid encoded chemical library comprising;

producing a nucleic acid encoded chemical library of the third aspect,

contacting the library with a target molecule and

selecting one or more library members which bind to the target.

The populations of first and second conjugates and the libraries of the first to fourth aspects described above may display high purity.

These and other aspects and embodiments of the invention are described in more detail below.

Brief Description of the Figures

Figure 1 shows examples of different self-assembling chemical libraries (ESAC) embodiments.

Figure 2 shows examples of synthetic and encoding strategies for dual-pharmacophore chemical libraries with ring closure on the top.

Figure 3 shows some examples of chemical reactions/chemical compounds used for the ring closure of the complementary sub-population of nucleic acid conjugates. Figures 4A and 4B show a strategy for the generation of nucleic acid conjugates in which the extremities of the two nucleic acid strands (strand A and strand B) are coupled to chemical moieties and azide (strand A) and alkyne (strand B) reactive groups. Figure 4A shows the synthesis of an amino-modified d-spacer oligonucleotide azide conjugate (strand A). Figure 4B shows the synthesis of the amino-modified 48-mer oligonucleotide alkyne conjugate (strand B).

Figure 5 shows an example of the production of a cyclic dual-pharmacophore chemical library by copper(l)- catalysed 1 , 2, 3-triazole-forming reaction, in which chemical moieties are coupled to the strands of the library members. An azide modified chemical moiety is coupled to a first nucleic acid strand (strand A). An alkyne modified chemical moiety is coupled to the second nucleic strand (strand B). Ring closure is performed between diverse populations of chemical moieties which are displayed at the extremities of the two DNA strands using a copper(l)-catalysed 1 , 2, 3-triazole-forming reaction.

Figure 6 shows analytical UV traces (recording absorbance at 260 nm) of (A) nucleic acid strand A comprising the azide modified chemical moiety, (B) strand B comprising the alkyne modified chemical moiety, (C) strands A and B without ring closure reagents and (D) nucleic acid strands A and B with ring closure.

Figure 7 shows the respective MS traces of (A) nucleic acid strand A comprising the azide modified chemical moiety, (B) strand B comprising the alkyne modified chemical moiety, (C) strands A and B without ring closure reagents and (D) nucleic acid strands A and B with ring closure.

Figure 8 shows the synthesis scheme of (A) the 5’-amino-modified oligonucleotide azide acid conjugate (strand A1) by coupling the 2-azidoacetic acid onto the Elib2_Code1 oligonucleotide using s-NHS/EDC as coupling reagents in MOPS buffer, (B) the 3’-amino-modified, 5’-phosphorylated d-spacer oligonucleotide alkyne acid conjugate (strand B1) by coupling the 6-heptynoic acid onto the d-spacer using DMT-MM as coupling reagent in Borate buffer and (C) the ring closure of the nucleic acid strands A1 and B1 by 1 ,2,3- triazole formation reaction using copper(l) catalyst to obtain the Format 1.

Figures 9 shows respectively the analytical LC-ESI-MS trace references of (A) nucleic acid strand A1 , (B) nucleic acid strand B1 , (C) the mixture of nucleic acid strands A1 and B1 without the ring closure reaction and (D) the nucleic acid strand A1 and strand B1 mixture after the reaction depicted on Figure 8C.

Figure 10 shows (A) the synthesis scheme of the encoding step by the ligation of a custom oligonucleotide Elib4_Code2 onto the 5’-phosphorylated, 3’-amino-modified alkynyl acid strand B1 , synthesising the encoded oligonucleotide alkynyl conjugate strand B2 and (B) the synthesis scheme of DNA duplex Format 2 conjugate by the ring closure of nucleic acid strands A1 and B2 by 1 , 2, 3-triazole formation reaction using copper(l) catalyst to obtain Format 2.

Figure 11 shows respectively the analytical LC-ESI-MS trace references of the (A) nucleic acid strand A1 ,

(B) nucleic acid strand B2, (C) the mixture of nucleic acid strands A1 and B2 without the ring closure reaction and (D) nucleic acid strand A1 and B2 mixture after reaction depicted on Figure 10B. Figure 12 shows (A) the synthesis scheme of the Klenow fill-in of the nucleic acid strand A1 to nucleic acid strand A2, constructing the DNA duplex used for the ring closure reaction and (B) the synthesis scheme of the ring closure of the nucleic acid strands A2 and B2 by 1 , 2, 3-triazole formation reaction using copper(l) catalyst to obtain Format 3.

Figures 13 shows respectively (A) the analytical LC-ESI-MS trace references of the nucleic acid strand B2, (B) the analytical LC-ESI-MS trace references of the duplex of nucleic acid strands A2 and B2 without the ring closure reaction, (C) the analytical LC-ESI-MS trace of the nucleic acid strands A2 and B2 duplex after the reaction depicted on Figure 12B.

Figure 14 shows a denaturing electrophoresis gel of the ring closure formation by copper(l) catalysed azide alkyne cycloaddition in different formats and their corresponding references. After the Gene Ladder (pb), the lanes are ordered as follows: (A) nucleic acid strand A1 , (B) nucleic acid strand B1 , (C) nucleic acid strands A1 and B1 reference mixture, (D) Ring closure formation of nucleic acid strand A1 and strand B1 (Format 1), (E) nucleic acid strand B2, (F) nucleic acid strands A1 and B2 reference mixture, (G) Ring closure formation of nucleic acid strand A1 and strand B2 (Format 2), (H) nucleic acid strands A2 and B2 reference mixture, (I) Ring closure formation of nucleic acid strand A2 and strand B2 (Format 3).

Figure 15 shows (A) the synthesis scheme of the 5’-amino-modified oligonucleotide scaffold conjugates (nucleic acid strand C1) by coupling 14 tri-functional carboxylic acid scaffolds (first chemical moiety, BB1) onto custom Elib5_Code1 oligonucleotides using s-NHS/EDC as coupling reagents in TEA/HCI buffer. The resulting oligonucleotides conjugates were then Fmoc-deprotected to obtain the 14 tri-functional nucleic acid strands C1 that were pooled in equimolar quantity, giving the nucleic acid strands C1 Pool, (B) the synthesis scheme of the encoding step by the ligation of 293 custom 5’-phosphorylated oligonucleotides Elib5_Code2 onto the pool of 14 5’-amino-modified tri-functional carboxylic acid scaffolds (nucleic acid strand C1), giving 293 Elib5 sub-pools, (C) the synthesis scheme of the nucleic acid strands C2 sub-library by coupling 293 carboxylic acids (second chemical moiety, BB2) to the corresponding Elib5 sub-pools using DMT-MM as coupling reagent in MOPS buffer, yielding the final nucleic acid strands C2 sub-library.

Figure 16 shows (A) the synthesis scheme of the nucleic acid strand D1 by coupling the 5-hexynoic acid onto the 3’-amino-modified, 5’-phosphorylated d-spacer oligonucleotide using DMT-MM as coupling reagent in Borate buffer and (B) the synthesis scheme of the encoding step by the ligation of a custom Elib6_Code3 oligonucleotide onto nucleic acid strand D1 , giving the encoded oligonucleotide alkynyl conjugate nucleic acid strand D2.

Figure 17 shows the synthetic scheme of the ring closure of the nucleic acid strands C2 sub-library and nucleic acid strand D2 by 1 ,2,3-triazole cyclisation reaction using copper(l) catalyst to obtain the format 4.

Figure 18 shows, respectively, (A) the analytical LC-ESI-MS trace references of the nucleic acid strands C2 sub-library, (B) the analytical LC-ESI-MS trace references of the nucleic acid strand D2, (C) the analytical LC-ESI-MS trace references of the mixture of the nucleic acid strands C2 sub-library and D2 without the ring closure reaction and (D) the analytical LC-ESI-MS trace of the nucleic acid strands C2 sub-library and nucleic acid strand D2 mixture after reaction depicted on Figure 17, to obtain Format 4.

Figure 19 shows (A) the analytical LC-ESI-MS trace references of the mixture of nucleic acid strands C2 sub-library and D2 without the ring closure reaction at 80 °C (maximum afforded by the device), unlike all the previous figures, depicting the LC-ESI-MS traces at 60 °C and (B) the analytical LC-ESI-MS trace at 80 °C of the nucleic acid strands C2 sub-library and nucleic acid strand D2 mixture after reaction depicted on Figure 17 to obtains the Format 4.

Figure 20 shows a denaturing electrophoresis gel of the ring closure formation of the format 4 by copper(l) catalysed azide alkyne cycloaddition and their corresponding references. After the Gene Ladder, the lanes are ordered as follows: (A) Pool of 4Ί 02 Elib5 conjugates (nucleic acid strands C2 sub-library), (B) encoded alkyne conjugate (nucleic acid strand D2), (C) nucleic acid strands C2 sub-library and nucleic acid strand D2 mixture reference and (D) ring closure formation of C2 and D2 (Format 4).

Figure 21 shows (A) the synthesis scheme of the Klenow fill-in of the nucleic acid strands C2 sub-library into nucleic acid strands C3 sub-library, constructing the DNA duplex used for the ring closure reaction and (B) the synthesis scheme of the ring closure of the nucleic acid strands C3 sub-library and nucleic acid strand D2 by 1 ,2,3-triazole formation reaction using copper(l) catalyst.

Figure 22 shows (A) the analytical LC-ESI-MS trace references of the nucleic acid strand D2, (B) the analytical LC-ESI-MS trace references of the duplex of nucleic acid strands C3 sub-library and nucleic acid strand D2 without the ring closure reaction and (C) the analytical LC-ESI-MS trace of the nucleic acid strands C3 sub-library and nucleic acid strand D2 duplex after the reaction depicted in Figure 21 B (Format 5).

Figure 23A shows the synthesis scheme of the Klenow fill-in of the strands C2 sub-library using a single custom YL_Code3 to yield the single-pharmacophore DNA duplex (Format 6A). Figures 23B, 23C and 23D show respectively the analytical LC-ESI-MS trace references of the strands C2 sub-library, the YL_Code3 and the single-pharmacophore DNA duplex (Format 6A). The analyses have been performed at 60 °C, except in the case of Figure 23D, to see the denaturation of the Format 6A at 80 °C.

Figure 24 shows the synthesis scheme of the 5’-amino-modified oligonucleotide azide acid conjugate (strands F1) by coupling the acid alkynes onto the d-spacer oligonucleotide using DMT-MM as coupling reagent in Borate buffer (Figure 24A), s-NHS/EDC as coupling reagents in MOPS buffer (Figure 24B) and DMT-MM as coupling reagent in MOPS buffer (Figure 24C).

Figure 25A shows the synthesis scheme of the encoding step by the ligation of custom oligonucleotides Elib6_Code3#1-n onto 5’-phosphorylated, 3’-amino-modified alkynyl acid oligonucleotide conjugates (strands F1), giving strands F2. Figure 25B shows the synthesis scheme of the Klenow fill-in of the strands C2 sublibrary using the strands F2 sub-library to yield the dual-pharmacophore DNA duplex (Format 6B) composed of the strands F2 sub-library and the strands C3B sub-library. Figures 25C, 25D and 25E show respectively the analytical LC-ESI-MS trace references of the strands C2 sub-library, the strands F2 sub-library and the dual-pharmacophore DNA duplex (Format 6B).

Figure 26A shows the synthetic scheme of the ring closure of strands C3B and strands F2 sub-libraries by 1 ,2,3-triazole formation reaction using copper(l) catalyst, yielding a cyclised dual-pharmacophore library (Format 6C). Figures 26B and 26C show respectively the analytical LC-ESI-MS trace references of the dualpharmacophore DNA duplex (Format 6B) and cyclised dual-pharmacophore library (Format 6C).

Figure 27 shows the electrophoresis denaturing gel of the three different library formats (Formats 6A, 6B and 6C) and their corresponding references. After the Gene Ladder, the lanes are ordered as follow: (a) strands C2 sub-library, (b) strands F2 sub-library, (c) strands C2 and F2 sub-libraries mixture, (d) Ring closure of strands C2 and strands F2 sub-libraries (no Klenow fill-in), (e) strands C3B and strands F2 sub-libraries (Format 6B), (f) Ring closure of filled-in strands C3B and strands F2 sub-libraries (Format 6C), (g) strands C2 sub-library, (h) YL_Code3, (i) Klenow fill-in of strands C2 sub-library and YL_Code3 (Format 6A).

Figures 28A and 28B show the DNA constructs built by the PCR1 of the single-pharmacophore (Format 6A) and the dual-pharmacophores (Formats 6B and 6C) respectively.

Figure 29 shows the agarose gel of the PCR#2 pool, merging all formats (6A, 6B and 6C) before the gel extraction.

Figure 30 shows the fingerprints of the Format 6A after sequencing of the affinity selections. Codes A, B and C on the figure correspond to Codes 1 , 2 and 3, respectively. Figure 30A shows the Format 6A library selection without CAIX, highlighting the non-specific binders. Figure 30B shows the Format 6A selection against CAIX.

Figure 31 shows the fingerprints of the Format 6B after sequencing of the affinity selections. Codes A, B and C on the figure correspond to Codes 1 , 2 and 3, respectively. Figure 31 A shows the Format 6B library selection without CAIX, highlighting the non-specific binders. Figure 31 B shows the Format 6B selection against CAIX.

Figure 32 shows the fingerprints of the Format 6C after sequencing of the affinity selections. Codes A, B and C on the figure correspond to Codes 1 , 2 and 3, respectively. Figure 32A shows the Format 6C library selection without CAIX, highlighting the non-specific binders. Figure 32B shows the Format 6C selection against CAIX.

Figure 33A shows the most enriched compounds from the fingerprints of Format 6A. The fingerprint depicts two carboxylic acids, 6 and 53, as building block 2, described in Example 5.4. The building blocks 1 , described in Example 5.2, do not show any preferential number. The single-pharmacophore setting displays preferential building blocks 2, keeping flexibility in respect of building blocks 1. Figure 33B shows the most enriched compounds from the fingerprints of Format 6B. The fingerprint depicts only the carboxylic acid 6 as building block 2, described in Example 5.4. The building blocks 1 , described in Example 5.2, and building blocks 3, described in Example 8.2, do not show any preferential number. The dual-pharmacophore setting locks the carboxylic acid 6 as building block 2, inducing a rigidity of the structures. Figure 33C shows the most enriched compounds from the fingerprints of Format 6C. The fingerprint depicts only the carboxylic acid 6 as building block 2, described in Example 5.4, and the tri-functional scaffold 4 (the 4 th scaffold among the 14 described in the synthesis of the strands C2 sub-library in Example 5) as building block 1 , described in Example 5.2. The full set of acid alkynes, described in Example 8.2, is shown as building block 3. The cyclised dual-pharmacophore setting increases the rigidity by locking carboxylic acid 6 as building block 2 described in Example 5.4 and tri-functional scaffold 4 as building block 1 described in Example 5.2.

Figure 34A shows the synthesis scheme of the 5’-amino-modified oligonucleotide scaffold conjugates (strand G1) by coupling tri-functional carboxylic acid scaffolds onto custom Elib5_Code1. The resulting

oligonucleotides conjugates were then Fmoc-deprotected to obtain tri-functional strands G1 that were pooled in equimolar quantity, yielding the strands G1 Pool. Figure 34B shows the synthesis scheme of the encoding step by the ligation of custom 5’-phosphorylated oligonucleotides Elib5_Code2 onto the pool of 5’-amino- modified tri-functional carboxylic acid scaffolds, yielding Elib5 sub-pools (Elib5-G sub-pools). Figure 34C shows the synthesis scheme of the strands G2 sub-library by coupling carboxylic acids to the corresponding Elib5 sub-pools, yielding the final strands G2 sub-library.

Figure 35A shows the synthesis scheme of the strands H1 by coupling azide acids onto the 3’-amino- modified, 5’-phosphorylated d-spacer. Figure 35B shows the synthesis scheme of the encoding step by the ligation of a custom oligonucleotide Elib6_Code3 onto the strands H1 to obtain the strands H2 that were pooled in equimolar quantity, yielding the strands H2 sub-library.

Figure 36A shows the synthesis scheme of the Klenow fill-in of the strand G2 to strand G3, yielding the DNA duplex used for the ring closure reaction. Figure 36B shows the synthesis scheme of the ring closure of the strands G3 and H2 by 1 , 2, 3-triazole formation reaction (Format 7).

Figure 37A shows the synthesis scheme of the 5’-amino-modified oligonucleotide amino acid conjugates (strand 11) by coupling N-Fmoc protected amino acids onto custom Elib5_Code1. The resulting

oligonucleotides conjugates were then Fmoc-deprotected to obtain free-amino strands 11 that were pooled in equimolar quantity, yielding the strands 11 Pool. Figure 37B shows the synthesis scheme of the encoding step by the ligation of custom 5’-phosphorylated oligonucleotides Elib5_Code2 onto the pool of 5’-amino- modified amino-conjugates, yielding Elib5 sub-pools (Elib5-I sub-pools). Figure 37C shows the synthesis scheme of the strands I2 sub-library by coupling acid azides to the corresponding Elib5 sub-pools, yielding the final strands I2 sub-library.

Figure 38A shows the synthesis scheme of the strands F1 by coupling acid alkynes onto the 3’-amino- modified, 5’-phosphorylated d-spacer. Figure 38B shows the synthesis scheme of the encoding step by the ligation of a custom oligonucleotide Elib6_Code3 onto strands F1 , to obtain the strands F2 that were pooled in equimolar quantity, yielding the final strands F2 sub-library. Figure 39A shows the synthesis scheme of the Klenow fill-in of the strand 12 to strand 13, yielding the DNA duplex used for the ring closure reaction. Figure 39B shows the synthesis scheme of the ring closure of the strands 13 and F2 by 1 , 2, 3-triazole formation reaction (Format 8).

Figure 40A shows the synthesis scheme of the 5’-amino-modified oligonucleotide amino acid conjugates (strand J1) by coupling N-Fmoc protected amino acids onto custom Elib5_Code1. The resulting

oligonucleotides conjugates were then Fmoc-deprotected to obtain free-amino strands J1 that were pooled in equimolar quantity, yielding the strands J1 Pool. Figure 40B shows the synthesis scheme of the encoding step by the ligation of custom 5’-phosphorylated oligonucleotides Elib5_Code2 onto the pool of 5’-amino- modified amino-conjugates, yielding Elib5 sub-pools (Elib5-J sub-pools). Figure 40C shows the synthesis scheme of the strands J2 sub-library by coupling acid alkynes to the corresponding Elib5 sub-pools, yielding the final strands J2 sub-library.

Figure 41A shows the synthesis scheme of the Klenow fill-in of the strand J2 to strand J3, yielding the DNA duplex used for the ring closure reaction. Figure 41 B shows the synthesis scheme of the ring closure of the strands J3 and H2 by 1 ,2,3-triazole formation reaction (Format 9).

Figure 42A shows the synthesis scheme of the 5’-amino-modified oligonucleotide acid ester conjugates (strand K1) by coupling acid esters onto custom Elib5_Code1. The resulting oligonucleotides conjugates were then hydrolysed to obtain free-acido strands K1 that were pooled in equimolar quantity, yielding the strands K1 Pool. Figure 42B shows the synthesis scheme of the encoding step by the ligation of custom 5’- phosphorylated oligonucleotides Elib5_Code2 onto the pool of 5’-amino-modified acido-conjugates, yielding Elib5 sub-pools (Elib5-K sub-pools). Figure 42C shows the synthesis scheme of the strands K2 sub-library by coupling amine azides to the corresponding Elib5 sub-pools, yielding the final strands K2 sub-library.

Figure 43 shows the synthesis scheme of the Klenow fill-in of the strand K2 to strand K3, yielding the DNA duplex used for the ring closure reaction (upper panel) and the synthesis scheme of the ring closure of the strands K3 and F2 by 1 ,2,3-triazole formation reaction (Format 10; lower panel).

Figure 44A shows the synthesis scheme of the 5’-amino-modified oligonucleotide acid ester conjugates (strand L1) by coupling acid esters onto custom Elib5_Code1. The resulting oligonucleotides conjugates were then hydrolysed to obtain free-acido strands L1 that were pooled in equimolar quantity, yielding the strands L1 Pool. Figure 44B shows the synthesis scheme of the encoding step by the ligation of custom 5’- phosphorylated oligonucleotides Elib5_Code2 onto the pool of 5’-amino-modified acido-conjugates, yielding Elib5 sub-pools (Elib5-L sub-pools). Figure 44C shows the synthesis scheme of the strands L2 sub-library by coupling amine alkynes to the corresponding Elib5 sub-pools, yielding the final strands L2 sub-library.

Figure 45 shows the synthesis scheme of the Klenow fill-in of the strand L2 to strand L3, yielding the DNA duplex used for the ring closure reaction (upper panel) and the synthesis scheme of the ring closure of the strands L3 and H2 by 1 ,2,3-triazole formation reaction (Format 11 ; lower panel). Table 1 shows the conditions for copper (l)-catalysed 1 , 2, 3-triazole-forming reactions to achieve the ring closure in between diverse population of chemical moieties displayed at the extremities of the two DNA strands wherein an azide modified chemical moiety is coupled to a nucleic acid strand A and an alkyne modified chemical moiety is coupled to the second nucleic acid strand B. All reactions were conducted using 1 nmol of strand A and 1 nmol of strand B in 1 pL H2O and 1 pl_ in Borate Buffer (500 mM, pH 9.4) for 2 h at 25 °C. Conversion determined by LC-MS.

Table 2 shows the m/z results of a) nucleic acid strand A comprising the azide modified building block, b) nucleic acid strand B comprising the alkyne modified building block, and c) the ring closure example of nucleic acid strands A and B, and d) of the nucleic acid strands A and B without ring closure reagents.

Detailed Description

This invention relates to the production of a nucleic acid encoded chemical library through the hybridisation of a population of first conjugates comprising a first nucleic strand coupled to a first set of one or more chemical moieties, and a first reactive group to a population of second conjugates comprising a second nucleic strand coupled to a second set of one or more chemical moieties, and a second reactive group to produce double stranded molecules comprising first and second sets of chemical moieties. The first and second sets of chemical moieties may be located at an end of the double stranded molecules. The first and second reactive groups are then reacted to covalently link the first and second conjugates through ring closure, for example“on the top” of the sets of chemical moieties, and producing cyclised pharmacophores of high purity located at the end of the double stranded molecules that comprise the first and second sets of chemical moieties covalently linked together. The population of pharmacophores coupled to the double stranded molecules forms a nucleic acid-encoded chemical library. The methods described herein allow the generation of libraries of large pharmacophores that are both highly pure and highly diverse.

Members of a nucleic acid encoded chemical library as described herein comprise a double-stranded nucleic acid molecule linked at an end to a pharmacophore that comprises a first set of chemical moieties coupled to the first nucleic acid strand and covalently linked to a second set of chemical moieties coupled to the second nucleic acid strand. Preferably, one of the nucleic acid strands comprises coding sequences that encode the chemical moieties that constitute the pharmacophore displayed by the library member. A member of a nucleic acid encoded chemical library may be produced by a method comprising;

providing a first conjugate comprising a first nucleic strand coupled to a first reactive group and a first set of one or more chemical moieties,

providing a second conjugate comprising a second nucleic strand coupled to a second reactive group and a second set of one or more chemical moieties,

hybridising the first and second strands together to produce a double stranded molecule having the first and second sets of chemical moieties at an end, and

reacting the first and second reactive groups to covalently link the first and second sets of chemical moieties and produce cyclised pharmacophores coupled to the end of the double stranded molecule

In some embodiments, the first and second conjugates may be pure (i.e. they may display high purity) and may be used to generate a pure library member. A nucleic acid-encoded library is a collection of library members, each of which displays a cyclised pharmacophore that is made up of one or more chemical moieties. The identity of the chemical moieties that constitute the pharmacophore is encoded into each library member through a nucleic acid strand that incorporates coding sequences that allow the identification of the chemical moieties in the pharmacophore. The members of the library display a diverse population of pharmacophores. This allows the screening of a large number of pharmacophores. For example, a nucleic acid-encoded library may comprise 10 6 or more different pharmacophores for screening.

Preferably, members of a nucleic acid-encoded chemical library (‘library members’) may be formed from two nucleic acid strands (a first and a second strand) and two or more chemical moieties. The two or more chemical moieties may be composed of a first set of one or more chemical moieties coupled to a first nucleic strand and a second set of one or more chemical moieties coupled to a second nucleic strand. The first nucleic strand may be hybridised to the second nucleic acid strand to bring the first and second sets of chemical moieties into proximity. The first and second sets of chemical moieties are then covalently linked through a ring closure, for example on the top of the two strand extremities, to form the cyclised

pharmacophore.

Preferably, the first and second sets of chemical moieties are diverse. For example, the population of first conjugates and the population of second conjugates may be sub-libraries which may be combined through hybridisation and covalent linkage as described herein to generate a nucleic acid-encoded library.

The populations of first and second conjugates may be pure i.e. the populations may be essentially free of contaminant molecules, such as reactants and by-products. For example, 70% or more, 80% or more 90% or more, 95% or more, 99% or more or 99.5% or more of the molecular species in the populations may be conjugates as described herein. Purity may be determined by standard analytic techniques, such as HPLC, SDS-PAGE and MS. For example, purity may be expressed as the percentage of measured peak area of the first and second conjugates combined from the sum of peak area of all peaks, for example the sum of the peaks area of the first conjugate, the second conjugate and the contaminant molecule. The peak area may be measured under standard detection conditions, for example HPLC or MS conditions described elsewhere herein.

The conjugates that form the library members described herein each comprise a nucleic strand, a set of one or chemical moieties and a reactive group. The set of chemical moieties may be coupled to an end of the nucleic acid strand. The reactive group may be coupled to the end of the nucleic acid strand or more preferably may be coupled to the set of chemical moieties. The first conjugate may thus comprise a first nucleic strand, a first set of one or more chemical moieties and a first reactive group. The second conjugate may comprise a second nucleic strand, a second set of one or more chemical moieties and a second reactive group. A nucleic acid strand is a polynucleotide chain (e.g. a DNA, RNA, LNA or RNA/DNA chain) which may be coupled to a set of chemical moieties. The nucleic acid strands of the conjugates may be DNA, RNA or chimeric RNA/DNA. Preferably, the nucleic acid strand(s) in the chemical libraries described herein are DNA.

The first and second nucleic acid strands may comprise one or more complementary regions that comprise complementary nucleotide sequences. The first nucleic acid strand of the first conjugate may hybridise to the second nucleic acid strand of the second conjugate through the complementary regions in the two strands to form a double-stranded molecule.

The first nucleic acid strand may comprise coding sequences that encode the chemical moieties in the first set. The second nucleic acid strand may comprise coding sequences that encode the chemical moieties in the second set. In some preferred embodiments, the first nucleic acid strand may further comprise coding sequences that encode the chemical moieties in the second set. In other preferred embodiments, the second nucleic acid strand may further comprise coding sequences that encode the chemical moieties in the first set. A coding sequence may encode one chemical moiety or more than one chemical moiety, for example two chemical moieties. Suitable methods for the incorporation of the coding sequences for the second set of chemical moieties into the first nucleic acid strand or the coding sequences for the first set of chemical moieties into the second nucleic acid strand are known in the art and include for example ligation or extension using a polymerase, as discussed below.

A coding sequence (or coding region) can be any sequence of nucleic acid bases that is uniquely associated with a particular chemical moiety. This allows the identity of the chemical moiety to be determined by sequencing or otherwise‘reading’ the coding sequence. A coding sequence may encode one chemical moiety or more than one chemical moiety. Preferably a coding sequence encode one or two chemical moieties. A coding sequence contains sufficient nucleotides to uniquely identify the chemical moiety for which it is coding. For example, if the chemical moiety has 20 variants, the coding sequence needs to contain at least 3 nucleotides (4 2 = 16, 4 3 = 64). The coding sequence may be longer than necessary. The benefit of employing coding sequences that are longer than necessary is that they provide the opportunity to differentiate codes by more than just a single nucleotide difference, which gives more confidence in the decoding process. For example, a first chemical moiety from a population of 20 different moieties (20 compounds) may be encoded by 6 nucleotides, and a second chemical moiety from a population of 200 different moieties may be encoded by 8 nucleotides. The length of the coding sequence therefore depends on the number of chemical moieties to be encoded (i.e. the number of different chemical moieties in the library). A sequence of nucleotides and/or its complement may be used as a coding sequence to encode a chemical moiety. Suitable sequences for encoding chemical moieties in a library are well-known in the art. Examples of suitable coding sequences are shown in SEQ ID NOs: 1 , 3, 4, 6, 7, 9, 11 and 12.

The second nucleic acid strand may not be complementary to the first nucleic acid strand when the strands are hybridised together in the double stranded nucleic acid molecule at positions where the coding sequences are located in the first nucleic acid strand. In some preferred embodiments, the second nucleic acid strand may comprise one or more spacer regions. Alternatively, the first nucleic acid strand may not be complementary to the second nucleic acid strand when the strands are hybridised together in the double stranded nucleic acid molecule at positions where the coding sequences are located in the second nucleic acid strand. In some preferred embodiments, the first nucleic acid strand may comprise one or more spacer regions. The spacer region is non-hybridisable and may be called a non-hybridisable spacer (also named d- spacer). The spacer region may be located in one of the first and second nucleic acid strands at a position that would otherwise hybridise with a coding sequence located in the other of the first and second nucleic acid strands in the double stranded nucleic acid molecule. In some embodiments, regions in one of the first and second nucleic acid strands that are complementary to all of the coding sequences in the other of the first and second nucleic acid strands may be replaced by spacer regions. The non-hybridisable spacers may be located at positions in one of the first and second nucleic acid strands that correspond, when the first and second strands are hybridised together, to the positions of the coding sequences in the other of the first and second nucleic acid strands. A nucleic acid strand containing one or more spacer regions at positions corresponding to coding sequences may hybridise to nucleic acid strands containing different coding sequences. This may be useful in the production of diversity in self-assembling libraries.

The spacer region (or d-spacer) is an abasic region that does not hybridise to nucleotide sequences and is not a template for a nucleic acid polymerase. Suitable spacer regions may comprise an abasic linker, such as an abasic phosphodiester backbone or a linker, such as an alkyl chain, polyethylene glycol or other oligomer that spans the spacer region. Suitable spacer regions may be obtained from commercial suppliers. An example of a spacer region or d-spacer sequence is shown in SEQ ID NO: 2

Suitable methods for the production of nucleic acid strands comprising spacer regions, coding sequences and complementary regions are known in the art (see for example W02003/076943, W02009/077173, WO2015/091207; and W. Decurtins et al. Nat. Protoc. 11, 764-780 (2016); M. Wichert et al Nat. Chem., 7, 241-249 (2015)).

A nucleic acid strand in a conjugate described herein may be coupled to a set of one or more chemical moieties. The first and second sets of chemical moieties may be coupled to the first and second nucleic acid strands directly or via a linker. The first set of chemical moieties may be coupled to one of the 5’ and 3’ ends of the first nucleic acid strand in the first conjugate and the second set of chemical moieties may be coupled to the other of the 5’ and 3’ ends of the second nucleic acid strand in the second conjugate. The two sets of chemical moieties may be located at the same end of the double stranded molecule following hybridisation of the first and second nucleic acid strands. This facilitates the reaction of the first and second reactive groups to covalently link the sets of chemical moieties through a ring closure and form the cyclised pharmacophore.

The first and/or second sets of chemical moieties may be diverse. The first and/or second conjugates that form the library members may themselves be members of a sub-library, each nucleic acid strand in the population of first and/or second conjugates being coupled to a different combination of chemical moieties.

A sub-library of conjugates may comprise different sets of chemical moieties coupled to nucleic acid strands. The conjugates in a sub-library may assemble through hybridisation of the first and second nucleic acid strands with conjugates from the same sub-library or a different sub-library, for example a sub-library of conjugates comprising a different number of chemical moieties, to produce a double-stranded library. For example, nucleic acid strands coupled to a set containing a single chemical moiety may assemble with nucleic acid strands coupled to a set containing two chemical moieties, thereby presenting, following reaction of the reactive groups, a pharmacophore consisting of three chemical moieties and a covalent linkage. Alternatively, nucleic acid strands coupled to a first set of two or more chemical moieties may assemble with nucleic acid strands coupled to a second set of two or more chemical moieties, thereby presenting, following reaction of the reactive groups, a cyclised pharmacophore consisting of four or more chemical moieties and a covalent linkage.

A set of chemical moieties may comprise 1 , 2, 3, 4, 5 or more chemical moieties. For example, each of the first and second sets of chemical moieties may comprise 1 , 2, 3 or more chemical moieties.

The reaction of the first and second reactive groups to form a covalent linkage generates a cyclised pharmacophore through a ring closure, for example on the top of the two strand extremities. The cyclised pharmacophore may comprise the first and second sets of chemical moieties and the covalent linkage.

The covalent linkage in between the two conjugates may be part of the pharmacophore or may be part of a branched chemical space between the nucleic acid strand and the pharmacophore. The covalent linkage provides one of a pair of ring closures that generate a cyclised population of library members (e.g. one of a top or bottom ring closure, preferably with a ring closure on the top). The other of the pair of ring closures (e.g. the other of the top and bottom closures) is provided by the hybridisation of the first and second nucleic acid strands to form a double stranded molecule.

The presence and identity of the covalent linkage may affect the binding properties of the pharmacophore, and may for example introduce of lipophilic or hydrophilic chemical spaces, and/or additional functional groups into the pharmacophore.

In some preferred embodiments, a nucleic acid-encoded library as described herein may include members with different covalent linkages. This may be useful for example in increasing the diversity contained in the library.

A pharmacophore is an assembly of molecular features or elements which is capable of specifically interacting with a target. Different combinations of chemical moieties produce different pharmacophores which are displayed by different members of the library. In addition, different covalent linkages may also produce different pharmacophores which may be combined following production of a library as described herein.

The pharmacophore may be formed from the first set of chemical moieties on the first nucleic acid strand, the second set of chemical moieties on the second nucleic acid strand and the covalent linkage between the two sets of moieties. Typically, the chemical moieties within a set on the same nucleic acid strand will be covalently bonded together and the different sets of chemical moieties on different nucleic acid strands will be brought into proximity by the assembly of the nucleic acid strands, followed by ring closure with a covalent linkage through reaction of the reactive groups. The reaction of the reactive groups covalently links the sets of chemical moieties to form a cyclised pharmacophore following hybridisation of the first and second strands, to form the library member.

A chemical library member may display a pharmacophore which comprises or consists of any of 2, 3, 4, 5 or more chemical moieties. The chemical moieties may be covalently linked together, for example with a covalent connection at an end of the first and second conjugates (e.g.“on the top” of the two conjugates). Both nucleic acid strands may be coupled to one or more chemical moieties in the pharmacophore. For example, a first strand may be coupled to the first set of chemical moieties and the second strand may be coupled to the second set of chemical moieties and the chemical moieties in the sets may be covalently linked by reaction of the reactive groups. The covalent linkage itself may form part of the pharmacophore. A pharmacophore presented on a double stranded nucleic acid molecule may form a closed structure through the covalent linkage and the non-covalent hybridisation of the nucleic acid strands and may be described as cyclic.

In some preferred embodiments, the total molecular weight of the chemical moieties in the cyclised pharmacophore may be less than 3kD, preferably less than 1 kD, more preferably less than 500Da.

Suitable chemical moieties include small organic molecules, amino acid residues or other amino- containing moieties (optionally with appropriate amino protection); and peptides or globular proteins (including antibody domains). In some embodiments, a chemical moiety may have a molecular weight of 300 Da or less, for example about 100 to 300 Da. Populations of chemical moieties for use in the generation of libraries are well-known in the art (see for example W. Decurtins et al. Nat. Protoc. 11, 764-780 (2016); M. Wichert et al Nat. Chem., 7, 241-249 (2015) ; Mannocci et al., PNAS 105, 17670-17675 (2008); Mannocci et al., Bioconj. Chem. 21 , 1836-1841 (2010); Franzini et al. Bioconj. Chem. 25 1453-1461 (2014); Franzini et al, Angew. Chem. Int. Ed. 54, 1 3927-3931 (2015); Franzini et al, Chem. Commun. 51 , 8014-8016 (2015); Franzini et al. Acc. Chem. Res. 47, 1247-1255 (2014); Li et al., Nature Chem. 10, 441-448 (2018); Bigatti et al.,

ChemMedChem. 12, 1748-1752 (2017); Zimmermann et al., Chemistry, 23, 8152-8155 (2017).

A set of chemical moieties may be covalently coupled to the nucleic acid strand directly or indirectly, for example via a linker. Suitable linkers, such as alkyl chains, are well known in the art. The chemical moieties may be coupled directly using conventional synthetic chemistries, for example amide or other conventional linkages. Chemical moieties may be coupled to a nucleic acid strand via other chemical moieties. For example, each of the chemical moieties within a set may be covalently bonded to other chemical moieties and one of the chemical moieties may be coupled to the nucleic acid strand. Suitable methods for covalently bonding chemical moieties and coupling chemical moieties to nucleic acid strands are well known in the art (see for example W. Decurtins et al. Nat. Protoc. 11, 764-780 (2016); M. Wichert et al. Nat. Chem., 7, 241- 249 (2015)).

As described above, the first and second sets of chemical moieties may be coupled to the nucleic acid strands such that they are located at the same end of the double-stranded molecule formed by hybridisation of the strands. For example, the first set of chemical moieties may be coupled to the 5’ end of the first nucleic acid strand and the second set of chemical moieties may be coupled to the 3’ end of the second nucleic acid strand; or the first set of chemical moieties may be coupled to the 3’ end of the first nucleic acid strand and the second set of chemical moieties may be coupled to the 5’ end of the second nucleic acid strand.

Hybridisation establishes non-covalent sequence-specific base-pairing between the complementary regions of the first and second nucleic acid strands and brings the set of chemical moieties attached to the strands into proximity. Under suitable hybridisation conditions, the complementary regions of the strands will anneal together such that the strands form a double stranded molecule with the sets of chemical moieties at one end. Suitable hybridisation conditions are well-known in the art. Typical hybridisation temperatures for the sequence specific annealing of two polynucleotide strands may be between 4°C and 70°C.

In some embodiments, the first nucleic acid strands of a sub-library of first conjugates may be coupled to a first diverse set of one or more chemical moieties and the second nucleic acid strands of a sub-library of second conjugates may be coupled to a second diverse set of one or more chemical moieties. The first and second diverse sets may be the same or different. When the nucleic acid strands of the first and second conjugates hybridise together to form double-stranded molecules and the first and second reactive groups are reacted to form a covalent linkage, pharmacophores may be generated from the different combinations of the sets of chemical moieties coupled to the nucleic acid strands. This increases the number of different pharmacophores in the library.

In some embodiments, the coding sequences in the second nucleic acid strand may be incorporated into the first nucleic acid strand. This allows the first strand to contain coding sequences for all of the chemical moieties in the first and second sets. The chemical moieties displayed by the library member can thus be identified by sequencing the first nucleic acid strand. In some embodiments, the hybridisation of the first and second nucleic acid strands may leave a single-stranded overhanging region of the second nucleic acid strand. The single-stranded region of the second nucleic acid strand may comprise coding sequences encoding the set of chemical moieties attached to the second strand. A method may comprise;

extending the first nucleic acid strand along the second nucleic acid strand,

such that the first nucleic acid strand incorporates the complement of the coding sequences in the second nucleic acid strand.

In other embodiments, the coding sequences in the first nucleic acid strand may be incorporated into the second nucleic acid strand. This allows the second strand to contain coding sequences for all of the chemical moieties in the first and second sets. The chemical moieties displayed by the library member can thus be identified by sequencing the second nucleic acid strand. In some embodiments, the hybridisation of the first and second nucleic acid strands may leave a single-stranded overhanging region of the first nucleic acid strand. The single-stranded region of the first nucleic acid strand may comprise coding sequences encoding the set of chemical moieties attached to the first strand. A method may comprise;

extending the second nucleic acid strand along the first nucleic acid strand,

such that the second nucleic acid strand incorporates the complement of the coding sequences in the first nucleic acid strand. Suitable techniques for 5’ to 3’ extension of nucleic acid strands along a template nucleic acid strand are well known in the art. For example, the second nucleic acid strand may be extended by addition of nucleotides for polymerisation (normally in excess), preferably deoxynucleotides (dNTPs), and a polymerase (e.g. Taq or Klenow polymerase) in a suitable buffer, incubated at a suitable temperature (e.g. 37°C for Klenow polymerase or 65°C or 72°C for Taq).

In other embodiments, the coding sequences in the second nucleic acid strand may be incorporated into the first nucleic acid strand by ligation. This allows the first strand to contain coding sequences for all of the chemical moieties in the first and second sets. The chemical moieties displayed by the library member can thus be identified by sequencing the first nucleic acid strand. The hybridisation of the first and second nucleic acid strands may leave a single-stranded overhanging region of the second nucleic acid strand. The single- stranded region of the second nucleic acid strand may comprise coding sequences encoding the set of chemical moieties attached to the second strand. A method may comprise;

ligating to the first nucleic acid strand a coding oligonucleotide comprising the complement of the coding sequences in the second nucleic acid strand,

such that the first nucleic acid strand incorporates said complement of the coding sequences.

In other embodiments, the coding sequences in the first nucleic acid strand may be incorporated into the second nucleic acid strand by ligation. This allows the second nucleic acid strand to contain coding sequences for all of the chemical moieties in the first and second sets. The chemical moieties displayed by the library member can thus be identified by sequencing the second nucleic acid strand. The hybridisation of the first and second nucleic acid strands may leave a single-stranded overhanging region of the first nucleic acid strand. The single-stranded region of the first nucleic acid strand may comprise coding sequences encoding the set of chemical moieties attached to the first strand. A method may comprise;

ligating to the second nucleic acid strand a coding oligonucleotide comprising the complement of the coding sequences in the first nucleic acid strand,

such that the second nucleic acid strand incorporates said complement of the coding sequences.

The coding oligonucleotide may be ligated to the first or second nucleic acid strand by any convenient technique. For example, an adaptor oligonucleotide may be used. The nucleic acid strand may comprise a proximal end that is coupled to the chemical moiety, for example the 5' end, and a distal end to which the coding sequence is added, for example the 3' end. The nucleic acid strand may further comprise an annealing region which hybridises with an adaptor oligonucleotide. The annealing region may be located adjacent the distal end of the nucleic acid strand to facilitate ligation of the coding oligonucleotide.

An adaptor oligonucleotide may serve as a template to facilitate the ligation of the first or second nucleic acid strand and the coding oligonucleotide. A single adaptor oligonucleotide may facilitate the ligation of multiple nucleic acid strands and coding oligonucleotides. For example, a set consisting of 1 , 2, 3, 4, 5 or more adaptor oligonucleotides may be used to facilitate ligation of all of the nucleic acid strands in the sub-library to coding oligonucleotides. Preferably, the sequence of the adaptor oligonucleotide is the same regardless of the chemical moiety(s) coupled to the nucleic acid strand i.e. only 1 adaptor oligonucleotide is used. This reduces the total number of oligonucleotides required to generate the nucleic acid encoded chemical library.

The adaptor oligonucleotide hybridises with the nucleic acid strand and the coding oligonucleotide and brings the ends of the nucleic acid strand and coding sequence into association within a double-stranded trimeric complex, such that they can be ligated together by a ligase. The adaptor oligonucleotide may bring into association the 3' end of the nucleic acid strand to the 5' end of the coding oligonucleotide or the 5' end of the nucleic acid strand to the 3' end of the coding oligonucleotide. Suitable hybridisation conditions for the hybridisation of polynucleotides are well-known in the art and include for example a temperature of between 0°C and 70°C. Suitable ligation conditions are well-known in the art.

The adaptor oligonucleotide may be DNA, RNA or chimeric (i.e. containing both deoxyribonucleotides and ribonucleotides). A suitable adaptor oligonucleotide may, for example be 10 to 35 bases, preferably 14 to 30 bases in length. Examples of suitable adaptor oligonucleotide are shown in SEQ ID NOs: 5, 8 and 10.

Suitable adaptor oligonucleotides may be synthesized using appropriate techniques.

In some embodiments, the adaptor may remain hybridised to the first nucleic acid strand, for example within a nucleic acid spacer strand, and may form part of the library member that is produced.

In other embodiments, the adaptor may be removable or removed by purification following the ligation step. For example, adaptor may be separated under denaturing conditions on the basis of their small size relative to the nucleic acid strand incorporating the identifier oligonucleotide. More preferably, the adaptor oligonucleotide may be cleavable.

Cleavage or degradation of the adaptor results in separation of the adaptor oligonucleotide from the nucleic acid strand. For example, the adaptor oligonucleotide may be cleaved enzymatically, for example using RNAase, or chemically, for example by base hydrolysis (typically, exposure to pH>12 at room temperature or greater).

In some embodiments, the nucleic acid strand may be purified following removal of the adaptor for example to remove fragments of a cleaved or degraded adaptor. Suitable purification methods are well known in the art.

The first and second conjugates are covalently linked in the libraries described herein by reaction of the first reactive group on the first conjugate with the second reactive group on the second conjugate. The reactive groups may be located at the same end of the nucleic strands as the sets of chemical moieties. In some preferred embodiments, the sets of chemical moieties comprise the reactive groups. For example, the first set of one or more chemical moieties may comprise the first reactive group and/or the second set of one or more chemical moieties may comprise the second reactive group. In some embodiments, one or both of the first and second sets may comprise a chemical moiety that is modified to further comprise a reactive group (i.e. a reactive-group modified chemical moiety). In other embodiments, one or both of the reactive groups may be directly linked to the nucleic acid strands. The reactive groups may be linked to the nucleic acid strands or sets of chemical moieties directly or through a linker, such as an alkyl- or polyethylene glycol- chain.

The first and second reactive groups of the first and second conjugates may be any pair of chemical groups that react together to form a covalent linkage.

Any convenient chemistry may be employed to react the reactive groups, including for example, a click reaction, such as azide alkyne cycloaddition or alkene hydrothiolation (see respectively for example Liu et al. Beilstein J. Org. Chem. (2018), 14, 2404-2410 and Chalker et al. Chem. Asian J. (2009), 4, 630-640) or 1 ,3- dipolar cycloaddition, nitrone-olefin cycloaddition, Diels Alder reaction (see for example, Buller et al.

Bioorganic & Medicinal Chemistry Letters (2008), 18, 5926-5931), amide bond formation (see for example Y. Li et al. ACS Comb. Sci. (2016) 18, 438-444), reductive amination, reductive alkylation (see for example Satz et at. Bioconjugate Chemistry (2015) 26, 1623-1632), Suzuki reaction (see for example Li J. Y. et al.

Bioconjugate Chem. (2018) 29 (1 1), 3841 -3846), disulfide formation, cyclic sulfide (see for example Biron et al. Wiley: New York, (2017), Chap. 9, 205-241) photo-redox decarboxylaltive reaction (Kolmel et al.

ChemMedChem. (2018), 20, 2159-2165), metathesis reaction, thiourea or urea formation, sulfonylation (see for example Chalker et al. Chem. Asian J. (2009), 4, 630-640), alkylation (see for example Franzini et al. Bioconjugate Chemistry (2014) 25, 1453-1461), carbamate formation, carbamoylation, acylation, Michael reaction, Michael addition, quinazolinone/isoindolinone/thiazole formation, alkene-alkyne oxidative coupling, Heck reaction and Sonogashira reaction (see for example Satz et al. Bioconjugate Chemistry (2015) 26, 1623-1632). Reaction conditions suitable for covalently linking the first and second reactive groups are well- known in the art. For example, the first and second conjugates may be admixed in aqueous buffer in the presence of a suitable catalyst and shaken at a suitable temperature.

Suitable reactive groups may be selected from carboxyl, sulfonyl halide, acyl halide, aryl halide, isocyanate, isothiocyanate, carbonyl, alkyl halide, alkenyl, boronyl, amino, azido, alkynyl and thiol groups.

In some embodiments, covalent linkage may be achieved through click chemistry. Suitable click chemistry reactions includel ,3-dipolar cycloaddition, for example azide alkyne cycloaddition (CuAAC), such as copper(l)-catalysed CuAAC. For example, one of the first and second reactive groups may comprise an azido group and the other of the first and second reactive groups may comprise an alkynyl (CºC) group. A first reactive group comprising one of an alkynyl or an azido group may react with a second reactive group comprising the other of the alkynyl or the azido group to form covalent linkage via a 1 ,2,3-triazole moiety.

The first and second conjugates may be admixed in aqueous buffer in the presence of copper salt, such as CuSC> 4 , and, if necessary, reducing agent, at a suitable temperature to effect the azide alkyne cycloaddition of the first and second reactive groups. [Liu et al. Beilstein J. Org. Chem. (2018), 14, 2404-2410]

Other suitable click chemistry reactions include alkene hydrothiolation (thiol-ene reaction). For example, one of the first and second reactive groups may comprise an alkenyl group and the other of the first and second reactive groups may comprise a thiol group. A first reactive group comprising one of an alkenyl group and a thiol group may react with a second reactive group comprising the other of the alkenyl or thiol group to form covalent linkage [see for example Chalker et al. Chem. Asian J. (2009), 4, 630-640]

Covalent linkage may be achieved through 1 , 3-dipolar cycloaddition. For example, one of the first and second reactive groups may comprise a diazoalkane group and the other of the first and second reactive groups may comprise a vinyl group. A first reactive group comprising one of a diazoalkane group and a vinyl group, may react with a second reactive group comprising the other of the diazoalkane group or vinyl group, to form a covalent linkage.

Covalent linkage may be achieved through nitrone-olefin cycloaddition. For example, one of the first and second reactive groups may comprise a nitrone group and the other of the first and second reactive groups may comprise an alkenyl or alkynyl group. A first reactive group comprising one of a nitrone group and an alkenyl/alkynyl group may react with a second reactive group comprising the other of the nitrone group and alkenyl/alkynyl group to form a covalent linkage

Covalent linkage may be achieved through a sulfhydryl/maleimide reaction. For example, a first reactive group comprising one of a sulfhydryl or maleimide group may react with a second reactive group comprising the other of the sulfhydryl or maleimide group to form a 3-thiosuccinimidyl ether linkage [see for example Chalker et al. Chem. Asian J. (2009), 4, 630-640]

Covalent linkage may be achieved through a Diels-Alder Cycloaddition reaction. For example, a first reactive group comprising one of a dienyl or imine group and an alkenyl group may react with a second reactive group comprising the other of the dienyl or imine group and the alkenyl group to form a substituted cyclohexene linkage. The first and second conjugates may be admixed in aqueous buffer in the presence, if necessary, of a suitable condensing agent at a suitable temperature to effect the cycloaddition reaction [see for example, Buller et al. Bioorganic & Medicinal Chemistry Letters (2008), 18, 5926-5931 ]

Covalent linkage may be achieved through an amination reaction. For example, a first reactive group comprising one of an amine group and an carbonyl group or activated version thereof (e.g. ester, acid anhydride, acid halide or activated ester such as N-hydroxysuccinimide ester) may react with a second reactive group comprising the other of the amine group or the carbonyl group or activated version thereof to form an amide linkage. A solution of one of the first and second conjugates may be prepared in DMSO and it is added to the other of the first and second conjugates in aqueous buffer in the presence, if necessary, of a suitable condensing agent and shacked at a suitable temperature [see for example Y. Li et al. ACS Comb. Sci. (2016) 18, 438-444]

Covalent linkage may be achieved through reductive amination/reductive alkylation. For example, one of the first and second reactive groups may comprise a carbonyl group, such as an aldehyde group, and the other of the first and second reactive groups may comprise an amino group. A first reactive group comprising one of a carbonyl group and an amino group may react with a second reactive group comprising the other of the carbonyl group and amino group to form a covalent linkage Covalent linkage may be achieved through Suzuki cross-coupling. For example, a first reactive group comprising one of a boronyl group and a halide group, such as I, OTf, Br or Cl, may react with a second reactive group comprising the other of the boronyl or halide group, to form a covalent linkage [see for example Li J. Y. et al. Bioconjugate Chem. (2018) 29 (1 1), 3841 -3846]

Covalent linkage may be achieved through disulfide formation. For example, first and second reactive groups comprising thiol groups may react together to form a disulfide linkage.

Covalent linkage may be achieved through cyclic sulfide formation. For example, a first reactive group comprising one of an alkenyl group and a disulfide group, may react with a second reactive group comprising the other of the alkenyl group and disulfide group to form a cyclic sulfide linkage.

Covalent linkage may be achieved through ether bond formation. For example, a first reactive group comprising one of a hydroxyl group and a halide group may react with a second reactive group comprising the other of the hydroxyl group and a halide group to form an ether linkage.

Covalent linkage may be achieved through urea/thiourea formation. For example, a first reactive group comprising one of an amino group and an isothiocyanate or isocyanate group may react with a second reactive group comprising the other of the amine group and isothiocyanate/isocyanate group to form a urea or thiourea linkage. A solution of one of the first and second conjugates may be prepared in DMSO or CFhCN and it is added to the other of the first and second conjugates in aqueous buffer and shacked at a suitable temperature.

Covalent linkage may be achieved through a photo-redox decarboxylative reaction. For example, a first reactive group comprising one of a carboxylic group and a vinyl group may react with a second reactive group comprising the other of the carboxylic group and vinyl group to produce a carbon-carbon linkage.

Covalent linkage may be achieved through a metathesis reaction. For example, first and second reactive groups comprising alkenyl groups may react together to form a carbon-carbon linkage.

Covalent linkage may be achieved through an Eglinton or Hay reaction. For example, first and second reactive groups comprising alkynyl groups may react together to form a carbon-carbon linkage.

Covalent linkage may be achieved through sulfonylation reaction. For example, a first reactive group comprising one of a sulfonyl halide group and an amino group may react with a second reactive group comprising the other of the sulfonyl halide or amino group, to form a sulfonamide linkage.

Covalent linkage may be achieved through an alkylation reaction. For example, a first reactive group comprising one of an amino group and an alkyl halide group may react with a second reactive group comprising the other of the amino and alkyl halide group to form an amino linkage. Covalent linkage may be achieved through carbamate formation. For example, a first reactive group comprising one of an amino group and a carbonyldiimidazole group may react with a second reactive group comprising the other of the amino and carbonyldiimidazole group to form a carbamate linkage.

Covalent linkage may be achieved through an acylation reaction. For example, a first reactive group comprising one of an amino group and an acyl halide group may react with a second reactive group comprising the other of the amino and acyl halide group to form an amino linkage.

Covalent linkage may be achieved through a Michael reaction. For example, a first reactive group comprising one of an amino group and an acrylamide group may react with a second reactive group comprising the other of the amino and acrylamide group to form an amino linkage.

Covalent linkage may be achieved through a Michael addition. For example, a first reactive group comprising one of an a,b-unsaturated carbonyl group and an alkenyl group may react with a second reactive group comprising the other of the a,b-unsaturated carbonyl group and alkenyl group to form an amino linkage.

Covalent linkage may be achieved through alkene-alkyne oxidative coupling. For example, a first reactive group comprising one of an alkenyl group and an alkynyl group may react with a second reactive group comprising the other of the alkenyl group and alkynyl group to form a carbon-carbon linkage.

Covalent linkage may be achieved through a Heck reaction. For example, a first reactive group comprising one of an alkenyl group and a halide group may react with a second reactive group comprising the other of the alkenyl group and halide group to form a carbon-carbon linkage.

Covalent linkage may be achieved through a Sonogashira reaction. For example, a first reactive group comprising one of an alkynyl group and a halide group may react with a second reactive group comprising the other of the alkynyl group and halide group to form a carbon-carbon linkage.

Following the production of a nucleic acid encoded chemical library as described above, the library may be isolated and/or purified. For example, after the covalent linkage of the first and second reactive

functionalities, the reaction crude may be analysed, for example by liquid chromatography electrospray ionization mass spectrometry (LC-ESI-MS), in order to confirm the formation of the desired products. Library members comprising the covalent linkage may be isolated from the reaction mixture, for example by ethanol precipitation, high performance liquid chromatography (HPLC) purification and lyophilisation of the pure fraction.

In some embodiments, the library may be combined with one or more additional libraries to produce an expanded library. The library or expanded library may be stored or used in screening applications. For example, a nucleic acid encoded library may be screened for binding to a target molecule. Methods of screening using nucleic acid encoded libraries are described in more detail below. A nucleic acid encoded chemical library contains members that together display a diverse population of pharmacophores. As described above, the nucleic acid strands hybridise to form a duplex nucleic acid molecule which is coupled to the sets of one or more chemical moieties attached to each strand. The nucleic acid strands may self-assemble through the hybridisation of complementary regions in each strand to form a double-stranded or partially double stranded nucleic acid molecule. The first and second sets of chemical moieties coupled to the strands are then covalently connected through the reactive groups together to form a cyclised pharmacophore (i.e. both the nucleic acid strands and optionally the covalent linkage contribute to the pharmacophore) producing a population of library members comprising cyclised pharmacophores . In some embodiments, the population may display high purity. A nucleic acid encoded chemical library may comprise;

a diverse population of members, each member comprising;

a first nucleic strand coupled to a first set of one or more chemical moieties,

a second nucleic strand coupled to a second set of one or more chemical moieties,

wherein the first and second nucleic strands are hybridised together to form double stranded molecules, and the first and second sets of chemical moieties are covalently linked to form cyclised pharmacophores coupled to an end of the double stranded molecules, forming a chemical library.

In some embodiments, the library may display high purity i.e. the library may be essentially free of containment molecules, such as reactants and by-products, including unreacted first and second conjugates. For example, 70% or more, 80% or more 90% or more, 95% or more, 99% or more or 99.5% or more of the molecular species in the populations may be library members as described herein. Purity may be determined by standard analytic techniques, such as HPLC, SDS-PAGE and LC-MS. For example, purity may be expressed as the percentage of measured peak area of the first and second conjugates combined from the sum of peak area of all peaks, for example the sum of the peaks area of the first conjugate, the second conjugate and the containment molecule. The peak area may be measured under standard detection conditions, for example HPLC or MS conditions described elsewhere herein.

Libraries produced by the methods described above may comprise 1000 or more, 10000 or more, 100000 or more or 1000000 or more different library members each different member displaying a different pharmacophore formed from a different combination of chemical moieties. For example, a library produced by the methods described above may comprise 10 3 to 10 9 library members.

The first and second sets of chemical moieties are diverse. Each pharmacophore in the library is formed from the covalent linkage of the first and second diverse sets of chemical moieties that are coupled to the first and second nucleic acid strands of the library members.

The first nucleic strand may comprise coding sequences that identify the chemical moieties in the pharmacophore attached to it.

Suitable nucleic acid encoded chemical libraries may be produced by the method described herein. Within a nucleic acid encoded chemical library, members may include nucleic acid strands which are coupled to the same number and type of chemical moiety, but which are linked in a different order to each nucleic acid strand. For example, where a nucleic acid strand is coupled to two chemical moieties, A and B, some nucleic acid strands may include the moieties linked in the order A-B, where A is distal to the nucleic acid strand and B is proximal to the nucleic acid strand, while others may contain the same two chemical moieties linked in the order B-A where B is distal to the nucleic acid strand and A is proximal to the nucleic acid strand. Assembly of each of these strands individually with a partner strand coupled to a single moiety ‘C’ will produce two library members having pharmacophores with different structures, even though they are composed of the same chemical moieties.

The same principle applies to members which include three chemical moieties, A’, B’ and C’, where members may include the moieties linked as A’-B’-C’, A’-C’-B’, B’-A’-C’, B’-C’-A’, C’-A’-B’ and/or C’-B’-A’ (ordered as proximal-middle-distal with respect to the nucleic acid strand in each case). Other arrangements of chemical moieties are possible, for example A’ and B’ may both be linked to C’ but not to each other, or all of A’, B’ and C’ may form a covalently linked compound.

The same principle applies to members having four, five or more chemical moieties. Thus, it can be seen that the number of combinations of chemical moieties in the pharmacophore is increased which can aid selection.

The number of different members represents the complexity of a library and is defined by number of different chemical moieties, the number of chemical moieties in each pharmacophore, the number of different covalent linkages and therefore the number of different pharmacophores in the library. The number of different pharmacophores of any particular library can be determined by multiplying the number of different types of chemical moieties and the number of different types of chemical linkage together. For example, if each library member has two chemical moieties in the pharmacophore, and there are twenty types of each chemical moiety and one type of covalent linkage, then the resulting library has 400 members. If, for example, there are three chemical moieties in the pharmacophore, each of which has twenty variants, and one type of covalent linkage then the resulting library has 8000 members.

Preferably, each chemical moiety is present in the library in approximately equimolar amounts.

If desired, the members of a nucleic acid encoded chemical library may be linked to a solid support such as a bead, array or other substrate surface. Alternatively, the library members can be free in solution.

Nucleic acid encoded chemical libraries produced as described herein may be used in a variety of screening methods. For example, the library can be used in a method for identifying a pharmacophore that participates in a preselected binding interaction with a biological macromolecule.

A nucleic acid encoded chemical library generated according to the methods described herein provides a repertoire of chemical diversity in which each chemical moiety is linked to a coding sequence that facilitates identification of the chemical moiety. The library may be used to screen for pharmacophores with particular properties, e.g. pharmacophores that bind a target molecule e.g. a protein. By screening an encoded chemical library, it is possible to identify optimised chemical structures that participate in binding interactions with a biological macromolecule by drawing upon a repertoire of structures randomly formed by the association of diverse chemical moieties without the necessity of either synthesising them one at a time or knowing their interactions in advance. Nucleic acid-encoded libraries may in particular be useful in the identification of pharmacophores which are candidates for binding to a target of interest, such as a protein, or which have improved characteristics compared to previously known pharmacophores, such as improved binding affinity to a target of interest. Suitable targets for nucleic acid-encoded libraries of pharmacophores are well known in the art. A method of screening a nucleic acid encoded chemical library may comprise contacting a nucleic acid encoded chemical library produced by a method described herein with a target molecule and

selecting one or more library members which bind to the target.

The chemical library may be contacted with the target under binding conditions (i.e. in a binding reaction admixture) for a time period sufficient for the target molecule to interact with the library and bind to at least one member thereof.

Suitable binding conditions are generally compatible with the known natural binding function of the target molecule. Compatible conditions include buffer, pH and temperature conditions that maintain the biological activity of the target molecule, thereby maintaining the ability of the molecule to participate in its preselected binding interaction. Typically, those conditions include an aqueous, physiologic solution of pH and ionic strength normally associated with the target molecule of interest. For example, where the binding interaction is to identify a member in the library able to bind an antibody molecule, the preferred binding conditions would be conditions suitable for the antibody to immunoreact with its immunogen, or a known

immunoreacting antigen. For a receptor molecule, the binding conditions would be those compatible with measuring receptor ligand interactions.

A time period sufficient for the target molecule to bind to at least one member of the library is typically that length of time required for the target molecule to interact with its normal binding partner under conditions compatible with interaction. Although the time periods can vary depending on the target molecule and its respective concentration, admixing times are typically for at least a few minutes, and usually not longer than several hours, although nothing is to preclude using longer admixing times for binding to occur.

Binding between a library member and the target molecule may result in the formation of a binding reaction complex, which is a stable product of the interaction between a target molecule and the pharmacophore of the library member as described herein. The product is referred to as a stable product in that the interaction is maintained over sufficient time that the complex can be isolated from the rest of the members of the library without the complex becoming significantly disassociated.

The admixture of a library and the target molecule may be a heterogeneous or homogeneous admixture. Thus, the members of the library may be in the solid phase with the target molecule present in the liquid phase. Alternatively, the target molecule may be in the solid phase with the members of the library present in the liquid phase. Alternatively, both the library members and the target molecule may both be in the liquid phase.

The target molecule may be any molecule which the pharmacophore is a candidate for interacting with. The target molecule may be a biological target molecule as described herein or any other molecule of interest. Suitable target molecules include biological targets, for example biological macromolecules, such as proteins. The target may be a receptor, enzyme, for example a kinase, protease, or phosphatase, an antigen or an oligosaccharide.

The interaction with the target is generally through specific binding of all or part of the pharmacophore with the target. In other words, some or all of the chemical moieties, or parts of the chemical moieties which form the pharmacophore may specifically bind to the target.

The binding between the pharmacophore and target may occur through intermolecular forces such as ionic bonds, hydrogen bonds and van der Waals forces, which are generally reversible. The binding may occur through covalent bonding, which is generally irreversible, although this is generally rare in biological systems.

In some preferred embodiments, the one or more selected members may bind to a large surface of a target protein. The one or more selected members may for example inhibit protein-protein interactions (PPIs) of the target protein.

The one or more selected library members may be isolated and/or purified. Any suitable separation technique selective for library members bound to the target molecule may be employed to isolate the one or more selected library members from the binding reaction admixture. A variety of separation techniques may be employed. For example, a target which is a biological macromolecule may be provided in admixture in the form of a solid phase reagent, i.e. , affixed to a solid support, and thus can readily be separated from the liquid phase, thereby removing the majority of library members. Separation of the solid phase from the binding reaction admixture can optionally be accompanied by washes of the solid support to rinse library members having lower binding affinities off the solid support.

Alternatively, for a homogeneous liquid binding reaction admixture, a secondary binding means specific for the target molecule can be used to separate the macromolecule from the binding reaction admixture. For example, an immobilised antibody immunospecific for the target molecule may be provided as a solid phase- affixed antibody to the binding reaction admixture after the binding reaction complex is formed. The immobilised antibody immunoreacts with the target molecule present in the binding reaction admixture to form an antibody- target molecule immunoreaction complex. Thereafter, by separation of the solid phase from the binding reaction admixture, the immunoreaction complex, and therefore any binding reaction complex, may be separated from the admixture to form isolated library member.

Alternatively, a binding member can be operatively linked to the target molecule to facilitate its retrieval from the binding reaction admixture. Exemplary binding members include the following high affinity pairs: biotin- avidin, protein A-Fc receptor, ferritin-magnetic beads. Thus, the target molecule is operatively linked (conjugated) to biotin, protein A, ferritin or other binding member, and the binding reaction complex is isolated by the use of the corresponding binding partner in the solid phase, e.g., solid-phase avidin, solid- phase Fc receptor or solid phase magnetic beads.

The use of solid supports on which to operatively link proteinaceous molecules is generally well known in the art. Useful solid support matrices are well known in the art and include cross-linked dextran such as that available under the tradename SEPHADEX™ from Pharmacia Fine Chemicals (Piscataway, N.J.); agarose, borosilicate, polystyrene or latex beads about 1 micron to about 5 millimetres in diameter, polyvinyl chloride, polystyrene, cross-linked polyacrylamide, nitrocellulose or nylon-based webs such as sheets, strips, paddles, plates microtiter plate wells and the like insoluble matrices

The cyclised pharmacophore formed by the covalently linked chemical moieties of the one or more selected members may, for example, be a ligand, substrate, inhibitor or activator or may be useful in the development of any one of these. The cyclised pharmacophore may be an agonist or antagonist or a candidate agonist or antagonist or may be used as a model or lead in the development of such an agonist or antagonist.

Following screening, the identity of the pharmacophore that is displayed by a selected library member may be determined by decoding the coding sequences that are incorporated into the first nucleic acid strand of the selected library member. For example, the first nucleic acid strands of the one or more selected library members may be amplified to produce amplification products and the amplification products may be sequenced to determine the coding sequences in the first nucleic acid strands. The identity of the first and second sets of chemical moieties in the one or more selected library members may be determined from the coding sequences in the first nucleic acid strands.

A preferred method for decoding the coding sequences in the first nucleic acid strand is the use of high throughput sequencing methods (NGS sequencing), such as the lllumina HTDS (lllumina high-throughput sequencing) or the 454-Roche Genome Sequencer system. For sequencing with the lllumina HTDS system, PCR products have to contain suitable adaptor sequences at their extremities (called adaptor sequence A and B), which can be either added after a PCR reaction by ligation, or they can be incorporated in the PCR reactions, if the PCR primers contain on their 5’-ends sequences corresponding to an adaptor region. The next step of a particular sequencing process is the annealing of PCR amplicons on nucleic acid Capture Beads, emulsification of beads and PCR reagents in water-in-oil microreactors, and clonal emPCR amplification inside these microreactors. After breaking of the emulsion, the Capture beads are mixed with Enzyme Beads, and loaded on a PicoTiterPlate. Pyrosequencing allows the recording of individual sequences for each nucleic acid species displayed at Capture Beads, trapped in the wells of PicoTiterPlates. This allows the parallel sequencing of a vast amount (typically more than 100,000 per PicoTiterPlate) of individual nucleic acid species at a time (Decurtins W, Wichert M, Franzini RM, Buller F, Stravs MA, Zhang Y, Neri D, Scheuermann J Nat Protoc. (2016), 1 1 (4), 764-780).

One exemplary use for a nucleic acid encoded chemical library produced as described herein is for lead optimization. Lead optimization may involve combining a known pharmacophore, formed from one or more chemical moieties with one or more further chemical moieties, as described herein with the aim of improving the characteristics of the known pharmacophore, for example the binding affinity. In this case, nucleic acid strands from first and second sub-libraries of conjugates may be hybridised and the sets of chemical moieties covalently linked to form a library. The first sub-library may comprise library members which are coupled to the known pharmacophore and the second sub-library comprises members coupled to one or more candidate chemical moieties. The second sub-library generally comprises a variety of different chemical moieties, because this increases the variety of structure in the pharmacophores of the assembled library members. The identities of the chemical moieties in the resultant covalently linked cyclised pharmacophore are encoded into the library member using the methods known in the art and described elsewhere herein.

Other aspects and embodiments of the invention provide the aspects and embodiments described above with the term“comprising” replaced by the term“consisting of and the aspects and embodiments described above with the term“comprising” replaced by the term“consisting essentially of.

It is to be understood that the application discloses all combinations of any of the above aspects and embodiments described above with each other, unless the context demands otherwise. Similarly, the application discloses all combinations of the preferred and/or optional features either singly or together with any of the other aspects, unless the context demands otherwise.

Modifications of the above embodiments, further embodiments and modifications thereof will be apparent to the skilled person on reading this disclosure, and as such, these are within the scope of the present invention.

All documents and sequence database entries mentioned in this specification are incorporated herein by reference in their entirety for all purposes.

“and/or” where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. For example,“A and/or B” is to be taken as specific disclosure of each of (i) A, (ii) B and (iii) A and B, just as if each is set out individually herein.

Experimental

Example 1 : The copperdj-catalvsed 1.2.3-triazole-forminq reaction ("top" ring closurej between azide- modified nucleic acid strand A and alkvne-modified nucleic acid strand B.

Methods

Custom oligonucleotides were lyophilized and further purified by EtOH precipitation and re-dissolved in H2O. The final concentration was determined by UV absorbance measurement at 260 nm using a NanoDrop 2000 instrument.

Amino-modified 48-mer oligonucleotide (SEQ ID NO: 1)

5’- GGAGCTTCTGAATTCTGTGTGCTGACGTAACGAGTCCCATGGCGCAGC -3’

Molecular Weight = 14’994.60 Da; e = 0.46361 pM- 1 .cnv 1 Amino-modified d-spacer oligonucleotide (SEQ ID NO: 2)

5’- CATGGGACTCGddddddCAGCACACAGAATTCAGAAGCTCC -3’

Molecular Weight = 12Ό58.74 Da; e = 0.38283 mM ohn 1

Mass spectrometry (LC-ESI-MS) spectra were recorded on an Agilent 6100 Series Single Quadrupole MS system combined with an Agilent 1260 Series LC. An ACQUITY UPLC Oligonucleotide BEH C18 column (130 A, 1.7 pm, 2.1 mm x 50 mm) was used and compounds were eluted by applying gradients of MeOH and 400 mM HFIP / 15 mM TEA in H2O. Calculated and measured m/z values are reported as dimensionless quantities. Preparative reversed-phase high-pressure liquid chromatography (RP-HPLC) for the

oligonucleotide conjugates was performed on an Agilent 1200 Series with a C18-Xterra© Prep RP column (112 A, 5 pm, 10 x 150 mm) using a gradient of eluent A (TEAA 100 mM) and eluent B.

1.1 Synthesis of azide-modified nucleic acid conjugates (strand A) (Figure 4A)

Synthesis of azide conjugate A: 4-Azidobenzoic acid (5 mI_, 200 mM in feri-butyl-methyl ether) and DMT-MM (2.5 mI_, 400 mM in H2O) were incubated for 1 h at 25°C. Amino-modified d-spacer oligonucleotide (SEQ ID NO: 2) (20 nmol) in Borate Buffer (6.4 mI_, 250 mM, pH 9.4) was then added to the mixture and the reaction was stirred for 2 h at 25 °C. To the aqueous DNA solution, 10% (v/v) of 5 M NaCI was added, followed by 2.5-3 volumes of cold absolute EtOH. The colloidal solution was left at -20 °C for 72 h and then centrifuged at 4 °C for 30 min at 15000 rpm. The resulting supernatant was discarded and the pellet was dried using a SpeedVac. The crude mixture was purified by RP-HPLC on a C18-Xterra© Prep RP column (112 A, 5 pm,

10 x 150 mm) using a gradient of eluent A (TEAA 100 mM) and eluent B (TEAA 100 mM in 80% ACN). The fractions containing the product were combined and lyophilised to obtain the azide product (A), as determined by measuring the UV absorbance at 260 nm of a water solution on a Thermofisher Nanodrop 2000. LC-ESI-MS: 12203.87 m/z, found: 12202.60 m/z.

1.2 Synthesis of alkyne-modified nucleic acid conjugates (strand B) (Figure 4B)

Synthesis of alkynyl conjugate B: 5-hexynoic acid (2.5 pL, 200 mM in DMSO) and DMT-MM (1.25 pL, 400 mM in H2O) were incubated for 2 h at 25 °C. Amino-modified 48-mer oligonucleotide (SEQ ID NO: 2) (10 nmol) in Borate Buffer (13.2 pL, 250 mM, pH 9.4) was then added to the mixture and the reaction was stirred for 2 h at 25 °C. To aqueous the DNA solution, 10% (v/v) of 5 M NaCI was added, followed by 2.5-3 volumes of cold absolute EtOH. The colloidal solution was left at -20 °C for 72 h and then centrifuged at 4 °C for 30 min at 15000 rpm. The resulting supernatant was discarded and the pellet was dried using a SpeedVac. The crude mixture was purified by RP-HPLC on a C18-Xterra© Prep RP column (112 A, 5 pm, 10 x 150 mm) using a gradient of eluent A (TEAA 100 mM) and eluent B (TEAA 100 mM in 80% ACN). The fractions containing the product were combined and lyophilised to obtain the alkyne product (B), as determined by measuring the UV absorbance at 260 nm of a water solution on a Thermofisher Nanodrop 2000. LC-ESI-MS: 15088.73 m/z, found: 15088.01 m/z.

1.3 Synthesis of the 1 ,2,3-triazole-cyclic nucleic acid population (Figure 5).

Condition 1 : Azide conjugate (strand A; 1 nmol) and alkynyl conjugate (strand B; 1 nmol) were diluted in 1 pL H2O and 1 pL in Borate Buffer (500 mM, pH 9.4). TBTA (3 pL, 50 mM in DMSO), CuS0 4* 5H 2 0 (2 pL, 50 mM in H2O) and (+)-Sodium L-ascorbate (4 mI_, 50 mM H2O) were added to the equimolar reaction mixture and the reaction was stirred for 2 h at 25 °C. The crude LC/MS profiles show a single pick of the desired 1 ,2,3- triazole product (C). LC-ESI-MS: 27290.6 m/z, found: 27289.98 m/z.

Condition 2: Azide conjugate (strand A; 1 nmol) and alkynyl conjugate (strand B; 1 nmol) were diluted in 1 mI_ hhO and 1 mI_ in Borate Buffer (500 mM, pH 9.4). The crude LC/MS profiles showed two distinct peaks form the conjugate A and conjugate B. No reaction was observed.

ESI-LC-MS was used to analyse the reactants and products. Single peaks were observed for strand A comprising the azide modified building block (Figures 6A and 7 A) and strand B comprising the alkyne modified building block (Figure 6B and 7B) in both the UV and MS traces. Double peaks were observed for the mixture of strands A and B without ring closure reagents (Figures 6C and 7C), whilst single peaks were observed for the mixture of strands A and B following ring closure (Figures 6D and 7D) in both the UV and the MS traces.

Example 2: Construction of a cvclised dual-pharmacophore oligonucleotide conjugate composed of an azido acid DNA-encoded compound and an alkynyl acid attached on a complementary DNA strand (Format 1 , Figures 8 and 9)

Methods

The synthetic oligonucleotides used for the construction of the oligonucleotide conjugates are shown below. The synthetic oligonucleotides were stored as 100 mM stock solutions at -20 °C.

5’-phosphorylated, 3’-aminomodified, 41 -mer oligonucleotide (d-spacer, SEQ ID NO: 2)

5’-CATGGGACTCGddddddCAGCACACAGAATTCAGAAGCTCC-3’

Molecular Weight = 12Ό58.74 Da; e = 0.38283 pM- 1 .cnv 1

5’-aminomodified, 48-mer oligonucleotide (Elib2_Code1 , SEQ ID NO: 3)

5’-GGAGCTTCTGAATTCTGTGTGCTGACTATCCGAGTCCCATGGCGCAGC-3

Molecular Weight = 14’944.88 Da; e = 0.49068 pM- 1 .cnv 1

2. / Construction of the azide acid conjugate using 5’-amino-modified oligonucleotide [strand A1, Figure 8A]

5 mI of 200 mM azidoacetic acid [1 pmol in dry dimethyl sulfoxide (DMSO)] was activated for 30 min at 30 °C with 10 mI of 100 mM 1 -Ethyl-3-(3-dimethylaminopropyl)carbodiimide (EDC, 1 pmol in dry DMSO) and 8 pi of 333 mM N-hydroxysulfoxsuccimide [s-NHS, 2.7 pmol in DMSO/H2O, (2:1)] in 90 pi dry DMSO and subsequently reacted overnight at 30 °C with 20 nmol amino-modified coding oligonucleotide (Elib2_Code1 , SEQ ID NO: 3) dissolved in 30 pi of 66 mM 3-(A/-morpholino)propanesulfonic acid (MOPS, 2 pmol, pH = 8.0). The DNA-compound was precipitated with cold absolute EtOH to obtain the oligonucleotide azido conjugate (strand A1) and re-dissolved in H2O before Nanodrop measurement and LC-ESI-MS analysis.

2.2 Construction of alkyne acid conjugate using 3’ -amino-modified, 5’-phosphorylated d-spacer oligonucleotide [strand B1, Figure 8B] 50 nmol amino-modified d-spacer oligonucleotide were dissolved in 50 pi 250 mM borate buffer (12.5 pmol, pH = 9.4) with 12.5 mI 200 mM 6-heptynoic acid (2.5 mhioI in dry DMSO) and 6.25 mI 400 mM 4-(4,6- dimethoxy-1 ,3,5-triazin-2-yl)-4-methyl-morpholinium chloride (DMT-MM, 2.5 mhioI in H2O) and reacted 2 h at 25 °C. The DNA-compound was precipitated with cold absolute EtOH to obtain the oligonucleotide alkynyl conjugate (strand B1) and re-dissolved in H2O before Nanodrop measurement and LC-ESI-MS analysis.

2.3 Ethanol precipitation of oligonucleotide conjugates

After each reaction onto DNA, the oligonucleotide conjugates were precipitated by adding 10% v/v 5M NaCI and 2.5-3 volumes of cold absolute EtOH. The oligonucleotide conjugates were stored at -20 °C for at least 2 h before centrifugation (1 h, 15Ό00 rpm, 4 °C). Immediately after the centrifugation, the supernatants were carefully discarded, and the remaining pellets were vacuum dried.

2.4 Nanodrop measurement

The oligonucleotide quantities were estimated by UV absorbance using a Nanodrop™ 2000/2000c spectrophotometer. 1 mI of the oligonucleotide conjugate solution was released directly onto the optical measurement surface. The absorbance at 260 nm was extracted from the computed data to calculate the amount of oligonucleotide, knowing the absorption coefficient ( e ) of the corresponding DNA sequence.

2.5 Liquid chromatography-electrospray ionisation-mass spectrometry (LC-ESI-MS)

Mass-analysis of the oligonucleotide-coupled compounds was performed by the combination of liquid chromatography with electrospray ionization mass spectrometry (LC-ESI-MS). A reverse-phase Agilent 1260 Series LC An ACQUITY UPLC Oligonucleotide BEH C18 column (130 A, 1.7 pm, 2.1 mm x 50 mm) with organic/inorganic particle (silica and polymeric supports) was used as stationary phase. As a mobile phase, 400 mM 1 ,1 ,1 ,3,3,3-hexafluoroisopropanol (HFIP), 2 mM triethylamine (TEA) buffer A was applied with a 100% methanol gradient as buffer B. A tandem-quadrupole mass spectrometer (Agilent 6100 Series Single Quadrupole MS) with electrospray ionization (ESI) source was used for mass detection and analysis. Mass spectrometric analyses were performed in negative ion-mode. ESI interface parameters were set as follows: dissolution temperature (200 °C), source temperature (110 °C), capillary voltage (3.0 kV), cone voltage (40 V), scan time (0.5 s), inter-scan delay time (0.1 s).

2.6 Cyclisation of the strand A1 and strand B1 (Format 1, Figure 8C)

200 pmol oligonucleotide azido conjugate (strand A1) and 200 pmol oligonucleotide alkynyl conjugate (strand B1) were diluted in 2 pi of 250 mM borate buffer (0.5 pmol, pH = 9.4) and subsequently incubated with 3 pi of 50 mM TBTA (T ris[(1 -benzyl-1 H-1 ,2,3-triazol-4-yl)methyl]amine, 150 nmol in DMSO), 2 pi of 50 mM copper(ll) sulfate pentahydrate (CUSO4 5H2O, 100 nmol in H2O), and 4 mI of 50 mM (+)-Sodium L- ascorbate (Vitamin C, 200 nmol in H2O) at 25 °C for 2 h. The reactions were diluted in H2O for LC-ESI-MS analysis (Figure 9D) and polyacrylamide gel electrophoresis (Figure 14).

2.7 Polyacrylamide electrophoresis

The cyclisation reaction was analysed on native polyacrylamide 20% TBE gels (1 .0 mm, 15 wells) and on denaturing polyacrylamide 15% TBE-Urea gels (1.0 mm, 15 wells). A current of 120 mA with a voltage of 200 V was applied for 1 h on the electrophoresis box. The gels were then stained with SYBR Green I during 30 min and analysed by UV excitation (Figure 14).

2.8. Results

A cyclised dual-pharmacophore double-stranded oligonucleotide construct (Format 1) was generated from two hybridised single-stranded oligonucleotide conjugates (strand A1 and strand B1) by copper(l)-catalysed azide alkyne cycloaddition, as shown in Figure 8C. LC-ESI-MS was used to analyse the reactants and products. Single peaks were observed for strand A1 comprising the azide modified building block (Figure 9A) and strand B1 comprising the alkyne modified building block (Figure 9B) in the LC-ESI-MS trace. Double peaks were observed for the mixture of strands A1 and B1 without ring closure reagents (Figure 9C), whilst single peaks were observed for the mixture of strands A1 and B1 following ring closure (Figure 9D) in the LC-ESI-MS trace.

Example 3: Construction of a cyclised dual-pharmacophore oligonucleotide conjugate composed of an DNA- encoded azido compound and an alkvnyl acid encoded by a complementary DNA strand (Figures 10 and 11)

Methods

The synthetic oligonucleotides used for the encoding ligation step are shown below. The synthetic oligonucleotides were stored as 100 pM stock solutions at -20 °C.

38-mer oligonucleotide (Elib4_Code2, SEQ ID NO: 4)

5’-CCTGCATCGAATGGATCCGTGAATTATTCGCAGCTGCG-3’

Molecular Weight = H’958.53 Da; e = 0.39620 pM- 1 .crrr 1

21-mer oligonucleotide RNA adaptor (Elib4_aT, SEQ ID NO: 5)

5’-CGA-rG-5-Me-U-rC-CCATGGC-rG-rC-rA-rG-CTGC-3’

Molecular Weight = 6’520.17 Da; e = 0.19057 mM- 1 .ohn 1

3.1 Construction of 5’ -amino-modified azido oligonucleotide (strand A1) and 3’ -amino-modified, 5’- phosphorylated d-spacer alkynyl oligonucleotide (strand B1) conjugates

For the synthesis of strand A1 , as described in Example 2.1 , 13 mI of a solution containing 56 mM EDC and 148 mM s-NHS in 85% DMSO/15% FLO was added to 95 mI of a 10.5 mM solution of azidoacetic acid. After 30 min of activation at 30 °C, 30 mI of a solution of 667 mM of Elib2_Code1 oligonucleotide (SEQ ID NO: 3) in 66 mM MOPS, pH 8.0, was added. The reaction was stirred for 16 h at 30 °C. For the synthesis of strand B1 , as described in Example 2.2, 50 pi of a solution of 1 mM of d-spacer oligonucleotide (SEQ ID NO: 2) in 250 mM Borate, 12.5 pi of 200 mM solution of 6-heptynoic acid, and 6.25 pi of a 400 mM solution of DMT-MM were stirred for 2 h at 25 °C.

The DNA-Compounds (strands A1 and B1) were precipitated with cold absolute EtOH and re-dissolved in H2O before Nanodrop measurement and LC-ESI-MS analysis, as described in Examples 2.4 and 2.5 respectively, yielding, the azido acid oligonucleotide conjugate (strand A1) and the alkynyl acid

oligonucleotide conjugate (strand B1). 3.2 Encoding by ligation [oligonucleotide alkynyl conjugate strand B2, Figure 10A]

4.6 mI of 434 mM strand B1 (2 nmol in H2O), 62.5 mI of a 48.3 mM coding oligonucleotide (Elib4_Code2, SEQ ID NO: 4), 3 nmol in H2O), 106,7 mI of 30 mM RNA adaptor oligonucleotide (Elib4_aT, SEQ ID NO: 5), were vacuum-dried and subsequently dissolved in 89.5 mI hhO and 10 mI of 10x ligase buffer. The solution was mixed, heated up to 90 °C for 2 min and passively cooled down to 25 °C (hybridisation). Afterwards, 0.5 mI T4 ligase was added. Ligation was performed for 16 h at 16 °C. The ligase was inactivated for 10 min at 65 °C. The reaction mixture was purified by HPLC to obtain the encoded oligonucleotide alkynyl conjugate (strand B2). The strand B2 was vacuum-dried overnight and re-dissolved in H2O before LC-ESI-MS analysis.

3.3 Reversed-phase high-pressure liquid chromatography (RP-HPLC) of the oligonucleotide conjugates The oligonucleotide alkynyl conjugate strand B2 was separated from the mixture of coding oligonucleotide Elib4_Code2, non-reacted strand B1 and Elib4_aT RNA adaptor by Agilent 1260 Infinity HPLC. A reverse- phase C18-Xterra© Prep RP column (112 A, 5 pm, 10 x 150 mm) with organic/inorganic particle (silica and polymeric supports) was used as stationary phase. As a mobile phase, an aqueous 100 mM

triethylammonium acetate (TEAA) buffer A (pH = 7.0) was applied with an acetonitrile gradient (Buffer B: 100 mM TEAA in 80% MeCN/20% H2O). After a signal corresponding to enzyme debris at the very beginning of the run (T = 25-30 °C, p = 100-170 bar, Abs = 260 nm), the Elib4_aT RNA adaptor came out at 6-6.5 min, the strand B1 between 8-9 min followed by the Elib4_Code2 oligonucleotide at 9-11 min. The

oligonucleotide alkynyl conjugate strand B2 showed a retention time of 12.5 min with a peak ranging between 11.5 and 13.5 min. The purity of the fractions was verified by LC-ESI-MS analysis as described in Example 2.5 and the recovered amount estimated by Nanodrop measurement after lyophilization as described in the Example 2.4.

3.4 Cyclisation of the strand A1 and strand B2 (Format 2, Figure 10B)

200 pmol oligonucleotide azido conjugate (strand A1) and 200 pmol oligonucleotide alkynyl conjugate (strand B2) were diluted in 2 pi of a 250 mM solution of borate buffer (0.5 pmol, pH = 9.4) and subsequently incubated with 3 mI of 50 mM TBTA (150 nmol in DMSO), 2 mI of 50 mM copper(ll) sulfate pentahydrate (CUSO4 5H2O, 100 nmol in H2O), and 4 mI of 50 mM (+)-Sodium L-ascorbate (Vitamin C, 200 nmol in H2O) for 2 h at 25 °C. The DNA-Compound was diluted in H2O for Nanodrop Measurement and LC-ESI-MS analysis, as described in Examples 2.4 and 2.5, respectively, (Figure 11) and polyacrylamide gel electrophoresis as described in Example 2.7 (Figure 14).

3.5 Results

Two single-stranded DNA conjugates (strand A1 and strand B2) were synthesized. The oligonucleotide section of strand B1 was extended by ligation to yield the encoded oligonucleotide alkynyl conjugate strand B2, as shown in Figure 10A.

A cyclised dual-pharmacophore double-stranded oligonucleotide construct (Format 2) was generated from two hybridised single-stranded oligonucleotide conjugates (strand A1 and strand B2) by copper(l)-catalysed azide alkyne cycloaddition (Figure 10B). LC-ESI-MS was used to analyse the reactants and products. Single peaks were observed for strand A1 comprising the azide modified building block (Figure 1 1 A) and strand B2 comprising the alkyne modified building block (Figure 11 B) in the LC-ESI-MS trace. Double peaks were observed for the mixture of strands A1 and B2 without ring closure reagents (Figure 11 C), whilst the mixture of strands A1 and B2 following ring closure (Figure 11 D) elicited one large peak in the LC-ESI-MS trace for the conjugate containing strands A1 and B2 and minor peaks for the unreacted strands.

Example 4: Construction of a third cvclised dual-pharmacophore DNA duplex composed of a DNA-encoded azido compound and an alkvnyl acid encoded by a complementary DNA strand (Format 3, Figures 1A, 12 to 14)

4.1 Construction of 5’ -amino-modified oligonucleotide azido acid (strand A1) and 3’ -amino-modified alkynyl acid d-spacer encoded oligonucleotide (strand B2) conjugates

For the synthesis of the strand A1 , 13 pi of a solution containing 56 mM EDC and 148 mM s-NHS in 85% DMSO/15% H2O was added to 95 pi of a 10.5 mM solution of azidoacetic acid. After 30 min of activation at 30 °C, 30 pi of a solution of 667 pM of Elib2_Code1 oligonucleotide (SEQ ID NO: 3) in 66 mM MOPS, pH 8.0, was added. The reaction was stirred for 16 h at 30 °C. The strand A1 was precipitated with cold absolute EtOH and re-dissolved in H2O before LC-ESI-MS analysis, as described in Example 2.5, to obtain the azide acid oligonucleotide conjugate (strand A1).

Strand B2 was obtained from strand B1 followed extended by a ligation step (Figure 10A). To construct strand B1 , 50 pi of a solution of 1 mM of d-spacer oligonucleotide (SEQ ID NO: 2) in 250 mM Borate, 12.5 pi of 200 mM solution of 6-heptynoic acid, and 6.25 pi of a 400 mM solution of DMT-MM were stirred for 2 h at 25 °C. The DNA-Compound was precipitated with cold absolute EtOH and re-dissolved in H2O before Nanodrop measurement and LC-ESI-MS analysis, as described in Examples 2.4 and 2.5 respectively, to obtain the oligonucleotide alkynyl conjugate (strand B1). The strand B1 was encoded by ligation: 4.6 pi of 434 pM oligonucleotide alkynyl conjugate strand B1 (2 nmol), 62.5 pi of 48.3 pM coding oligonucleotide Elib4_Code2 (SEQ ID NO: 4) (3 nmol), 106,7 pi of 30 pM RNA adaptor oligonucleotide (SEQ ID NO: 5) were vacuum-dried and re-dissolved in 89.5 pi H2O and 10 pi of 10x ligase buffer. The mixture was heated up to 90 °C for 2 min and passively cooled down to 25 °C (hybridisation). Afterwards, 0.5 pi T4 ligase was added. Ligation was performed for 16 h at 16 °C. The ligase was inactivated for 10 min at 65 °C. The DNA- compound was purified by HPLC, as described in Example 3.3, and quantify by Nanodrop measurement, as described in Example 2.4, and LC-ESI-MS analysed, as described in Example 2.5, to obtain the encoded oligonucleotide alkynyl conjugate (strand B2).

4.2 Construction of strand A2 by Klenow fill-in reaction (Figure 12A)

2.9 pi of 345.7 pM oligonucleotide azido conjugate (strand A1 , 1 nmol in H2O), 46.5 pi of 21.5 pM encoded oligonucleotide alkynyl conjugate (strand B2, 1 nmol in H2O), 10 pi NEB 2 of 10x Klenow buffer and 15.6 pi H2O were heated up at 90 °C and passively cooled down to 25 °C (hybridisation). Afterwards, 20 pi of 5 mM deoxyribose nucleosidic triphosphates (dNTPs, 100 nmol in H2O) and 5 pi DNA polymerase were added. Klenow fill-in was performed for 2 h at 25 °C. The reaction solution was cleaned up by cartridge purification, giving the DNA duplex composed of strand B2 and its complementary extended oligonucleotide (strand A2). The completion of the reaction was estimated by electrophoresis with QIAxce/ Advanced Instrument (see below the clean-up purification and the automated procedure of gel electrophoresis).

4.3 Cartridge clean up purification of oligonucleotide duplex

The DNA duplex built up by the Klenow fill-in reaction was separated from the remaining DNA single strands and enzyme debris. A Spin Column for DNA (silica membrane, 800 pi loading capacity, QIA Quick ® PCR Purification kit) was used to bind the DNA oligonucleotides containing at least 100 base pairs (bp) to the silica membrane by mixing the Klenow reaction mixture as 1 volume to 5 volumes of binding buffer

(furnished with the PCR purification kit). The spin column was centrifuged (25 °C, 13Ό00 rpm, 1 min) to filter the solution. After discarding of the recovered liquid, 750 pi of the Washing buffer (furnished with the PCR purification kit) was added on the spin column before another centrifugation (25 °C, 13Ό00 rpm, 1 min). After the discarding, 50 pi of H2O were added to the spin column. The liquid was left diffusing during 1 min before a last centrifugation (25 °C, 13Ό00 rpm, 90 s), recovering the cleaned up eluted DNA duplex.

4.4 QIAxcel electrophoresis

The Klenow fill-in reaction was analysed by electrophoresis automated by QIAxcel Advanced Instrument.

The samples were loaded into the appropriate analysis tubes (row of 12 tubes). The unused tubes were filled with Dilution buffer (included with the QIAxcel Advanced Instrument or provided by QIAGEN). The resulting gel was computed by the QIAGEN software, before LC-ESI-MS analysis and Nanodrop measurement.

4.5 Cyclisation of the strand A2 and strand B2 (Format 3, Figure 12B)

200 pmol of DNA duplex composed of strand A2 and strand B2 were dissolved in 2 pi of 250 mM borate buffer (0.5 pmol, pH = 9.4) and subsequently incubated with 3 pi of 50 mM TBTA (150 nmol in DMSO), 2 pi of 50 mM copper(ll) sulfate pentahydrate (CUSO4 5H2O, 100 nmol in H2O), and 4 pi of 50 mM (+)-Sodium L- ascorbate (Vitamin C, 200 nmol in H2O) for 2 h at 25 °C. The DNA-Compound was diluted in H2O for Nanodrop measurement and LC-ESI-MS analysis, as described in Examples 2.4 and 2.5 respectively (Figure 13) and polyacrylamide gel electrophoresis as described in Example 2.7.

4.6 Results

Two single-stranded DNA conjugates (strand A2 and strand B2) were synthesized. The oligonucleotide section of strand A1 were extended by Klenow Fill-in polymerisation to yield the strand A2 (Figure 12A). A cyclised dual-pharmacophore double-stranded oligonucleotide construct (Format 3) was generated from two hybridised single-stranded oligonucleotide conjugates (strand A2 and strand B2) by copper(l)-catalysed azide alkyne cycloaddition (Figure 12B).

LC-ESI-MS was used to analyse the reactants and products after the reaction depicted on Figure 12B.

Single peaks were observed for strand B2 comprising the alkyne modified building block (Figure 13A) in the LC-ESI-MS trace. Double peaks were observed for the mixture of strands A2 and B2 without ring closure reagents (Figure 13B), whilst the mixture of strands A2 and B2 following ring closure (Figure 13C) elicited one large peak in the LC-ESI-MS trace for the conjugate containing strands A2 and B2 and minor peaks for the unreacted A2 and B2 strands. The ring closure formation by copper (I) catalysed azide alkyne cycloaddition in different formats was assessed by denaturing electrophoresis. The presence of bands corresponding to conjugates formed by ring closure was observed (Figure 14).

Example 5: Construction of a sub-library of oligonucleotide-compound conjugates displaying two chemical Building Blocks (2BB) using 5'-aminomodified oligonucleotides (Figure 15)

Methods

Oligonucleotides carrying a 5' primary amino group and an individual encoding sequence were coupled to trifunctional carboxylic acid scaffolds, referred as Building Block 1 (BB1), which contained an /V-Fmoc- protected amino and an azide group.

The oligonucleotides were stored as 100 pM stock solutions at -20 °C. The sequences of the

oligonucleotides used for the encoding ligation step are shown below. Each BB has been tagged with a custom oligonucleotide, carrying an encoding region, with unique nucleotides defined as X, and a non-coding region.

5’-amino-modified, 45-mer oligonucleotide (Elib5_Code1 , SEQ ID NO: 6)

5’-GGAGCTTCTGAATTCTGTGTGCTGXXXXXXCGAGTCCCATGGCGC-3’

Average Molecular Weight = 13’880 Da; Average e = 0.46100 pM _1 .cnv 1

5’-phosphorylated, 29-mer oligonucleotide (Elib5_Code2, SEQ ID NO: 7)

5’-CGGATCGACGXXXXXXXGCGTCAGGCAGC-3’

Average Molecular Weight = 8’945 Da; Average e = 0.30900 pM _1 .cnrr 1

25-mer oligonucleotide DNA adaptor (Elib5_aH, SEQ ID NO: 8)

5’-CGTCGATCCGGCGCCATGGGACTGG -3’

Molecular Weight = 7Ό59.98 Da; e = 0.25075 pM- 1 .crrr 1

5.1 Split and Pool Combinatorial Strategy

Split and pool strategy allows the construction of large DNA-encoded chemical libraries in a combinatorial fashion. Each BB1 chemical moiety was specifically coupled to a custom oligonucleotide (Elib5_Code1 , SEQ ID NO: 6), containing distinct encoded sequences X. 14 reactions were run in parallel, giving 14 different oligonucleotide conjugates (strands C1). The strands C1 were individually purified, normalised and mixed, yielding a pool of 14 custom oligonucleotides displaying a BB1 (strands C1 sub-pool) (Figure 15A).

The pool of strands C1 was aliquoted in 293 reaction vessels before the splint ligation of the second code (Elib5_Code2, SEQ ID NO: 7), each of them containing distinct encoded sequences X, yielding 293 Elib5 sub-pools (Figure 15 B). The 293 Elib5 sub-pools were coupled with 293 carboxylic acids referred as building blocks 2 (BB2), yielding the strands C2. The strands C2 were mixed, purified by HPLC yielding the strands C2 sub-library (Figure 15C).

5.2 DNA-conjugation of tri-functional carboxylic acid scaffolds as first chemical moiety (BB1) [strands C1 Pool (Figure 15A)] To activate the carboxylic acid, 40 mI of a solution containing 333 mM s-NHS and 100mM EDC in 85% DMSO/15% H2O was added to 191 mI of a 200 mM solution of the carboxylic acid in DMSO. After 30 min at 30 °C, 60 mI of a solution of 4 mM oligonucleotide (Elib5_Code1) in 420 mM TEA/HCI, pH 10, was added. All reactions were stirred for 12 h at 37°C.

Purifications were performed by HPLC on a reverse-phased C18-Xterra© Prep RP column (112 A, 5 pm, 10 x 150 mm) with organic/inorganic particle (silica and polymeric supports) as stationary phase. As a mobile phase, an aqueous 100 mM triethylammonium acetate (TEAA) buffer A (pH = 7.0) was applied with an acetonitrile gradient (Buffer B: 100 mM TEAA in 80% MeCN/20% H2O). The desired samples were redissolved in 100 pi of H2O. An amount of 1 pi was analysed by LC-ESI-MS as described in section 2.5. The samples containing the strand C1 of were merged and precipitated by adding 10% v/v of 5M NaCI and 2.5-3 volumes of cold absolute EtOH and vacuum dried. Each strand C1 was then re-dissolved in 100 pi of H2O, and the recovered amounts were determined by Nanodrop, as described in Example 2.4. Equimolar amounts of strands C1 were then mixed together to generate the desired strands C1 Pool, displaying a chemical moiety (BB1).

The strands C1 Pool was then split in 10 tubes followed by drying under vacuum. The removal of the Fmoc group was performed by addition of 5 pi triethylamine to solutions of 4 mM encoded compounds and stirring for 6 h at 37 °C. The DNA-Compounds were ethanol precipitated and vacuum-dried and the amounts were estimated by Nanodrop as described in Example 2.4, yielding the strands C1 Pool.

5.3 Encoding (Elib5_Code2) of the oligonucleotide-compound conjugates by ligation [Elib5 sub-pools (Figure 15B)]

20 pi of 33 mM strands C1 Pool (1 ,23 nmol), 10 pi of 100 mM Elib5_Code2 oligonucleotide (1.6 nmol) (SEQ ID NO: 7), 2 pi of 102 mM DNA adaptor oligonucleotide (Elib5_aH, SEQ ID NO: 8), 20 pi of 10x ligase buffer and 144.8 mI H2O were mixed and heated up to 90 °C for 2 min. Then the mixture was passively cooled down to RT (hybridisation). Afterwards, 3.2 pi of T4 ligase was added. Ligation was performed for 12 h at 16 °C. The ligase was inactivated for 15 min at 70 °C. The DNA-compounds were precipitated and vacuum- dried and the recovered amounts were determined by Nanodrop, as described in Example 2.4, yielding Elib5 sub-pools.

5.4 DNA-conjugation of carboxylic acids as second chemical moiety (BB2) [strands C2 sub-library (Figure 15C)]

To activate the carboxylic acid (BB2), 44.5 pi of a solution containing 208 mM DMT-MM in 65% DMSO/35% H2O was added to 12.5 pi of a 200 mM solution of the carboxylic acid in DMSO. After 15 min at 25 °C, the activated carboxylic acid solution was added to 72 pi of a solution of ligated oligonucleotide-compound conjugates (Elib5 sub-pool) in 100 mM MOPS, pH 8.0 1 M NaCI. All reactions were stirred for 12 h at 25 °C. The reactions were quenched by adding 25 pi of 500 mM NH4OAC, and stirred for an additional 30 min at 37 °C. The ligated oligonucleotide-compound conjugates carrying the BB1 and BB2 chemical moieties were precipitated and vacuum-dried as described above. All the ligated oligonucleotides-compound conjugates were then mixed together and vacuum-dried as described above, giving the sub-library pool (strands C2 sub- library). Purification of the strands C2 sub-library was performed by HPLC as described in Example 3.3 to generate the desired sub-library.

5.5 Results

A 4Ί 02 strands C2 sub-library was synthesized from 14 tri-functional carboxylic acid scaffolds as chemical moiety BB1 and 293 carboxylic acids as chemical moiety BB2, assembled in combinatorial fashion.

Example 6: Construction of a cvclised dual-pharmacophore oligonucleotide library composed of a trifunctional DNA-encoded azido sub-library (strands C2 sub-library) and an alkvnyl acid encoded by a complementary DNA strand (strand D2) (Format 4, Figures 16 to 19)

Methods

The sequences of the synthetic oligonucleotides used for the encoding ligation step are shown below. The synthetic oligonucleotides were stored as 100 pM stock solutions at -20 °C.

65-mer oligonucleotide (Elib6_Code3, SEQ ID NO: 9)

5’-GCTCTGCACGGTCGCCTGAGATGTAGGATCACGCTGCCTGACGCdddddddCGTC GATCCGGCGC-3’ Molecular Weight = 19Ό89.21 Da; e = 0.58309 pM- 1 .crrr 1

14-mer oligonucleotide DNA adaptor (Elib6_aH_14mer, SEQ ID NO: 10)

5’-TCCCATGGCGCCGG-3’

Molecular Weight = 4’240.78 Da; e = 0.13435 pM- 1 .crrr 1

6.1 Construction of 3’ -amino-modified, 5’-phosphorylated alkynyl acid d-spacer oligonucleotide conjugate [strand D1 (Figure 16A)]

10 nmol of amino-modified d-spacer oligonucleotide (SEQ ID NO: 2), were dissolved in 10 mI of 250 mM borate buffer (2.5 pmol, pH = 9.4) with 2 pi of 200 mM 5-hexynoic acid (400 nmol in dry DMSO) and 1 pi of 400 mM DMT-MM (400 nmol in H2O) and reacted for 2 h at 25 °C. The DNA-Compound was precipitated with cold absolute EtOH and re-dissolved in H2O before LC-ESI-MS analysis, as described in Example 2.5, and Nanodrop measurement, as described in Example 2.4, yielding the strand D1.

6.2 Encoding by ligation [strand D2 (Figure 16B)]

8.1 pi of 124 mM of strand D1 (1 nmol in H2O), 16.1 pi of 93.5 mM coding oligonucleotide (Elib6_Code3 (SEQ ID NO: 9), 1.5 nmol in H2O), 1 .8 mI of 868 mM DNA adaptor oligonucleotide (Elib6_aH_14mer, SEQ ID NO: 10), 60.8 mI H2O and 10 pi of 10x ligase buffer were mixed, heated up to 90 °C for 2 min and passively cooled down to 25 °C (hybridisation). Afterwards, 3.2 mI of T4 ligase was added. Ligation was performed for 16 h at 16 °C. The ligase was inactivated for 10 min at 65 °C. The DNA-compound was purified by HPLC, as described in Example 3.3, quantified by Nanodrop, as described in Example 2.4, LC-ESI-MS analysed, as described in Example 2.5, to obtain the encoded oligonucleotide alkynyl conjugate (strand D2).

6.3 Cyclisation of the strands C2 sub-library and strand D2 (Format 4, Figure 17 to 20)

500 pmol of oligonucleotide azido conjugates sub-library (strands C2), described in Example 5, and 500 pmol encoded oligonucleotide alkynyl conjugate (strand D2) were diluted in 1 mI of 250 mM borate buffer (250 nmol, pH = 9.4) and subsequently incubated with 1 pi of 50 mM TBTA (50 nmol in DMSO), 1 pi of 50 mM copper(ll) sulfate pentahydrate (CUSO4 5H2O, 50 nmol in H2O), and 2 pi of 50 mM (+)-Sodium L- ascorbate (Vitamin C, 100 nmol in H2O) at 25 °C for 2 h. The reaction was diluted in H2O and analysed by LC-ESI-MS at 60 and 80 °C as shown in Example 2.5 (Figures 18C, 18D, 19A and 19B) and polyacrylamide gel electrophoresis as described in Example 2.4 (Figure 20).

6.4 Results

A cyclised dual-pharmacophore double-stranded library was generated from a single-stranded azido-sub- library (strands C2 pool) and a single-stranded alkyne conjugate (strand D2) by copper (l)-catalysed azide alkyne cycloaddition.

LC-ESI-MS was used to analyse the reactants and products at 60 °C after the reaction depicted in Figure 17. A peak was observed for the strands C2 sub-library (Figure 18A) and the strand D2 (Figure 18B) in the LC- ESI-MS trace. The mixture of the strands C2 sub-library and strand D2 without the ring closure reaction (Figure 18C) elicited a peak (8.911). This peak was shifted for the strand C2 sub-library and strand D2 mixture after the ring closure reaction (7.883) in the LC-ESI-MS trace (Figure 18D).

LC-ESI-MS was also used to analyse the reactants and products at 80 °C after the reaction depicted in Figure 17. The mixture of strands C2 sub-library and D2 without the ring closure reaction elicited a peak (7.467) (Figure 19A). This peak was shifted for the strands C2 sub-library and strand D2 mixture after the ring closure reaction (7.206) (Figure 19B).

The ring closure formation by copper (I) catalysed azide alkyne cycloaddition of the format 4 was also assessed by denaturing electrophoresis. Ring closure formation of the format 4 conjugate by copper (I) catalysed azide alkyne cycloaddition was observed (Figure 20 lane D).

Example 7: Construction of a cyclised dual-pharmacophore oligonucleotide final library format composed of a filled-in tri-functional DNA-encoded azido sub-library (strands C3 sub-library) and an alkvnyl acid encoded by a complementary DNA strand (strand D2) (Format 5, Figures 1 E, 21 and 22)

Methods

7.1 Construction of strands C3 sub-library by Klenow fill-in reaction (Figure 21 A)

After construction of a tri-functional DNA-encoded azido sub-library (strands C2 sub-library), as described in Example 5, and an encoded oligonucleotide alkynyl conjugate (strand D2), as described in Example 6, 1.7 mI of 58.3 mM of strands C2 sub-library (100 pmol in H2O), 2.0 mI of 49 mM of strand D2 (100 pmol), 10 mI of NEB 2 10x Klenow buffer and 80.2 mI H2O were mixed, heated up at 90 °C and passively cooled down to 25 °C (hybridisation). Afterwards, 4 mI of 5 mM dNTPs (20 nmol in H2O) and 2 mI of DNA polymerase were added. Klenow fill-in was performed for 2 h at 25 °C. The reaction solution was cleaned up by cartridge purification as described in Example 4.3, giving the DNA duplex composed of strand D2 and its

complementary extended oligonucleotides library (strands C3 sub-library). This reaction was performed in 5 batches, for a total amount of 500 pmol of each DNA strands. The completion of the reaction was estimated by electrophoresis with QIAxce/ Advanced Instrument and then cleaned up by cartridge purification, both procedures being described in Examples 4.3 and 4.4. 7.2 Cyclisation of the strands C3 sub-library and strand D2 (Format 5, Figure 21 B)

200 pmol of DNA duplex composed of strands C3 sub-library and strands D2 were dissolved in 0.4 pi of 250 mM borate buffer (0.1 pmol, pH = 9.4) and subsequently incubated with 0.6 pi of 50 mM TBTA (30 nmol in DMSO), 0.6 pi of 50 mM copper(ll) sulfate pentahydrate (CUSO4 5H2O, 30 nmol in H2O), and 0.4 pi of 50 mM (+)-Sodium L-ascorbate (Vitamin C, 20 nmol in H2O) for 2 h at 25 °C. The DNA-compound was diluted in H2O for Nanodrop measurement, LC-ESI-MS analysis and polyacrylamide gel electrophoresis, as described in Examples 2.4, 2.5 and 2.7 respectively.

7.3 Results

A cyclised dual-pharmacophore double-stranded library was generated from a single-stranded azido-sub- library extended by Klenow Fill-in (strands C3 sub-library) and a single-stranded alkyne conjugate (strand D2) by copper(l)-catalysed azide alkyne cycloaddition, as shown in Figure 21.

LC-ESI-MS was used to analyse the reactants and products after the reaction depicted in Figure 21 B.The following ring closure reaction, the mixture of the strands C2 sub-library and strand D2 (Format 5 conjugate) was found to elicit a peak in the LC-ESI-MS trace (Figure 22C).

Example 8: Construction of a cyclised dual-pharmacophore oligonucleotide library format composed of a trifunctional DNA-encoded azide scaffold sub-library (strands C3 sub-library) and an alkvnyl encoded by a complementary DNA sub-library [strands F2 sub-library (Figures 23 to 33)1.

Methods

The sequences of the synthetic oligonucleotides used for the encoding ligation step are shown below. The synthetic oligonucleotides were stored as 100 pM stock solutions at -20 °C.

34-mer oligonucleotide (YL_Code3, SEQ ID NO: 11)

5’-GCTCTGCACGGTCGCCTGAGATGCTGCCTGACGC-3’

Molecular Weight = 10’411 .8 Da; e = 0.33636 pM-1.cm-1

65-mer oligonucleotide (Elib6_Code3#1-n, SEQ ID NO: 12; 10 different codes were used, so n=10)

5’-GCTCTGCACGGTCGCXXXXXXXGTAGGATCACGCTGCCTGACGCddddddd

CGTCGATCCGGCGC-3’

Average Molecular Weight = 19Ό37.39 Da; Average e = 0.58307 pM _1 cnr 1 51 -mer oligonucleotide (Primer 1 aXX, SEQ ID NO: 13)

5’- TACACGACGCTCTTCCGATCTXXXXXXGGAGCTTCTGAATTCTGTGTGCTG -3’

42-mer oligonucleotide (Primer 1 b-a, SEQ ID NO: 14)

5’-CAGACGTGTGCTCTTCCGATCCGATATGCTCTGCACGGTCGC-3’

8.1 Construction of a double-stranded single-pharmacophore library (Format 6A, Figure 23) After construction of a tri-functional DNA-encoded azido sub-library (strands C2 sub-library), as described in Example 5, 6.9 pi of 29.17 pM of strands C2 sub-library (200 pmol in H2O), 0.7 pi of 287.7 pM of YL_Code3 (SEQ ID NO: 11 , 200 pmol), 5 pi of NEB 2 10x Klenow buffer and 31.4 pi H2O were mixed, heated up at 90 °C and passively cooled down to 25 °C (hybridisation). Afterwards, 4 pi of 5 mM dNTPs (20 nmol in H2O) and 2 pi of DNA polymerase were added. Klenow fill-in was performed for 2 h at 25 °C. The reaction solution was cleaned up by cartridge purification as described in Example 4.3, giving the single-pharmacophore DNA duplex (Format 6A). The completion of the reaction was estimated by electrophoresis with QIAxcel Advanced Instrument and then cleaned up by cartridge purification, both procedures being described in Examples 4.3 and 4.4.

8.2 Construction of 3’ -amino-modified, 5’-phosphorylated alkynyl acid d-spacer oligonucleotide conjugates [strands F1 (Figure 24)]

For most of the acid alkynes (seven out of ten), 20 nmol of amino-modified d-spacer oligonucleotide (SEQ ID NO: 2), were dissolved in 20 pi of 250 mM borate buffer (5 pmol, pH = 9.4) with 5 pi of 200 mM acid alkyne (1 pmol in dry DMSO) and 5 pi of 200 mM DMT-MM (1 pmol in H2O) and reacted for 2 h at 25 °C. The DNA- Compound was precipitated with cold absolute EtOH and re-dissolved in H2O before LC-ESI-MS analysis, as described in Example 2.5, yielding most of the strands F1 (Figure 24A).

For some other acid alkynes (two out of ten), 270 pi of a solution containing 56 mM EDC and 148 mM s-NHS in 85% DMSO/15% H2O was added to 237.5 pi of a 10.5 mM solution of acid alkyne in DMSO. After 30 min of activation at 30 °C, 75 pi of a solution of 267 pM of d-spacer oligonucleotide (SEQ ID NO: 2) in 66 mM MOPS, pH 8.0, was added. The reaction was stirred for 16 h at 30 °C. The strands F1 were precipitated with cold absolute EtOH and re-dissolved in H2O before LC-ESI-MS analysis, as described in Example 2.5, yielding some other strands F1 (Figure 24B).

For the last acid alkyne, 12.5 pi of a solution containing 200 mM DMT-MM in H2O was added to 44.5 pi of a 56 mM solution of acid alkyne in DMSO. After 30 min of activation at 25 °C, 72 pi of a solution of 278 pM of d-spacer oligonucleotide (SEQ ID NO: 2) in 49 mM MOPS, pH 8.0, was added. The reaction was stirred for 16 h at 25 °C. The strand F1 was precipitated with cold absolute EtOH and re-dissolved in H2O before LC- ESI-MS analysis, as described in Example 2.5, yielding the last strands F1 (Figure 24C).

The strands F1 were individually purified by HPLC, as described in Example 3.3, vacuum-dried and redissolved in H2O before LC-ESI-MS analysis, as described in Example 2.5, and Nanodrop measurement, as described in Example 2.4.

8.3 Encoding by ligation [strands F2 sub-library (Figure 25A)]

20 pi of 100 pM of strands F1 (2 nmol in H2O), 15 pi of 200 pM coding oligonucleotide (Elib6_Code3#1-n (SEQ ID NO: 12), 3.0 nmol in H2O), 5.35 pi of 598 pM DNA adapter oligonucleotide (Elib6_aH_14mer, SEQ ID NO: 10), 4.15 pi H2O and 5 pi of l Ox ligase buffer were mixed, heated up to 90 °C for 2 min and passively cooled down to 25 °C (hybridisation). Afterwards, 0.5 pi of T4 ligase was added. Ligation was performed for 16 h at 16 °C. The ligase was inactivated for 10 min at 65 °C. The DNA-compound was purified by HPLC, as described in Example 3.3, quantified by Nanodrop, as described in Example 2.4, LC-ESI-MS analysed, as described in Example 2.5, to obtain encoded oligonucleotide alkynyl conjugates (strand F2). Equimolar amounts of strands F2 were mixed together to generate the desired strands F2 sub-library, as described in Example 5.1.

8.4 Construction of strands C3B by Klenow fill-in reaction (Format 6B, Figures 25B to 25E)

6.86 pi of 29.17 pM strands C2 sub-library, (200 pmol in H2O), 20 pi of 10 pM strands F2 sub-library (200 pmol in H2O), 5 pi NEB 2 of 10x Klenow buffer and 12.4 pi H2O were heated up at 90 °C and passively cooled down to 25 °C (hybridisation). Afterwards, 4 pi of 5 mM deoxyribose nucleosidic triphosphates (dNTPs, 20 nmol in H2O) and 2 pi DNA polymerase were added. Klenow fill-in was performed for 1 h at 25 °C. The reaction solution was cleaned up by cartridge purification, giving the dual-pharmacophore DNA duplex (Format 6B) composed of strands F2 sub-library and strands C3B sub-library. The completion of the reaction was estimated by electrophoresis with QIAxcel Advanced Instrument and then cleaned up by cartridge purification, both procedures being described in Examples 4.3 and 4.4.

8.5 Cyclisation of the strands C3B and strands F2 sub-libraries (Format 6C, Figures 26 and 27)

200 pmol of DNA duplex composed of strands C3B sub-library and strands F2 sub-library were dissolved in 2 pi of 250 mM borate buffer (0.5 pmol, pH = 9.4) and subsequently incubated with 3 pi of 50 mM TBTA (150 nmol in DMSO), 2 pi of 50 mM copper(ll) sulfate pentahydrate (CuS0 4* 5H 2 0, 100 nmol in H2O), and 4 pi of 50 mM (+)-Sodium L-ascorbate (Vitamin C; 200 nmol in H2O) for 2 h at 25 °C. The DNA-compound was diluted in H2O and cleaned up by cartridge purification, as described in Example 4.3. The Format 6C has been quantified by Nanodrop measurement, as described in Example 2.4, and analysed by LC-ESI-MS and polyacrylamide gel electrophoresis, as described in Examples 2.5 and 2.7 respectively.

8.6 Affinity screening of three DNA-encoded chemical libraries (Formats 6 A, B and C) against Carbonic Anhydrase IX (CAIX)

Affinity selections were performed using a Thermo Scientific KingFisher magnetic particles processor. In the case of biotinylated CAIX, Streptavidin-coated magnetic beads (0.01 mg) were re-suspended in 100 pi PBS [Phosphate-Buffered Saline in H2O, pH = 7.4, 1.1 mM KH2PO4 (Potassium Phosphate monobasic), 155.2 mM NaCI, 3.0 mM Na 2 HP0 4* 7H 2 0 (Sodium Phosphate dibasic)] and subsequently incubated with 100 pi 1.0 pM biotinylated for 30 min with continuous gentle mixing. CAIX-coated beads were washed two times with 200 pi PBST for (Phosphate-Buffered Saline with Tween-20, 1x, pH = 7.4, 1.1 mM KH2PO4, 155.2 M NaCI, 3.0 mM Na 2 HP0 4* 7H 2 0 0.05% (v/v) Tween-20, in H2O) that was supplemented with 100 pM biotin in order to block remaining binding sites on Streptavidin, one time with 200 pi PBST without biotin and subsequently incubated with 100 pi of the DNA-encoded chemical library (6.8 nM of Format 6A or 68 nM of Formats 6B and C in PBST) for 1 h with continuous gentle mixing. After removing unbound library members by washing with 200 pi PBST for five times, beads carrying bound library members were re-suspended in 100 pi Tris buffer (trisaminomethane, 10 mM, pH = 8.5) and the DNA compound conjugates were separated from beads by heat denaturation of Streptavidin and CAIX at 95 °C for 5 min.

In the case of his-tagged CAIX, cobalt magnetic beads (0.01 mg) were re-suspended in 100 pi PBST (300) (pH = 7.4, 1.06 mM KH2PO4, 310,17 mM NaCI, 2.97 mM Na 2 HP0 4* 7H 2 0, 0.01 % (v/v) Tween-20, in H2O) and subsequently incubated with 100 pi 2.5 pM his-tagged CAIX for 30 min with continuous gentle mixing. CAIX-coated beads were washed three times with 200 mI PBST (300) and subsequently incubated with 100 mI of the DNA-encoded chemical library (6.8 nM of Format 6A or 68 nM of Formats 6B and C in PBST (300)) for 1 h with continuous gentle mixing. After removing unbound library members by washing with 200 mI PBST (300) for five times, beads carrying bound library members were re-suspended in 100 mI Tris buffer

(trisaminomethane, 10 mM, pH = 8.5), giving a solution of binders (Template)

8.7 Polymerase Chain Reactions 1 and 2 (PCR1 and PCR2) of the binders (Figure 28)

The PCR1 is first performed in order to encode the affinity selection experiments. A mixture composed of 25 mI of Phusion MasterMix, 3 mI 10 mM of Primer 1 b-a (SEQ ID NO: 14) and 14 mI H2O, was added to a mixture of 5 mI Template and 3 mI 10 mM of Primer 1 aXX specific to the template (SEQ ID NO: 13). The mixtures were incubated first to 98 °C for 1 min and subsequently heat to 98 °C for 10 s (denaturation of the Template DNA duplex) and cooled down to 72 °C for 15 s (annealing of the Primers and elongation of the DNA strands). This cycle was repeated 35 times before a final incubation at 72 °C for 5 min, yielding the PCR1 DNA constructs (PRC#1). The completion of the PCR reactions was estimated by QIAxcel automated electrophoresis, as described in Example 4.4. The PCR#1 are then cleaned up by cartridge purification, as described in Example 4.3, and quantified by Nanodrop measurement, as described in Example 2.4, before normalization of the PCR#1 to 1 ng.pl -1 .

The PCR2 is then performed in order to allow the sequencing of the DNA. To a mixture composed of 25 mI of Phusion MasterMix, 3 m1 10 mM of lllumina Primer 2a, 3 mI 10 mM of lllumina Primer 2b and 14 mI H2O, were added 5 mI PCR#1 . The mixtures were incubated first to 98 °C for 3 min and subsequently heat to 98 °C for 45 s (denaturation of the Template DNA duplex), cooled down to 72 °C for 45 s (annealing of the Primers) and heated up to 72 °C for 45 s (elongation of the DNA strands). This cycle was repeated 16 times before a final incubation at 72 °C for 5 min, yielding the PCR2 DNA constructs (PRC#2). The completion of the PCR reactions was estimated by QIAxcel automated electrophoresis, as described in Example 4.4. The PCR#2 were then cleaned up by cartridge purification, as described in Example 4.3, and quantified by Nanodrop measurement, as described in Example 2.4. The PCR#2 were then mixed together, yielding the PCR#2 pool, before the DNA extraction on agarose gel.

8.8 Agarose gel electrophoresis (Figure 29)

1.32 g of agarose was dissolved in 65 ml of TBE buffer (Tris/Borate/EDTA in H2O, 89 mM Tris, 89 mM Boric acid, 0.4 mM Ethylenediaminetetraacetic acid (EDTA), pH = 8.0). The mixture was boiled until the complete dissolution of the agarose before being cooled down to room temperature. 5 mI 25 mM ethidium bromide were then mixed with the solution. The gel was then poured in the electrophoresis chamber and let sit at room temperature for 30 min, until it has completely solidified. The electrophoresis chamber was then filled with TBE buffer to completely cover the agarose gel. The PCR#2 pool was mixed with loading dye and split into four different wells. The DNA migration was performed by applying a voltage of 110 V and an open current (about 70 mA) during 27 min. The migration was analysed by a brief UV excitation and the PCR#2 bonds were excised from the gel.

8.9 DNA extraction and sequencing sample preparation The gel slices were dissolved in 665 mI of NT 1 buffer by incubation at 39 °C until complete dissolution. The solution was introduced in NucleoSpin® column and centrifuged at 1 1 Ό00 rpm for 30 s to bind the DNA to the silica membrane. The membrane was then washed two times by centrifugation of 665 mI of NT3 buffer at 1 1 Ό00 rpm for 30 s and subsequently dried by centrifuging the column without buffer at 1 1 Ό00 rpm for 1 min. 50 mI of H2O were finally diffused onto the membrane for 1 min before a last centrifugation at 1 1 Ό00 rpm for 1 min to recovered the eluted DNA. This last step is repeated one more time to end up with 100 mI of purified PCR#2 pool. The DNA amount was quantified by Nanodrop measurement, as described in Example 2.4.

The DNA-compounds were then precipitated by adding 10 mI of 5 M NaCI and 300 mI of absolute ethanol before overnight storage at -20 °C. After 1 h of centrifugation at 14Ό00 rpm, the supernatant was carefully removed and the pellet re-dissolved in 500 mI of cold 80% EtOH before an additional centrifugation at 14Ό00 rpm for 30 min. The supernatant was carefully removed and the pellet was air dried for 1 h. The DNA compounds were re-dissolved in 100 mI. A small quantity was used for a last analysis by agarose gel, as described in Example 8.8. The sequencing sample was normalized at 14 ng.pl -1 in 55 mI.

8.10 High Throughput DNA Sequencing (HDTS) and data analysis (Figures 30 to 33)

The lllumina HTDS consists in the capture of the different DNA nucleotides on the surface of the reaction flow cell, which presents complementary sequences to lllumina Primers introduced during the PCR2, as described in Example 8.7. These oligonucleotides were amplified until the formation of several“clusters”

DNA strands. Primers were introduced with polymerase in order to build the complementary strands.

Furthermore, some fluorescently labelled nucleic bases, able to interrupt the strand elongation, were also brought among the usual dNTPs. In this way, a wide range of different fluorescent labelled DNA fragments were synthesised, allowing the identification of the base pair by iterative scanning procedure. Regarding the length of each fragment and the position of the label, the whole strand can be decoded.

Finally, after the sequencing of the DNA oligonucleotides of the selected binders, the data were analysed, using a C++ program, and the data were displayed with MATLAB software. The resulting fingerprints facilitate the distinction of the binding from nonbinding library members.

In summary, a single-pharmacophore double-stranded library (Format 6A) has been generated from a single-stranded azido-sub-library (strands C2 sub-library) and its complementary oligonucleotide built by Klenow fill-in. A dual-pharmacophore double-stranded library (Format 6B) has been generated from a single- stranded azido-sub-library (strands C2 sub-library) and a single-stranded alkynyl sub-library (strands F2 sublibrary). A cyclised dual-pharmacophore double-stranded library (Format 6C) has been generated from a single-stranded azido-sub-library (strands C3B sub-library) and a single-stranded alkynyl sub-library (strands F2 sub-library) by copper (l)-catalysed azide alkyne cycloaddition. These three libraries have been screened against CAIX as a protein target of interest to provide sets of binders specific for each library. The structures of the binders showed indeed an increase of the rigidity with the cyclised dual-pharmacophore settings. Example 9: Construction of a cvclised dual-pharmacophore oligonucleotide library format composed of a trifunctional DNA-encoded alkvnyl scaffold sub-library (strands G3 sub-library ' ) and an azido encoded by a complementary DNA sub-library [strands H2 sub-library (Figures 34 to 3611

Format 7 comprises a tri-functional scaffold (carboxylic acid, /V-Fmoc protected amine and alkyne moiety) as first building block (BB1), encoded by Elib5_Code1 (SEQ ID NO: 6), a carboxylic acid as second building block (BB2), encoded by Elib5_Code2 (SEQ ID NO: 7) providing a first population of nucleic acid conjugates (strands G2 sub-library, Figure 34), and an acid azide as third building block (BB3), encoded by

Elib6_Code3#1 -n (SEQ ID NO: 12; where, n is the total number of BB3) providing a second population of nucleic acid conjugates (strands H2 sub-library, Figure 35). The double-stranded molecules (strands G3 sublibrary and strands H2 sub-library) are then covalently linked by 1 ,2,3-triazole formation reaction between the azide (strands H2 sub-library), and the alkyne (strands G3 sub-library, Figure 36A), yielding the final cyclised double-stranded population of molecules forming a chemical library (Format 7, Figure 36B).

Example 10: Construction of a cvclised dual-pharmacophore oligonucleotide library format composed of an azido DNA-encoded sub-library (strands I3 sub-library) and an alkvnyl encoded by a complementary DNA sub-library [strands F2 sub-library (Figures 37 to 39)1

Format 8 comprises a /V-Fmoc amino acid as first building block (BB1), encoded by Elib5_Code1 (SEQ ID NO: 6), an acid azide as second building block (BB2), encoded by Elib5_Code2 (SEQ ID NO: 7) providing a first population of nucleic acid conjugates (strands I2 sub-library, Figure 37), and an acid alkyne as third building block (BB3), encoded by Elib6_Code3#1 -n (SEQ ID NO: 12; wherein n is the total number of BB3) providing a second population of nucleic acid conjugates (strands F2 sub-library, Figure 38). The double- stranded molecules (strands I3 sub-library and strands F2 sub-library) are then covalently linked by 1 ,2,3- triazole formation reaction between the azide (strands I3 sub-library, Figure 39A), and the alkyne (strands F2 sub-library), yielding the final cyclised double-stranded population of molecules forming a chemical library (Format 8, Figure 39B).

Example 1 1 : Construction of a cvclised dual-pharmacophore oligonucleotide library format composed of an alkvnyl DNA-encoded sub-library (strands J3 sub-library) and an azido encoded by a complementary DNA sub-library [strands H2 sub-library (Figures 40 and 41)1

The Format 9 comprises a /V-Fmoc amino acid as first building block (BB1), encoded by Elib5_Code1 (SEQ ID NO: 6), an acid alkyne as second building block (BB2), encoded by Elib5_Code2 (SEQ ID NO: 7) providing a first population of nucleic acid conjugates (strands J2 sub-library, Figure 40), and an acid azide as third building block (BB3), encoded by Elib6_Code3#1 -n (SEQ ID NO: 12) providing a second population of nucleic acid conjugates (strands H2 sub-library, as described in Example 9 and Figure 35). The double- stranded molecules (strands J3 sub-library and strands H2 sub-library) are then covalently linked by 1 ,2,3- triazole formation reaction between the alkyne (strands J3 sub-library, Figure 41A), and the azide (strands H2 sub-library), yielding the final cyclised double-stranded population of molecules forming a chemical library (Format 9, Figure 41 B).

Example 12: Construction of a cvclised dual-pharmacophore oligonucleotide library format composed of an azido DNA-encoded sub-library (strands K3 sub-library) and an alkvnyl encoded by a complementary DNA sub-library [strands F2 sub-library (Figures 42 and 43)1 The Format 10 comprises an acid ester as first building block (BB1), encoded by Elib5_Code1 (SEQ ID NO: 6), an amine azide as second building block (BB2), encoded by Elib5_Code2 (SEQ ID NO: 7) providing a first population of nucleic acid conjugates (strands K2 sub-library, Figures 42), and an acid alkyne as third building block (BB3), encoded by Elib6_Code3#1-n (SEQ ID NO: 12) providing a second population of nucleic acid conjugates (strands F2 sub-library, as described in Example 10, Figures 38). The double- stranded molecules (strands K3 sub-library and strands F2 sub-library) are then covalently linked by 1 ,2,3- triazole formation reaction between the azide (strands K3 sub-library, Figures 43 upper panel), and the alkyne (strands F2 sub-library), yielding the final cyclised double-stranded population of molecules forming a chemical library (Format 10, Figures 43 lower panel).

Example 13: Construction of a cyclised dual-pharmacophore oligonucleotide library format composed of an alkvnyl DNA-encoded sub-library (strands L3 sub-library) and an azido encoded by a complementary DNA sub-library [strands H2 sub-library (Figures 44 and 45)1

The Format 11 comprises an acid ester as first building block (BB1), encoded by Elib5_Code1 (SEQ ID NO: 6), an amine alkyne as second building block (BB2), encoded by Elib5_Code2 (SEQ ID NO: 7) providing a first population of nucleic acid conjugates (strands L2 sub-library, Figures 44) and an acid azide as third building block (BB3), encoded by Elib6_Code3#1-n (SEQ ID NO: 12) providing a second population of nucleic acid conjugates (strands H2 sub-library, as described in Example 9 and Figures 35). The double- stranded molecules (strands L3 sub-library and strands H2 sub-library) are then covalently linked by 1 ,2,3- triazole formation reaction between the alkyne (strands L3 sub-library, Figures 45 upper panel), and the azide (strands H2 sub-library), yielding the final cyclised double-stranded population of molecules forming a chemical library (Format 11 , Figures 45 lower panel).

Tables

Table 1

Table 2

Sequence Listing

SEQ ID NO: 1 ; Amino-modified 48-mer oligonucleotide

GGAGCTTCTGAATTCTGTGTGCTGACGTAACGAGTCCCATGGCGCAGC

SEQ ID NO: 2, Amino-modified d-spacer oligonucleotide (MW = 12058.74 Da) wherein“dddddd” is an abasic region

CATGGGACTCGddddddCAGCACACAGAATTCAGAAGCTCC

SEQ ID NO: 3; 5’-aminomodified, 48-mer oligonucleotide (Elib2_Code1)

GGAGCTTCTGAATTCTGTGTGCTGACTATCCGAGTCCCATGGCGCAGC

SEQ ID NO:4; 38-mer oligonucleotide (Elib4_Code2)

CCTGCATCGAATGGATCCGTGAATTATTCGCAGCTGCG

SEQ ID NO: 5; 21 -mer oligonucleotide RNA adaptor (Elib4_aT)

CGA-rG-5-Me-U-rC-CCATGGC-rG-rC-rA-rG-CTGC

SEQ ID NO: 6; 5’-amino-modified, 45-mer oligonucleotide (Elib5_Code1) wherein X is a nucleotide

GGAGCTTCTGAATTCTGTGTGCTGXXXXXXCGAGTCCCATGGCGC

SEQ ID NO: 7; 5’-phosphorylated, 29-mer oligonucleotide (Elib5_Code2) wherein X is a nucleotide

CGGATCGACGXXXXXXXGCGTCAGGCAGC

SEQ ID NO: 8; 25-mer oligonucleotide DNA adaptor (Elib5_aH)

CGTCGATCCGGCGCCATGGGACTGG

SEQ ID NO: 9; 65-mer oligonucleotide (Elib6_Code3)

GCTCTGCACGGTCGCCTGAGATGTAGGATCACGCTGCCTGACGCdddddddCGTCGATCC GGCGC

SEQ ID NO: 10; 14-mer oligonucleotide DNA adaptor (Elib6_aH_14mer)

TCCCATGGCGCCGG

SEQ ID NO: 11 ; 34-mer oligonucleotide (YL_Code3, S)

GCTCTGCACGGTCGCCTGAGATGCTGCCTGACGC

SEQ ID NO: 12; 65-mer oligonucleotide (Elib6_Code3#1-n)

GCTCTGCACGGTCGCXXXXXXXGTAGGATCACGCTGCCTGACGCdddddddCGTCGATCC GGCGC

SEQ ID NO: 13; 51 -mer oligonucleotide (Primer 1 aXX,)

TACACG ACGCT CTTCCGAT CTXXXXXXGGAGCTTCT GAATT CT GT GTGCTG SEQ ID NO: 14; 42-mer oligonucleotide (Primer 1 b-a) CAGACGTGTGCTCTTCCGATCCGATATGCTCTGCACGGTCGC