Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
GENE SYNTHESIS USING TEMPLATED CHEMICAL LIGATION
Document Type and Number:
WIPO Patent Application WO/2018/138218
Kind Code:
A1
Abstract:
The present invention relates to a method for producing oligonucleotides, preferably gene fragments or genes. A plurality of gene strand, comprising at least one functional group on one and where appropriate another function group on the other end, and a plurality of staple strands are hybridized together to form a nanostructure. In this nanostructure the functional group of the gene strands are in close proximity with each other. Then the functional groups are reacted with each other forming the oligonucleotide. The functional groups are preferably alkynes and azides and the gene strands are coupled using click chemistry.

Inventors:
MANETTO ANTONIO (DE)
Application Number:
PCT/EP2018/051878
Publication Date:
August 02, 2018
Filing Date:
January 25, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MANETTO ANTONIO (DE)
METABION INT AG (DE)
International Classes:
C07H21/00; C12Q1/6806
Domestic Patent References:
WO2015177520A12015-11-26
Foreign References:
US20130046084A12013-02-21
US20130046083A12013-02-21
EP2940150A12015-11-04
Other References:
A. H. EL-SAGHEER ET AL: "Biocompatible artificial DNA linker that is read through by DNA polymerases and is functional in Escherichia coli", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, vol. 108, no. 28, 12 July 2011 (2011-07-12), pages 11338 - 11343, XP055196445, ISSN: 0027-8424, DOI: 10.1073/pnas.1101519108
MARKUS KRAMER ET AL: "Enzyme-Free Ligation of 5'-Phosphorylated Oligodeoxynucleotides in a DNA Nanostructure", CHEMISTRY & BIODIVERSITY, vol. 14, no. 9, 11 August 2017 (2017-08-11), CH, pages e1700315, XP055457818, ISSN: 1612-1872, DOI: 10.1002/cbdv.201700315
MIKIEMBO KUKWIKILA ET AL: "Assembly of a biocompatible triazole-linked gene by one-pot click-DNA ligation", NATURE CHEMISTRY, vol. 9, no. 11, 11 September 2017 (2017-09-11), GB, pages 1089 - 1098, XP055454196, ISSN: 1755-4330, DOI: 10.1038/nchem.2850
XIAOHUA PENG ET AL: "A Template-Mediated Click-Click Reaction: PNA-DNA, PNA-PNA (or Peptide) Ligation, and Single Nucleotide Discrimination", EUROPEAN JOURNAL OF ORGANIC CHEMISTRY, WILEY - V C H VERLAG GMBH & CO. KGAA, DE, vol. 2010, no. 22, 1 August 2010 (2010-08-01), pages 4194 - 4197, XP002739264, ISSN: 1434-193X, [retrieved on 20100617], DOI: 10.1002/EJOC.201000615
SHAWN M DOUGLAS ET AL: "Self-assembly of DNA into nanoscale three-dimensional shapes", NATURE, MACMILLAN JOURNALS LTD., ETC, vol. 459, no. 7245, 21 May 2009 (2009-05-21), pages 414 - 418, XP002690757, ISSN: 0028-0836, DOI: 10.1038/NATURE08016
EL-SAGHEER ET AL., JACS, vol. 131, 2009, pages 3958 - 3964
N.BIRTS, ANGEW. CHEM., vol. 53, 2014, pages 2362 - 2365
EL-SAGHEER ET AL., PNAS, vol. 108, 2011, pages 11338 - 11343
KUKWIKILA, M.; GALE, N.; EL-SAGHEER, A. H.; BROWN, T.; TAVASSOLI, A., NAT. CHEM., 2017, pages 1089 - 1098
DUNN, K. E. ET AL.: "Guiding the folding pathway of DNA origami", NATURE, vol. 525, 2015, pages 82 - 86
S. DOUGLAS ET AL., NATURE, vol. 459, 2009, pages 414 - 418
Attorney, Agent or Firm:
PATENTANWÄLTE GIERLICH & PISCHITZIS PARTNERSCHAFT MBB (DE)
Download PDF:
Claims:
CLAIMS

1. Method for producing an oligonucleotide, preferably a gene or gene fragment, comprising the steps

a) providing a plurality of staple strands,

b) providing a plurality of gene strands, which sequences are parts of the sequence to be produced, wherein the gene strands comprise at least one functional group and where ap- propriate a second functional group capable to react with the first functional group;

c) contacting the staple strands and the gene strands by forming a nanostructure, whereby the functional groups of dif¬ ferent gene strands come into close proximity;

d) reacting the functional groups and thereby coupling the gene strands producing the oligonucleotide.

2. Method according to claim 1, wherein the functional groups are selected from alkyne and azide groups.

3. Method according to one of the claims 1 or 2, wherein the oligonucleotide produced is amplified by a PCR reaction as step e) . 4. Method according to any of claims 1 to 3, wherein the staple strands are selected from DNA, RNA and nucleic acid an¬ alogues such as peptide nucleic acids (PNA) , morpholino, locked nucleic acids (LNA) as well as glycol nucleic acids (GNA) and threose nucleic acids (TNA) .

5. Method according to any of claims 1 to 4, wherein the gene strands are selected from DNA, RNA and nucleic acid ana¬ logues such as peptide nucleic acids (PNA) , morpholino, locked nucleic acids (LNA) as well as glycol nucleic acids (GNA) and threose nucleic acids (TNA) .

6. Method according to any of claims 1 to 5, wherein the oligonucleotide has a length of at least 500 nucleotides. 7. Method according to any of claims 1 to 6, wherein at least 10 gene strands are used.

8. Method according to any of claims 1 to 7, wherein in the nanostructure at least one gene strand is hybridised to sec- tions of at least three different staple strands.

9. Method according to any of claims 1 to 8, wherein the nanostructure formed is a helix bundle. 10. Method according to claim 9, wherein the helix bundle is a six helix bundle.

11. Oligonucleotide comprising at least 11 triazole linkag¬ es in the backbone.

12. Oligonucleotide according to claim 11, wherein the oligonucleotide is based on DNA, RNA and nucleic acid analogues such as peptide nucleic acids (PNA) , morpholino, locked nucle¬ ic acids (LNA) as well as glycol nucleic acids (GNA) and thre- ose nucleic acids (TNA) .

13. Use of the oligonucleotide according to any of claims 11 or 12, in PCR reaction.

14. Kit for producing an oligonucleotide, gene or gene fragment, comprising a plurality of staple strands, coupling reagents .

Description:
GENE SYNTHESIS USING TEMPLATED CHEMICAL LIGATION

FIELD OF THE INVENTION

The present invention relates to a method for producing oligo ¬ nucleotides, preferably gene fragments or genes.

DESCRIPTION OF RELATED ART

The DNA nanotechnology and synthetic biology fields rely on synthetic oligonucleotides that are assembled to form

nanostructures and artificial genetic systems. A good quality synthesis of nucleic acids - using solid phase synthesis - de ¬ pends among other factors on the sequence length, nucleotide composition and the purification system used. Yields exceeding 99% are not rare for each coupling step, although even the most efficient synthesis setup cannot reach 100% coupling ef ¬ ficiency. Therefore, the overall percentage yield of oligonu ¬ cleotides strongly depends on their length. For instance, the synthesis of a 200-mer, where each cycle has an efficiency of incorporation of 99%, yields 13% of full-length product without taking in account further purification steps.

Therefore, synthesis of oligonucleotides that are shorter than 100 nts is preferred in order to achieve reliable yields. Fur- thermore, the acidic reagents used for the de-tritylation step can lead to the formation of abasic sites and cleavage of the biopolymer, further decreasing the yield of full-length oligonucleotides. Various approaches based on joining multiple short oligonucleotides have been developed to overcome these limitations with the goal of assembly long synthetic DNA strands (gene fragments) . Currently two main strategies are used to generate gene fragments from oligonucleotides, both based on procedures that involve the use of enzymes. The first utilises DNA ligase whereas the second relies on the activity of DNA polymerases. In the ligation methods, which represent the earliest example of synthetic gene synthesis, the double- stranded DNA is assembled from complementary overlapping strands subsequently joined by the ligase to produce longer fragments; this requires 5' phosphorylated oligonucleotides. This method is used extensively, but becomes inefficient when more than 14 oligonucleotides are ligated. In the second meth ¬ od, called DNA polymerase cycling assembly (PCA) , which is based on the activity of DNA polymerase enzymes, the desired gene fragment is produced in a multiple step assembly. Alt- hough these methods give access to a large variety of DNA fragments, there are some limitations due to mispriming, for ¬ mation of secondary structures, and mistakes that occur when assembling repetitive sequences that hinder the polymerase ac ¬ tivity. Therefore, in the rapidly evolving genomic, epigenomic and DNA-nanotechnology fields the demand for functional gene fragments are not yet fully met, and alternative approaches are urgently needed. It was demonstrated in 2009 how DNA and RNA polymerases are able to read through a triazole linkage in place of a phos ¬ phate group between two nucleobases [El-Sagheer et al . JACS 2009 131, 3958-3964] . The triazole was efficiently generated by the Cu ( I ) -mediated cycloaddition between two oligonucleo ¬ tides bearing an alkyne and an azide group respectively. This discovery immediately opened new routes for the assembly of exceptionally long synthetic ssDNA strands. It was demonstrat- ed the compatibility of the triazole linkage not only towards common polymerases, but also in complex systems such as E. coli expression systems and even in eukaryotic cells [N.Birts et.al. Angew. Chem. 2014 53, 2362-2365] . A limited number of triazole linkages were used in those studies due to the diffi- culty to correctly assemble together several oligonucleotides carrying the azide and the alkyne groups at their termini. A 300nt long ssDNA - containing two triazole linkages - was for example assembled using three lOOnt long sequences. It was hy ¬ pothesized, that entire genes could be produced starting from 20-50 alkyne/azide fragments using this chemical ligation and the resulting bio-compatible triazole linkage. The method de ¬ scribed in El-Sagheer et al . PNAS 2011 108, 11338-11343 uses short unmodified sequences (splints) designed to pair with two modified oligonucleotides, so that alkyne and azide groups are held in close proximity. The efficacy and selectivity of the chemical ligation (click reaction) is thus assured by a hybridization event of the two oligonucleotides with a splint. A correct stoichiometry is required for an efficient and specif ¬ ic chemical ligation. Therefore - in solutions - full products are obtained in very low yields and sequences exceeding the 300-600nt are difficult to be generated. Kukwikila et al . demonstrated enzyme-free, click-mediated gene assembly starting from 10 functionalized oligonucleotides that overlap to create a small 335 bp gene. The assembled gene is functional both in vitro and in vivo, confirming the biocom- patibility of the triazole linkage (Kukwikila, M., Gale, N., El-sagheer, A. H., Brown, . & Tavassoli, A. Nat. Chem. 1089- 1098 (2017)). The assembly approach used in Kukwikila et al . is based on splint oligonucleotides, but it is known that when increasing the number of strands to create a long gene, the complexity of the assembled procedure increases proportional ¬ ly, often leading to failure in the synthesis of full-length sequences. An alternative method to chemically ligate DNA strands is based on formation of phosphoramidate linkages. In this case, 3 ' -amino-modified oligonucleotides react with 5'- phosphorylated partner strands in templated reactions. This method has been recently used for gene synthesis and also to ligate DNA nanostructures.

BRIEF SUMMARY OF THE INVENTION

The object of the invention is to further improve the chemical ligation method in order to allow the assembly of long ssDNA and dsDNA strands or genes, which may include modifications such as epigenetic bases. The de novo gene synthesis offers the ability to optimize genes for unnatural hosts, alter ex ¬ isting or append new restriction sites, create chimeric fusion proteins, or even produce genes for completely artificial transcripts .

The problem is solved using the DNA self-assembly properties - used in DNA nanotechnology to create 2D and 3D nanostructures - to pre-organize several oligonucleotides with nanometer con ¬ trol. Using this approach it was possible to perform the chem ¬ ical ligation in 13 positions simultaneously with a guaranteed 1:1 stoichiometry . This concept can be easily extended to much longer sequences also in presence of oligonucleotides contain ¬ ing modification or other natural and un-natural nucleobases from the deoxy- or ribo-series, but also analogous thereof (LNA, PNA, UNA etc. ) . DNA nanostructures are nanoscale structures made of DNA, which acts both as a structural and functional element. DNA

nanostructures can be generated using - among others - the so called DNA Origami technique. This aim is achieved by the inventions as claimed in the inde ¬ pendent claims. Advantageous embodiments are described in the dependent claims.

Even if no multiple back-referenced claims are drawn, all rea- sonable combinations of the features in the claims shall be disclosed .

The object of the invention is also achieved by a method. In what follows, individual steps of a method will be described in more detail. The steps do not necessarily have to be per ¬ formed in the order given in the text. Also, further steps not explicitly stated may be part of the method.

In the method a plurality of staple strands is provided. This plurality is contacted with a plurality of gene strands (gene oligos) , which sequences are parts of the gene sequence to be synthesized. The plurality of gene strands comprises at least one functional group and where appropriate a second functional group capable to react with the first functional group. The functional groups are situated at the end of the strands. By providing conditions to induce self-assembly of the staple strands and the gene strands a nanostructure is formed. This nanostructure is designed so that the functional groups of different gene strands come into close proximity so that by coupling the functional groups the gene strands are coupled with each other and the gene sequence to be synthesized is formed. Preferably the ends of two gene strands with their functional groups form a double strand with the same staple strand in the region of the coupling.

In a preferred embodiment the functional groups are click functional groups capable to react with each other to form a 1 , 2 , 3-triazole linkage, these are alkyne and azide groups. Preferred are terminal alkynes and azides. These groups are situated at the end or ends of each gene strand, so that upon reaction to the triazole linkage the two strands are covalent- ly linked. By choosing appropriate linker, like ether, ethylene groups, a similar spacing compared to the usual linkage of the strands, for examples the phosphate backbone or DNA or RNA can be reached. The groups may substitute groups of the usual functional groups present on the end of each strand, for example hyxdoxyl group, amine group, carboxylic acid groups.

The alkynes may be present as propargyl ether, for example in 3' position of DNA or RNA. The azide may substitute the hy- droxyl group in 5' position of DNA or RNA.

Gene strands representing the end of the oligonucleotide to be produces only comprise one functional groups, all other gene strands comprise at least two functional groups, preferably two functional groups. The functional groups are so located that when the gene strands are within the nanostructure, the functional group at the end of one gene strand is in close proximity of a functional group on a second gene strand, capa ¬ ble of reacting with the other functional group of the first gene strand. Preferably this is obtained by hybridising the end section of the two gene strands to adjacent sections of one staple strand. The sections of the staple strand bridge the two different gene strands. The nanostructure gives the template to organize all gene strands in such a way, that by reacting the functional groups the oligonucleotide is formed.

When the gene strands are linked together the synthesized strand can be separated from the much shorter staple strands. Due to the high yield of the Click reaction gene sequences with at least 5, preferably at least 10 linkage reactions can be synthesized. The term "staple strand" refers to any at least partially sin ¬ gle-stranded nucleic acid or nucleic acid binding molecule. Examples of staple strands include DNA, RNA and nucleic acid analogues such as peptide nucleic acids (PNA) , morpholino, locked nucleic acids (LNA) as well as glycol nucleic acids (GNA) and threose nucleic acids (TNA) .

The term "section" refers to a sequence of at least 2 nucleo ¬ tides, preferably at least 3, even more preferably at least 4 nucleotides, of a strand capable of selectively binding to a corresponding pair of nucleotides of a different strand. Pref ¬ erably a section is 2 to 50 nucleotides long, preferably 3 to 40 nucleotides. Individual staple strands may range in length from 10 to 100, preferable from 18 to 70 nucleotides, more preferably from 18- 50 nucleotides. The lower limit is justified by considering that the binding of shorter staples may not be stable. Cost considerations regarding a high throughput synthesis of suffi ¬ ciently pure staples typically set the upper length limit.

In a preferred embodiment of the invention at least one staple strand comprises one section capable of binding selectively to a section of one gene strand and another section capable of binding selectively to a section of a different gene strand. More preferably at least one staple strand comprises at least three sections each capable of binding selectively to a sec- tion of a different gene strand. More preferably each staple strand comprises at least two sections each capable of binding selectively to a different section of a different gene strand.

In another embodiment of the invention in the nanostructure more than 50 % of the staple strands comprises at least three sections each capable of binding selectively to a different section of a different gene strand.

A selective binding is understood as a pairing of the section of the staple strand with the corresponding section of the gene strand under the folding conditions of the nanostructure (lx TE buffer with 20 mM MgCl 2 , and a temperature of below 20 °C) .

In another preferred embodiment at least one gene strand hybridised to sections of at least three different staple strands in the nanostructure. Since each staple strands i preferably hybridized to sections of at least two, preferably at least three gene strands, a stable nanostructure is formed, preferably a single nanostructure is formed. This nanostruc ¬ ture can be imaged using atomic force microscopy.

In a preferred embodiment of the invention the nanostructure formed is a helix bundle, more preferable a six helix bundle. This structure is known from DNA origami (Dunn, K. E. et al . Guiding the folding pathway of DNA origami. Nature 525, 82-86 (2015) ) . Preferably the six helices form a hexagonally symmet ¬ ric arrangement. Strands hybridyzed to more than one helix form bridges between these helices. In this structure every helix is linked to at least two other helices. Preferably the nanostructure is designed, so that oligonucleo ¬ tide to be synthesized is part of all the helices of the nanostructure, while the staple strands link the different helices. The sequence and length is chosen so that a stable nanostructure is formed. This can be done by estimation of the binding strength based on the sequence. Such methods are known to the person skilled in the art.

Preferably the nanostructure is formed by all staple strands and all gene strands used.

By using this nanostructure, wherein each strand binds to more than two other strands, the selectivity of the linkage reac ¬ tion is dramatically increased, compared with the usual liga ¬ tion using only a bridging strand. This allows the formation of the oligonucleotide in one single step. To form the nanostructure it can be necessary to heat the mix ¬ ture of the strands to a temperature of 80 °C to 96 °C and then cool it down to a temperature below 25 °C with a cooling rate of below 2 °C/min.

The reaction mixture may further comprise buffers and salt concentrations needed to form the nanostructure. The concen ¬ trations depend on the sequence and the amount of charges pre ¬ sent on the nanostructures , e.g. if neutral strands like PNA are used the charge is lesser than in the case of DNA. Usually at least 1 to 30 mM of a doubly charged cation like Ca 2+ or Mg 2+ , preferably Mg 2+ , is present. The buffer present can be usual buffers known for oligonucleotides like TE, TAE or TBE buffers .

According to a preferred aspect of the invention, extended staple strands are used which include a domain having a se ¬ quence that does not hybridize to other staple strands or to the gene sequence. Additional elements can be directly or in- directly attached to such staples. As used herein, the term "directly bound" refers to a covalent attachment while the term "indirectly bound" in contrast refers to the attachment to an entity through one or more non-covalent interactions. The length of the gene strands depend of the oligonucleotide to be produced. Altogether the gene strands comprise the com ¬ plete sequence of this oligonucleotide cut into the different gene strands. The length of the gene strand is sufficient for a specific binding to a defined position to one or more staple strands. The length is preferably at least 18 nucleotides, more preferably at least 30 nucleotides. According to a preferred aspect of the invention the individu ¬ al gene strands may range in length from 4 to 3000 nucleo ¬ tides, preferably 4 to 300 nucleotides, more preferably from 18 to 80 nucleotides. The lower limit is justified by consid ¬ ering that the binding of shorter gene strands to the staple strands and therefore the nanostructure may not be stable.

Cost considerations regarding a high throughput synthesis of sufficiently pure gene strands typically set the upper length limit, especially when these strands comprise modifications. Longer gene strands are more difficult to synthesize and also the risk of mispairing is increasing.

In another embodiment of the invention at least 10, preferably at least 12, more preferably at least 14 gene strands are used .

The sequence to be produced has preferably a length of at least 500 nucleotides, preferably at least 600 nucleotides, especially at least 700 nucleotides. Such long oligonucleo ¬ tides are also termed genes or gene fragments.

For reacting the functional groups using click chemisty, a source for Copper (I) -ions is added to the reaction. This in- eludes that the solution comprising the folded nanostructure is added to a solution comprising Copper ( I ) -ions .

In a preferred embodiment the source for Copper (I) is chosen from Copper (I) salts like CuBr, Cu(II) salts as Cu(S0 4 ) or ele- mentar Cu(0) . In case of Cu(II) salts a reducing agent like sodium ascorbate can be added to reduce the Cu(II) salt in situ . In a preferred embodiment a Cu(I) stabilizing ligand is added, more preferably a ligand comprising the structure of

Tris (triazolylmethyl) amine, like TBTA (Tris [ ( 1-benzyl-lH- 1, 2, 3-triazol-4-yl) methyl] amin) or THPTA (Tris (3- hydroxypropyltriazolylmethyl ) amine) ) .

In case of Copper (I) salts an organic solvent like DMSO tert. -Butyl can be necessary to prevent precipitation of salts .

The amount of Cu(I) is sufficient to initiate the reaction be ¬ tween the azide and the alkyne groups. The reaction conditions can be adapted to the sequences used. Preferred is a reaction temperature of between 10°C and 40 °C, preferably 15°C and 25 °C. The reaction time is preferably from 1 hour to 8 hours . The mixture may be shaken during reac ¬ tion .

After the reaction the reaction mixture can be purified using gel electrophoresis.

In another embodiment of the at least one of the gene strands comprises a selective anchor, like a biotin or a specific se ¬ quence, to purify the mixture by affinity of this anchor.

In another embodiment at least one of the staple stands com ¬ prises a selective anchor.

In a preferred embodiment of the invention the oligonucleotide produced is amplified by a PCR reaction. By this step the cou- pled functional group in the backbone of the oligonucleotide can be replaced by the natural phosphate groups. It may be necessary to use a polymerase, which is able to read over the triazole linkages. By this step a natural oligonucleotide can be obtained, which can be manipulated with usual methods of the biochemistry.

For further purification of the oligonucleotide the oligonu ¬ cleotide obtained may be inserted into a plasmid, which se- quence can be controlled by cloning, sequencing and then se ¬ lecting the clones carrying the correct sequence.

Another object of the invention is an oligonucleotide compris ¬ ing at least 11 triazole linkages, preferably at least 12 tri- azole linkages, even more preferably at least 13 linkages, in the backbone. Preferably the oligonucleotide is a single stranded oligonucleotide.

Backbone is understood as a continuous sequence of covalently connected structures onto which the nucleobases are attached to. In case of DNA the backbone comprises 2-Deoxyriboses linked together by phosphate groups.

Another object of the invention is an oligonucleotide produced by the method of the invention.

In a preferred embodiment of the invention the oligonucleotide is based on DNA, RNA and nucleic acid analogues such as pep ¬ tide nucleic acids (PNA) , morpholino, locked nucleic acids (LNA) as well as glycol nucleic acids (GNA) and threose nucle ¬ ic acids (TNA) . In a preferred embodiment of the invention the oligonucleotide is DNA. The triazole linkages in the backbone preferably substitute the linkage of the DNA, RNA or nucleic acid analogue used, e. g. the phosphate groups in DNA or RNA or the amide group in PNA .

Another object of the invention is the use of the oligonucleo ¬ tide in PCR, preferably as a template. Another object of the invention is a kit for producing an oligonucleotide, gene or gene fragment, comprising a plurality of staple strands, coupling reagents and optionally buffers. In this case the gene strands are provided by the user of the kit .

The kit may also comprise a plurality of gene strands, which sequences are parts of the sequence to be produced, wherein the gene strands comprise at least one functional group and where appropriate a second functional group capable to react with the first functional group.

Preferably the functional groups are alkyne and azide groups. In this case the coupling reagents are described reagents for the click chemistry.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

Other objects and advantages of the present invention may be ascertained from a reading of the specification and appended claims in conjunction with the drawings therein. For a more complete understanding of the present invention, reference is established to the following description made in connection with accompanying drawings in which: Fig. 1 Schematic representation of the gene assembly pro ¬ cess, (a) Chemical ligation mechanism. Colored shapes represent the corresponding molecule to the left; (b) Gene oligonucleotides (GOs) and staples are folded forming the 6HB (c, caDNAno) . (d) CuAAC (copper (I) catalyzed azide-alkyne cycloaddition) of the GOs in ¬ side the 6HB (six helix bundle) forms a long linear scaffold (in blue, n°l) . (e) 3D views of the designed DNA nanostructure with hexagons highlight click points (f) PCR product with canonical phosphate back- bone;

Fig. 2 Assembly experiments, (a) Salt test (mM MgCl 2 ) : the structure formed at concentrations above 10 mM, but two species were present, (b) The product of the click reaction with the heterogeneous catalyst con- tained in the vial "reactor M" (BAS) runs like the folded sample, (c) The 6HB after click reaction is stable in absence of Mg ions. The PCR done with Taq polymerase shows a product of the correct length, (d) Comparison between splint assembly without a nanostructure and assembly in the 6HB. The latter shows a product of the same length as the positive control (last lane, PCR on the EGFP gene) . (e) AFM of the folded sample: monomers of ~ 43 nm and dimers of ~ 82 nm are formed, (f) AFM of the ligated sample: monomers of ~ 42 nm and dimers of ~ 78 nm are shown;

Fig. 3 Staple set 1 design based on Douglas et.al. 2009; Fig. 4 Staple set 2 modified design to avoid placing the li ¬ gation site close to the crossing overs;

Fig. 5 Schematic representation of the 3D nanostructure

bearing oligos with alkyne and azide groups; Fig. 6 Staple set 1 and staple set 2 tested with cut EGFP mix ;

Fig. 7 Schematic representation of the click reaction within the 3D nanostructure forming the EGFP gene sequence;

Fig. 8 PCR reaction using HF Taq polymerase on click reac- tion. The lanes 1 and 2 show the expected length of the product. It is also shown how the PCR reaction is not working when only the folding reaction without the click is used as template (lanes 3 and 4);

Fig. 9 AFM Picture of the 3D nanostructure after the click reaction;

Fig. 10 Agarose gel without MgCl 2 ; Lanes 1 and 5 show the

mixture before the click reaction, the structure is completed unfolded; After the click reactions (lanes 2 and 6) the band corresponding to the gene is at the correct length; Lanes 3 and 7 show the resulting PCR product; Lanes 4 and 8 show a reference product;

Fig. 11 Agarose gels with MgCl 2 (left) and without MgCl 2 ;

Fig. 12 Denaturing PAGE of the reaction mixture at different times; pattern of the EGFP cut (lane 1); mixture be- fore (lane 2) and after folding (lane 3); after the click reaction (lane 4) these strands disappear; The generation of the fully extended ssDNA (triazole- gene) is observed (lane 5) in the gel as well and compared with a reference (lane 6, arrows) ; Fig. 13 Schematic representation of a restriction assay of the PCR product; The restriction reaction leads to the production of two bands at the expected length. Lane 2: restriction assay. Lane 3: PCR from the tri- zole DNA. Lane 4: PCR product cleaned up using puri ¬ fication kit. Lane 5: PCR on commercial EGFP sequence as control;

Detailed Description of the INVENTION Examples All reagents for chemical synthesis were purchased from Sigma- Aldrich. Unmodified oligonucleotides were acquired from Meta- bion International AG. The 3' -alkyne and/or 5'-azide modified gene oligonucleotides carry the modification as shown in fig. 1.

Using the method of the invention a 762 bp gene (Sequence ta ¬ ble 3, Seq-ID No. 3) starting from 14 functionalized oligonu ¬ cleotides was synthesized (Sequences are shown in table 2, Seq-ID Nos. 4-17; 3' alkyne-, 5' azide-modification as shown in figure 1 a; The first strand EGFP1 and last strand EGFP14 is only modified on the side of the linkage.) . This system em ¬ ploys a DNA nanostructure - a 6-helix bundle (6HB) - as vehi ¬ cle for gene assembly. The DNA nanostructure is formed by the staple strands (table 2, Seq-ID Nos. 18-35) and the GOs . DNA nanostructures are known for their ability to fold in a pre ¬ designed manner with their most stable conformation as the fully-assembled nanostructure. Using this technique, a 6HB was assembled where all "gene oligonucleotides" (GOs, 3' alkyne-, 5' azide-modified, sequences in table 2, Seq-ID Nos. 4-17) are brought in close proximity, ordered in a predesigned fashion with an equimolar stoichiometry and ligated through click chemistry. The resulting product is then amplified by PCR to convert the triazole linkage in a canonical phosphodiester backbone .

As a model system the gene coding for the enhanced green fluo- rescent protein (eGFP) was used. The design is origami-based and the gene constitutes the scaffold of the 6HB nanostruc- ture . The design was supported by the caDNAno software pack- age20 and consists of 3 steps: (1) the 762 nt long gene is de ¬ signed to run through the nanostructure forming the scaffold of a 6HB of -40 nm in length. (2) The gene-scaffold is frag ¬ mented into strands of < 60 nt to assure reliable chemical synthesis of the double functionalized GOs . (3) The staples are designed to allow the structure to fold in a hierarchical order. In the case of eGFP, the gene was divided into 14 GOs bearing a terminal 5' azide-modified thymine and a terminal 3' alkyne-modified cytosine, only the first and the last GOs were mono-functionalized respectively as 5' azide and 3' alkyne.

To test the folding of the 6HB nanostructure, unmodified GOs were initially used. The 6HB used folds in presence of 20 mM MgCl 2 , with formation of two species (Figure 2a) . The sample was analyzed by AFM and, as expected, the species were found to be monomers and dimers of the designed 6HB, with an average length of 43 nm ± 4.5 nm for the monomers and 82 nm ± 3.4 nm for the dimers (Figure 2e) . The dimer formation is probably due to stacking interactions between terminal base pairs of two different 6HB. The fact that only dimers, but not trimers or larger assemblies are formed, indicates that only one end of the 6HB tends to participate in base stacking. The presence of two species will not interfere with the gene assembly, therefore the experiments were performed without further opti ¬ mization of the nanostructure design. The ligation methodology was tested to synthesise the EGFP gene, folding the 6HB using 14 modified GOs and the staples. The GOs were then ligated, assisted by to the close proximity of the fragments pre-organized in the nanostructure . The AGE- Mg in Figure 2b shows that the structure retains its confor ¬ mation after the click reaction while in an AGE without Mg ions, the structure prior to the click reaction unfolds, whereas the ligated structure ("click BAS" in Figure 2c) en- tirely retains its conformation. Chemically-ligated 6HB con ¬ structs were examined by AFM and they were shown to have a similar length to the 6HB having a fragmented scaffold (78 nm ± 3.5 nm for the dimer and 42 nm ± 4.2 nm for the monomer) (Figure 2f) . However, the denaturing PAGE of the eGFP gene as- sembly was not very informative: many bands formed after click without any particular predominance. The band corresponding to the full-length gene might be the closest to the well of the denaturing gel, but since it was observed that triazole- containing oligonucleotides run slower than their unmodified counterpart, it is not reliable to refer to the ladder in as ¬ sessing their size.

Since it was not possible to identify the full triazole-linked gene from intermediates on denaturing gel, the crude chemical ligation mixture was used as a template for PCR amplification. Primers were designed in order to amplify only the full length gene (Table 1, Seq-ID Nos. 1 and 2) . The PCR reaction led to amplification of the full-length EGFP gene when both Taq polymerase (low fidelity) or Baseclick polymerase (high fidelity) were employed. To assess the accuracy of the gene assembly method, PCR products were cloned and sequenced. In both cases, 5% of the screened clones resulted in 100% identity with the designed gene sequence. 48 clones were sent for sequencing; 38 clones produced sequencing result; 33 clones had the PCR prod ¬ ucts as insert; 7 clones with 762 bp insert and 2 clones had 100% identity with the designed sequence. -5% of the clones that produced sequencing results showed 100% identity with the designed sequence. This is an encouraging result if one con ¬ siders that one of the polymerase tested in the PCR step is Taq polymerase, classified as relatively low-fidelity due to its error-rate of 2.3 x 10 ~5 (vs. 9.5 x 10 ~7 of a high fidelity polymerase) . At this point the error rate of the polymerase was calculated to understand if the system is prone to muta ¬ tions, or whether the triazole groups interfere with the cor ¬ rect incorporation of bases during PCR. An estimation of the error rate of the system can be obtained by comparing pub- lished data for the fidelity of Taq polymerase, which is re ¬ ported to incorporate 1 error every 700 - 1700 bp depending on the source of the mutation data. In the system Taq polymerase incorporates 1 error every 254 bp. This indicates that the presence of triazole groups along the template may favour mis- incorporation of bases, or that the crude ligation mixture used as template for PCR contains high concentrations of metal ions that might interfere with the activity of DNA polymeras ¬ es, or that the original oligonucleotide synthesis produced some of these mutations.

Finally, the method was compared to splint-assisted ligation in the absence of a nanoconstruct to prove the utility of the DNA nanostructure in assembling multiple gene oligonucleotides in equimolar ratios. The 14 GOs were assembled using 13 com- plementary splints and chemically ligated with the same proce ¬ dure used for the 6HB nanostructure. The ligation product was used as template for a PCR reaction where the KOD XL DNA poly- merase was employed, which is expected to easily read through the triazole linkage. Figure 2d shows the PCR products of the splint-mediated assembly, the 6HB assembly and a positive con ¬ trol (PCR of the EGFP gene) . PCR of the splint assembly did not produce full-length EGFP gene, but artefacts of higher and lower molecular weight, while PCR of the 6HB assembly showed a product of the same length as the control. However, the PCR products obtained using KOD XL polymerase were not as homoge ¬ neous as the ones employing Taq polymerase.

In conclusion, the present invention provides a system for gene fragment assembly by chemical ligation promoted by a DNA nanostructure, where gene fragments are part of the scaffold that run inside the nanostructure. These are assembled in a predefined fashion, so that 3'-alkyne and 5'-azide are in close proximity, forming a 6HB nanostructure. The use of the nanostructure proved to be an efficient method to achieve an equimolar ratio of oligonucleotides, which is otherwise diffi ¬ cult when several fragments have to be ligated together. With this technique it was possible to assemble 14 gene oligonucle ¬ otides to create a 762 nt long DNA strand, that after PCR is converted in a canonical double-stranded gene. The method proved to be more efficient than the equivalent ligation per ¬ formed using splint oligonucleotides in the absence of the nanostructure. Interestingly this gene is twice the size of the only one previously synthesized by CuAAC-mediated liga ¬ tion. The chemical ligation method based on the CuAAC reaction is fast and efficient and can be carried out in a variety of biologically compatible buffers. This method provides a new route to the assembly of long DNA strands, genes and genomes for use in DNA nanotechnology and synthetic biology for the construction of complex nanostructures and synthetic organ- isms. The system also allows the synthesis of modified genes by using modified GOs . This may be used to chemically assemble genes decorated with modifications such as epigenetic bases, fluorophores or haptens, which could have important applica- tions in fields of DNA nanotechnology and synthetic biology.

The method of the invention was developed starting from the design of the nanostructure for the assembly of 14 fragments of a sequence (gene) encoding the "enhanced green fluorescent protein" (EGFP) . To this end, the complete EGFP sequence was used as scaffold in the free-software "CADNano", which auto ¬ matically calculates the staple sequences.

The design of a six-helix bundle (6HB) nanostructure was se- lected for this purpose. Two different sets of staple strands were designed: the first (Fig.3) follows a design previously published (S. Douglas et al . Nature 2009 459, 414-418). The second design (Fig. 4 and 5) takes in account both the hierar ¬ chic assembly contemporarily avoiding the presence of crosso- vers in close proximity to the alkyne and azide functional groups. Both designs have been positively tested using un ¬ modified oligonucleotides and analyzed by agarose gel electro ¬ phoresis using previously reported experimental and analytical conditions .

Subsequently, alkyne/azide oligonucleotides prepared were in ¬ troduced in the folding mixture. The structure was efficiently folded into the 6HB following the hybridization program (folding program) earlier optimized using the un-modified equiva- lents of those alkyne/azide oligonucleotides and analyzed on agarose gel (Fig. 6) . The click reaction (Fig. 7) was performed using a Cu(I) source. Other Cu(I) systems such as CuS0 4 / Ascorbate yielded similar results. Different conditions have been tested in order to find the op ¬ timal folding salt mixture (MgCl 2 and TE concentrations) and different conditions for the click reaction (different Cu(I) sources and different reagent concentrations) . A first PCR test was done starting from the crude mixture of

6HB after click reaction, which gave a product of the expected size (Fig . 8 ) .

The PCR product was sequenced (Sanger) and the data showed the presence of at least one clone of 10 in which the sequence had a 100% fit with the EGFP sequence designed for these experi ¬ ments. The 6HB was further analyzed before and after click re ¬ action at the atomic force microscopy (AFM) measuring the ex ¬ pected nanotube size (Fig. 9) .

The next step was identifying the polymerases, which are able to read over the gene containing thirteen triazole linkages in a reproducible way. The following polymerases have been tested during this period

1) Taq DNA polymerase

2) HF Taq DNA polymerase

3) Pfu DNA polymerase

4) Q5 DNA polymerase

5) Phusion DNA polymerase

6) Phi29 DNA polymerase 7) KOD XL DNA polymerase

8) T7 DNA polymerase

So far the DNA polymerases able to read over the 13 triazole linkages are the Taq DNA polymerase, the HF Taq DNA polymerase and the KOD XL DNA polymerase. Example of a successful PCR shows the gel of figure 10.

Figure 11 shows the influence of MgCl 2 on agarose gels. The DNA nanostructure maintains its structure in gels containing MgCl 2 , while the structure is disassembled in gels without MgCl 2 - The formation of the long ssDNA product after the click ligation is confirmed by the presence of the corresponding band in the gel without MgCl 2 on the right, as schematic represented in this figure.

The efficacy of the click reaction was confirmed via denatur ¬ ing PAGE as well (Fig. 12) . The gel shows the disappearance of the EGFP cut mix (alkyne/azide-oligonucleotides contained in the mixture, red circle) after the click reaction. Moreover, the appearance of a band at ca. 800nt size confirmed the for ¬ mation of the full length EGFP gene (at the arrow height) .

Several bands at higher and lower molecular weight are also visible in this gel. They are probably partially due to the presence of the staples strands, which hold together the nanostructure even in denaturing conditions to some extent.

Different methods have been applied to extract the triazole- gene from the reaction mixture. A method based on magnetic beads loaded with a ssDNA sequence complementary to the new formed gene, resulted so far the most promising. The outcomes of those extraction experiments have been analyzed via UV ab- sorption and subsequently used as template in the PCR reaction normally with positive outcomes (formation of the amplified gene, even in absence of a clear UV signal) . PCR products showed that 5 ng to 50 ng of the triazole-product can be successfully used as template for PCR reactions.

A restriction assay was used to confirm that the PCR product corresponds to the fully elongated sequence. The Pfol re- striction enzyme was used to this purpose showing two new fragments at ca. 440bp and 320bp as result of the restriction event (Fig. 13) . This result corroborates the already robust DNA sequencing data obtained previously.

Folding procedure

A fresh "tile set-2 mix" was prepared by taking 5 μΐ of each set-2 tile (I6M0 - I6M17, table 2) using a 10 μΐ micropipette and uniting and mixing them. The "tile set-1 mix" was prepared accordingly from the 14 set-1 eGFP Oligonucleotides (EGFP1 - EGFP14, table 2) . Next, 10 μΐ Ι Ο χ TE (250 mM Tris, 20 mM EDTA, pH 8.0), 4 μΐ 0.5 M MgCl 2 , 14 μΐ of tile set-1 mix, 18 μΐ tile set-2 mix and 54 μΐ of HPLC-grade ¾0 (Fisher Scientific) were carefully mixed. The tubes were folded in a MJ-Mini PCR-

Personal Thermocycler (BioRad) using one of three different folding programs: GS15H, GS16H or GS11H (see table 4). Nano- tubes that have undergone the folding process but have not been subjected to a click reaction are denoted by the folding program that was used without any superscript. For example, "GS15H" refers to a nanotube that has been folded with the GS15H program, but has not yet been ligated in a click reac ¬ tion.

For an optimized program the samples were made mixing staples and in-house synthetized GOs in ratio 1:1 to a final concen ¬ tration of 500 nM/each oligo in IX TE buffer with 20 mM MgCl 2 . The sample was folded in a thermocycler using the following program: form 95 °C to 80 °C with a ramp of 1° C/min, from 80 °C to 40 °C with a ramp of 0.03°C/min, from 40 °C to 23 °C with a ramp of 0.1 °C/min and finally 8 °C.

Click reaction

Procedure using heterogenous catalyst 10 mg THPTA-ligand (Tris (3-hydroxypropyltriazolylmethyl) amine, baseclick GmbH) were dissolved in 230 μΐ HPLC-grade ¾0 (Fisher Scientific) for a 0.1 M THPTA-solution . Next, 5-6 mg of Reac ¬ tor M (catalyst, source of copper (I), baseclick GmbH) was weighed. 10 μΐ of the 0.1 M THTPA-solution were added to the Reactor first, then 40 μΐ of folded nanotubes were added care ¬ fully to the same vial. The vial was placed in a thermomixer (HLC) and underwent a click reaction for 5 hours at 32°C and 200 rpm. Afterwards, the solution was carefully transferred into a new vial, the remaining Reactor M was washed once with 10 μΐ of HPLC-grade ¾0 (Fisher Scientific) and the wash solu ¬ tion was added to the CLK-tubes for a total final volume of 60 μΐ nanotubes after ligation by click reaction. These nanotubes will be denoted as GS15H*, GS16H*, or GS11H* with the star referring to tubes that underwent a click reaction and the GSxxH denoting the folding program that was used to produce the nanotubes . As an alternative procedure a volume of 15 μΐ of THPTA 0.1 M was added to the "reactor M", a vial containing the heteroge ¬ neous catalyst (baseclick) , then 20 μΐ of folding reaction was added to the "reactor". The sample was incubated at 32 °C with gentle shaking (200 rpm) for 5h.

Click reaction (CuAAC) using CuSC^ / sodium ascorbate

The Baseclick EdU kit (reaction buffer, catalyst solution and reducing agent/buffer additive) was used for the experiments with CU S C as source of Cu(I) . The indications of the producer were used for the ligation assay using the click reaction. In this case, 40 μΐ of folding reaction were used in the assays. Procedure using copper bromide (CuBr)

The exact amounts of all chemicals used for the CuBr click re ¬ action can differ strongly and need to be adjusted for each different construct. The reaction mixture cannot contains too much water, as this would allow for the water un-soluble CuBr to precipitate, making a successful click reaction unlikely. Here, 5 mg of CuBr (Sigma Aldrich) were dissolved in 350 μΐ of click solution (DMSO : tert . -Butanol 3:1 by volume, baseclick) . The color of the solution was closely observed, as it needs to be light to medium green, otherwise the Cu (I) has oxidized to Cu (II) and the click reaction might be impaired. Next, 100 μΐ of the light green CuBr solution were added to 200 μΐ THPTA. 50 μΐ of this mixture were then added to 100 μΐ of folded nanotubes and underwent a click reaction at 25°C for 5 hours in a thermomixer (Eppendorf ) . The sample was gently mixed during the click reaction for 5 seconds every 30 minutes at 300 rpm. These nanotubes will be denoted as GS15H CuBr , GS16H CuBr , or GS11H 11 r with the superscript "CuBr" referring to nanotubes that underwent a copper bromide click reaction and the GSxxH denoting the folding program that was used to produce the nanotubes .

In an alternative preferred procedure with CuBr as source of Cu(I) : 1 mg of CuBr was weighted under inert atmosphere and dissolved in 70 μΐ of "click solution" (DMSO : tert . -Butanol 3:1 by volume) . A total of 5 mg of the ligand TBTA were dissolved in 94 μΐ click solution and the two solutions were combined in ratio 1:2 (TBTA:CuBr) . A volume of 20 μΐ of the mix were mixed with 4 μΐ of the folding reaction and incubated at 32°C with gentle shaking. Gel electrophoresis

Most experiments were analyzed using agarose gel electrophore ¬ sis. For this, depending on the experiment, different agarose gel concentrations were used, 2% agarose, 1.6% agarose and 0.8% agarose. As buffers for gel preparation and as running buffers, either 0.5x TBE (Tris-borate-EDTA buffer, 50 mM Tris, 50 mM H 3 BO 3 , 1 mM EDTA, pH 8.0) or 0.5x TAE (Tris-acetic acid- EDTA buffer, 50 mM Tris, 1 mM EDTA) were used. Depending on the application, sometimes 0.5x TBE or 0.5x TAE containing 11 mM MgCl2 were used.

AGE/AGE-Mg were prepared dissolving agarose (Ultra-pure, Ther ¬ mo Scientific) to achieve a 1% gel in 0.5X TBE buffer (and 11 mM MgCl 2 final concentration for AGE-Mg, IX TAE was used when the bands were extracted from gel) . The gel was casted and left solidify at RT for 30 min. The folded CLK-tubes can degrade in solution because of the repelling forces of the negative charges of the DNA' s phos ¬ phate backbone. This is due to the 3D structure of the nano- tube, bringing the negative charges very close together. If intact nanotubes are to be analyzed, Mg 2+ in the form of MgCl 2 is added, to neutralize the negative charges and thus stabi ¬ lize the tube. If the gel is made using an 11 mM MgCl 2 contain ¬ ing buffer, it is cooled with frozen cool packs during the electrophoresis to avoid excess heat and thus denaturing of the nanotubes or melting of the gel. If not explicitly stated, no MgCl2 was added.

All gels were stained for 15 to 20 minutes in an ethidium bro ¬ mide (EtBr, Carl Roth) staining solution. The staining solu- tion was prepared by adding approximately one spatula tip of EtBr to 30-50 ml of buffer. The buffer used was the same as for the particular gel preparation: either 0.5x TBE or 0.5x TAE with or without 11 mM MgCl 2 - After staining, the gels were then destained for 10 minutes in the same buffer as used for gel preparation. On all gels, 2-Log DNA Ladder (0.1-10.0 kb) (New England Biolabs) was used as a marker for fragment size. Gel Loading Dye, Blue (6x, 2.56% Ficoll®-400, llmM EDTA, 3.3 mM Tris-HCl, 0.017% SDS and 0.015% bromophenol blue, New Eng ¬ land Biolabs) was used as loading dye for all samples. All gels were photographed in a Gel Doc™ EZ Imager (BioRad) and processed with Image Lab™ Software version 5.2.1 (BioRad).

Concentration measurements All DNA concentration measurements were conducted with a Nano- Photometer (Implen) using either a dsDNA Program (PCR products, CLK-tubes) or an ssDNA program (purified CLK-Gene) . Each sample was measured three consecutive times, using 1 μΐ of sample each time. Between each of the three measurements, the nanophotometer was cleaned and the sample carefully mixed again. An average was calculated from the three measurements and given as the best approximation for actual DNA concentra ¬ tion of the sample.

PCR All PCRs were done in a MJ-Mini PCR-Personal Thermocycler (Bi- oRad) . PCRs were so called "hot-start" PCRs with the polymer ¬ ase only added to the reaction mix at 80°C after an initial denaturation step, to minimize unspecific amplification. Base- click polymerase (baseclick GmbH) was used in all PCRs if not stated otherwise. For this polymerase the following program was used: 94 °C for 3 min, 80°C for 30 sec (add polymerase); 94 °C for 45 sec, 30°C for 30 sec, 72 °C for 12 min, repeat for 4 times; 94 °C for 45 sec, 46 °C for 30 sec, 72 °C for 80 sec, repeat 14 times; 72 °C for 10 min.

Incubation with Phusion polymerase (NEB) proceeded as follows: 98 °C for 90 sec; 98 °C for 10 sec, 58 °C for 20 sec, 72 °C for 15 sec, repeat 20 times; 72 °C for 8 min.

A volume of 1 μΐ of click reaction was used as template for PCR. The incubation with Taq Polymerase (NEB) and KOD XL (Mil- lipore) proceeded as follows: 94 °C for 3 min, 80°C for 30 sec (add polymerase); 94 °C for 45 sec, 30°C for 30 sec, 72 °C for 12 min, repeat for 4 times; 94 °C for 45 sec, 46 °C for 30 sec, 72 °C for 72 sec, repeat 10 times; 72 °C for 10 min. In ¬ cubation with Phusion polymerase (NEB) proceeded as follows: 98 °C for 90 sec; 98 °C for 10 sec, 58 °C for 20 sec, 72 °C for 15 sec, repeat 20 times; 72 °C for 8 min.

Incubation with Q5 polymerase (NEB) proceeded as follows: 98 °C for 90 sec; 98 °C for 10 sec, 64 °C for 20 sec, 72 °C for 20 sec, repeat 20 times; 72 °C for 2 min.

Sequencing

PCR reactions were cloned into a plasmid using the TOPO PCR cloning kit by Thermo Fisher Scientific and following manufac ¬ turer instructions. Plasmids from clones were extracted with Gene Elute HP Miniprep by Sigma-Aldrich and sent to sequencing to GATC Biotech. The results from PCR using Taq polymerase are shown in table 5 (CP = click point = triazole linkage) . 48 clones were sent for sequencing; 38 clones produced sequencing result; 33 clones had the PCR products as insert; 7 clones with 762 bp insert; 2 clones have 100% identity with the designed sequence. -5% of the clones that produced sequencing results showed 100% iden ¬ tity with the designed sequence.

Analysis of putative monomer/dimer tube structures The nanotubes were folded as described before using all three different folding programs for comparison. All samples were ligated in a heterogenous click reaction. The GS15H nanotubes and GS16H nanotubes were analyzed on a 2% agarose 0.5x TBE-gel (11 mM MgCl 2 ) . Next, the GS15H nanotubes and the GS11H nano- tubes were separated on a 0.8% agarose 0.5x TAE-gel (11 mM MgCl 2 ) · TAE buffer was used for gel extraction experiments, when the samples subsequently underwent enzymatic reactions. Borate, as present in TBE buffer, is well known to inhibit the activity of some enzymes and should thus be avoided for such applications. All resulting bands were visualized on a UV- transilluminator (Biostep) and excised with a scalpel steri- lized in 70% ethanol (J.T. Baker) . Proper personal protection such as a UV face visor was worn at all times when working with an open UV-light source. DNA was extracted from the gel slices using a QIAquick® Gel extraction Kit (Qiagen) according to the manufacturer's protocol. DNA concentrations of the re- suiting samples were measured. 570 ng DNA of each sample were used as template for a PCR. 10 μΐ of each PCR product were an ¬ alyzed via a 2% agarose 0.5x TBE-gel electrophoresis. On a second 2% agarose 0.5x TBE-gel the GS15H/GS11H, GS15H*/GS11H* and gel extraction samples of both nanotubes were separated. Another 2% agarose 0.5x TBE-gel (11 mM MgCl 2 ) was also prepared for the subsequent electrophoresis of the GS15H* and GS11H* nanotubes .

Purification of the CLK-Gene

A fresh GS15H folding reaction was prepared as described before. An aliquot of the folded nanotubes was ligated in a click reaction with CuBr and a second aliquot in a hetero ¬ genous click reaction. The resulting GS15H* and GS15HCuBr sam- pies were concentrated in an ethanol precipitation. For this, 146 μΐ of GS15H CuBr were carefully mixed with 40 μΐ 3 M sodium acetate (Sigma Aldrich) and 1 ml of ice cold 100% ethanol (J.T. Baker) . 90 μΐ of the GS15H* were carefully mixed with 30 μΐ 3 M sodium acetate and 1 ml ice cold 100% ethanol. Both re- actions were mixed well and placed in a -20°C freezer for 48 hours . Next, the samples were centrifuged at 15300 rpm for 20 minutes at room temperature. The supernatant was taken off and the pellets were resuspended in 1 ml ice cold 70% ethanol. Then both samples were centrifuged for another 20 minutes at

15300 rpm and room temperature. The supernatant was again dis ¬ carded and the reaction vials were placed under a fume hood for approximately 20 minutes, until all remaining ethanol was evaporated. Both pellets were resuspended in 115 μΐ HPLC-grade H 2 0 (Fisher Scientific) . As the GS15H CuBr pellet did not suspend completely, one drop of 32% pure ammonia solution (AppliChem) was added, dissolving the pellet immediately. The GS15H CuBr vial was placed in a thermomixer (Eppendorf) with the reaction vial lid open, at 40°C and 300 rpm for approximately 1 hour until all ammonia was evaporated. DNA concentrations were measured for both samples. The GS15H* sample after the first ethanol precipitation was then purified with a QIAquick® PCR Purifica ¬ tion Kit (Qiagen) according to the manufacturer' s protocol and DNA concentrations were again measured. In a next step, both samples, GS15H* after PCR purification Kit and GS15H CuBr after ethanol precipitation, were further purified using Illustra

NAP™-5 columns (GE Healthcare) , also according to the manufac ¬ turer' s protocol. Again, DNA concentrations for both samples were measured. The resulting GS15H* sample was divided into two 240 μΐ frac ¬ tions, to each of which 40 μΐ 3 M sodium acetate and 1 ml ice cold 100% ethanol were added. The samples sat in a -20°C freezer for 48 hours. Then, the ethanol precipitation was continued as described above and the resulting final GS15H* pel- let was resuspended in 15 μΐ HPLC-grade water and DNA concen ¬ trations were measured. The GS15H CuBr sample after the NAP- column purification was further purified with the QIAquick® PCR Purification Kit according to the manufacturer' s protocol and DNA concentrations were measured. After every purification step, samples were taken for agarose gel electrophoresis and PCR amplification for both the GS15H* and the GS15H CuBr .

PCR amplification of the purified CLK-Gene samples

Four different PCR mixtures of each final purification product (GS15H CuBr after QIAquick® PCR Purification kit and GS15H* after second ethanol precipitation) were prepared, containing either 25 ng, 50 ng, 100 ng or 250 ng of presumably purified CLK-Gene DNA as template. Another PCR mixture containing 500 ng of GS15H DNA but no forward or reverse primers was prepared as a control. After the PCR, DNA concentrations were measured. 2 μΐ of each PCR product were loaded and analyzed on a 1.6% agarose 0.5x TBE-gel.

7 PCR mixtures were prepared: Two GS15H* nanotube samples con ¬ taining either 25 ng or 50 ng DNA as template, one of every intermediate purification step of the purified GS15H* sample containing each 25 ng DNA as template and two of the final GS15H* purification step with either 25 ng or 50 ng DNA as template . After the PCR all PCR products were analyzed on a 1.6% agarose 0.5x TBE-gel together with the original purification samples before PCR. A control sample containing 25 ng of template DNA in 50 μΐ HPLC-grade ¾0 was also prepared and 10 μΐ were load ¬ ed .

PCR amplification of the CLK-Gene directly from the nanotube For PCR amplification, 5 ng, 10 ng, 15 ng, 20 ng, 25 ng and 50 ng of DNA from GS15H* nanotubes without any prior purification were used as template. The resulting PCR products were ana ¬ lyzed in a 1.6% agarose 0.5x TBE gel electrophoresis.

Restriction digestion

The PCR product of the 25 ng template PCR was digested using the Pfol restriction enzyme (Thermo Scientific) . For this, the PCR product was purified with the QIAquick® PCR Purification Kit. Then, the DNA concentration of the sample was measured. For the digest, 8 μΐ HPLC-grade ¾0 (Fisher Scientific) , 2 μΐ lOx Tango buffer (Thermo Scientific) , 20 μΐ of the purified PCR product and 1 μΐ Pfol were carefully mixed and incubated at 37°C in a water bath for 1.5 hours. The digestion was stopped by incubating the sample in a water bath at 65°C for 20 minutes. A 1.6% 0.5x TBE-gel was prepared for gel electro ¬ phoresis of the digest. Also loaded were the 50 ng template PCR product and the purified 25 ng template PCR product sample before digest.

CLK-Gene isolation

A GS15H* sample was prepared. An aliquot of the sample was treated with a RNA Clean & Concentrator™-5 Kit ( Zymoresearch) and DNA concentrations of the two resulting fractions (17-200 nt and ≥ 200 nt) were measured. The samples were separated on a 1.6% agarose 0.5x TBE-gel. Next, a second aliquot of the GS15H* sample was separated in a 1.6% agarose 0.5x TBE-gel electrophoresis. On the same gel, a denaturated sample of the GS15H* nanotube was also analyzed. For this, 10 μΐ of GS15H* nanotube were denatured in a thermo- mixer (Eppendorf) at 95°C for 2 minutes and then loaded onto the gel. The resulting bands were excised from the gel with a clean scalpel. The excised bands were dissolved in the gel dissolving buffer of a Monarch DNA Gel extraction Kit (New

England Biolabs) and the DNA was extracted with the RNA Clean & Concentrator™-5 Kit ( Zymoresearch) according to the Zymocle- an™ Gel RNA Recovery Kit (Zymoresearch) protocol. DNA concentration of the gel extraction sample was measured. A PCR was then prepared with the 25 ng DNA of the GS15H* nanotube, both RNA Clean & Concentrator™-5 fractions and the gel extraction sample serving as template for amplification.

25 ng and 50 ng DNA of each the GS15H* nanotube and the gel extraction sample also served as template in the same PCR but were amplified with high fidelity Phusion polymerase (New Eng ¬ land Biolabs) . A 1.6% agarose TBE-gel was prepared and all PCR products were loaded. Another denaturated GS15H* sample was also prepared and loaded. For this, the GS15H* nanotube was denatured in the MJ-Mini PCR-Personal Thermocycler (BioRad) at 94°C for 3 minutes with the lid also heated to 94°C.

While the present inventions have been described and illus ¬ trated in conjunction with a number of specific embodiments, those skilled in the art will appreciate that variations and modifications may be made without departing from the princi ¬ ples of the inventions as herein illustrated, as described and claimed. The present inventions may be embodied in other spe ¬ cific forms without departing from their spirit or essential characteristics. The described embodiments are considered in all respects to be illustrative and not restrictive. The scope of the inventions are, therefore, indicated by the appended claims, rather than by the foregoing description. All changes which come within the meaning and range of equivalence of the claims are to be embraced within their scope.

Table 1

FW EGFP 1-14 TATCACTATCGACGGTA Seq-ID No. 1

REV EGFP1-14 ACTTACAGCTTTACTTG Seq-ID No. 2

Table 2

Name Sequence Length/

SEQ-ID

EGFP1 TCGACGGTACCGCGGGCCCGGGATCCACCGGTCGCCACCAT- 61 /No. 4

GGTGAGCAAGGGCGAGGAGC

EGFP2 TGTTCACCGGGGTGGTGCCCATCCTGGTCGAGC 33 /No. 5

EGFP3 TGGACGGCGACGTAAACGGCCACAAGTTCA- 62 /No. 6

GCGTGTCCGGCGAGGGCGAGGGCGATGCCACC

EGFP4 TACGGCAAGCTGACCCTGAAGTTCATCTGCACCAC- 54 /No. 7

CGGCAAGCTGCCCGTGCCC

EGFP5 TGGCCCACCCTCGTGACCACCCTGACCTACGGTG- 51 /No. 8

TACAGTGCTTCAGCCGC

EGFP6 TACCCCGACCACATGAAGCAGCACGACTTCTTCA- 54 /No. 9

AGTCCGCCATGCCCGAAGGC

EGFP7 TACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAAC 42 /No. 10

EGFP8 TACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACAC- 58 /No. 11

CCTGGTGAACCGCATCGAGC

EGFP9 TGAAGGGCATCGACTTCAAGGAG- 59 /No. 12

GACGGCAACATCCTGGGGCACAAGCTGGAGTACAAC

EGFP10 TACAACAGCCACAACGTCTATATCAT- 60 /No. 13

GGCCGACAAGCAGAAGAACGGCATCAAGGTGAAC

EGFP11 TTCAAGATCCGCCACAACATCGAG- 51 /No. 14

GACGGCAGCGTGCAGCTCGCCGACCAC

EGFP12 TACCAGCAGAACAC- 58 /No. 15

CCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCA- CTACC

EGFP13 TGAGCACCCAGTCCGCCCTGAGCAAAGAC- 60 /No. 16

CCCAACGAGAAGCGCGATCACATGGTCCTGC

EGFP14 TGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATG- 59 /No. 17 GACGAGCTGTACAAGTAAAGC

I6M0 GCCGTAGGTGGCATCAGCTCGACCAGGATCGTTGGGGT 38 /No. 18

I6M1 AAGAAGTCGCTTGTGCCCCAGGAGCCGTCCTCACG 35 /No. 19

I6M2 CTCCTTGAAGTCGATGCCTCCTGGACGTAGCCTTCA- 44 /No. 20

GGGCACCC

I6M3 GGGCAGCAGCGGGTGCTCAGGTAGTTAACTTCGCTG 36 /No. 21

I6M4 GCAGCTTGCCGGGCCCTTGCTCACCATGGTGGC 33 /No. 22

I6M5 GATCTTGAAGTTCACCTTGATCGTTGTGG 29 /No. 23

I6M6 AACTCCAGCAGGACCAGCGAGCTGCACGCTTGTTGCCGTC 40 /No. 24

I6M7 CGGTGAACAGCTCCTCTGGTGCAGATGAACTTCGGGCAT- 49 /No. 25

GGCGGACTTG

I6M8 CTTTGCTCAGGCGCCGTCCGCCCTCGCCCTCGGCCGTCGTC 41 /No. 26

I6M9 CTGTTGTAGTTGTACTCCAGTGCTGCTTCAT- 50 /No. 27

GTGGTGGGCCAGGGCACGG

I6M10 GACCGGTGGATCCCTCCATGCCG 23 /No. 28

I6M11 TGTACAGCTCGGGGCCCGGGGTGGTCACGAGGGTCGGGG- 49 /No. 29

TAGCGGCTGA

I6M12 CGTTTACGTGCGGACTACGGGGCCGTCGCCGGTTCACCA- 48 /No. 30

GGGTGTCGC

I6M13 CCTCGAACTTCGGGTCTTGTAGTTCCGGACAGTGGC 36 /No. 31

I6M14 TCTGCTGGTAGTGGTCGTGTGATCGCGCTTCTGGGCACTCA- 45 /No. 32

GCTT

I6M15 AGAGTGATCCCGGCGGCGGTCGATGTTGTGGCG 33 /No. 33

I6M16 CGGTACCGTCGATTTTGCTTTACT 24 /No. 34

I6M17 CTTGAAGAA- 44 /No. 35

GATGGTGCGCCTTCAGCTCGATGCGATGGGGGTGT

Table 2 (cont.) Table 3 eGFP (CLK-Gene; TCGACGGTACCGCGGGCCCGGGATCCACCGGT

CGCCACCATGGTGAGCAAGGGCGAGGAGCTGT

sequence

TCACCGGGGTGGTGCCCATCCTGGTCGAGCTG

Seq-ID No. 3 GACGGCGACGTAAACGGCCACAAGTTCAGCGT

GTCCGGCGAGGGCGAGGGCGATGCCACCTACG GCAAGCTGACCCTGAAGTTCATCTGCACCACC GGCAAGCTGCCCGTGCCCTGGCCCACCCTCGT GACCACCCTGACCTACGGTGTACAGTGCTTCA GCCGCTACCCCGACCACATGAAGCAGCACGAC TTCTTCAAGTCCGCCATGCCCGAAGGCTACGT CCAGGAGCGCACCATCTTCTTCAAGGACGACG GCAACTACAAGACCCGCGCCGAGGTGAAGTTC GAGGGCGACACCCTGGTGAACCGCATCGAGCT GAAGGGCATCGACTTCAAGGAGGACGGCAACA TCCTGGGGCACAAGCTGGAGTACAACTACAAC AGCCACAACGTCTATATCATGGCCGACAAGCA GAAGAACGGCATCAAGGTGAACTTCAAGATCC GCCACAACATCGAGGACGGCAGCGTGCAGCTC GCCGACCACTACCAGCAGAACACCCCCATCGG CGACGGCCCCGTGCTGCTGCCCGACAACCACT ACCTGAGCACCCAGTCCGCCCTGAGCAAAGAC CCCAACGAGAAGCGCGATCACATGGTCCTGCT GGAGTTCGTGACCGCCGCCGGGATCACTCTCG GCATGGACGAGCTGTACAAGTAAAGC

Table 4

GS15H (14:38 hours) G11H (10:58 hours) GS16H (15:27 hours)

Lid= 100 °C Vol. Lid= 100 °C Vol. Lid= 100 °C Vol. 100 ym 100 ym 100 ym

1=90°C for 1:00 1=95°C for 1:00 1=95°C for 1:00

2=80°C for 1:00 -0. l°C/Cycle -1.0 °C/Cycle

-0. l°C/Cycle 2=Goto 1, 649 times 2=Goto 1, 19 times

3=Goto 2, 100 times 3=30°C for 1:00 3=75°C for 2:00

4=70°C for 2:00 -1.0°C/Cycle -0. l°C/Cycle

-0. l°C/Cycle 4=Goto 3, 6 times 4=Goto 3, 449 times

5=Goto 4, 300 times 5= 23°C for 1:00 5=30°C for 1:00

6=40 °C for 1:00 6=8°C hold 6=Goto 5, 6 times

7=Goto 6, 170 times 7= 23 °C hold

8=18°C for 3:00

9=4 °C hold

Table 5

Clone Substitutions Deletions Insertions

Name

A02 2 (9 after) 1 (13 before) 1

A03

A05

B01 2 (12bp, 9bp 1 ( 9bp before 1 (8bp before before CP) CP) click point)

B02

B03

B05

B06

C01

C02

C03

C04

C05

C06

D02 3 (16bp before 0 0

CP)

D03 0 0 0

D05

D06

E01

E02

E03

E05

E06

F01

F02

F03

F04

F05

G01

G02

G03

G06

HOI

H02

H03 2 (13bp before 0 0

CP)

H04 1 0 0

H05

H06 0 0 0