Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR ADHERING A POLY(DA/DT) TAIL TO A DNA BACKBONE
Document Type and Number:
WIPO Patent Application WO/2024/003311
Kind Code:
A1
Abstract:
The current invention relates to a method for incorporating a poly(dA/dT) tail to a nucleic acid sequence of interest, the method comprising: providing a backbone DNA sequence comprising a sequence of interest, encoding for a protein or peptide of interest; amplifying at least the sequence of interest by means of a Polymerase Chain Reaction (PCR) using a pair of amplification primers comprising a forward and a reverse primer, said reverse primer having a length of between 30 to 530 nucleotides and comprising a stretch of thymine nucleotides of at least 20 nucleotides long.

Inventors:
RANDRIANJATOVO-GBALOU IRINA (FR)
CHARBONNIER TEDDY (FR)
SAID AHMED (FR)
RAHIER RENAUD (FR)
Application Number:
PCT/EP2023/067925
Publication Date:
January 04, 2024
Filing Date:
June 29, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
QUANTOOM BIOSCIENCES FRANCE SAS (FR)
International Classes:
C12Q1/6844; C12Q1/6853
Domestic Patent References:
WO2010085966A22010-08-05
WO2013071047A12013-05-16
WO2014144583A22014-09-18
WO2009017743A22009-02-05
Foreign References:
US20120135417A12012-05-31
Other References:
ANONYMOUS: "PCR Primer Design Guidelines", INTERNET CITATION, 21 December 2016 (2016-12-21), XP002780636, Retrieved from the Internet [retrieved on 20180427]
Attorney, Agent or Firm:
BRANTSANDPATENTS BV (BE)
Download PDF:
Claims:
CLAIMS

1. A method for incorporating a poly(dA/dT) tail to a nucleic acid sequence of interest, the method comprising: providing a backbone DNA sequence comprising a sequence of interest, encoding for a protein or peptide of interest; amplifying at least the sequence of interest by means of a Polymerase Chain Reaction (PCR) using a pair of amplification primers comprising a forward and a reverse primer, said reverse primer having a length of between 30 to 530 nucleotides and comprising a stretch of thymine nucleotides of at least 20 nucleotides long.

2. The method according to claim 1, wherein said stretch of thymine nucleotides in said reverse primer is between 20 to 500 nucleotides long.

3. The method according to claim 1 or 2, wherein said stretch of thymines is present at the 5' end of said reverse primer.

4. The method according to any of the claims 1 to 3, wherein said stretch is consecutive or interrupted by a stabilizer sequence.

5. The method according to any of the claims 1 to 4, wherein the forward and reverse primers, excluding the poly(dT) stretch, have a melting temperature (Tm) of between 50°C and 75°C.

6. The method according to any of the previous claims, wherein the delta Tm of the reverse, excluding the poly(dT) stretch, and forward primer is between 0°C and 10°C.

7. The method according to any of the previous claims, wherein the forward primer has a GC content of between 10% and 90%, preferably between 40% and 60%.

8. The method according to any of the previous claims, wherein the part of the reverse primer excluding the poly(dT) stretch has a GC content of between 10% and 90%, preferably between 40% and 60%.

9. The method according to any of the claims 1 to 8, wherein the forward and/or reverse primer comprises a terminal GC clamp.

10. The method according to any of the previous claims, wherein the forward primer comprises a sequence corresponding to a promoter region.

11. The method according to any of the previous claims, wherein the forward and reverse primers are modified to incorporate fluorophores, biotin, non-natural amino acids and/or abasic sites.

12. The method according to any of the previous claims wherein the forward and reverse primers are modified to comprise phosphorothioate bonds, locked nucleic acids (LNA), and/or peptide nucleic acid (PNA).

13. A nucleotide primer set comprising a forward and reverse primer, wherein said nucleotide primer set is to be used in a polymerase chain reaction for amplifying a sequence of interest, wherein said reverse primer has a nucleotide length of between 30 to 500 nucleotides and comprises a stretch of thymine nucleotides of at least 20 nucleotides long.

14. A kit for obtaining a polynucleotide template for in vitro transcription, comprising: a primer pair according to claim 13 a DNA polymerase; and nucleotides.

15. A method of in vitro transcription, wherein said in vitro transcription occurs on an amplicon that is obtained by means of the method according to any of the previous claims 1 to 14.

Description:
METHOD FOR ADHERING A POLY(DA/DT) TAIL TO A DNA BACKBONE

FIELD OF THE INVENTION

The present invention pertains to the technical field of molecular biology and relates to a method of nucleic acid amplification and incorporation of a poly(dA/dT) tail.

BACKGROUND

During the last years, an increase in demand for mRNA synthesis and manufacturing methods ensued from the need to supply new RNA-based therapeutic applications, such as vaccines. The mRNA is generated in an in vitro transcription reaction (IVT), starting from a DNA template that must be produced in advance, usually by linearization of a purified plasmid or by amplification of the region of interest using PCR. An essential feature of the mRNA is the presence of the poly(A) tail that confers stability to the mRNA, aids in the export of the mRNA to the cytosol, and is involved in the formation of a translation-competent ribonucleoprotein (RNP). Moreover, the length of the poly(A) tail of the mRNA has been linked with the efficiency of expression of said mRNA. Increased efficiency of expression reduces costs by directly reducing the quantity of mRNA necessary for a required level of expression, and can also reduce the potential for the antigenicity of the mRNA due to the reduced exposure of a patient's cells to a foreign nucleic acid.

Commonly the DNA templates used in IVT reactions lack poly(dA/dT) tails and in such cases, they are added post-transcriptionally by enzymatic reactions. A simplified and more efficient manufacturing process requires that the DNA template contains a poly(dA/dT) tail which could be integrated into the newly transcribed mRNA during the IVT process. The latter approach involves the use of an additional restriction enzyme cutting step, which is undesirable.

Besides, such a template requires to be cloned in a plasmid and to be amplified in E.coli prior IVT reaction. However, it has been observed that poly(dA/dT) stretches are prone to shortening when propagated in E.coli, forcing a large number of clones to be screened to obtain a single clone with a poly(dA/dT) stretch of the correct length. Moreover, optimizing the length of the poly(A) tails remains a major drawback of the mRNA production process, as existing technologies do not allow for the production of IVT mRNA with defined length poly(A) tails in the physiological (>200 nucleotides) range.

The methodology disclosed herein aims to solve the above-mentioned drawbacks. A method for length-controlled incorporation of an end-terminating poly(dA/dT) tail to nucleic acid molecules using PCR is described herein. This method allows the obtention of large quantities of tailed DNA fragments that can be used directly for IVT, without the need of performing an additional restriction step.

SUMMARY OF THE INVENTION

The present invention and embodiments thereof serve to provide a solution to one or more of above-mentioned disadvantages. To this end, the present invention relates a method for the length-controlled incorporation of an end-terminating poly(dA/dT) tail to nucleic acid molecules using polymerase chain reaction (PCR). A primer set was designed to allow the incorporation of long poly(dA/dT) tails into nucleic acids of interest. The tailed DNA obtained may be used as a template for in vitro transcription (IVT).

Preferred embodiments of the method are shown in any of the claims 2 to 12.

In a second aspect, the present invention relates to a primer set that allows the PCR amplification of a DNA fragment of interest and the introduction of a size-controlled poly(dA/dT) tail, according to claim 13.

In a third aspect, the present invention relates to a kit for PCR amplification of a DNA fragment of interest and the introduction of a size-controlled poly(dA/dT) tail, according to claim 14.

In a final aspect, the present invention relates to a method for IVT transcription, using a DNA template comprising a poly(dA/dT) tail, according to claim 15. DESCRIPTION OF FIGURES

Figure 1 describes the construct used in Example 1 and the results obtained after PCR. Figure 1A depicts a circular DNA backbone comprising an origin of replication, a T7 polymerase promoter and the 1.2 kbp GOI, which contains a 5' UTR, a coding sequence, a 3' UTR and a 120 bp poly(dA/dT) tail. The GOI is amplified with a forward primer annealing 2.8 kbp upstream to the T7 promoter, and a reverse primer comprising a template-specific sequence along with a poly(dTi2o) that anneals both to the end of the coding sequence and to the poly(dT/dA) tail of the GOI. After amplification, the 4 kbp amplicon comprises the 1.6 kbp GOI, including the poly(dT/dAi2o) stretch. Figure IB depicts the electrophoresis of the PCR product; a 4 kbp band corresponding to the GOI was obtained.

Figure 2 depicts the results of the electrophoresis of the mRNA fragment obtained from the IVT reaction of Example 1. A band corresponding to 1.2 kbp RNA was obtained.

Figure 3 describes the construct used in Example 2 and the results obtained after PCR. Figure 3A depicts a circular DNA backbone comprising the 12 kbp GOI, an origin of replication, and a T7 polymerase promoter. The GOI lacks poly(dA/dT) tail and is amplified with a forward primer annealing to the T7 promoter, and a reverse primer comprising a template specific sequence along with a poly(dT 4 o) that anneals to the end of the GOI. After amplification, the 12 kbp GOI comprises a poly(dT/dA 4 o) stretch. Figure 3B The MidoriGreen fluorescence of the PCR product obtain indicates the presence of a 12 kbp fragment corresponding to the gene of interest.

Figure 4 depicts the results of the Fragment Analyzer with the DNA amplicon of Example 2. Figure 4A The size of the amplicon is 11827bp. Figure 4B At a concentration of 3.77 ng/pl, the purity of the amplicon is 95.1%.

Figure 5 depicts the results of the electrophoresis of the mRNA fragment obtained from the IVT reaction of Example 2. Bands corresponding to 12 kbp RNA were obtained for both of the concentrations used, 400 ng and 800 ng.

Figure 6 describes the construct used in Example 3 and the results obtained after PCR. Figure 6A depicts a circular DNA backbone comprising the 2.3 kbp GOI, an origin of replication, and a T7 polymerase promoter. The GOI has poly(dA/dT 7 o) stretch with a 10 bp stabilizer sequence inserted in the poly(dA/dT) stretch. Said GOI is amplified with a forward primer annealing upstream to the T7 promoter, and a reverse primer annealing to the stabilizer sequence, and the poly(dA/dT) stretch downstream to the stabilizer sequence. Figure 6B The MidoriGreen fluorescence of the PCR product obtain indicates the presence of a 2.3 kbp fragment corresponding to the gene of interest.

Figure 7 depicts the results of the Fragment Analyzer with the DNA amplicon of Example 3. Figure 7A The size of the amplicon is 2609 bp. Figure 7B At a concentration of 5 ng/pl, the purity of the amplicon is 93.4%.

Figure 8 depicts the results of the electrophoresis of the mRNA fragment obtained from the IVT reaction of Example 3. Bands corresponding to 2.3 kbp RNA were obtained.

Figure 9 depicts the results of the Fragment Analyzer with the mRNA obtained after IVT in Example 3. Figure 9A The size of the amplicon is 2316 bp. Figure 9B At a concentration of 187.9 ng/pl, the purity of the mRNA is 95.3%.

Figure 10 shows a schematic representation of the PCR amplification according to embodiments of the present invention. In Figure 1OA, the GOI to be amplified contains a poly(dA/dT) stretch. The forward primer is designed to anneal to the backbone DNA, upstream to the T7 promoter. The reverse primer is designed to anneal both to the 3' UTR and to the complete poly(dA) tail. In Figure 1OB, the GOI does not contain a poly(dA/dT) stretch. The reverse primer, containing a long poly(dT) stretch, is designed to anneal to the 3'UTR of the GOI. In both Figure 1OA and FigurelOB after the PCR reaction, the multiplied GOI fragments comprise size- controlled poly(dA/dT) tails and are free of backbone DNA.

Figure 11 depicts the forward primer according to embodiments of the present invention. The forward primer may be comprised solely of a sequence that anneals to the sequence of interest (template-specific sequence), in some embodiments as depicted in Figure 11A. In another embodiment, the forward primer comprises in addition to the template-specific sequence, a promoter and/or additional nucleotides, that are not complementary to the sequence of interest, as depicted in Figure 11B. In yet other embodiments, the forward primer may additionally comprise a functional sequence (Figure 11C) or a 5'UTR (Figure 11D). The additional added sequences (nucleotides, promoter, functional sequences and/or 5'UTR) are not complementary to the sequence of interest and are added to the GOI during the PCR reaction.

Figure 12 depicts the reverse primer according to embodiments of the present invention. The reverse primer may be comprised of a sequence that anneals to the sequence of interest (template-specific sequence) and a poly(dT) tail as depicted in Figure 12A. In another embodiment, the reverse primer comprises in addition to the template-specific sequence and the poly(dT) tail, a 3'UTR sequence (Figure 12B) In other embodiments, the reverse primer might comprise a stabilizer sequence after the poly(dT) stretch (Figure 12C). The stabilizer sequence may be positioned between two poly(dT) stretches (Figure 12D) and may be followed by a templatespecific sequence (Figure 12E) in some embodiments of the reverse primer. In some embodiments, the poly(dT) of the primer is complementary to a poly(dT/dA) stretch of the sequence of interest. In other embodiments, only a portion of the poly(dT/dA) stretch is complementary to the sequence of interest. In other embodiments the gene of interest does not comprise a poly(dT/dA) stretch. In all embodiments, the amplicon resulting from the PCR has a controlled poly(dT/dA) tail and is free of backbone.

Figure 13 depicts the results of the PCR amplification of Example 4. Figure 13A depicts the results of the electrophoresis of DNA fragments obtained from PCR reactions of Example 4. Bands corresponding to 2 kbp DNA were obtained for all the reaction volumes. 1 represents 100 pl, 2 represents 150 pl, and 3 represents 200 pl. All the reactions were performed in 3 replicates. Figure 13B depicts the results of a capillary electrophoresis of a mRNA fragment obtained from an IVT reaction using a PCR fragment as template (200 pL sample). The size of the mRNA is 2181 bases.

DETAILED DESCRIPTION OF THE INVENTION

The present invention concerns a method for the length-controlled incorporation of an end-terminating poly(dA/dT) tail to nucleic acid molecules using polymerase chain reaction (PCR).

Unless otherwise defined, all terms used in disclosing the invention, including technical and scientific terms, have the meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. By means of further guidance, term definitions are included to better appreciate the teaching of the present invention.

As used herein, the following terms have the following meanings:

"A", "an", and "the" as used herein refers to both singular and plural referents unless the context clearly dictates otherwise. By way of example, "a compartment" refers to one or more than one compartment.

"About" as used herein referring to a measurable value such as a parameter, an amount, a temporal duration, and the like, is meant to encompass variations of +/- 20% or less, preferably +/-10% or less, more preferably +/-5% or less, even more preferably +/-1% or less, and still more preferably +/-0.1% or less of and from the specified value, in so far such variations are appropriate to perform in the disclosed invention. However, it is to be understood that the value to which the modifier "about" refers is itself also specifically disclosed.

"Comprise", "comprising", and "comprises" and "comprised of" as used herein are synonymous with "include", "including", "includes" or "contain", "containing", "contains" and are inclusive or open-ended terms that specifies the presence of what follows e.g. component and do not exclude or preclude the presence of additional, non-recited components, features, element, members, steps, known in the art or disclosed therein.

Furthermore, the terms first, second, third and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order, unless specified. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.

The recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within that range, as well as the recited endpoints.

The expression "% by weight", "weight percent", "%wt" or "wt%", here and throughout the description unless otherwise defined, refers to the relative weight of the respective component based on the overall weight of the formulation. Whereas the terms "one or more" or "at least one", such as one or more or at least one member(s) of a group of members, is clear per se, by means of further exemplification, the term encompasses inter alia a reference to any one of said members, or to any two or more of said members, such as, e.g., any >3, >4, >5, >6 or >7 etc. of said members, and up to all said members.

Unless otherwise defined, all terms used in disclosing the invention, including technical and scientific terms, have the meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. By means of further guidance, definitions for the terms used in the description are included to better appreciate the teaching of the present invention. The terms or definitions used herein are provided solely to aid in the understanding of the invention.

Reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to a person skilled in the art from this disclosure, in one or more embodiments. Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments, as would be understood by those in the art. For example, in the following claims, any of the claimed embodiments can be used in any combination.

The term "backbone DNA" as used in the present disclosure, is understood as being any type of double-strand DNA that comprises the sequence of the nucleic acid of interest.

"A nucleic acid sequence of interest" or "a gene of interest" is to be understood in the present disclosure and used interchangeably as a double-strand DNA sequence comprising a gene or a gene part translatable into a protein. In preferred embodiments of the invention, the nucleic acid of interest may comprises one or more of the following regions: - a promoter sequence, position at the 5' end, upstream of all the other elements; a 5'UTR downstream of the promoter; a coding sequence that is translatable to an amino acid sequence; a 3'UTR downstream of the coding sequence;

- and optionally a poly (dA/dT) sequence downstream to the 3'UTR.

"A promoter" refers to a region of DNA to which RNA polymerase binds to initiate transcription of the DNA sequence positioned downstream of the promoter, to RNA. Promoters are located near the transcription start sites of genes, upstream on the DNA, towards the 5’ region of the sense strand.

A "sense strand", or coding strand, is the segment within double-stranded DNA that carries the translatable code in the 5' to 3' direction, and which is complementary to the "antisense strand" of DNA, or template strand, which does not carry the translatable code in the 5' to 3' direction. The sense strand is the strand of DNA that has the same sequence as the mRNA, which takes the antisense strand as its template during transcription, and eventually undergoes (typically, not always) translation into a protein. The antisense strand is thus responsible for the RNA that is later translated to protein, while the sense strand possesses a nearly identical makeup to that of the mRNA.

The terms "Polymerase chain reaction" or "PCR" or "amplification", used interchangeably herein, are understood as a thermal cycling-dependent method used to make millions to billions of copies of a specific DNA or RNA sample, allowing the amplification of small sample sizes of nucleic acids. In PCR, several cycles of heating and cooling of the reaction results in melting the template polynucleotide for replication, enzymatic replication, and amplification of copies of the polynucleotide resulting in thousands or millions of copies of the polynucleotide. Melting, annealing (5°C below the melting temperature of the primer), and amplification or extension are cycles performed each in a different temperature range and are repeated 12 to 35 times, depending on the stability of the polynucleotide template and primers. The term "PCR" encompasses derivative forms of the reaction, including but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplexed PCR, assembly PCR and the like. Reaction volumes range from a few hundred nanoliters, e.g., 200 nl, to a few hundred microliters, e.g., 200 microliters. Reverse transcription PCR (RT- PCR) means a PCR that is preceded by a reverse transcription reaction that converts a target RNA to a complementary single stranded DNA, which is then amplified. Realtime PCR is a PCR for which the amount of reaction product, amplicon, is monitored as the reaction proceeds. Nested PCR means a two-stage PCR wherein the amplicon of a first PCR becomes the sample for a second PCR using a new set of primers, at least one of which binds to an interior location of the first amplicon. Multiplexed PCR means a PCR wherein multiple target sequences (or a single target sequence and one or more reference sequences) are simultaneously carried out in the same reaction mixture. Usually, distinct sets of primers are employed for each sequence being amplified. Quantitative PCR means a PCR designed to measure the abundance of one or more specific target sequences in a sample or specimen.

The nucleic acid obtained from the PCR reaction in multiplied copies is known as a "PCR product" or "amplicon".

The term "reaction mixture", "amplification mixture" or "PCR mixture" refers to a mixture of components necessary to amplify at least one amplicon from nucleic acid templates. The mixture may comprise nucleotides (dNTPs), a thermostable polymerase, primers, and a plurality of nucleic acid templates. The mixture may further comprise a Tris buffer, a monovalent salt, and Mg 2+ . The concentration of each component is well known in the art and can be further optimized.

The terms "binds", "anneals", or "hybridizes" are used interchangeably in the present disclosure and relate to the Watson-Crick pairing between a primer or a primer portion and a sequence of interest or a portion of a sequence of interest.

As used herein, a "primer" is an oligo DNA with its 3' termini extendable by a polymerase after it binds at a primer binding site to a template. A primer binding site is a complete or partial site in a target nucleic acid to which a primer hybridizes. Primers are usually designed in pairs that anneal at unique regions outside of a nucleic acid of interest to be amplified by PCR. Forward primers anneal at the 5' region upstream of the nucleic acid of interest and reverse primers anneal at the 3' region downstream of the nucleic acid of interest. A primer can be linked at its 5' end to another nucleic acid (sometimes referred to as a tail), not found in or complementary to the target nucleic acid. A 5' tail can have an artificial sequence. For a primer exactly complementary to a binding site, the demarcation between primer and tail is readily apparent in that the tail starts with the first noncomplementary nucleotide encountered moving from the 3' end of the primer. For a primer substantially complementary to a primer binding site, the last nucleotide of the primer is the last nucleotide complementary to the primer binding site encountered moving away from the 3' end of the primer that contributes to primer binding to the target nucleic acid (i.e., primer with this 5' nucleotide has higher Tm for the target nucleic acid than a primer without the 5' nucleotide). Complementarity or not between nucleotides in the primer and priming binding site is determined by Watson-Crick pairing or not on the maximum alignment of the respective sequences.

The "melting temperature (Tm)" as used herein is the temperature at which half of the DNA duplex, formed by the nucleic acid of interest and at least one primer, dissociates to become single-stranded.

The "annealing temperature (Ta)" is the temperature at which the primer hybridizes or anneals to the complementary sequence of a separated strand of a nucleic acid template.

The "extension temperature" allows the synthesis of a nascent DNA strand of the amplicon.

"A polymerase" is an enzyme that can perform template-directed extension of a primer hybridized to the template. It can be a DNA polymerase, an RNA polymerase, or a reverse transcriptase. Examples of DNA polymerases include: E. coli DNA polymerase I, Taq DNA polymerase, S. pneumoniae DNA polymerase I, Tfl DNA polymerase, D. radiodurans DNA polymerase I, Tth DNA polymerase, Tth XL DNA polymerase, M. tuberculosis DNA polymerase I, M. thermoautotrophicum DNA polymerase I, Herpes simplex-1 DNA polymerase, T4 DNA polymerase, thermosequenase or a wild-type or modified T7 DNA polymerase, 029 Polymerase, Bst Polymerase, Vent Polymerase, 9° Nm Polymerase, Klenow fragment of DNA Polymerase I, Pyrococcus furiosus DNA polymerase, thermococcus kodakaraenis DNA polymerase, Q5 DNA polymerase. Examples of reverse transcriptase: AMV Reverse Transcriptase, MMLV Reverse Transcriptase, HIV Reverse Transcriptase. Examples of RNA polymerases include: T7 RNA polymerase or SP6 RNA polymerase, bacterial RNA polymerases and eukaryotic RNA polymerases.

A "nucleotide" as used herein is composed of a nucleobase, a five-carbon sugar (i ribose or 2-deoxyribose), and one or more than one phosphate group (two or three phosphate groups). Thus, the term "nucleotide" generally refers to a nucleoside monophosphate, but a nucleoside diphosphate or nucleoside triphosphate can be considered a nucleotide as well.

The term "dNTP" generally refers to an individual or combination of deoxynucleotides containing a phosphate, sugar, and organic base in the triphosphate form, that provide precursors required by a DNA polymerase for DNA synthesis. A dNTP mixture may include each of the naturally occurring deoxynucleotides (i.e., adenine (A), guanine (G), cytosine (C), uracil (U), and Thymine (T)).

"Adenine" is a nucleobase, and is a chemical component of DNA and RNA. As used herein, "adenines" can refer to polyadenosine ribonucleotides, polyadenosine monophosphates or polyadenylyls. The shape of adenine is complementary to either thymine in DNA or uracil in RNA.

"Thymine", is one of the four nucleobases in the nucleic acid of DNA. As used herein, "thymines" can refer to polythymidine ribonucleotides, polythymidine monophosphates or polythymidyls.

The terms "poly(dA/dT) tail" or "poly(dA/dT) stretch", interchangeably used herein, refer to a polynucleotide sequence consisting of multiple adenines or thymines positioned at the 3' end of a DNA sequence and the is transcribable to mRNA. In eukaryotic cells, a poly(A) tail is typically added to the mRNA at the 3'-most segment at the end of the transcription. The poly(A) tail is important for the nuclear export, translation, and stability of mRNA. In some embodiments the poly(dA/dT) tail may be interrupted by a non A/T sequence.

The "GC-content" (or guanine-cytosine content) as disclosed herein is the percentage of nitrogenous bases in a DNA molecule that are either guanine (G) or cytosine (C). This measure indicates the proportion of G and C bases out of an implied four total bases, also including adenine and thymine.

"Oligo" as used herein is defined as short single-stranded DNA fragments.

The term "UTR" as used herein refers to either of two untranslated regions, 5'UTR and 3'UTR, one on each side of a coding sequence on a DNA or RNA strand. The 5’ UTR is upstream from the coding sequence. Within the 5’ UTR there is a sequence that is recognized by the ribosome which allows the ribosome to bind and initiate translation. The 3’ UTR is found immediately following the translation stop codon. The 3' UTR plays a critical role in translation termination as well as post- transcriptional modification.

"In vitro transcription (IVT)" is the procedure that allows for a template-directed synthesis of RNA molecules of any sequence. In vitro transcription requires a purified linear DNA template, ribonucleotide triphosphates, a buffer system that includes DTT and magnesium ions, and an appropriate phage RNA polymerase

The term "IVT template" as used herein refers to a DNA sequence containing a promoter and a coding region used for synthesizing a mRNA molecule in the in vitro transcription process. In some embodiments, the IVT template may comprise a poly(dA/dT) tail.

In a first aspect, the invention provides a method for incorporating a poly(dA/dT) tail to a nucleic acid sequence of interest, the method comprising: providing a backbone DNA sequence comprising a sequence of interest, encoding for a protein or peptide of interest; amplifying at least the sequence of interest by means of a Polymerase Chain Reaction (PCR) using a pair of amplification primers comprising a forward and a reverse primer, said reverse primer having a length of between 30 to 530 nucleotides and comprising a stretch of thymine nucleotides of at least 20 nucleotides long.

Generally, long primers over 30 nucleotides, are not used for PCR amplification as they present drawbacks such as slow hybridizing rates and the propensity to selfhybridization and thus the formation of hairpin structures. Thus, it has not been considered possible to use long PCR primers, to introduce into a gene, new sequences different from the primer binding site. The present invention circumvents these issues by designing a primer pair highly specific and efficient.

It was surprisingly observed that using the primer pair as disclosed herein allows for both specific amplification of the nucleic acid of interest and the incorporation or the maintenance of a size-controlled poly(dA/dT) tail. The nucleic acid sequence of interest may be any gene or gene fragment encoding a protein or peptide of interest. In an embodiment, said sequence of interest is a naturally occurring sequence derived from an eukaryotic or prokaryotic organism, preferably from a virus or a virion. In another embodiment, said sequence of interest is a synthetic construct. In yet another embodiment said sequence of interest is a mixture between a naturally occurring sequence and a synthetic construct. In some embodiments, the sequence of interest is derived from the nuclear genome of an organism while in other embodiments the sequence of interest is derived from organelle genomes such as mitochondria or chloroplasts.

In some embodiments, the nucleic acid sequence of interest is a DNA while in other embodiments said nucleic sequence of interest is a cDNA obtained by reverse transcription from an RIMA.

In an embodiment, the method disclosed herein is the first step in the production of a vaccine and the nucleic acid sequence of interest encodes for a molecule that induces an immunogenic response. In a further embodiment, said nucleic acid sequence of interest may encode any disease or pathogen-related protein, such as an antigen or functional fragments thereof or any pathogen subunit protein. In a further embodiment, said nucleic acid sequence of interest may encode any full version, truncated version, mutated version, or modified version of said antigen. In another embodiment said nucleic acid sequence of interest encodes at least one antigen or functional fragments thereof. In yet another embodiment, said nucleic acid sequence of interest encodes more than one antigen or functional fragments thereof.

The nucleic acid of interest may be of any size known in the art as being suitable as a PCR amplification template. In an embodiment, the sequence of interest is between 80 bp and 20000 bp, preferably between 80 bp and 15000 bp, more preferably between 80 bp and 10000 bp, more preferably between 80 bp and 5000 bp, more preferably between 80 bp and 3000 bp, more preferably between 80 bp and 2500 bp, more preferably between 80 bp and 2000 bp, more preferably between 80 bp and 1500 bp or more preferably between 80 bp and 1000 bp. Alternatively, the sequence of interest is between 1000 bp and 20000 bp, 1500 bp and 20000 bp, 2000 bp and 20000 bp, 2500 bp and 20000 bp, 3000 bp and 20000 bp, 5000 bp and 20000 bp, 10000 bp and 20000 bp or between 15000 bp and 20000 bp. The backbone DNA sequence may be any DNA backbone known in the art such as, but not limited to circular DNA such as plasmid DNA, genomic DNA, or a gene fragment.

A primer has a sequence complementary to its primer binding site. By convention, the forward primer is complementary to the noncoding antisense strand so the extended product is the coding strand, and the reverse primer to the coding sense strand so the extended product is the noncoding strand.

In an embodiment, the reverse primer has a length of between 30 and 530 nucleotides, between 30 and 500 nucleotides, between 30 and 450 nucleotides, between 30 and 400 nucleotides, between 30 and 350 nucleotides, between 30 and 300 nucleotides, between 30 and 250 nucleotides, between 30 and 200 nucleotides, between 30 and 150 nucleotides, between 30 and 100 nucleotides, between 30 and 100 nucleotides, or between 30 and 50 nucleotides.

Alternatively, the reverse primer has a length of between 50 and 530 nucleotides, between 100 and 530 nucleotides, between 150 and 530 nucleotides, between 200 and 530 nucleotides, between 250 and 530 nucleotides, between 300 and 530 nucleotides, between 350 and 530 nucleotides, between 400 and 530 nucleotides, between 450 and 530 nucleotides, or between 500 and 530 nucleotides.

In another embodiment, the reverse primer has a length of at least 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, or 530 nucleotides.

The reverse primer, such as disclosed herein contains a poly(dT) tail as well as a specific sequence that is long enough to hybridize or anneal at the primer binding site of the DNA sequence of interest. Such a long primer allows thus for the addition of a poly(dA/dT) tail at the end of the DNA sequence of interest. Such a strategy is ideal for the industrial-scale synthesis of IVT templates. The implementation of PCR already leads to a drastic reduction of synthesis times and volumes, when compared to fermentation, and the direct incorporation of an end-terminating poly(dA/dT) tail improves, even more, the efficacity of the process. Direct tailing using the long revers primer ensures the production of tailed DNA templates at yields and lengths of 100% while removing the need of using restriction enzymes, resulting in time and cost reduction. Moreover, the obtained DNA templates have blunt ends with no overhanging portions.

The reverse primer as disclosed herein has a stretch of thymine nucleotides, that is to be understood as a sequence of consecutive thymine nucleotides. Said stretch of thymine nucleotides is at least 20 nucleotides long in a preferred embodiment. In another embodiment said the stretch of thymine nucleotides is between 20 and 500 nucleotides long, between 20 and 450 nucleotides, between 20 and 400 nucleotides, between 20 and 350 nucleotides, between 20 and 300 nucleotides, between 20 and 250 nucleotides, between 20 and 200 nucleotides, between 20 and 150 nucleotides, between 20 and 100 nucleotides, between 20 and 100 nucleotides, or between 20 and 50 nucleotides.

Alternatively, the stretch of thymine nucleotides has a length of between 20 and 500 nucleotides, between 50 and 500 nucleotides, between 100 and 500 nucleotides, between 150 and 500 nucleotides, between 200 and 500 nucleotides, between 250 and 500 nucleotides, between 300 and 500 nucleotides, between 350 and 500 nucleotides, between 400 and 500 nucleotides, or between 450 and 500 nucleotides.

In another embodiment, the stretch of thymine nucleotides has a length of at least 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 nucleotides.

In an embodiment, the thymine stretch represents at least 16w% of the total reverse primer length, preferably at least 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or even more preferably 98%.

In a preferred embodiment of the method disclosed herein, said stretch of thymines is present at the 5' end of said reverse primer. The 3' end of the reverse primer is complementary to the primer binding site positioned at the 5' end of the sense strand of the nucleic acid of interest and thus the stretch of thymines is positioned downstream of said complementary region.

In some embodiments, the stretch of thymine nucleotides is consecutive while in other embodiments it is interrupted by a stabilizer sequence. By "stabilizer sequence" within the meaning of the inventions, is meant a sequence of one or more nucleotides that are encompassed within the stretch of thymines and that starts and finishes with a nucleotide that is not thymine. The inside of the stabilizer sequence may be comprised of any nucleotides. Integration of a stabilizer sequence into the stretch of thymine nucleotides may aid the specificity and/or contribute to the reduction of length of the reverse primer.

In an embodiment, the stabilizer sequence comprises between 5 to 30 nucleotides. In another embodiment, said stabilizer sequence has a length of between 5 and 29 nucleotides, of between 5 and 28 nucleotides, of between 5 and 27 nucleotides, of between 5 and 26 nucleotides, of between 5 and 25 nucleotides, of between 5 and 24 nucleotides, of between 5 and 23 nucleotides, of between 5 and 22 nucleotides, of between 5 and 21 nucleotides, of between 5 and 20 nucleotides, of between 5 and 19 nucleotides, of between 5 and 18 nucleotides, of between 5 and 17 nucleotides, of between 5 and 16 nucleotides, of between 5 and 15 nucleotides, of between 5 and 14 nucleotides, of between 5 and 13 nucleotides, of between 5 and 12 nucleotides, of between 5 and 11 nucleotides, of between 5 and 10 nucleotides, of between 5 and 9 nucleotides, of between 5 and 8 nucleotides, of between 5 and 7 nucleotides, or between 5 and 6 nucleotides.

Alternatively, said stabilizer sequence has a length of between 6 and 30 nucleotides, of between 7 and 30 nucleotides, of between 8 and 30 nucleotides, of between 9 and 30 nucleotides, of between 10 and 30 nucleotides, of between 11 and 30 nucleotides, of between 12 and 30 nucleotides, of between 13 and 30 nucleotides, of between 14 and 30 nucleotides, of between 15 and 30 nucleotides, of between 16 and 30 nucleotides, of between 17 and 30 nucleotides, of between 18 and 30 nucleotides, of between 19 and 30 nucleotides, of between 20 and 30 nucleotides, of between 21 and 30 nucleotides, of between 22 and 30 nucleotides, of between 23 and 30 nucleotides, of between 24 and 30 nucleotides, of between 25 and 30 nucleotides, of between 26 and 30 nucleotides, of between 27 and 30 nucleotides, of between 28 and 30 nucleotides, or between 29 and 30 nucleotides.

In a further embodiment, the stabilizer is rich in guanine (G) and/or cytosine (C) nucleotides and contains between 40% and 60% of G and/or C nucleotides. In a preferred embodiment, the stabilizer contains the 4 nucleotides (A, T, C and G) at 25% each. In an embodiment of the method as disclosed herein, the forward primer has a length of between 5 and 50 nucleotides, between 5 and 45 nucleotides, between 5 and 40 nucleotides, between 5 and 35 nucleotides, between 5 and 30 nucleotides, between 5 and 25 nucleotides, between 5 and 20 nucleotides, between 5 and 15 nucleotides, or between 5 and 10 nucleotides.

Alternatively, the forward primer has a length of between 10 and 50 nucleotides, between 15 and 50 nucleotides, between 20 and 50 nucleotides, between 25 and 50 nucleotides, between 30 and 50 nucleotides, between 35 and 50 nucleotides, between 40 and 50 nucleotides or, between 45 and 50 nucleotides.

The forward prime is designed to anneal or hybridize to the primer binding site positioned at the 5' end of the antisense DNA strand of the nucleic acid sequence of interest.

In an embodiment of the method, the nucleic acid of interest is amplified in a PCR reaction with a specific pair of primers, forward and reverse as disclosed herein. Although having different lengths, said forward and reverse primers have similar PCR characteristics that ensure optimal amplification efficiency. Generation of self-dimers and hairpins is avoided by design, using bioinformatic tools. Any bioinformatic tool known in the art may be used for identifying primer binding sites in the nucleic acid of interest and designing the pair of primers as disclosed herein.

Selection of primer binding sites and primers can be performed by computer- implemented analysis of a target nucleic acid in a computer programmed by non- transitory computer-readable storage media. The sequence of a target nucleic acid (one or both strands) is received in a computer. The computer also stores or receives by user input desired nucleotide compositions of primers (e.g., A, T, C). The computer is then programmed to search the target sequence to identify forward and reverse primer binding sites within a distance of one another compatible with amplification that most closely corresponds to the primer composition. If the primer composition is A, T, C, then forward and reverse primer binding sites should most closely correspond to A, T and G. The computer can identify forward and reverse primer binding sites on opposite strands or can identify a complement of the forward primer binding sites and reverse primer binding site on the same strand and calculate the forward primer binding site from its complement. The computer can then provide an output of candidate pairs of primer binding sites, which may differ to varying degrees from the ideal composition sought. The computer can also show primer designs that hybridize to each of the primer binding site pairs. Multiple primer designs can be shown for the same primer binding site pair with different numbers of units of the underrepresented nucleotide and different numbers of mismatches.

Any software and algorithm known in the art may be used for the identification of primer binding sites and primer design.

The Tm is the temperature at which half of the DNA duplex, formed by the nucleic acid of interest and at least one primer, will dissociate to become single-stranded. The T m is an indicator of duplex stability.

A predictive T m of the pair of primers disclosed herein may be calculated using the formula: Tm= 64.9 +41*(yG+zC-16.4)/(wA+xT+yG+zC), where w, x, y, and z are the number of the bases A, T, G, and C in the sequence.

The T m of said forward primer and reverse primer excluding the poly(dT) stretch is preferably between 50°C and 75°C. In some embodiments, the T m of said forward and reverse primers is between 50°C and 74°C, between 50°C and 73°C, between 50°C and 72°C, between 50°C and 71°C, between 50°C and 70°C, between 50°C and 69°C, between 50°C and 68°C, between 50°C and 67°C, between 50°C and 66°C, between 50°C and 65°C, between 50°C and 64°C, between 50°C and 63°C, between 50°C and 62°C, between 50°C and 61°C, between 50°C and 60°C, between 50°C and 59°C, between 50°C and 58°C, between 50°C and 57°C, between 50°C and 56°C, between 50°C and 55°C, between 50°C and 54°C, between 50°C and 53°C, between 50°C and 52°C, or between 50°C and 51°C.

Alternatively, the T m of said forward primer and reverse primer excluding the poly(dT) stretch is between 51°C and 75°C, between 52°C and 75°C, between 53°C and 75°C, between 54°C and 75°C, between 55°C and 75°C, between 56°C and 75°C, between 57°C and 75°C, between 58°C and 75°C, between 59°C and 75°C, between 60°C and 75°C, between 61°C and 75°C, between 62°C and 75°C, between 63°C and 75°C, between 64°C and 75°C, between 65°C and 75°C, between 66°C and 75°C, between 67°C and 75°C, between 68°C and 75°C, between 68°C and 75°C, between 69°C and 75°C, between 70°C and 75°C, between 71°C and 75°C, between 72°C and 75°C, between 73°C and 75°C or, between 74°C and 75°C.

In an embodiment of the method, the difference in Tm of the reverse primer excluding the poly(dT) stretch and forward primer is between 0°C and 10°C. In an embodiment said difference in Tm of the reverse primer excluding the poly(dT) stretch and forward primer is between 0.5 °C and 10 °C, 1 and 10°C, 1.5°C and 10°C, 2°C and 10°C, 2.5°C and 10°C, 3°C and 10°C, 3.5°C and 10°C, 4°C and 10°C, 4.5°C and 10°C, 5°C and 10°C, 5.5°C and 10°C, 6°C and 10°C, 6.5°C and 10°C, 7°C and 10°C, 7.5°C and 10°C, 8°C and 10°C, 8.5°C and 10°C, 9°C and 10°C or 9.5°C and 10°C.

Alternatively, said Tm difference is between 0°C and 9.5°C, 0°C and 9°C, 0°C and 8.5°C, 0°C and 8°C, 0°C and 7.5°C, 0°C and 7°C, 0°C and 6.5°C, 0°C and 6°C, 0°C a,d 5.5°C, 0°C and 5°C, 0°C and 4.5°C, 0°C and 4°C, 0°C and 3.5°C, 0° and 3°C, 0°C and 2.5°C, 0°C and 2°C, 0°C and 1.5°C, 0°C and 1°C or between 0°C and 0.5°C. The Tm of the forward primer was designed in function to the Tm of the reverse primer. In this way, the annealing of both primers occurs at the same time.

DNA with high GC content is characterized by higher thermostability than DNA with lower GC content and coding regions of a genome are characterized by having a higher GC content in contrast to the non-coding region. Binding between G and C nucleotides is formed by three hydrogen bonds, compared to only two between A and T pairs. Therefore, G and C pairs are considered to have stronger binding than A and T pairs.

The forward primer and the part of the reverse primer excluding the poly(T) stretch preferably have a GC content of between 10% and 90%, preferably between 40% and 60%, in an embodiment of the method disclosed herein.

In other embodiments, the forward primer and the part of the reverse primer excluding the poly(T) stretch have a GC content of between 20% and 90%, between 30% and 90%, between 40% and 90%, between 50% and 90%, between 60% and 90%, between 70% and 90%, or between 80% and 90%.

Alternatively, said GC content is between 10% and 80%, between 10% and 70%, between 10% and 60%, between 10% and 50%, between 10% and 40%, between 10% and 30%, or between 10% and 20%.

In another embodiment, said GC content is between 40% and 55%, between 40% and 50%, or between 40% and 45%. In yet another embodiment, said GC content is between 45% and 60%, between 50% and 60%, or between 55% and 60%.

The high GC content of the forward primer and the part of the reverse primer excluding the poly(T) stretch reflects the GC content of the sequence of interest they anneal to during the PCR amplification.

In some embodiments of the method as disclosed herein, the forward and/or reverse primer comprises a terminal GC clamp. A "GC clamp" as used in the present disclosure is defined as the presence of a G or C nucleotide in the last 5 nucleotides (the 3' end) of a PCR primer. The presence of a GC clamp in a PCR primer can help to improve the specificity of primer binding to the complementary sequence. Since G and C nucleotide pairs have superior binding, placing 1-2 of these nucleotides at the end of the primer will encourage complete primer binding. However, it is not recommended to include more than 2 G or C nucleotides in the last 5 nucleotides of a primer. Doing so can actually have adverse effects by increasing the primer melting temperature and reducing primer specificity.

In an embodiment of the method as disclosed herein, the forward primer anneals to at least a portion of said sequence of interest and/or a sequence upstream of said sequence of interest. Said forward primer may anneal to any position of said sequence of interest, preferably immediately downstream to the promoter comprised in the sequence of interest. In other embodiments, the forward primer anneals to a position up to 4000 nucleotides, 3900 nucleotides, 3800 nucleotides, 3700 nucleotides, 3600 nucleotides, 3500 nucleotides, 3400 nucleotides, 3300 nucleotides, 3200 nucleotides, 3100 nucleotides, 3000 nucleotides, 2900 nucleotides, 2800 nucleotides, 2700 nucleotides, 2600 nucleotides, 2500 nucleotides, 2400 nucleotides, 2300 nucleotides, 2200 nucleotides, 2100 nucleotides, 2000 nucleotides, 1900 nucleotides, 1800 nucleotides, 1700 nucleotides, 1600 nucleotides, 1500 nucleotides, 1400 nucleotides, 1300 nucleotides, 1200 nucleotides, 1100 nucleotides, 1000 nucleotides, 900 nucleotides, 800 nucleotides, 700 nucleotides, 600 nucleotides, 500 nucleotides, 400 nucleotides, 300 nucleotides, 200 nucleotides, 100 nucleotides, 90 nucleotides, 80 nucleotides, 70 nucleotides, 60 nucleotides, 50 nucleotides, 40 nucleotides, 30 nucleotides, 20 nucleotides, 10 nucleotides, 9 nucleotides, 8 nucleotides, 7 nucleotides, 6 nucleotides, 5 nucleotides, 4 nucleotides, 3 nucleotides, 2 nucleotides, or 1 nucleotide upstream of the promoter comprised in the sequence of interest. In an embodiment, a portion of the reverse primer anneals to at least a portion of said sequence of interest and/or a sequence downstream of said sequence of interest. In a further embodiment, the portion of the reverse primer that anneals to at least a portion of said sequence of interest and/or a sequence downstream of said sequence of interest excludes the poly(T) stretch. In another embodiment, the portion of the reverse primer that anneals to at least a portion of said sequence of interest and/or a sequence downstream of said sequence of interest includes portions of the poly(T) stretch. In yet another embodiment, the entire reverse primer anneals to at least a portion of said sequence of interest and/or a sequence downstream of said sequence of interest. In a preferred embodiment, a portion of the reverse primer anneals to the 3'UTR of the sequence of interest.

In some embodiments of the method as disclosed herein, the forward and/or reverse primer may comprise in addition to the sequences that anneal to the DNA sequence of interest, one or more new sequences. During the PCR amplification, said new sequences are incorporated into the sequence of interest.

In a preferred embodiment, the forward primer comprises a sequence corresponding to a promoter region. Any promotor known in the art can be comprised in the forward primer. Non-limitative examples include T3 and T7 promoters.

In addition, the forward primer may harbor additional sequences, such as an RNA polymerase promoter sequence and/or any other functional sequences, such as transcription and/or translation regulating sequences. Without wishing to be bound to theory, functional sequences are understood as those sequences that (1) are required for replication and structural integrity of the chromosome, (2) encode functional products (RNAs and derived proteins), or (3) are required for the correct four-dimensional expression (regulation or processing) of these products during ontogeny and homeostasis.

In some embodiments of the method, the forward primer comprises a 5'UTR, and/or the reverse primer comprises a 3'UTR sequence, which is incorporated in the sequence of interest after PCR. By incorporating various functional sequences, such as promoters and/or UTRs, the method as disclosed herein allows the provision of IVT templates from DNA of any origin, wherein said templates may be directly used for the synthesis of mRNA.

In an embodiment of the method, the obtained polynucleotide template for in vitro transcription has a longer sequence than the backbone DNA sequence. With the incorporation of various functional sequences via the primers that contain new sequences, the amplicons of the sequence of interest are longer than the original backbone DNA. The added sequence that is not contained in the backbone DNA, may be a functional sequence or a poly(dA/dT) tail.

In some embodiments of the method disclosed herein, the forward and reverse primers are modified to incorporate detectable and/or selectable markers. Said markers are incorporated in the sequence of interest during amplification and allow visualization and/or purification of the amplicons carrying it from the reaction pool. More specifically, the forward and reverse primers are modified to incorporate fluorophores, biotin, non-natural amino acids, and/or abasic sites.

Fluorophores are chemical compounds, which when excited by exposure to a particular stimulus such as a defined wavelength of light, emit light. Incorporation of fluorophores into the sequence of interest aids visualization of the amplicons. Any fluorophores known in the art to be able to label DNA may be linked to the primers and incorporated into the sequence of interest. Non-limiting examples include: acetamido-4'-isothiocyanatostilbene-2,2' disulfonic acid; acridine and derivatives such as acridine and acridine isothiocyanate, 5-(2'-aminoethyl)aminonaphthalene-l- sulfonic acid (EDANS), 4-amino-N-[3-vinylsulfonyl)phenyl]naphthalimide-3,5 disulfonate (Lucifer Yellow VS), N-(4-anilino-l-naphthyl)maleimide, anthranilamide; Brilliant Yellow; coumarin and derivatives such as coumarin, 7-amino-4- methylcoumarin (AMC, Coumarin 120), 7-amino-4-trifluoromethylcouluarin (Coumaran 151); cyanosine; 4',6-diaminidino-2-phenylindole (DAPI); 5', 5"- dibromopyrogallol-sulfonephthalein (Bromopyrogallol Red); 7-diethylamino-3-(4'- isothiocyanatophenyl)-4-methylcoumarin; diethylenetriamine pentaacetate; 4,4'- diisothiocyanatodihydro-stilbene-2,2'-disulfonic acid; 4,4'-diisothiocyanatostilbene- 2,2'-disulfonic acid; 5-[dimethylamino]naphthalene-l-sulfonyl chloride (DNS, dansyl chloride); 4-dimethylaminophenylazophenyl-4'-isothiocyanate (DABITC); eosin and derivatives such as eosin and eosin isothiocyanate; erythrosin and derivatives such as erythrosin B and erythrosin isothiocyanate; ethidium; fluorescein and derivatives such as 5-carboxyfluorescein (FAM), 5-(4,6-dichlorotriazin-2-yl)aminofluorescein (DTAF), 2'7'-dimethoxy-4'5'-dichloro-6-carboxyfluorescein (JOE), fluorescein, fluorescein isothiocyanate (FITC), QFITC(XRITC), -6-carboxy-fluorescein (HEX), and TET (Tetramethyl fluorescein); fluorescamine; IR144; IR1446; Malachite Green isothiocyanate; 4-methylumbelliferone; ortho cresolphthalein; nitrotyrosine; pararosaniline; Phenol Red; B-phycoerythrin; o-phthaldialdehyde; pyrene and derivatives such as pyrene, pyrene butyrate and succinimidyl 1-pyrene butyrate; Reactive Red 4 (CIBACRON™ Brilliant Red 3B-A); rhodamine and derivatives such as 6-carboxy-X-rhodamine (ROX), 6-carboxyrhodamine (R6G), lissamine rhodamine B sulfonyl chloride, rhodamine (Rhod), rhodamine B, rhodamine 123, rhodamine X isothiocyanate, N,N,N',N'-tetramethyl-6-carboxyrhodamine (TAMRA), tetramethyl rhodamine, and tetramethyl rhodamine isothiocyanate (TRITC); sulforhodamine B; sulforhodamine 101 and sulfonyl chloride derivative of sulforhodamine 101 (Texas Red); riboflavin; rosolic acid and terbium chelate derivatives; LightCycler Red 640; Cy5.5; and Cy56-carboxyfluorescein; boron dipyrromethene difluoride (BODIPY); acridine; stilbene; 6-carboxy-X-rhodamine (ROX); Texas Red; Cy3; Cy5,); LC Red 640; LC Red 705; and Yakima yellow amongst others.

When primers labeled with biotin are used, the biotinylated amplicons obtained after PCR can be purified or detected, preferably in the presence of streptavidin, which has a high affinity for biotin.

Non-natural amino acids are amino acid-like compounds that are similar in structure and/or overall shape to one or more of the twenty L-amino acids commonly found in naturally occurring proteins. In some embodiments of the method as disclosed herein, the primers are linked to one or more non-natural amino acids and thus the amplicons obtained after the PCR reaction are linked to said one or more non-natural amino acids. The non-natural amino acids generally have a reactive moiety with a functional group that can be specifically reacted with a reactive moiety on a reagent, which reagent further comprises a functional moiety, such as a fluorescent moiety, an affinity moiety, etc. Once the reactive moiety on the non-natural amino acid reacts with the reactive moiety on the reagent, the functional group becomes associated with the amplicon linked to said non-natural amino acid, thereby facilitating the labeling, detecting, monitoring, isolating, and/or identifying said amplicon. Any non- natural amino acids known in the art to be able to label DNA may be linked to the primers and incorporated into the sequence of interest. Non-limiting examples include: azidohomoalanine, homoproparglyglycine, p- bromophenylalanine, p- iodophenylalanine, azidophenylalanine, acetylphenylalanine and ethynylephenylalanine.

In some embodiments of the method as disclosed herein, the primers are modified to include abasic sites (AP sites) and thus, the PCR amplicon of the sequence of interest also contains AP sites. AP sites are DNA locations that have neither a purine nor a pyrimidine base when deoxyribose is cleaved. The AP sites are very reactive fluctuating between a furanose ring and an open-chain free aldehyde and free alcohol confirmation. In some embodiments of the method disclosed herein, the AP sites integrated into the amplicons after PCR may be labeled using aldehyde reactive probes. Aldehyde reactive probes are chemical probes containing hydroxylamine, which reacts with the open-chain aldehydic form of the AP-site, and mostly use affinity groups for enrichment and detection. Any aldehyde reactive probes known in the art may be used for the detection, visualization and/or capture of the amplicons containing AP-sites, such as but not limited to N-aminooxylacetyl-N'-D-biotynoyl hydrazine.

In some embodiments of the method disclosed herein, the forward and reverse primers are modified to comprise phosphorothioate bonds (PS), locked nucleic acids (LNA), and/or peptide nucleic acid (PNA).

In phosphorothioate (PS) bonds, a sulfur atom is substituted for non-bridging oxygen in the phosphate backbone of an oligo. This modification renders the internucleotide linkage resistant to nuclease degradation. Phosphorothioate bonds can be introduced between the last 3-5 nucleotides at the 5’- or 3’-end of the primers to inhibit exonuclease degradation.

Locked nucleic acids (LNAs) are modified RNA monomers wherein a methylene bridge bond is linked to the 2' oxygen to the 4' carbon of the RNA pentose ring. The bridge bond fixes the pentose ring in the 3'-endo conformation. When incorporated into the primers as disclosed herein, LNAs impart heightened structural stability, resulting in increased hybridization Tm and increased resistance to nucleases.

Peptide nucleic acids (PNAs) are DNA analogs where the phosphate-ribose backbone is substituted with a peptide-like amide backbone (e.g. N-(2-aminoethyl) glycine). When the primers according to the present disclosure comprise PNAs, they display increased affinity and stability of the binding to the sequence of interest. In a preferred embodiment of the method disclosed herein, the nucleic acid of interest is amplified by PCR. The amplification conditions are usually similar to conventional conditions in terms of buffers, Mg 2+ , enzymes, temperatures, no of cycles, and so forth. Conventional amplification is performed with all four standard nucleotide types present as dNTP monomers.

In an embodiment of the method, as disclosed herein, the DNA polymerase used for the reaction is Q5 DNA polymerase or Platinum SuperFi II DNA polymerase, which present the highest fidelity described sor far (10-7 error/bp/duplication). Alternatively, KOD DNA polymerase, pfu DNA polymerase or Thermococcus pacificus DNA polymerase are also be used.

In an embodiment, the primer concentration in the reaction mixture is between 0.05 pM and 10 pM, preferably between 0.4 pM and 1 pM. In a further embodiment, the concentration of the dNTP monomers in the reaction mixture is between 0.02 mM and 1 mM, preferably between 0.2 mM and 0.4 mM. In yet a further embodiment, the concentration of the template in the reaction mixture is between 0.05 ng/pl and 1 ng/pl, preferably between 0.6 ng/pl and 0.25 ng/pl. In some further embodiments, the concentration of the polymerase in the reaction mixture is between 0.001 U/pl and 1 U/pl, preferably between 0.005 U/pl and 0,05 U/pl.

In an embodiment of the method, as disclosed herein, the PCR reaction is carried on in a reaction volume of between 50 pl and 300 pl, between 75 pl and 300 pl, between

100 pl and 300 pl, between 125 pl and 300 pl, between 150 pl and 300 pl, between

150 pl and 300 pl, between 175 pl and 300 pl, between 200 pl and 300 pl, between

225 pl and 300 pl, between 250 pl and 300 pl or between 275 pl and 300 pl.

Alternatively, the PCR reaction is carried on in a reaction volume of between 50 pl and 275 pl, between 50 pl and 250 pl, between 50 pl and 225 pl, between 50 pl and 200 pl, between 50 pl and 175 pl, between 50 pl and 150 pl, between 50 pl and 125 pl, between 50 pl and 100 pl or between 50 pl and 75 pl.

Carrying on the PCR reaction in a high volume ensures increased yield of the PCR product, improved sensitivity of the PCR reaction, is more economical, time efficient and reduces labour and waste. Moreover, the large reaction volumes enhance the efficiency of purification and reduce sample handling errors. The method as disclosed herein is thus highly effective in delivering high quantities of tailed DNA that may be used as templates for IVT reactions.

In an embodiment of the method disclosed herein, the PCR reaction comprises an initial denaturation that occurs at between 94°C and 98°C followed by between 20 and 40 amplification cycles. Each amplification cycle comprises a denaturation phase at 98°C, an annealing phase at between 50°C and 72°C, and an extension phase at 72°C. A final extension phase happens at 72°C. It would be obvious for a person skilled in the art that these examples are non-limitative as the temperatures of PCR reaction depend on the design and composition of the primer set used.

In an embodiment of the method, as disclosed herein, the amplification cycles of the PCR reaction comprise two phases, a denaturation phase at 98°C and an annealingextension phase at 72°C. By integrating the annealing and extension phases of PCR into a single step, several advantages are achieved, including streamlined workflow, reduced reaction time, cost-effectiveness, reduced risk of contamination, simplicity and optimal usage of reagents, consumables and resources.

Advantageously, this method does not require the post-PCR digestion of methylated and hemimethylated DNA from the PCR product, with restriction enzymes, such as DpnI. The combination of high reaction volume and amplification in only two phases ensures a high amplicon yield after the PCR reaction. Therefore, the PCR amplicon is directly purified after the PCR reaction, reducing the time and cost of the method, simplifying the protocol reducing the risk of variability between batches and increasing the throughput. The high reaction volume, two-step PCR and bypassing of post-PCR digestion ensure that the method as disclosed herein is significantly faster and more efficient than other methods known in the art.

The method disclosed herein allows the length-controlled incorporation of an endterminating poly(dA/dT) tail to any DNA sequence of interest using PCR reaction. Large quantities of tailed DNA fragments can thus be obtained and used directly for In Vitro Transcription (IVT), without the need of performing an additional restriction step. In addition, this method can be applied directly during a gene assembly process and can be automatized and parallelized for high-throughput screening of templates. IVT templates, such as single nucleotide polymorphism containing templates and/or promoter mutant templates, and UTR sequences may be screened using the method disclosed herein. In a second aspect, the present disclosure relates to a kit for obtaining a polynucleotide template for in vitro transcription, comprising: a forward primer a reverse primer having a length of between 30 to 530 nucleotides and comprising a stretch of thymine nucleotides of at least 20 nucleotides long; a DNA polymerase I; and

- dNTPs.

Said kit comprises a forward and a reverse primer, as disclosed in previous embodiments, a DNA polymerase, and nucleotides and is used for PCR amplification of a sequence of interest. DNA polymerase I catalyzes the sequence of interest- directed polymerization of nucleotides into a double-stranded DNA, in a 5' 3' direction.

By using the kit as disclosed herein it is possible to amplify a DNA sequence of interest and obtain a template for IVT. The templates for IVT obtained using said kit, possesses poly (dT/dA) tails of controlled size. Using such a kit is thus possible to eliminate a step in the IVT process, mainly the post-transcriptional tailing of mRNA.

In a third aspect, the present invention relates to a nucleotide primer set comprising a forward and reverse primer, wherein said nucleotide primer set is to be used in a polymerase chain reaction for amplifying a sequence of interest, wherein said reverse primer has a nucleotide length of between 30 to 500 nucleotides and comprises a stretch of thymine nucleotides of at least 30 nucleotides long. The forward and reverse primers of said primer set are according to any of the previously disclosed embodiments.

In a final aspect, the disclosure relates to a method of in vitro transcription, wherein said in vitro transcription occurs on an amplicon that is obtained by means of the method according to any of the previous embodiments.

In some embodiments of the method, the sequence of interest resulting from the amplification is directly used as a template for IVT. In other embodiments said sequence of interest is purified before being used as a template for IVT. Any purification method known in the art may be employed to purify the DNA before being used as an IVT template. Non-limiting purification method examples are phenol/chloroform extraction, ethanol precipitation, commercial purification kits, silica-based capture, or any chromatographic method. The purification of the amplicon prior IVT is required especially when compliance with GMP standards is a prerequisite.

The purified/unpurified DNA template is used in an IVT system. Said system typically comprises a transcription buffer, dNTPs, an RNase inhibitor, and an RNA polymerase. Any RNA polymerase known in the art may be used, such as T3, T7, SP6, and the like.

In order to be translatable, the mRNA transcript must be capped. In some embodiments of the method, the mRNA is co-transcriptionally capped. In other embodiments, the mRNA is post-transcriptionally enzymatically capped. Any cap analog known in the art may be added to the mRNA obtained according to the method disclosed herein.

In an embodiment of the method disclosed herein, the DNA template used for IVT already contains a DNA-encoded cap or an exoribonuclease-resistant RNA structure (xrRNA). Preferably said cap was introduced in the templated during the PCR reaction of the sequence of interest, using the primer pair as disclosed in previous embodiments. The forward primer may be designed in such a way that in addition to the sequence annealing to the nucleic acid of interest, it comprises a cap sequence at the 5' end.

Forward primers anneal at the 5' region upstream of the nucleic acid of interest

The capped mRNA obtained by the method disclosed herein is further purified using any method known in the art such as liquid chromatography, ion-exchange chromatography, HPLC, reverse phase HPLC, oligo dT affinity purification, tangential flow filtration, ultrafiltration, isopropanol precipitation, salt precipitation, and/or using commercial purification kits.

The invention is not limited to this application. The method according to the invention can be applied in all sorts of nucleic acids wherein a long and controlled poly (dA/dT) sequence is to be introduced. The invention is further described by the following non-limiting examples which further illustrate the invention, and are not intended to, nor should they be interpreted to, limit the scope of the invention.

EXAMPLES

Example 1. Amplification of a 4 kbp DNA that comprises a 1.2 kbp gene with a long reverse primer comprising a 120 poly(dT) tail

A gene of interest, 1.2 kbp long and having a poly(dA/dT) tail was used for the PCR amplification with a primer pair. The forward primer was designed to anneal 2.8 kbp upstream to the T7 promoter, and the reverse primer comprising a poly(dTi2o) was designed to anneal both to the 3' UTR and to the poly(dA/dT) tail of the GOI to amplify a 4 kbp DNA (Figure 1A). The GOI was contained in a circular DNA backbone along with an origin of replication and a T7 polymerase promoter.

For the PCR amplification of the GOI, 5000 pl Master Mix has been prepared using the composition described in Table 1 and distributed in 50 reaction tubes of 100 pL. The tubes were pre-chilled on ice before the start of thermal cycling and the reaction solutions were gently mixed by pipetting and transferred to a PCR Thermal Cycler.

The PCR reaction was carried on according to the conditions described in Table 2. After the PCR reaction, all tubes were pooled and Ipl aliguots of the amplification product were mixed with Ipl of Midori Green direct and resolved on a 1 % agarose gel along with the DNA ladder.

An IVT reaction using 1 pg of PCR products was prepared. Said reactions were performed in a 20 pl final volume using Neb HiScribe™ T7 ARCA mRNA kit (E2060S). After reactions, 5 pl of mRNA was loaded on 1.5% agarose gel in the presence of MidoriGreen direct. Gel images were acguired using Typhoon™ imager. A ssRNA Ladder (NEB N0362S) was used. Tablet. The PCR mix composition used in Example 1

Table 2. PCR thermocycling conditions used in Experiment 1 The gene of interest was successfully amplified with a pair of PCR primers, from which the reverse primer contained poly(dTi2o)- The MidoriGreen fluorescence of the PCR product indicated the presence of a 4 kbp fragment corresponding to the gene of interest (Figure IB). The IVT reaction using the PCR product as a template yielded a 1.2 kb mRNA possessing a 120 bp poly(dT) tail (Figure 2).

Example 2. Incorporation of a 40 poly(dA/dT) tail into a 12 kbp gene of interest A gene of interest, 12 kbp long and lacking poly(dA/dT) tail was used for the PCR amplification with a primer pair. The GOI was contained in a circular DNA backbone along with an origin of replication and a T7 polymerase promoter. The forward primer was designed to anneal with the T7 promoter while the reverse primer comprised template specific sequence along with a poly(dT 4 o) and annealed to the end of the gene of interest (Figure 3A).

For the PCR amplification of the GOI, 1200 pl Master Mix has been prepared using the composition described in Table 3 and distributed in 24 reaction tubes of 50 pL. The tubes were pre-chilled on ice before the start of thermal cycling and the reaction solutions were gently mixed by pipetting and transferred to a PCR Thermal Cycler.

The PCR reaction was carried on according to the conditions described in Table 4. After the PCR reaction, all tubes were pooled and Ipl aliquots of the amplification product were mixed with Ipl of Midori Green direct and resolved on a 1 % agarose gel along with the DNA ladder. The samples were diluted 111 times and loaded on Fragment Analyzer for quality check.

An IVT reaction using 1 pg of PCR products was prepared. Said reactions were performed in a 20 pl final volume using Neb HiScribe™ T7 ARCA mRNA kit (E2060S). After reactions, 5 pl of mRNA was loaded on 1.5% agarose gel in the presence of MidoriGreen direct. Gel images were acquired using Typhoon™ imager. A ssRNA Ladder (NEB N0362S) was used.

Table 3. The PCR mix composition used in Example 2 Table 4. PCR thermocycling conditions used in Example 2

The gene of interest was successfully amplified with a pair of PCR primers, from which the reverse primer contained a poly(dT 4 o). The MidoriGreen fluorescence of the PCR product indicated the presence of a 12kbp fragment corresponding to the gene of interest (Figure 3B). This was confirmed by the results obtained from the Fragment Analyzer where a product of size 11827bp was detected (Figure 4A). The Fragment Analyzer assay was carried on in triplicate, resulting in a concentration of the PCR product of 5 ng/pl, 2.38 ng/pl, and 3.77 ng/pl DNA at purities of 96.6%, 96.8%, and 95.1%, respectively (Figure 4B).

The IVT reaction using the PCR product as a template yielded a 12kbp mRNA possessing a poly(dT 4 o) tail (Figure 5).

Example 3. Amplification of a 2.3kbp gene with a long reverse primer comprising a 70 poly(dT) tail and a tail-stabilizer seguence

A gene of interest, 2.3 kbp long and having a poly(dA/dT 7 o) stretch with a 10 bp stabilizer seguence inserted in the middle of said stretch, was used for the PCR amplification with a primer pair. The GOI was contained in a circular DNA backbone along with an origin of replication and a T7 polymerase promoter. The forward primer was designed to anneal upstream of the T7 promoter while the reverse primer annealed to the stabilizer seguence and the poly (dA/dT) stretch downstream to the stabilizer seguence (Figure 6A).

For the PCR amplification of the GOI, a Master Mix has been prepared using the composition described in Table 5 and distributed in 3 reaction tubes of lOOpL. The tubes were pre-chilled on ice before the start of thermal cycling and the reaction solutions were gently mixed by pipetting and transferred to a PCR Thermal Cycler. The PCR reaction was carried on according to the conditions described in Table 6. After the PCR reaction, all tubes were pooled and Ipl aliquots of the amplification product were mixed with Ipl of Midori Green direct and resolved on a 1 % agarose gel along with the DNA ladder. The samples were diluted and loaded on Fragment Analyzer for quality check.

An IVT reaction using 1 pg of PCR products was prepared. Said reactions were performed in a 20 pl final volume using Neb HiScribe™ T7 ARCA mRNA kit (E2060S). After the reactions, 5 pl of mRNA was loaded on 1.5% agarose gel in the presence of MidoriGreen direct. Gel images were acquired using Typhoon™ imager. A ssRNA Ladder (NEB N0362S) was used. An RNA sample was diluted and loaded on the Fragment Analyzer for quality check.

Table 5. The PCR mix composition used in Example 3

Table 6. PCR thermocycling conditions used in Example 3 The gene of interest was successfully amplified with the PCR primers, from which the reverse primer contained a poly(dT) with a stabilizer sequence. The Midori Green fluorescence of the PCR product indicated the presence of a 2.3 kbp fragment corresponding to the gene of interest (Figure 6B). This was confirmed by the results obtained from the Fragment Analyzer where a product of size 2609bp was detected (Figure 7A). The Fragment Analyzer assay was carried on in triplicate, resulting in a concentration of the PCR product of 5.1 ng/pl, 4.44 ng/pl, and 5.31 ng/pl DNA at purities of 93.4%, 92.5%, and 94.1%, respectively (Figure 7B).

The IVT reaction using the PCR product as a template yielded a 2.3kbp mRNA (Figure 8). This was confirmed by the results obtained from the Fragment Analyzer where a product of size 2316 bp was detected (Figure 9A) with a concentration of 187.9 ng/pl and 95.3% purity (Figure 9B).

The present invention is in no way limited to the embodiments described in the examples and/or shown in the figures. On the contrary, methods according to the present invention may be realized in many different ways without departing from the scope of the invention.

Example 4. Two phases and high volume PCR amplification of a 2kb phage Lambda gene

A gene of interest from phage Lambda, 2 kbp long and lacking poly(dA/dT) tail, was used for the PCR amplification with a primer pair. The GOI did not contain a T7 polymerase promoter. The forward primer was designed to anneal with templatespecific sequence and contained an additional T7 promoter while the reverse primer comprised template-specific sequence and contained an additional poly(dTso) and annealed to the end of the gene of interest.

For the PCR amplification of the GOI, 1450 pl Master Mix has been prepared using the composition described in Table 7 and distributed in 3 types of reaction volumes of 100, 150 or 200 pL, each in 3 replicates. The tubes were pre-chilled on ice before the start of thermal cycling and the reaction solutions were gently mixed by pipetting and transferred to a PCR Thermal Cycler.

The PCR reaction was carried on according to the conditions described in Table 8, namely, the initial denaturation was followed by 30 cycles in which annealing and extension were performed in a single step. No post-PCR digestion was performed. After the PCR reaction, all tubes of the same reaction volume were pooled and Ipl aliquots of the amplification product were mixed with Ipl of Midori Green direct and resolved on a 1 % agarose gel along with the DNA ladder. Gel images were acquired using Typhoon™ imager.

An IVT reaction using 1 pg of PCR products was prepared. Said reactions were performed in a 20 pl final volume using Neb HiScribe™ T7 ARCA mRNA kit (E2060S). After reactions, 5 pl of mRNA was loaded on Fragment Analyzer for quality check.

Table 7. The PCR mix composition used in Example 4

Table 8. PCR thermocycling conditions used in Example 4

The gene of interest was successfully amplified with a pair of PCR primers, from which the forward primer contained the T7 promoter and the reverse primer contained a poly(dTso). The MidoriGreen fluorescence of the PCR product indicated the presence of a 2kbp fragment corresponding to the gene of interest in all the 3 PCR reaction volumes tested (Figure 13A). High amplicon yields were thus obtained when the PCR reaction was carried out in volumes of 100, 150 and 200 pl and using two phases amplification cycles where annealing and extension are performed at the same temperature. The amplicon obtained was of high quality and purity and did not require post-PCR digestion and could be used directly for performing IVT reactions. The IVT reaction using the PCR product as a template yielded a 2kbp mRNA possessing a poly(dTso) tail (Figure 13B).