Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
THERMOSTABLE PRIMASE/DNA POLYMERASE OF THE THERMOCOCCUS NAUTILUS 30/1 PLASMID PTN2 AND ITS APPLICATIONS
Document Type and Number:
WIPO Patent Application WO/2011/098588
Kind Code:
A1
Abstract:
The present invention is directed to a thermostable primase/DNA polymerase protein of the thermococcus nautilus 30/1 plasmid p TN2 and the nucleic acid encoding said primase/DNA polymerase. The invention also relates to method of synthesizing, amplifying or sequencing nucleic acid implementing said primase/DNA polymerase protein and kit comprising said DNA polymerase protein.

Inventors:
FORTERRE PATRICK (FR)
SEZONOV GUENNADI (FR)
DESNOUES NICOLE (FR)
SOLER NICOLAS (FR)
VAN TILBEURGH HERMAN (FR)
KELLER JENNY (FR)
MARGUET EVELYNE (FR)
KRUPOVIC MART (FR)
Application Number:
PCT/EP2011/052075
Publication Date:
August 18, 2011
Filing Date:
February 11, 2011
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
PASTEUR INSTITUT (FR)
UNIV PARIS CURIE (FR)
UNIV PARIS SUD 11 (FR)
FORTERRE PATRICK (FR)
SEZONOV GUENNADI (FR)
DESNOUES NICOLE (FR)
SOLER NICOLAS (FR)
VAN TILBEURGH HERMAN (FR)
KELLER JENNY (FR)
MARGUET EVELYNE (FR)
KRUPOVIC MART (FR)
International Classes:
C12N9/12; C12N1/21; C12N15/54; C12N15/63; C12P19/34; C12P21/02; C12Q1/68
Foreign References:
EP0862656A11998-09-09
US5001050A1991-03-19
Other References:
DATABASE EMBL [online] EBI, Hinxton, Cambridgeshire, U.K.; 3 May 2010 (2010-05-03), SOLER, N. ET AL: "Thermococcus nautilus strain 30/1 plasmid pTN2, complete sequence.", XP002630282, Database accession no. GU056177
DATABASE UniProt [online] 24 November 2009 (2009-11-24), "SubName: Full=Putative uncharacterized protein;", XP002579067, retrieved from EBI accession no. UNIPROT:C9RIH3 Database accession no. C9RIH3
SOLER NICOLAS ET AL: "The rolling-circle plasmid pTN1 from the hyperthermophilic archaeon hermococcus nautilus", MOLECULAR MICROBIOLOGY, vol. 66, no. 2, October 2007 (2007-10-01), pages 357 - 370, XP002579068, ISSN: 0950-382X
DEAN ET AL., GENOME RES., vol. LL, no. 6, June 2001 (2001-06-01), pages 1095 - 9
MENDEZ ET AL., EMBO J., vol. 16, no. 9, 1997, pages 2519 - 27
HUTCHISON ET AL., PROC NATL ACAD SCI U S A., vol. 102, no. 48, 2005, pages 17332 - 6
MAMONE: "Life Sciences News", vol. 14, 2003, AMERSHAM BIOSCIENCES, article "Innovations Forum: GenomiPhi DNA amplification"
ALTSCHUL ET AL., NUC. ACIDS RES., vol. 25, 1977, pages 3389 - 3402
ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410
TATUSOVA ET AL.: "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS MICROBIOL LETT., vol. 174, 1999, pages 247 - 250
DEAN ET AL., GENOME RES., vol. 11, 2001, pages 1095 - 1099
LARSSON ET AL., NATURE METHODS, vol. 1, 2004, pages 227 - 232
CORTEZ ET AL., GENOME BIOLOGY, 2009
ALBERS SV; JONUSCHEIT M; DINKELAKER S; URICH T; KLETZIN A; TAMPE R; DRIESSEN AJ, SCHLEPER C., 2006
"Production of recombinant and tagged proteins in the hyperthermophilic archaeon Sulfolobus solfataricus", APPL ENVIRON MICROBIOL, vol. 72, pages 102 - 11
ARNOLD HP; SHE Q; PHAN H; STEDMAN K; PRANGISHVILI D; HOLZ I; KRISTJANSSON JK; GARRETT R; ZILLIG W.: "The genetic element pSSVx of the extremely thermophilic crenarchaeon Sulfolobus is a hybrid between a plasmid and a virus", MOL MICROBIOL, vol. 34, 1999, pages 217 - 26
BONNEAU R; FACCIOTTI MT; PAN M; GLUSMAN G; DEUTSCH EW; SHANNON P; CHIU Y; WENG RS; GAN RR; HUNG P, BALIGA NS, 2004
"Genome sequence of Haloarcula marismortui: a halophilic archaeon from the Dead Sea", GENOME RES, vol. 14, pages 2221 - 34
BASTA T; SMYTH J; FORTERRE P; PRANGISHVILI D; PENG X.: "Novel archaeal plasmid pAHl and its interactions with the lipothrixvirus AFV1", MOL MICROBIOL, vol. 71, 2009, pages 23 - 34, XP055238036, DOI: doi:10.1111/j.1365-2958.2008.06488.x
BERKNER S; GROGAN D; ALBERS SV; LIPPS G.: "Small multicopy, non-integrative shuttle vectors based on the plasmid pRNl for Sulfolobus acidocaldarius and Sulfolobus solfataricus, model organisms of the (cren-)archaea", NUCLEIC ACIDS RES, vol. 35, 2007, pages E88
CORTEZ ET AL., GENOME BIOL., vol. 10, no. 6, 16 June 2009 (2009-06-16), pages R65
GIRALDO R; RUIZ-ECHEVARRIA MJ; ESPINOSA M; DIAZ-OREJAS R., DEL SOLAR G, 1998
"Replication and control of circular bacterial plasmids", MICROBIOL MOL BIOL REV, vol. 62, pages 434 - 64
EDGAR RC.: "MUSCLE: a multiple sequence alignment method with reduced time and space complexity", BMC BIOINFORMATICS, vol. 5, 2004, pages 113, XP021000496, DOI: doi:10.1186/1471-2105-5-113
ERAUSO G; MARSIN S; BENBOUZID-ROLLET N; BAUCHER MF; BARBEYRON T; ZIVANOVIC Y; PRIEUR D; FORTERRE P.: "Sequence of plasmid pGT5 from the archaeon Pyrococcus abyssi: evidence for rolling-circle replication in a hyperthermophile", J BACTERIOL, vol. 178, 1996, pages 3232 - 7, XP000645340
ERAUSO G; STEDMAN KM; VAN DE WERKEN HJ; ZILLIG W; VAN DER OOST J.: "Two novel conjugative plasmids from a single strain of Sulfolobus", MICROBIOLOGY, vol. 152, 2006, pages 1951 - 68
GREVE B; JENSEN S; BRUGGER K; ZILLIG W; GARRETT RA.: "Genomic comparison of archaeal conjugative plasmids from Sulfolobus", ARCHAEA, vol. 1, 2004, pages 231 - 9
JENSEN S; PHAN H; BRUGGER K; ZILLIG W; SHE Q; GARRETT RA., GREVE B, 2005
"Novel RepA-MCM proteins encoded in plasmids pTAU4, pORAl and pTIK4 from Sulfolobus neozealandicus", ARCHAEA, vol. 1, pages 319 - 25
HUBER R; STETTER KO; BETTS PW; NOLL KM., HARRIOTT OT, 1994
"A cryptic miniplasmid from the hyperthermophilic bacterium Thermotoga sp. strain RQ7", J BACTERIOL, vol. 176, pages 2759 - 62
KHAN, S. A.: "Plasmid rolling-circle replication: highlights of two decades of research", PLASMID, vol. 53, no. 2, 2005, pages 126 - 36, XP004762387, DOI: doi:10.1016/j.plasmid.2004.12.008
KRUPOVIC M; BAMFORD DH.: "Archaeal proviruses TKV4 and MVV extend the PRDl-adenovirus lineage to the phylum Euryarchaeota", VIROLOGY, vol. 375, 2008, pages 292 - 300, XP022638402, DOI: doi:10.1016/j.virol.2008.01.043
LEPAGE E; MARGUET E; GESLIN C; MATTE-TAILLIEZ O; ZILLIG W; FORTERRE P; TAILLIEZ P.: "Molecular diversity of new Thermococcales isolates from a single area of hydrothermal deep-sea vents as revealed by randomly amplified polymorphic DNA fingerprinting and 16S rRNA gene sequence analysis", APPL ENVIRON MICROBIOL, vol. 70, 2004, pages 1277 - 86
LI, Y.; H. J. KIM; C. ZHENG; W. H. CHOW; J. LIM; B. KEENAN; X. PAN; B. LEMIEUX; H. KONG: "Primase-based whole genome amplification", NUCLEIC ACIDS RES, vol. 36, no. 13, 2008, pages E79
LIPPS G.: "The replication protein of the Sulfolobus islandicus plasmid pRNl", BIOCHEM SOC TRANS, vol. 32, 2004, pages 240 - 4, XP002450349, DOI: doi:10.1042/BST0320240
LIPPS G.: "Plasmids and viruses of the thermoacidophilic crenarchaeote Sulfolobus", EXTREMOPHILES, vol. 10, 2006, pages 17 - 28, XP019374048, DOI: doi:10.1007/s00792-005-0492-x
LIPPS G; IBANEZ P; STROESSENREUTHER T; HEKIMIAN K; KRAUSS G: "The protein ORF80 from the acidophilic and thermophilic archaeon Sulfolobus islandicus binds highly site-specifically to double-stranded DNA and represents a novel type of basic leucine zipper protein", NUCLEIC ACIDS RES, vol. 29, pages 4973 - 82
LIPPS G; ROTHER S; HART C; KRAUSS G.: "A novel type of replicative enzyme harbouring ATPase, primase and DNA polymerase activity", EMBO J, vol. 22, 2003, pages 2516 - 25, XP002711343, DOI: doi:10.1093/emboj/cdg246
LIPPS G; STEGERT M; KRAUSS G.: "Thermostable and site-specific DNA binding of the gene product ORF56 from the Sulfolobus islandicus plasmid pRNl, a putative archael plasmid copy control protein", NUCLEIC ACIDS RES, vol. 29, pages 904 - 13
LIPPS G; WEINZIERL AO; VON SCHEVEN G; BUCHEN C; CRAMER P.: "Structure of a bifunctional DNA primase-polymerase", NAT STRUCT MOL BIOL, vol. 11, 2004, pages 157 - 62
LIPPS G.: "Molecular biology of the pRNl plasmid from Sulfolobus islandicus", BIOCHEM SOC TRANS., vol. 37, February 2009 (2009-02-01), pages 42 - 5
LUCAS S; TOFFIN L; ZIVANOVIC Y; CHARLIER D; MOUSSARD H; FORTERRE P; PRIEUR D; ERAUSO G.: "Construction of a shuttle vector for, and spheroplast transformation of, the hyperthermophilic archaeon Pyrococcus abyssi", APPL ENVIRON MICROBIOL, vol. 68, 2002, pages 5528 - 36, XP002657755, DOI: doi:10.1128/aem.68.11.5528-5536.2002
MARSIN S; FORTERRE P.: "A rolling circle replication initiator protein with a nucleotidyl-transferase activity encoded by the plasmid pGT5 from the hyperthermophilic archaeon Pyrococcus abyssi", MOL MICROBIOL, vol. 27, 1998, pages 1183 - 92
MARSIN S; FORTERRE P.: "pGT5 replication initiator protein Rep75 from Pyrococcus abyssi", METHODS ENZYMOL, vol. 334, 2001, pages 193 - 204
PAULL TT.: "Saving the ends for last: the role of pol mu in DNA end joining", MOL CELL, vol. 19, no. 3, 2005, pages 294 - 6
PENG X.: "Evidence for the horizontal transfer of an integrase gene from a fusellovirus to a pRN-like plasmid within a single strain of Sulfolobus and the implications for plasmid survival", MICROBIOLOGY, vol. 154, 2008, pages 383 - 91, XP055280237, DOI: doi:10.1099/mic.0.2007/012963-0
PENG X; HOLZ I; ZILLIG W; GARRETT RA; SHE Q.: "Evolution of the family of pRN plasmids and their integrase-mediated insertion into the chromosome of the crenarchaeon Sulfolobus solfataricus", J MOL BIOL, vol. 303, 2000, pages 449 - 54, XP004468982, DOI: doi:10.1006/jmbi.2000.4160
SANTANGELO TJ; CUBONOVA L; REEVE JN.: "Shuttle vector expression in Thermococcus kodakaraensis: contributions of cis elements to protein synthesis in a hyperthermophilic archaeon", APPL ENVIRON MICROBIOL, vol. 74, 2008, pages 3099 - 104, XP002657742, DOI: doi:10.1128/aem.00305-08
SATO T; FUKUI T; ATOMI H & IMANAKA T: "Improved and versatile transformation system allowing multiple genetic manipulations of the hyperthermophilic archaeon Thermococcus kodakaraensis", APPL ENVIRON MICROBIOL, vol. 71, 2005, pages 3889 - 99, XP002606157, DOI: doi:10.1128/AEM.71.7.3889-3899.2005
SHE Q; SINGH RK; CONFALONIERI F; ZIVANOVIC Y; ALLARD G; AWAYEZ MJ; CHAN-WEIHER CC; CLAUSEN IG; CURTIS BA; DE MOORS A: "The complete genome of the crenarchaeon Sulfolobus solfataricus P2", PROC NATL ACAD SCI U S A, vol. 98, 2001, pages 7835 - 40, XP002625002, DOI: doi:10.1073/PNAS.141222098
SOLER N; JUSTOME A; QUEVILLON-CHERUEL S; LORIEUX F; LE CAM E; MARGUET E; FORTERRE P: "The rolling-circle plasmid pTNl from the hyperthermophilic archaeon Thermococcus nautilus", MOL MICROBIOL, vol. 66, 2007, pages 357 - 70
SOPPA J.: "From genomes to function: haloarchaea as model organisms", MICROBIOLOGY, vol. 152, 2006, pages 585 - 90
SHE Q; PHAN H; ARNOLD HP; HOLZ I; GARRETT RA; ZILLIG W., STEDMAN KM, 2003
"Relationships between fuselloviruses infecting the extremely thermophilic archaeon Sulfolobus: SSV1 and SSV2", RES MICROBIOL, vol. 154, pages 295 - 302
TATUSOVA TA; MADDEN TL.: "BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences", FEMS MICROBIOL LETT, vol. 174, no. 2, 15 May 1999 (1999-05-15), pages 247 - 50
WARD DE; REVET IM; NANDAKUMAR R; TUTTLE JH; DE VOS WM; VAN DER OOST J; DIRUGGIERO J.: "Characterization of plasmid pRTl from Pyrococcus sp. strain JT1", J BACTERIOL, vol. 184, 2002, pages 2561 - 6
VINCENT, M.; Y. XU; H. KONG: "Helicase-dependent isothermal DNA amplification", EMBO REP, vol. 5, no. 8, 2004, pages 795 - 800, XP002400120
ZHOU, M. Y.; C. E. GOMEZ-SANCHEZ: "Universal TA cloning", CURR ISSUES MOL BIOL, vol. 2, no. 1, 2000, pages 1 - 7
Attorney, Agent or Firm:
WARCOIN, Jacques (20 rue de Chazelles, Paris Cedex 17, FR)
Download PDF:
Claims:
CLAIMS

1. An isolated polypeptide, wherein:

a) the sequence of said polypeptide comprises the four following motifs:

MI: hhhDhD;

Mil: Psh-ll-os-Ghphh@-h;

Mill: sD; and

MIV : hhRhsss-p,

wherein

@ - represents an aromatic residue (Y, W, F)

h - represents a hydrophobic residue (W, F, Y, M, L, I, V, A, C, T, P, H)

s - represents a small residue (A, G, C, S, V, N, D, T, P)

0 - represents an alcohol (hydroxylic) residue (S, T, Y )

1 - represents an aliphatic residue (I, V, L)

p - represents a polar residue (D, E, H, K, N, Q, R, S, T), and

b) said polypeptide harbours DNA polymerase and primase activities.

2. An isolated polypeptide, according to claim 1 wherein:

a) the sequence of said polypeptide comprises the three following motifs:

MI: DhD;

Mil: S/TG-GhQ/H; and

MIII-MIV: D— D-RhhRhP— N, and

b) said polypeptide harbours DNA polymerase and primase activities.

3. A polypeptide according to claim 1 or 2, wherein said polypeptide has been isolated from a hyperthermophilic Archaea, preferably from Thermococcus sp, or from hyperthermophilic Bacteria, said polypeptide exhibiting a DNA polymerase activity at a temperature comprised between 60°C and 90°C.

4. A polypeptide according to one of claims 1 to 3, wherein said polypeptide is encoded by a nucleic acid selected from the group consisting of:

- a nucleic acid fragment of the Thermococcus nautilus pTN2 plasmid, said pTN2 plasmid having the sequence SEQ ID NO: 1, or of a variant thereof having at least 70 % identity with the sequence SEQ ID NO: 1; and - a nucleic acid fragment of the Thermococcus nautilus pTN2 plasmid ORF12 having the sequence SEQ ID NO: 13, or a variant thereof having at least 70 % identity with the sequence SEQ ID NO: 13.

5. A polypeptide according to one of claims 1 to 4, wherein said polypeptide further exhibits a reverse transcriptase activity.

6. A polypeptide according to one of claims 1 to 5, wherein said polypeptide comprises a polypeptide selected from the group of polypeptides consisting of:

a) the polypeptide having the amino acid sequence SEQ ID NO: 14;

b) a fragment of a) having a DNA polymerase activity or/and a DNA primase activity and/or a nucleotidyl transferase activity;

c) a polypeptide having sequence which is at least 70 % identity after optimum alignment with the sequence SEQ ID NO: 14, or with a sequence as defined in b) said polypeptide having a DNA polymerase activity and/or DNA primase activity and/or nucleotidyl transferase activity, preferably a DNA polymerase activity at a temperature comprised between 70°C and 80°C.

7. A nucleic acid encoding a polypeptide according to one of claims 1 to 6.

8. A vector comprising the nucleic acid of claim 7.

9. The vector of claim 8, wherein said nucleic acid is operably linked to a promoter.

10. A host cell comprising the vector of claims 8 and 9.

11. A method of producing a DNA polymerase, said method comprising:

(a) culturing the host cell of claim 10 in conditions suitable for the expression of said nucleic acid; and

(b) isolating said DNA polymerase from said host cell.

12. A method of producing a recombinant DNA polymerase which is soluble in an aqueous solvent, said method comprising:

a) culturing the host cell of claim 10 in conditions suitable for the expression of said nucleic acid;

b) harvesting the cells by centrifugation and resuspending the cells pellet in an aqueous buffer;

c) lysing the cells, preferably by sonication, and, optionally, heating the lysate; and d) centrifugating and recovering the aqueous soluble fraction containing the recombinant DNA polymerase.

13. The method of claim 12, wherein said host cell is a prokaryotic or an eukaryotic cell.

14. A method of synthesizing a double-stranded DNA molecule comprising:

(a) hybridizing a primer to a first DNA molecule; and

(b) incubating said DNA molecule of step (a) in the presence of one or more deoxyribonucleoside triphosphates or analogs thereof and the polypeptide of one of claims 1 to 6 or obtained by a method of claim 11 or 12, under conditions sufficient to synthesize a second DNA molecule complementary to all or a portion of said first DNA molecule.

15. A method of synthesizing a double- stranded DNA molecule from an unknown DNA template:

(a) incubating said DNA template in the presence of one or more deoxyribonucleoside triphosphates or analogs thereof and the polypeptide of one of claims 1 to 6 or obtained by a method of claim 11 or 12 having a DNA polymerase and primase activities, under conditions sufficient to synthesize a second DNA molecule complementary to all or a portion of said first DNA molecule.

16. A method for production of DNA molecules of greater than 10 kilobases in length comprising the methods of claims 14 or 15, wherein the first DNA molecule which serves as a template is greater than 10 kilobases.

17. The method of claims 14 to 16, wherein said deoxyribonucleoside triphosphates are selected from the group consisting of dATP, dCTP, dGTP and dTTP.

18. A method for amplifying a double stranded DNA molecule, comprising: (a) providing a first and second primer, wherein said first primer is complementary to a sequence at or near the 3 '-termini of the first strand of said DNA molecule and said second primer is complementary to a sequence at or near the 3 '-termini of the second strand of said DNA molecule;

(b) hybridizing said first primer to said first strand and said second primer to said second strand in the presence of the polypeptide of claims 1 to 6, under conditions such that a nucleic acid complementary to said first strand and a nucleic acid complementary to said second strand are synthesized; (c) denaturing

- said first and its complementary strands; and

- said second and its complementary strands; and

(d) repeating steps (a) to (c) one or more times.

19. A method of preparing cDNA from mRNA, comprising:

(a) contacting mRNA with an oligo(dT) primer or other complementary primer to form a hybrid, and

(b) contacting said hybrid formed in step (a) with the DNA polymerase polypeptide of claims 1 to 6 or obtained by a method of claims 10 to 13, dATP, dCTP, dGTP and dTTP, whereby a cDNA-RNA hybrid is obtained.

20. A method of preparing dsDNA from mRNA, comprising:

(a) contacting mRNA with an oligo (dT) primer or other complementary primer to form a hybrid; and

(b) contacting said hybrid formed in step (a) with the polypeptide of claims 1 to 6 or obtained by a method of claims 11 to 13, dATP, dCTP, dGTP and dTTP, and an oligo nucleotide or primer which is complementary to the first strand cDNA;

whereby dsDNA is obtained.

21. A method for determining the nucleotide base sequence of a DNA molecule, comprising the steps of:

(a) contacting said DNA molecule with a primer molecule able to hybridize to said DNA molecule;

(b) incubating said hybrid formed in step (a) in a vessel containing four different deoxynucleoside triphosphates, a DNA polymerase polypeptide of claims 1 to 6 or obtained by a method of claims 11 to 13, and one or more DNA synthesis terminating agents which terminate DNA synthesis at a specific nucleotide base, wherein each said agent terminates DNA synthesis at a different nucleotide base; and

(c) separating the DNA products of the incubating reaction according to size, whereby at least a part of the nucleotide base sequence of said DNA can be determined.

22. The method of claim 21, wherein said terminating agent is a dideoxynucleoside triphosphate.

23. A method for amplification of a DNA molecule comprising the steps of: (a) incubating said DNA molecule in the presence of a DNA polymerase polypeptide of claims 1 to 6 or obtained by a method of claims 11 to 13 and a mixture of different deoxynucleoside triphosphates.

24. A method according to one of claims 14 to 23, wherein an enzyme having a helicase activity is added in order to improve the processivity of the DNA polymerase polypeptide of claims 1 to 6 or obtained by a method of claims 11 to 13.

25. A kit for sequencing a DNA molecule, comprising:

(a) a first container means comprising a DNA polymerase polypeptide of claims 1 to 6 or obtained by a method of claims 11 to 13;

(b) a second container means comprising one or more dideoxyribonucleoside triphosphates; and

(c) a third container means comprising one or more deoxyribonucleoside triphosphates.

26. A kit for amplifying a DNA molecule, comprising:

(a) a first container means comprising DNA polymerase polypeptide of claims 1 to 6 or obtained by a method of claims 11 to 13; and

(b) a second container means comprising one or more deoxyribonucleoside triphosphates.

27. Use of a DNA polymerase polypeptide of claims 1 to 6 or obtained by a method of claims 11 to 13, for rolling circle amplification, multiple displacement amplification or protein-primed amplification.

Description:
THERMOSTABLE PRIMASE/DNA POLYMERASE OF THE THERMOCOCCUS NAUTILUS 30/1 PLASMID pTN2 AND ITS APPLICATIONS

The present invention is directed to a thermostable primase/DNA polymerase protein of the Thermococcus nautilus 30/1 plasmid pTN2 and the nucleic acid encoding said primase/DNA polymerase. The invention also relates to method of synthesizing, amplifying or sequencing nucleic acid implementing said primase/DNA polymerase protein and kit comprising said DNA polymerase protein.

Plasmids are present in many archaeal species including archaeons living in geothermally heated aquatic environments (Lipps et al., 2008). Their size range from a very small plasmid with a single gene, the plasmid pRQ7 from the hyperthermophilic bacterium Thermotoga (Harriott et al., 1994), to large megaplasmids whose size rival those of bacterial or archaeal chromosomes (Baliga et al., 2004; Soppa, 2006). They can use different replication strategies (mostly rolling-circle mode for the smallest ones and theta mode for the others) and a variety of replication origins (del Solar et al., 1998).

In Archaea, plasmids from Sulfolobales (thermoacidophilic members of the phylum Crenarchaea) have been especially well characterized. Two plasmid families have been identified: the pRN plasmids and relatives (with sizes ranging mainly from 5 to 8 kb, up to 13.6 kb), and a family of rather large conjugative plasmids (CP), the pNOB8 and relatives (with sizes ranging from 24 to 36 kb) (Lipps, 2006). The pRN plasmids harbour three genes which are also more or less conserved in pRN relatives. Two of them encode site- specific DNA binding proteins (CopG and PlrA) that could be involved in regulation of copy number and gene expression (Lipps et al., 2001a; Lipps et al., 2001b). The third gene encodes a large protein, RepA, involved in plasmid replication. The RepA protein of the plasmid pRNl is a multifunctional enzyme whose N-terminal domain harbours DNA primase/polymerase activities and the C-terminal domain harbours a DNA helicase activity (Lipps, 2004; Lipps et al., 2003). The structural characterization of the N-terminal domain led to the identification of a new family of DNA polymerase (family E) distantly related to the archaeal/eukaryal DNA primase catalytic domain (Lipps et al., 2004). The larger genome of the conjugative plasmids (CP) from Sulfolobus species encodes around 40-50 proteins each (Erauso et al., 2006; Greve et al., 2004). Ten proteins are conserved in all CP plasmids of Sulfolobales and 80% of the others are common to two or more CP plasmids (Erauso et al., 2006). Most of these proteins are of unknown function. All plasmids isolated up to now from Sulfolobus species are larger than 5kb. Although their mode of replication has not yet been determined experimentally, they probably replicate via the theta mode, since none of them encode the Rep protein typical for rolling-circle replication. In contrast, all plasmids isolated up to now from Thermococcales (hyperthermophilic and anaerobic members of the phylum Euryarhaea) are small (less than 4 kb) and seems to replicate via the rolling circle (RC) mode. This has been experimentally demonstrated for the plasmids pGT5 from Pyrococcus abyssi (Erauso et al., 1996; Marsin & Forterre, 1998; Marsin & Forterre, 2001) and is most likely for the related plasmid pTNl, from the candidate species Thermococcus nautilus (Soler et al., 2007). Both plasmids encode large homologous proteins related to Rep proteins known to initiate rolling circle replication. A third small plasmid, called pRTl, from Pyrococcus sp. JT1, has been sequenced and tentatively described as a RC plasmid (Ward et al., 2002). However, its putative Rep protein shows no clear sequence similarity with known RC Rep proteins.

A variety of nucleic acid amplification techniques, developed as tools for nucleic acid analysis and manipulation, have been successfully applied for clinical diagnosis of genetic and infectious diseases. Amplification techniques can be grouped into those requiring temperature cycling (PCR and ligase chain reaction) and isothermal systems (amplification systems (3SR and NASBA), strand-displacement amplification, and Q replication systems). Two aspects are frequent caveats in these procedures: fidelity of synthesis and length of the amplified product. Development of an amplification system relying on the mechanism of DNA replication using such primase/DNA polymerase has been the object of many publications and patent documents (see for example Dean et al., Genome Res. 2001 Jun;l l(6): 1095-9; Mendez et al., EMBO J., 1997, l;16(9):2519-27; Hutchison et al., Proc Natl Acad Sci U S A., 2005, 102(48): 17332-6; Mamone, Innovations Forum: GenomiPhi DNA amplification, Life Sciences News 14, 2003 Amersham Biosciences; Blanco et al., 1994; EP 0 862 656 or U.S. 5,001,050).

Currently there is a need for a new thermostable highly processive DNA polymerase belonging to the protein-primed DNA polymerase family which can work at high temperature and which is able to produce long extension products (more than one Kb in length, preferably more than 5 Kb) without the need of the presence of additional proteins.

There is particularly a great interest to provide with a such thermostable DNA polymerase, soluble in aqueous solvent and easy to produce by recombinant method, preferably in E. coli bacteria which is well known by the skilled person and which is easy to cultivate.

Particularly waited are such soluble thermostable DNA polymerase easy to produce in E. coli which exhibits a primase activity and which is consequently able to synthesize or amplify unknown DNA template, preferably in total absence of primer and in addition which is able to produce with fidelity long extension products (more than one Kb in length, preferably more than 5 Kb or 10 Kb) without the need of the presence of additional proteins.

The activity of a nucleotidyl transferase, an important cellular enzyme involved in the DNA processing and repair, is also advantageous for the cloning purposes. For example this activity is a characteristic property of Taq polymerase. Taq polymerase makes DNA products that have A (adenine) overhangs at their 3' ends. This may be useful in TA cloning, whereby a cloning vector (such as a plasmid) is used which has a T (thymine) 3' overhang, which complements with the A overhang of the PCR product, thus enabling ligation of the PCR product into the plasmid vector.

This is the object of the present invention.

The inventors have characterized the plasmid pTNl and demonstrated that the strain T. nautilus contains additionally a larger plasmid of around 13 kb, which has been called pTN2. After sequencing and annotation of the complete sequence of the Thermococcus nautilus 30/1 plasmid pTN2 (13 kb, Sequence SEQ ID NO: 1), the inventors have demonstrated the presence of a nucleic sequence ORF12 (SEQ ID NO: 13) that encodes a protein of 107 kDa (tn2-12 having the sequence SEQ ID NO: 14). A BLAST search did not allow to predict the function of this protein. However, by using PSTBLAST iterations, the inventors have retrieved many hits between their N terminal regions and the same regions of proteins annotated either as uncharacterized proteins or as putative Rep protein or DNA primases. Interestingly, it has been demonstrated that all these N-terminal regions exhibit a conserved DhD motif, known to be present in the active sites of many DNA polymerases, primases and/or nucleotidyl transferases. Interestingly, using PSI-BLAST searches, the inventors have also demonstrated that the middle part of tn2-12 is composed of a short stretch of highly conserved amino-acids found in many proteins with the SI RNA binding motif usually detected in proteins able to bind nucleic acids.

After testing the DNA polymerase activity of tn2-12, the inventors have demonstrated that the protein tn2-12 is able to extend the primer up to several kilobases in less than 20 min of the reaction time with a highest activity found between 70°C and 80°C, and that the obtained DNA shows the same pattern as that synthesized by the Taq polymerase. Additionally to the DNA polymerase activity, the inventors have demonstrated that tn2-12 has a clear primase activity, the protein being able to efficiently initiate the second strand synthesis in the presence of ssDNA template.

These results confirm that tn2-12 is a thermostable primase/processive DNA polymerase able to produce long extension products, its activity being independent of the presence of additional proteins.

As any previously known DNA polymerases were detected by BLAST search, it can be considered that this discovered enzyme belongs to a new family of the DNA depending DNA polymerases. So, in a first aspect, the present invention is directed to an isolated polypeptide, wherein:

a) the sequence of said polypeptide comprises the four following motifs MI, Mil, Mill and MIV:

MI: hhhDhD;

Mil: Psh-ll-os-Ghphh@-h;

Mill: sD; and

MIV : hhRhsss-p,

wherein:

- "@" represents an aromatic residue (Y, W, F)

- "h " represents an hydrophobic residue (W, F, Y, M, L, I, V, A, C, T, P, H)

- "s " represents a small residue (A, G, C, S, V, N, D, T, P)

- "o" represents an alcohol (hydroxylic) residue (S, T, Y ) - "1" represents an aliphatic residue (I, V, L),

-"p" represents a polar residue (D, E, H, K, N, Q, R, S, T),

- a dash ("-") designates any amino acid residue,

and

b) said polypeptide harbours DNA polymerase and primase activities.

In a preferred embodiment, the present invention encompasses an isolated according to the invention according to the invention wherein:

a) the sequence of said polypeptide comprises the three following motifs MI, Mil and MIII-MIV and wherein the above motifs III and IV constitute a unique motif named "motif MIII-MIV":

Motif I: DhD;

Motif II: S/TG-GhQ/H; and

Motif III-MIV: D— D-RhhRhP— N,

and

b) said polypeptide harbours DNA polymerase and primase activities.

In these motifs (Motif I: DhD; Motif II: S/TG-GhQ/H; and Motif III-MIV: D— D— RhhRhP— N), a dash ("-") or "h" designates any amino acid residue.

In a particular preferred embodiment, h designates an hydrophobic residue (W, F, Y, M, L, I, V, A, C, T, P and H).

The letters W, F, Y, M, L, I, V, A, C, T, P, H, D, S, G, Q, R and N designate the specific single-letter amino acid code.

The linear order in which these three motifs are present in the sequences of the isolated polypeptides is indifferent.

In a preferred embodiment said polypeptide harbours further nucleotidyl transferase activity in addition to said DNA polymerase and primase activities. In a preferred embodiment these four motifs MI, Mil, Mill and MIV, or the three motifs MI, Mil and MIII-MIV, are presented in a linear order I- II- III- IV beginning from the N-terminal end of the polypeptide.

In another preferred embodiment, in the following motifs:

MI: hhhDhD;

Mil: Psh-ll-os-Ghphh@-h;

Mill: sD; and

MIV : hhRhsss-p,

or

MI: DhD;

Mil: S/TG-GhQ/H; and

MIII-MIV: D— D-RhhRhP— N,

the D, D, D and R residues indicated in bold form part of the enzyme catalytic site and can not be changed (remain invariant) in order to preserve the activity of the enzyme.

Deoxyribonucleic acid (DNA) polymerase, primase, nucleotidyl transferase or reverse transcriptase enzymatic activity is standard activity well known by the skilled person.

For example, by DNA polymerase activity, it can be intended to designate the activity of an enzyme that catalyzes the polymerization of deoxyribonucleotides into a DNA strand. DNA polymerases enzymes are best-known for their role in DNA replication, in which the polymerase "reads" an intact DNA strand as a template and uses it to synthesize the new strand. This process copies a piece of DNA. The newly- polymerized molecule is complementary to the template strand and identical to the template's original partner strand.

The term "DNA polymerase activity" refers for example to the ability of an enzymatic polypeptide to synthesize new DNA strands by the incorporation of deoxynucleoside triphosphates. The example 4 below provides an example of assay for the measurement of DNA polymerase activity. Such DNA polymerase activity may be measured using any of the DNA polymerase activity assays well known by the skilled person. A protein which can direct the synthesis of new DNA strands (DNA synthesis) by the incorporation of deoxynucleoside triphosphates in a template-dependent manner is said to be "capable of DNA polymerase activity".

For example, by DNA primase activity, it can be intended to designate the activity of an enzyme which is involved in the initiation of DNA replication, said enzyme catalyzes the polymerization of short ribonucleic acid (RNA) primers on the template DNA.

For example, by nucleotidyl transferase activity, it can be intended to designate the catalysis of the transfer of a nucleotidyl group to a reactant. In mammalians the corresponding enzyme is involved in the non-homologous end joining (NHEJ) process that is responsible for repairing double strand breakes in the DNA (Paull, 2005). For example, the nucleotidyl transferase activity of Taq polymerase is widely used in the T/A cloning system.

In a preferred embodiment, the polypeptide according to the present invention has been isolated from hyperthermophilic Archaea or Bacteria.

In a preferred embodiment, the polypeptide according to the present invention exhibits a DNA polymerase activity at a temperature of superior or equal to 65°C, more preferably superior or equal to 70°C, still more preferably comprised between 70°C and 80°C.

By thermophilic Archaea or Bacteria, it is intended to designate a member of

Archaea or Bacteria that is capable of best growth in temperatures comprised between 45°C and 100°C and more.

Thermophiles are classified into:

- moderate thermophiles which are intended to designate members that are capable of best growth in temperatures comprised between 45 °C and 65 °C;

- extreme thermophiles which are intended to designate members that are capable of best growth in temperatures comprised between 65°C and 80°C; and

- hyperthermophiles which are intended to designate members that are capable of best growth in temperatures superior to 80°C,

(members that are capable of best growth in temperatures comprised between 25 °C and 45°C being designated by mesophile and members that are capable of best growth in temperatures comprised between -5°C and 25°C being designated by psychrophile). In a more preferred embodiment, the polypeptide according to the present invention has been isolated from a member of the phylum Euryarhaea.

In a preferred embodiment, the polypeptide according to the invention or the nucleic acid encoding thereof is isolated from a Thermococcus sp, or plasmid or virus thereof, preferably from a Thermococcus sp plasmid or virus, more preferably from a Thermococcus nautilus variants plasmid or from a pTN2 plasmid variant of Thermococcus nautilus, more preferably from the gene ORF12 having the sequence SEQ ID NO: 13, or variant thereof, encoding said DNA polymerase.

By variant of an archaea or a bacteria, virus, plasmid or gene, it is intended to designate in the present invention a natural variant of said archaea, bacteria, virus, plasmid or gene which can be isolated, or a variant of the wild type archaea, bacteria, virus, plasmid or gene which has been obtained by mutation, for example by a method implementing a site-directed mutagenesis step.

In a more preferred embodiment, the polypeptide according to the present invention is encoded by a nucleic acid isolated from a pTN2 plasmid having the sequence SEQ ID NO: 1 or a variant thereof.

By a variant of a pTN2 plasmid, it is intended to designate a plasmid having a nucleic sequence exhibiting at least 70 % identity, preferably 75 , 80 , 85 , 90 , 92 , 95 , 97,5 , 98 , 98,5 , 99 % and 99,5 % identity, with the sequence SEQ ID NO: 1 depicting the nucleic acid sequence of the Thermococcus nautilus 30/1 plasmid pTN2.

Among the variants of a pTN2 plasmid, the variant exhibiting an ORF sequence having at least 70 % identity, preferably 75 , 80 , 85 , 90 , 92 , 95 , 97,5 , 98 , 98,5 , 99 % and 99,5 % identity, with the sequence SEQ ID NO: 13 are preferred.

In a preferred embodiment, the polypeptide according to the present invention further exhibits a reverse transcriptase activity.

For example by reverse transcriptase activity, it can be intended to designate the catalysis of RNA-template-directed extension of the 3'- end of a DNA strand by one deoxynucleotide at a time. The present invention is also directed to an isolated polypeptide or to an isolated polypeptide according to the present invention, wherein said polypeptide comprises a polypeptide selected from the group of polypeptides consisting of:

a) the polypeptide having the amino acid sequence SEQ ID NO: 14;

b) a fragment of a) having a DNA polymerase activity or/and a DNA primase activity and/or a nucleotidyl transferase activity;

c) a polypeptide having sequence which is at least 70 % identity, preferably 75 , 80 , 85 , 90 , 92 , 95 , 97,5 , 98 , 98,5 , 99 % and 99,5 , after optimum alignment with the sequence SEQ ID NO: 14, or with a sequence as defined in b), said polypeptide having a DNA polymerase activity and/or DNA primase activity and/or nucleotidyl transferase activity, preferably a DNA polymerase activity at a temperature of superior or equal to 65°C, more preferably superior or equal to 70°C, still more preferably comprised between 70°C and 80°C.

In a preferred embodiment, said polypeptide according to the present invention or one of their fragments is a polypeptide having DNA polymerase and primase and transferase activities.

In a preferred embodiment, said polypeptide according to the present invention or one of their fragments is a polypeptide further exhibiting a reverse transcriptase activity.

In a preferred embodiment, said fragment having a DNA polymerase and/or primase activity and/or nucleotidyl transferase activity, has at least 50, 100, 150, 200, 250, 300, 350, 400, 450, 500 or 550 amino acids.

In a preferred embodiment, said polypeptide according to the present invention or one of their fragments is a polypeptide further exhibiting a reverse transcriptase activity.

In a preferred embodiment, the DNA polymerase according to the invention or the nucleic acid encoding thereof is isolated from a Thermococcus sp, or plasmid or virus thereof, preferably from a Thermococcus sp plasmid or virus, from a Thermococcus nautilus variants plasmid or from a pTN2 plasmid variant of Thermococcus nautilus, more preferably from the gene, or variant thereof, encoding said DNA polymerase. By variant of a bacteria, virus, plasmid or gene, it is intended to designate in the present invention a natural variant of said bacteria, virus, plasmid or gene which can be isolated, or a variant of the wild type bacteria, virus, plasmid or gene which has been obtained by mutation, for example by a method implementing a site-directed mutagenesis step.

In a more preferred embodiment, the polypeptide of the present invention, or fragments thereof, comprises at least one the three following motifs:

DhD;

S/TG-GhQ/H; or

D— D-RhhRhP— N,

preferably two of these three motifs, more preferably the three motifs.

The term "DNA polymerase activity" refers to the ability of an enzymatic polypeptide to synthesize new DNA strands by the incorporation of deoxynucleoside triphosphates. The example 4 below provides an example of assay for the measurement of DNA polymerase activity. Such DNA polymerase activity may be measured using any of the DNA polymerase activity assays well known by the skilled person. A protein which can direct the synthesis of new DNA strands (DNA synthesis) by the incorporation of deoxynucleoside triphosphates in a template-dependent manner is said to be "capable of DNA polymerase activity".

In the present description, the terms polypeptides, polypeptide sequences, peptides and proteins are interchangeable.

In the present description, the terms nucleic acid, polynucleotide, oligonucleotide, or acid nucleic or nucleotide sequence are interchangeable.

The terms "identical" or percent "identity", in the context of two or more polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues that are the same (i.e., about 70 % identity, preferably 75 , 80 , 85 , 90 , 92 , 95 , 97,5 , 98 , 98,5 , 99 % and 99,5 , or higher identity over a specified region when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters, or by manual alignment and visual inspection (see, e.g., NCBI web site). The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 25 amino acids in length, or more preferably over a region that is 25-75 amino acids in length.

For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Preferably, default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.

Methods of alignment of sequences for comparison are well-known in the art. A preferred example of algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et al., J. Mol. Biol. 215:403-410 (1990).

For example, it is possible to use the BLAST program, "BLAST 2 sequences" (Tatusova et al., "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250, 1999) available on the site http://www.ncbi.nlm.nih.gov/gorf/bl2.html, the parameters used being those given by default (in particular for the parameters "open gap penalty": 5, and "extension gap penalty": 2; the matrix chosen being, for example, the matrix "BLOSUM 62" proposed by the program), the percentage of identity between the two sequences to be compared being calculated directly by the program.

By amino acid sequence having at least 70 % identity, preferably 75 , 80 , 85 , 90 , 92 , 95 , 97,5 , 98 , 98,5 , 99 % and 99,5 , or higher identity with a reference amino acid sequence, those having, with respect to the reference sequence, certain modifications, in particular a deletion, addition or substitution of at least one amino acid, a truncation or an elongation are preferred. In the case of a substitution of one or more consecutive or non consecutive amino acid(s), the substitutions are preferred in which the substituted amino acids are replaced by "equivalent" amino acids. The expression "equivalent amino acids" is aimed here at indicating any amino acid capable of being substituted with one of the amino acids of the base structure without, however, essentially modifying the DNA polymerase activity of the reference polypeptide and such as will be defined later, especially in the example 4, last paragraph.

These equivalent amino acids can be determined either by relying on their structural homology with the amino acids which they replace, or on results of comparative trials of DNA polymerase activity between the different polypeptides capable of being carried out. By way of example, mention is made of the possibilities of substitution capable of being carried out without resulting in a profound modification of the DNA polymerase activity of the corresponding modified polypeptide. It is thus possible to replace leucine by valine or isoleucine, aspartic acid by glutamic acid, glutamine by asparagine, arginine by lysine, etc., the reverse substitutions being naturally envisageable under the same conditions.

In a second aspect, the present invention provides a nucleic acid encoding a polypeptide having a DNA polymerase and/or primase activity and/or nucleotidyl transferase activity according to the invention, particularly the nucleic acid having the sequence SEQ ID NO: 13 or having a sequence which is at least 70 % identity, preferably 75 , 80 , 85 , 90 , 92 , 95 , 97,5 , 98 , 98,5 , 99 % and 99,5 , or higher identity after optimum alignment with the sequence SEQ ID NO: 13, the polypeptide encoded by said nucleic acid having a DNA polymerase activity, preferably at a temperature of superior or equal to 65°C, preferably superior or equal to 70°C, more preferably comprises between 70°C and 80°C.

In another aspect, the invention encompasses a vector, preferably a cloning or an expression vector, comprising the nucleic acid of the invention.

In a preferred embodiment, the vector according to the invention is characterized in that said nucleic acid is operably linked to a promoter.

The invention aims especially at cloning and/or expression vectors which contain a nucleotide sequence according to the invention. The vectors according to the invention preferably contain elements which allow the expression and/or the secretion of the nucleotide sequences in a determined host cell. The vector must therefore contain a promoter, signals of initiation and termination of translation, as well as appropriate regions of regulation of transcription. It must be able to be maintained in a stable manner in the host cell and can optionally have particular signals which specify the secretion of the translated protein. These different elements are chosen and optimized by the person skilled in the art as a function of the host cell used. To this effect, the nucleotide sequences according to the invention can be inserted into autonomous replication vectors in the chosen host, or be integrative vectors of the chosen host.

Such vectors are prepared by methods currently used by the person skilled in the art, and the resulting clones can be introduced into an appropriate host by standard methods, such as lipofection, electroporation, thermal shock, or chemical methods.

The vectors according to the invention are, for example, vectors of plasmidic or viral origin. They are useful for transforming host cells in order to clone or to express the nucleotide sequences according to the invention.

In a preferred embodiment, the vector of the present invention is the plasmid vector pET21 contained in the bacteria which has been deposited according to the Budapest Treaty at the C.N. CM. (Collection Nationale de Cultures de Microorganismes, Institut Pasteur, Paris, France) the January 12, 2010 under the number 1-4272.

This cloned plasmid vector is the vector pET21 wherein the nucleic sequence of the DNA polymerase of the invention has been inserted between the Ndel and Xhol sites of the pET21 plasmid.

The term "expression vector" refers to a recombinant DNA molecule containing the desired coding nucleic acid sequence and appropriate nucleic acid sequences necessary for the expression of the operably linked coding sequence in a particular host organism. Nucleic acid sequences necessary for expression in prokaryotes usually include a promoter, an operator (optional), and a ribosome binding site, often along with other sequences. Eukaryotic cells are known to utilize promoters, enhancers, and termination and polyadenylation signals. In another aspect, the present invention relates to a host cell comprising the vector according to the invention.

In another aspect, the present invention is directed to the isolated plasmid pTN2 and the host cell comprising said plasmid pTN2, particularly the recombinant bacteria which has been deposited according to the Budapest Treaty at the C.N. CM. (Collection Nationale de Cultures de Microorganismes, Institut Pasteur, Paris, France) the February 2, 2010 under the number 1-4275.

The DNA polymerase polypeptide of the present invention may be expressed in either prokaryotic or eukaryotic host cells. Nucleic acid encoding the DNA polymerase polypeptide of the present invention may be introduced into bacterial host cells by a number of means including transformation of bacterial cells made competent for transformation by treatment with calcium chloride or by electroporation. If the DNA polymerase polypeptide of the present invention is to be expressed in eukaryotic host cells, nucleic acid encoding the DNA polymerase polypeptide of the present invention may be introduced into eukaryotic host cells by a number of means including calcium phosphate co-precipitation, spheroplast fusion, electroporation and the like. When the eukaryotic host cell is a yeast cell, transformation may be affected by treatment of the host cells with lithium acetate or by electroporation or any other method known in the art. It is contemplated that any host cell will be useful in producing the peptides or proteins or fragments thereof of the invention.

The cells transformed according to the invention can be used in processes for preparation of recombinant polypeptides according to the invention. The processes for preparation of a polypeptide according to the invention in recombinant form, characterized in that they employ a vector and/or a cell transformed by a vector according to the invention, are themselves comprised in the present invention.

Preferably, a cell transformed by a vector according to the invention is cultured under conditions which allow the expression of said polypeptide and said recombinant peptide is recovered.

In another aspect, the present invention relates to a method of producing a DNA polymerase, particularly a recombinant DNA polymerase, said method comprising:

(a) culturing the host cell according to the invention in conditions suitable for the expression of said nucleic acid; and (b) isolating said DNA polymerase from said host cell.

Preferably, the present invention relates to a method of producing a DNA polymerase, particularly a recombinant DNA polymerase which is soluble in an aqueous solvent, said method comprising:

a) culturing a host cell transformed by a vector according to the invention in conditions suitable for the expression of said nucleic acid;

b) harvesting the cells by centrifugation and resuspending the cells pellet in an aqueous buffer;

c) lysing the cells, preferably by sonication, and, optionally, heating the lysate, preferably at a temperature equal or superior to 65°C, more preferably at a temperature between 70°C and 80°C; and

d) centrifugating and recovering the aqueous soluble fraction containing the recombinant DNA polymerase.

Said host cell can be a prokaryotic or an eukaryotic cell.

As has been said, the host cell can be chosen from prokaryotic or eukaryotic systems. In particular, it is possible to use nucleotide sequences facilitating secretion in such a prokaryotic or eukaryotic system. A vector according to the invention carrying such a sequence can therefore advantageously be used for the production of recombinant proteins, intended to be secreted. In effect, the purification of these recombinant proteins of interest will be facilitated by the fact that they are present in the supernatant of the cell culture rather than in the interior of the host cells.

In another aspect, the present invention encompasses a method of synthesizing a double- stranded DNA molecule comprising:

(a) hybridizing a primer to a first DNA molecule; and

(b) incubating said DNA molecule of step (a) in the presence of one or more deoxyribonucleoside triphosphates or analogs thereof and the polypeptide according to the invention, under conditions sufficient to synthesize a second DNA molecule complementary to all or a portion of said first DNA molecule.

In another aspect, the present invention encompasses a method of synthesizing or amplifying a single-stranded DNA or a double-strand DNA from an unknown DNA template, said method comprising: optionally, whether it is necessary, beforehand denaturing the double- stranded DNA molecule template; and

(a) incubating said DNA template in the presence of one or more deoxyribonucleoside triphosphates or analogs thereof and the polypeptide according to the invention or obtained by a method according to the invention having a DNA polymerase and primase activities, under conditions sufficient to synthesize a second DNA molecule complementary to all or a portion of said first DNA molecule.

In another aspect, the present invention encompasses a method of synthesizing a single- stranded DNA molecule comprising:

(a) synthesis of a double-stranded DNA molecule by a method according to the invention; and

(b) denaturing the double- stranded DNA molecule obtained in step (a); and

(c) recovering the single- stranded DNA molecule obtained in step (b).

In another aspect, the present invention encompasses a method for production and/or the amplification of DNA molecules of greater than 5 kilobases (Kb) in length, preferably greater than 10, 15, 20 or 25 Kb, said method implementing a polypeptide of the present invention, particularly the method according to the invention, wherein the first DNA molecule which serve as a template in step (a) is greater than 5, 10, 15 or 20 Kb, preferably greater than 25 Kb.

In the method according to the invention, said deoxyribonucleoside triphosphates are selected from the group consisting of dATP, dCTP, dGTP and dTTP.

In another aspect, the present invention encompasses a method for amplifying a double stranded DNA molecule, comprising:

(a) providing a first and second primer, wherein said first primer is complementary to a sequence at or near the 3 '-termini of the first strand of said DNA molecule and said second primer is complementary to a sequence at or near the 3 '-termini of the second strand of said DNA molecule;

(b) hybridizing said first primer to said first strand and said second primer to said second strand in the presence of the polypeptide according to the invention, under conditions such that a nucleic acid complementary to said first strand and a nucleic acid complementary to said second strand are synthesized; (c) denaturing

- said first and its complementary strands; and

- said second and its complementary strands; and

(d) repeating steps (a) to (c) one or more times.

It is also preferred that the step of amplifying is performed by PCR, or PCR-like method, or RT-PCR reaction implementing the polypeptide having a DNA polymerase activity (DNA polymerase polypeptide).

"PCR" describes a method of gene amplification which involves sequenced- based hybridization of primers to specific genes within a DNA sample and subsequent amplification involving multiple rounds of annealing (hybridization), elongation and denaturation using a heat-stable DNA polymerase.

"RT-PCR" is an abbreviation for reverse transcriptase-polymerase chain reaction. Subjecting mRNA to the reverse transcriptase enzyme results in the production of cDNA which is complementary to the base sequences of the mRNA. Large amounts of selected cDNA can then be produced by means of the polymerase chain reaction which relies on the action of heat-stable DNA polymerase.

"PCR-like" will be understood to mean all methods using direct or indirect reproductions of nucleic acid sequences, or alternatively in which the labelling systems have been amplified, these techniques are of course known, in general they involve the amplification of DNA by a polymerase; when the original sample is an RNA, it is advisable to carry out a reverse transcription beforehand. There are currently a great number of methods allowing this amplification, for example the so-called NASBA "Nucleic Acid Sequence Based Amplification", TAS "Transcription based Amplification System", LCR "Ligase Chain Reaction", "Endo Run Amplification" (ERA), "Cycling Probe Reaction" (CPR), and SDA "Strand Displacement Amplification", methods well known to persons skilled in the art.

When using mRNA, the method may be carried out by converting the isolated mRNA to cDNA according to standard methods using reverse transcriptase (RT-PCR).

In another aspect, the present invention encompasses a method of preparing cDNA from mRNA, comprising:

(a) contacting mRNA with an oligo(dT) primer or other complementary primer to form a hybrid, and (b) contacting said hybrid formed in step (a) with the DNA polymerase polypeptide according to the invention and dATP, dCTP, dGTP and dTTP, whereby a cDNA-RNA hybrid is obtained.

The present invention is further directed to a method of preparing dsDNA (double strand DNA) from mRNA, comprising:

(a) contacting mRNA with an oligo (dT) primer or other complementary primer to form a hybrid; and

(b) contacting said hybrid formed in step (a) with the polypeptide according to the invention, dATP, dCTP, dGTP and dTTP, and an oligonucleotide or primer which is complementary to the first strand cDNA;

whereby dsDNA is obtained.

In another aspect, the present invention encompasses a method for determining the nucleotide base sequence of a DNA molecule, comprising the steps of:

(a) contacting said DNA molecule with a primer molecule able to hybridize to said DNA molecule;

(b) incubating said hybrid formed in step (a) in a vessel containing four different deoxynucleoside triphosphates, a DNA polymerase polypeptide according to the invention, and one or more DNA synthesis terminating agents which terminate DNA synthesis at a specific nucleotide base, wherein each said agent terminates DNA synthesis at a different nucleotide base; and

(c) separating the DNA products of the incubating reaction according to size, whereby at least a part of the nucleotide base sequence of said DNA can be determined.

In a preferred embodiment, said terminating agent is a dideoxynucleoside triphosphate.

A DNA synthesis terminating agent which terminates DNA synthesis at a specific nucleotide base refers to compounds, including but not limited to, dideoxynucleosides having a 2', 3' dideoxy structure (e.g., ddATP, ddCTP, ddGTP and ddTTP). Any compound capable of specifically terminating a DNA sequencing reaction at a specific base may be employed as a DNA synthesis terminating agent.

In another aspect, the present invention encompasses a method for amplification of a DNA molecule comprising the steps of: (a) incubating said DNA molecule in the presence of a polypeptide having DNA polymerase according to the invention and a mixture of different deoxynucleoside triphosphates.

In a preferred embodiment, the method for the synthesizing or the amplification of a DNA molecule according to the invention comprises the addition of an enzyme having a helicase activity in order to improve the processivity of the DNA polymerase polypeptide of the present invention.

In a further aspect, the present invention is directed to a kit for sequencing a DNA molecule, comprising:

(a) a first container means comprising the polypeptide according to the invention;

(b) a second container means comprising one or more dideoxyribonucleoside triphosphates; and

(c) a third container means comprising one or more deoxyribonucleoside triphosphates.

The present invention also encompasses a kit for amplifying a DNA molecule, comprising:

(a) a first container means comprising the polypeptide according to the invention; and

(b) a second container means comprising one or more deoxyribonucleoside triphosphates.

The present invention also comprises the use of a polypeptide according to the invention for implementing rolling circle amplification, multiple displacement amplification or protein-primed amplification method.

These particular methods are well known by the skilled person and are for example described in the documents:

Lizardi et al., 1998; Baner et al., 1998; Dean et al., Genome Res., 11, 1095-1099, 2001;

Larsson et al., Nature methods, 1, 227-232, 2004; for isothermal rolling-circle amplification method;

Dean et al., 2002, for multiple displacement amplification method; and

Blanco et al., 1994, for protein-primed amplification method.

In another aspect, the present invention also provides methods for producing anti-DNA polymerase polypeptide of the invention comprising, exposing an animal having immunocompetent cells to an immunogen comprising a polypeptide of the invention or at least an antigenic portion (determinant) of a polypeptide of the invention under conditions such that immunocompetent cells produce antibodies directed specifically against the polypeptide of the invention, or epitopic portion thereof. In one embodiment, the method further comprises the step of harvesting the antibodies. In an alternative embodiment, the method comprises the step of fusing the immunocompetent cells with an immortal cell line under conditions such that a hybridoma is produced.

Such antibodies can be used particularly for purifying the polypeptide of the present invention in a sample where others components are present.

The present invention is finally directed to an apparatus for DNA sequencing or amplification having a reactor comprising a DNA polymerase polypeptide of the present invention.

The following examples and the figures are given for the purpose of illustrating various embodiments of the invention and are not meant to limit the present invention in any fashion.

Legends of the figures

Figures 1A and IB: Schematic representation of the three new plasmids

- Figure 1A. pTN2 and pP12-l were drawn at the same scale together with PAV1 genome. ORFs are numbered and represented as arrows. ORFs encoding homologous proteins have the same color. White ORFs do not have detectable homologues among these three genomes.

- Figure IB. pT26-2 with ORFs numbered and represented as arrows. Coloured ORFs encode proteins with expected activity or function.

- Figures 1A and IB. Hachured ORFs harbour putative hydrophobic segments. Circles indicate large intergenic regions including putative replication origins.

Figures 2 A and 2B: tn2-12p is a DNA polymerase

- Figure 2 A: The recombinant protein tn2-12p (15 μΜ) (lanes 1 and 2) or Taq polymerase (Promega, 0,02υ/μ1) (lane 3) were incubated with a short primer-template substrate (5'- 32 P labeled 20 nt oligonucleotide CGAACCCGTTCTCGGAGCAC (SEQ ID NO: 15) hybridized to 42 nt template TTCTGCACAAAGCGGTTCTGCAGTGCTCCGAGAACGGGTTCG) (SEQ ID NO: 16). Primers are extended in the presence of 0,2 mM dNTPs (lanes 1 and 3) or 0,2 mM NTPs (lane 2) during 20 min at 70°C. In the control reaction with NTPs (line 2) no primer elongation was observed. The buffer used for the polymerization reaction is described in Materials and Methods. Extension products were analyzed on the 16 % denaturating polyacrylamide gel.

- Figure 2B: The primer extension activity of tn2-12p was assayed at 70°C (lanes 1-3) and 80°C (lanes 4 and 5). A 5'- 32 P-labelled 18 nt primer (GTAAAACGACGGCCAGTG) (SEQ ID NO: 17) was hybridized with ssDNA of M13. lOnM of this primer-template substrate were incubated for 20 min either without any proteins (lane 1), either with 10 μΜ of tn2-12p (lanes 2 and 4) or with 0,05 U/μΙ of Tag polymerase (lanes 3 and 5) in the presence of 0,2 mM dNTPs at one of the temperature indicated. The buffer used for the polymerization reaction is described in Materials and Methods. Extension products were analyzed on the 16 % denaturating polyacrylamide gel.

Figure 3: Phylogenic trees of SFI Helicase.

Figure 4: CAGs families.

The plasmid pT26-2 (yellow) and related elements of the family, TKV2 (blue), TKV3 (blue), PHV1 (light blue), MMPV1, (orange) MMC7V1 (pink), MMC7V2 (pink) and MMC6V1 (mauve) share a core of six gene and are linked with bold lines. They share four genes with a string of CAGs present in genomes of Thermococcales, Methanococcales and Methanosarcines ( ) and their edges are colored in dark gray. These elements may show one or several links with integrated elements detected in other archaea (Cortez et al. Genome Biology 2009). mmp is M. marpaludis, mmq is M. maripaludis C5, mmx is M. maripaludis C6, mmz is M. maripaludis C7, mja is M. jannaschii and mvn is M. vannielii. Methanosarcinales: mac is M. acetivorans .

Figure 5: Alignment DNA polymerases.

Figures 6A and 6B: Cumulative GC (Figure 6A) and AT (Figure 6B) skews graphics for pTN2, pP12-l and pT26-2 plasmid genomes.

Figure 7: Domain organization of the tn2-12p primase-polymerase. Prim-pol domain is indicated (residues 1-257), while domain sharing sequence similarity with the large subunit of archaeo/eukayotic primases (residues 268-699). The four conserved motifs of the Prim-pol domain (see bars in the residues 1-257 domain) and the four Cys residues predicted to be involved in the Fe-S cluster formation (see bars in the residues 268-699 domain).

Figures 8A and 8B: Protein PolpTN (tn2-12) is a DNA dependent DNA polymerase that can act as a terminal transferase.

(A) DNA polymerase. After hybridization of two complementary primers, one of 42 nt and another 32 P labeled oligonucleotide of 20 nt, a radioactively labeled product of about 42 nt could be obtained at 70°C in the presence of PolpTN (tn2-12). In the control (lane 3) the same result is observed with Taq Pol. The polymerization is dependent on the presence of dNTP. No polymerization is observed if NTP were used instead of dNTP (lane 2)

(B) Terminal transferase activity. More careful analysis of the polymerisation product indicates that at 60°C and using a specific buffer, one extra nucleotide is added to the 5' end of the newly synthesized strand similarly to the Taq polymerase.

Figures 9A and 9B: PolpTN is an highly efficient primase/polymerase

A. PolpTN could initiate the synthesis of the (-) strand of the M13 circular genome in the presence of only dNTP, with no specific primer nor NTP added. The presence of NTP is not required for the DNA synthesis initiation. The DNA polymerase activity of PolpTN (tn2-12) allows to synthesize several kb of DNA (M13 genome is 6,4 kb) indicating that PolpTN is a fully processive polymerase.

B. Priming and DNA polymerization using as template a 43 nt oligo with blocked 3'. The only labeled product that can be obtained in these conditions is the result of primase/polymerase activity

Figure 10: Terminal transferase activity of PolpTN

Two complementary oligonucleotides of 50 bp were hybridized forming double stranded DNA fragment. One of the oligonucleotides is 32 P labeled. In the presence of PolpTN (tn2-12) a 51 nt product is observed. The optimal temperature for the terminal transferase activity of PolpTN (tn2-12) is 80°C (compared to 60°C) and any of 4 dNTP could be used as substrate.

Figure 11: PolpTN is a reverse transcriptase

In the presence of the 30 nt RNA template and a complementary 20 nt DNA primer, PolpTN (tn2-12) is able (lanes 1 and 2) to synthesize longer DNA fragments at the same way as commercial RT (Promega and Fermentas - lanes 3 and 4). Lane 5 is a negative control.

Figure 12: Mutations in motifs I and IV abolish the DNA polymerase activity of PolpTN

1 - no proteins; 2 - WT PolpTN; 3 - MI mutant; 4, 5 - MIV mutant; 6,7 - 2Cys mutant; 8 - Taq polymerase Figure 13: Mutations in motifs I and IV probably abolish the primase activity of PolpTN

1 - WT PolpTN; 2 - MI mutant; 3 - MIV mutant; 4 - 2Cys mutant; 5 - Taq polymerase

Figure 14: Unusual catalytic activity of PolpTN: self-sufficient reverse primase/transcriptase

(AMV= Avian Myeloblastosis virus reverse transcriptase; WT= motifs with wild type conserved amino acids; MI = MI mutant; MIV=MIV mutant; MV=MV mutant, see table 3) EXAMPLE 1: Materials and Methods

Strains cultivation

Thermococcus nautilus sp. 30-1, Pyrococcus sp. 12-1, and Thermococcus sp. 26-2, were isolated from single colonies by plating on Gelrite an enrichment culture obtained from fragments of chimneys collected at deep sea hydrothermal vents (- 2.630 m) in the East Pacific ocean (Lepage et al., 2004). Cells were grown in Zillig's broth (ZB) made anaerobic by addition of Na2S (0.1 mg/liter), with sulfur flowers S (Fisher Scientific) as previously described (Lepage et al., 2004). The cultures were incubated in Penny's flasks at 85°C for Thermococcus sp. 26-2 and T. nautilus, or 95°C for Pyrococcus sp. 12-1.

Plasmid isolation

Plasmids were obtained by alkaline extraction procedure as described previously (Soler et al, 2007) from 50 mL culture of T. nautilus 30-1, Pyrococcus 12-1 or Thermococcus 26-2 cells that had been grown until stationary phase. Plasmids were sequenced by Fidelity Systems, USA. The complete sequence of have been sequenced and annotated. Concerning the pTN2 of T. nautilus 30-1, see SEQ ID NO: 1 and the Table 1.

Sequence analysis

BLAST and PSI-BLAST searches were performed in the NCBI non redundant (nr) databank using the following web sites: http://www.ncbi.nlm.nih.gov/BLAST/ and http://www-archbac.u-psud.fr/genomics/GenomicsToolbox.html/. Hydrophobic regions were detected using TMpred and TMHMM prediction programs located at the web site http://www.expasy.org/tools/. ORFs were identified with a minimum of 50 amino acids length and with one of the for initiator codons (ATG, GTG, CTG, TTG). The exact position of each initiator codon was then checked individually (and manually modified if required) depending on the position of putative Shine-Dalgarno sequence located upstream.

pTN2 sequence annotation.

See the Table 1.

Size ORF position RBS and Homologue in another % identity, length of

ORF Putative protein Putative motives and function

(kDa) (start-stop) start codon plasmid BLAST alignment (aa)

1 tn2-lp 66,1 1-1710 aggagggggtggtcggGTG pl2-lp 69%, 569 UvrD/Rep/PcrA

(SEQ ID NO: 2) (SEQ ID NO: 18) superfamily I type helicase

2 tn2-2p 104,9 1785-4622 ggggggatttgtATG Putative coiled-coil domains

(SEQ ID NO: 3) (SEQ ID NO: 19)

3 tn2-3p 20,3 4841-5365 tgggggatgacgATG pl2-14p 46%, 173

(SEQ ID NO: 4) (SEQ ID NO: 20)

4 tn2-4p 15,2 5590-5979 gggtggtgtcgtaattATG Putative helix-turn-helix domain

(SEQ ID NO: 5) (SEQ ID NO: 21)

5 tn2-5p 11,2 5983-6279 aggggaggtgtaaccgATG

(SEQ ID NO: 6) (SEQ ID NO: 22)

6 tn2-6p 26,4 6285-6959 agaggtgtaaaacaaATG

(SEQ ID NO: 7) (SEQ ID NO: 23)

7 tn2-7p 19,5 6959-7477 aggaggggttatgATG pl2-9p 41%, 41

(SEQ ID NO: 8) (SEQ ID NO: 24)

8 tn2-8p 43 7542-8648 tggaggtgccgagcGTG pl2-10p 46%, 367 ATP binding site, that is related to ABC

(SEQ ID NO: 9) (SEQ ID NO: 25) transporters

9 tn2-9p 16 9184-9597 agggggtactcataATG

(SEQ ID NO: 10) (SEQ ID NO: 26)

10 tn2-10p 6,7 9551-9730 cggaggaggataATG

(SEQ ID NO: 11) (SEQ ID NO: 27)

11 tn2-l lp 18,4 9763-10251 tggaggtgatgtaaATG

(SEQ ID NO: 12) (SEQ ID NO: 28)

12 tn2-12p 106,8 10248-4 agggggtgaaagtATG pl21-17 55%, 922 DNA polymerase/primase activities

(SEQ ID NO: 14) (SEQ ID NO: 13) (SEQ ID NO: 29)

Table 1. Annotation of pTN2 plasmid from the strain Thermococcus nautilus 30/1.

Phylo genetic analyses

Homologous sequences were recovered by BLAST searches, and multiple alignments were performed with the selected sequences using MUSCLE program (Edgar, 2004). Only homologous positions were used to build unrooted maximum likelihood trees using PHYML (Guindon and Gascuel, 2003). The JTT model of amino acid substitution was chosen, and a gamma correction with 4 discrete classes of sites was used. The alpha parameter and the proportion of invariable sites were estimated. The robustness of trees was tested by non parametric bootstrap analysis using PHYML.

In silico identification of integrated mobile element and CAGs families

Mobile elements integrated in cellular genome were identified as Cluster of Atypical Genes (CAGs) according to Cortez et ah, (2009) (Cortez et ah, 2009). Briefly, archaeal genomes were analyzed with a species- specific Markov model in order to obtain the list of atypical genes. We then searched for atypical genes that cluster together, since these may be recently integrated foreign elements. We used a sliding window of ten ORFs that moved along the genome sequence. A CAG was defined when 7 or more ORFs in that window showed an atypical composition. To define CAGs families, Blast searches were performed with all the ORFs contained in our CAGs with an e-value of lOe -20. We then generated several topological networks of CAGs by drawing a line between pairs of CAGs that share a defined number of ORFs (from two up to 6). The graphical representation was obtained with the Cytoscape program (http://www.cytoscape.org). DNA polymerase assay

Two different 5 '-labelled primer- template systems were used (see Figures 2A and 2B legends). The standard polymerase assay contained 10 nM primer-template substrate, 10-15 μΜ of recombinant protein tn2-12p in 10 mM Tris-HCl pH 9, 1 mM DTT, 50 mM KC1, 0,1% Triton xlOO™, 50 mM MgCl 2 and 0,2 mM dNTPs. The protein was diluted in 50 mM Tris-HCl pH 8, 1 mM DTT, 100 mM NaCl, 0,1 mM EDTA. The reactions were allowed to proceed for 20 min at 70 or 80°C and were analyzed by denaturating PAGE. EXAMPLE 2: Isolation of three new plasmids from Thermococcales

Plasmids pTN2, pP12-l and pT26-2 were isolated from three strains of Thermococcales that were purified from samples collected at three different deep-sea hot vents located in the East-Pacific ridge during the AMISTAD cruise (Lepage et al., 2004). The strain 30- 1, that harbours the plasmid pTN2, has been tentatively described as the type strain of a new species, Thermococcus nautilus (Soler et al., 2007). The strain 26-2, that harbours the plasmid pT26-2 is closely related to T. nautilus sp. 30-1 by 16S rRNA analysis and could belong to the same species, although it exhibits a very different RAPD profile (Lepage et al., 2004). Finally, the strain 12-1, which harbours the plasmid pP-12-1 belongs to a RAPD group that includes a Pyrococcus species (isolate 32-4) and will be thereafter called Pyrococcus sp. 12-1. The three plasmids are circular and their sizes are 13015, 12205, and 21566 bp, respectively. The plasmids sequences have a GC content of around 47.5, 44.6 and 49.6 % for pTN2, pP12-l and pT26-2, respectively, which are close to GC of Thermococcales chromosomal DNA (40-50%). By in silico analysis, we identified 12, 17 and 32 putative ORFs in pTN2, pP12-l and p26-2, respectively Although pTN2 was isolated from a Thermococcus species and pP12-l from a Pyrococcus species, they turned out to be evolutionary related, since they share six homologous genes. In contrast, although plasmids pTN2 and pT26-2 are present in two closely related strains (30-1 and 26-2), they turned out to be completely unrelated to each other.

All ORFs of pTN2 and pP12-l are located on the same DNA strand with only two exceptions detected in pP12-l. Similarly, 30 of the 32 pT26-2 ORFs are transcribed in the same direction (Figure IB). We though to predict the position of the replication origin of these three plasmids by performing cumulative GC skew analysis (Grigoriev, 1998). This method is based on the general observation that GC content usually differ between the leading and lagging strands of replication forks (Lobry, 1996). The cumulative GC skews graphics for the three plasmids show GC frequency inversion producing V-like curves. Strikingly, the minima are located in the larger intergenic regions for each of the three plasmids (circles Figures 1A and IB). These intergenic regions are among the most AT-rich of the three plasmids and contain many direct and inverted repeats; features that are general characteristics of plasmid replication origins. EXAMPLE 3: The pTN2 plasmid family

We identified 12 and 17 putative ORFs in the plasmids pTN2 and pP12-l sequences, respectively (see material and methods). All ORFs of pTN2 and most ORFs of pP12-l are located on the same strand with only two exceptions in the case of pP12- 1. The plasmids pTN2 and pP12-l share six homologous ORFs, including two contiguous large ORFs encoding proteins of around 66 and 105 kDa, respectively (Figures 1A and IB). The first one (tn2-lp and pl2-lp) corresponds to a helicase of the superfamily I (SFI). These helicases include the well characterized bacterial helicases UvrD, PcrA and Rep. They are widespread in the three domains with many representatives in several bacterial and archaeal phyla, and in a few eukaryotic ones. The closest relative of tn2-lp and pl2-lp is encoded in the genome of Thermococcus onnurineus. Other close homologues are encoded in the genomes of euryarchaea from various orders (Halobacteriales, Thermoplasmatales, Methanosarcinales, Methanomicrobiales, Methanobacteriales, Methanococcales) and one in the genomes of a Thaumarchaeon (Cenarchaeum symbiosum) (for definition of Thaumarchaea, see Brochier-Armanet et al., 2008). We have built a phylogenetic tree of this helicase family with a selection of sequences from the three domains (at least one sequence per represented phylum) (Figure 4). In this tree, the closest relatives of the pTN2 and pP12- 1 helicases are encoded in the genomes of Thermococcus gammatolerans and Thermococcus onnurineus. The four helicases from Thermococcales form a monophyletic group with seven helicases from Haloarchaea. Overall, the archaeal SFI helicases tree is not congruent with the archaeal specie tree based on ribosomal proteins or RNA polymerase subunits (Brochier-Armanet et ah, 2008; Brochier et ah, 2005). The archaeal sequences form five monophyletic groups and Archaea of the same orders are often present in different groups (Figure 4). The SFI helicase of the thaumarchaeon Cenarchaeum symbiosum is grouped with some Methanococcales (phylum Euryarchaea). Furthermore, several groups of eukaryotic helicases (from plants and protists) are interspersed between archaeal groups, whereas helicases from fungi form a monophyletic group with sequences from Methanomicrobiales and Thermoplasmatales. This phylogeny is difficult to interpret because lengths are highly variable, probably inducing phylogenetic artefacts. Interestingly, the "cellular" homologue of pTN2-type helicases in T. gammatolerans is located in an integrated virus-like element, TGV2 (Zivanovic et ah, 2009). In Methanococcus maripaludis strain C6 (MMC6V1) and Methanococcus maripaludis strain C7 (MMC7V2) the genes encoding pTN2-type helicases are located in predicted integrated elements of plasmid/virus origin that have been identified in silico, based on their dinucleotide sequence composition (Cortez et ah, 2009). These observations suggest that some archaeal, and possibly eukaryotic SFI helicases, originated from viruses and plasmids and were transferred independently into various lineages, explaining the incongruence between their phylogeny and the classical archaeal and eukaryotic phylogenies. The frequent grouping of Archaea from different phyla in the same clade furthermore suggests that horizontal gene transfers of genes encoding these helicases also sometimes occurred in Archaea. The bacterial helicases of the SFI helicase family are quite well separated from Archaeal and Eukaryotic, with the exception of those from the bacterium Aquifex aeolicus that branches with the group mixing some helicases from Archaea and Fungi. Bacterial UvrD-like helicases form a monophyletic group (with two representatives in the eukaryotes Ostreococcus, a probable case of gene transfer from a mitochondrial or plastid genome). However, as in the case of Archaea, Bacteria from the same phylum are often dispersed in different parts of the tree, suggesting that the genes encoding these helicases have been also sometimes transferred between bacterial phyla.

The second large ORFs conserved between pTN2 and pP12-l (tn2-12p and pl2- 17p, respectively) encodes proteins of 107 and 103 kDa, respectively. Psi-BLAST searches using tn2-12p as query gave no significant result). Interestingly, we noticed among hits of the first PSTBLAST iterations the putative Rep protein of the recently described plasmid pXZl from Sulfolobus islandicus (Peng, 2008). BLAST search with the sequence of the pXZl Rep protein then retrieved the Rep protein of the plasmid pTIK4 from Sulfolobus neozelandicus (Greve et al., 2005). We could align manually the N-terminal regions of tn2-12p and pl2-17p with those of the Rep proteins of pXZl and pTIK4 (Figure 5). We then noticed that these proteins exhibit a conserved DhD motif, known to be present in the active sites of many DNA polymerases, primases and/or nucleotidyl transferases (Iyer et ah, 2005).

EXAMPLE 4: Cloning, expression, and purification of tn2-12p

The tn2-12p protein has been expressed in E. coli, particularly to test and to demonstrate the DNA polymerase, primase and transferase activities exhibited by the protein in vitro. The coding sequence of tn2-12p was amplified by PCR from plasmidic DNA isolated from the strain Thermococcus nautilus 30/1, and the amplified fragment was cloned in a pET30a plasmid allowing fusion of a 6His tag at 3' end. Expression was done at 37°C using the E. coli Rosetta (DE3) pLysS strain and the 2xYT medium (BIO 101 Inc.). When the cell culture reached an ODgQQ nm of 0.8, induction at 15°C was performed overnight with 0.5 mM IPTG (Sigma). Cells were harvested by centrifugation and resuspended in buffer A (20 mM Tris-HCl pH 7.5, 200 mM NaCl, 5 mM beta-mercaptoethanol). Cell lysis was completed by sonication and the lysate was heated for 20 min at 70°C before centrifugation at 20,000 rpm for 20 min. The soluble fraction was loaded on a NiNTA column (Qiagen Inc.) equilibrated with buffer A. The protein was eluted with imidazole and subsequently loaded on a heparin column (GE Healthcare) equilibrated against buffer A' (20 mM tris Tris-HCl pH 7.5, 50 mM NaCl, 5 mM beta-mercaptoethanol). Elution was performed using a gradient between buffer A' and buffer B (20 mM tris Tris-HCl pH 7.5, 1M NaCl, 5 mM beta-mercaptoethanol). The tn2-12p protein was eluted at about 0.9 M NaCl. Eluted fractions were pooled and loaded on a Superdex200 column (Amersham Pharmacia Biotech) equilibrated against buffer A supplemented with 10 mM beta-mercaptoethanol. The homogeneity of the protein sample was checked by SDS-PAGE. EXAMPLE 5: Demonstration of the thermostable DNA polymerase, primase and transferase activities of the tn2-12p recombinant protein

The purified recombinant tn2-12p protein was first incubated with dNTPs and 5 '-labelled 20 nt DNA primer hybridized to a complementary 42 nucleotides DNA template. As expected, if our prediction was correct, the primer was efficiently extended up to the full length of the template (Figure 2A). The DNA polymerizing activity of tn2- 12p requires the presence of dNTPs and a DNA template (data not shown) and it is ATP independent. No extension was observed when the dNTPs were replaced by NTPs. The optimum temperature of the DNA polymerase activity was assayed on M13 DNA primed with a 18 nucleotides DNA primer (Figure 2B). At 65°C the activity of tn2-12p was lower (data not shown) than at 70°C and 80°C, which therefore excludes the possibility that the observed activity could be due to the presence of some E. coli proteins co-purified with tn2-12p. The protein tn2-12p is able to extend the primer up to several kilobases at 20 min of the reaction times and the obtained DNA shows the same pattern as that synthesized by the Taq polymerase. This result indicates that tn2-12p can produce long extension products and its activity is independent of the presence of additional proteins. Additionally to its DNA polymerase activity, the tn2-12p protein exhibits primase and nucleotidyl-transferase activities. Preliminary sequence analyses based on hand-made alignments indicate that homologues of these DNA polymerases are widespread in Archaea and Bacteria (always encoded by plasmids, viruses or integrated elements). In particular, we found one copy in T. gammatolerans (TGAM_1629) and in T. onnurineus (TON_1379).

The DNA polymerase/primase activities found associated to tn2-12p, are consistent with the idea that tn2-12p and the homologous protein pl2-17p correspond to the Rep proteins of the plasmids pTN2 and pP12-l, respectively. In the case of pRNl, the DNA polymerase domain of RepA is located in the N-terminal part of the protein and is fused to a helicase domain (SFIII) in C-terminal. In the case of the pTN2 and pP12-l Rep proteins, Psi-BLAST analysis suggests that the DNA polymerase domain is also located in their N-terminal part (which contains the two conserved aspartate residues probably present in the active site). They also indicate that, as in the case of pRNl-type RepA, these polymerase domains are fused to large C-terminal domains. However, Psi-BLAST analysis using these domains as queries retrieved no hit in databases.

Using the Multalin program (Corpet, 1988), we found that pl2-17 and the pXZl

Rep protein also shares a short stretch of highly conserved amino-acids located in the middle of these proteins. PSI-BLAST searches using these central conserved regions of pTN2 and pXZl Rep proteins as queries recovered proteins with the SI RNA binding motif, suggesting that these regions could bind nucleic acid. It is tempting to speculate that the Rep proteins of pTN2 and pP12-l are formed by the fusion of a novel type of polymerase/primase in N-terminal and an origin of replication binding domain (corresponding to the SI nucleic acid motive). As previously mentioned, the genes encoding the SF1 helicase and the Rep protein are contiguous in both pTN2 and pP12-l. Interestingly, combining the DNA polymerase/primase and the putative origin binding activity of these Rep proteins with the activity of the neighbouring helicase would reconstitute a fully operational system for plasmid DNA replication by the theta mode. EXAMPLE 6: Prediction of the position of the replication origin of pTN2 and pP12-l

To predict the position of the replication origin of pTN2 and pP12-l, we performed cumulative GC and AT skews analyses, applying the following formulas: (G-C)/(G+C) and (A-T)/(A+T) (Grigoriev, NAR, 1998). This method is based on the general observation that GC and AT content usually differ between the leading and lagging strands of replication forks. Cumulative GC and AT skews graphics for pTN2 and pP12-l genomes show that the two plasmids have a GC frequency inversion exactly were a large intergenic region is found (see Figure 6A). These results indicate that these large intergenic regions likely harbour the origin of replication.

EXAMPLE 7: In silico and in vitro analysis of PolpTN.

Thermococcus nautilus is an archaeon thriving at 90-95°C. Sequence analysis of the pTN2 plasmid, revealed that it codes for at least two proteins involved in the replication of this plasmid - a very long multi-domain DNA polymerase PolpTN with multiple additional biochemical activities (described below) and a putative superfamily I helicase. Sequence analysis revealed that PolpTN is not closely related to any known DNA polymerases, but shares certain characteristic features with the primase- polymerase encoded by the Sulfolobus plasmid pRNl(Lipps, Rother et al. 2003; Lipps 2009).

1) A multidomain protein PolpTN, a prototype of a new family of DNA polymerases.

The protein PolpTN is composed of at least two distinct domains. The N- terminus of the protein bears a primase/polymerase (prim-pol) domain, which is characterized by the presence of several concerved motifs. If in the actual version of our patent we have mentioned 3 motifs, at presence we distinguish in the N termin part of PolpTN four conserved motifs (I-IV) (Figure 7). This domain is followed by a C- terminal part which shares considerable primary and secondary structural similarity with the large subunit of archaeal and eukaryotic DNA primases (PriL).

The PolpTN prim-pol domain is characteristic to a new family of DNA polymerases identified by our laboratory. This family includes at least 200 related proteins encoded by a variety of bacterial and archaeal viruses and plasmids. Despite their abundance none of these proteins has been extensively studied experimentally. The identified protein sequences were found to share the four conserved motifs (See Table 2), suggesting that the proteins in this new primase/polymerase family are also likely to possess the DNA polymerization activity characteristic to the type member protein, PolpTN. See the Table 2

Table 2: Multiple sequence alignment of the four motifs conserved in the prim-pol domains of tn2-12p-like proteins.

Legend of Table 2

Multiple sequence alignment of the four motifs conserved in the prim-pol domains of tn2-12p-like proteins. The dataset was collected by running a PSTBLAST search against the nr protein database at NCBI and using the prim-pol domain sequence of tn2-12p as a seed. 200 homologues were retrieved (only a selection of these is shown in Fig. 2. The consensus sequences (calculated from the entire dataset) are shown below the alignment. Key: h (hydrophobic residues), / (aliphatic residues), s (small residues), o (alcohol residues), p (polar residues), @ (aromatic residues). The names of the organisms encoding tn2-12p homologues are coded (see the legend (**)). Motifs 1 and 3 are identical to those of the members of the archaeo-eukaryotic primase (AEP) superfamily [Lips and al. 2009]. However, the catalytic His residue found in Motif 2 of AEP proteins (arrowhead) is not conserved in the tn2-12p family and Gin is usually (47 %) found instead. tn2-12p homologues invariably possess en Arg residue in Motif 4, which was not found to be conserved in the AEP superfamily [Lips and al. 2009].

(*) Numerical expression of sequence conservation. It has been calculated from the alignment of 200 homologues (only selection is shown in this table 2). The higher the number, the more conserved a position is. + denotes absolutely conserved positions.

(**) / Cellular chromo s ome ; 2 Plas mid ; 3 Inte grative pl as mid ;

4 Viru s ; 5 Pro viru s

2) Catalytic activities of PolpTN.

A PolpTN is a highly efficient thermostable DNA-dependent DNA polymerase. We observe a fast synthesis of more than 6 kb of DNA even in the experimental conditions that were not specifically optimized (Figures 8A and 8B, 9A and 9B). B The same enzyme is also an efficient, condition-dependant terminal nucleotidyltransferase using as substrate double strand DNA (Figures 8A-8B and 10). It is able to add, depending on the buffer composition, from 1 to 10-20 nucleotides to the 3' of the nascent DNA strand once the specific template-dependent polymerization is achieved (date not shown). Any nucleotide could be added to the 3' end of the de novo synthesized DNA strand, thereby overcoming the limitation of conventional Taq (as well as Tfl and Tth) polymerase that adds in these conditions only one A at the 3' end. This activity is widely used in the TA cloning vectors (Zhou and Gomez-Sanchez 2000). Notably, a USA-based biotechnological company "Lucigen" has recently proposed a similar GC cloning kit (patent pending; see http://www. lucigen. corn/catalog/index. php?cPath=15_43 ). The terminal transferase activity of PolpTN should also allow adding any nucleotides to any double strand DNA creating, for example, specific homopolymer nucleotide tails at the end of any dsDNA fragment. This modification might be very useful for genomic DNA and cDNA cloning.

C PolpTN is a highly efficient thermostable DNA-dependent (and probably RNA- dependent) DNA primase (Figure 9 A and 9B). The enzyme does not need any oligonucleotide primers to initiate the replication. It is one of the rare examples of DNA polymerases able to initial the DNA synthesis on the "naked" DNA without the need for any additional factors (priming proteins, short oligonucleotide primers, etc.) routinely used for such purpose. For example, PolpTN protein when added to the denatured genomic DNA of any origin will initiate the DNA polymerisation from multiple random sites on the template.

D Additionally, PolpTN has a strong specific reverse transcriptase activity (Figure 11). To the best of our knowledge, it is one of the rare examples of highly thermostable reverse transcriptases that are currently available. E Activities under study

Very often functionally linked proteins (especially in viral and plasmid genomes) are encoded close to each other in the genome. A gene situated next to polpTN encodes a DNA helicase (Soler, Marguet et al. 2010). It is highly likely that this protein controls the topology of the DNA during replication. Hidden activities of PolpTN might potentially be revealed in the presence of this helicase. Usually, small plasmids replicate by a rolling circle mechanism and their replication machineries have a strand displacement activity (Khan 2005). It is reasonable to expect that in the presence of both enzymes, PolpTN and the helicase, strand displacement activity, similar to that of Phi29, will be achieved.

3) Mutational analysis of the conserved motifs of PolpTN

To confirm the catalytic importance of the four motifs of the N terminal part of PolpTN we have performed a systematic mutation analysis of PolpTN. Table 3 presents the mutations inserted or planned to be inserted. To better understand the role of the C- terminal domaine (PriL-like part of PolpTN) we also mutated 2 from 4 conserved cysteines present in this portion of the PolpTN. motif WT conserved amino acids mutated allele realisation

MI DiD LiL obtained

Mil Ghpphh Xhpphh planned

Mill sD sA obtained

MIV hhR hhA obtained

PriL-like 4 cysteines 2 cysteines obtained

Table 3. Mutational analysis of the conserved motifs of PolpTN (X - amino acid to be defined later)

Two activities characteristics of PolpTN were tested - activity of the DNA dependent DNApolymerase and that of primase.

The Figure 12 clearly demonstrated that if any of two motifs is mutated (either MI or MIV), the polymerisation activity is completely abolished. This result directly proves the essential role of DD and R amino acids in the PolpTN polymerisation activity. Interestingly the 2 Cys mutant (mutation in the C terminal part) conserved this activity untouched.

The result presented in the Figure 13 indicates that the both of the motifs, MI and MIV, could be essential for the primase activity of PolpTN. No second strand synthesis was detected in the presence of dNTP and a single strand template (no primer added). It is still possible that the primase activity in at least one of the PolpTN mutants is still intact. To answer this question and to distinguish two activities - polymerase and primase- the DNA polymerization activity of PolpTN will be complemented in the in vitro assays by the conventional Taq polymerase activity (Taq could not initiate the second strand synthesis but will synthesise the second strand if the primase activity of PolpTN is intact). In the same experiment 2 Cys mutant shows normal promise activity.

Other mutants are in the process of construction and all activities of WT PolpTN will be tested for all mutants obtained to assign a specific function to each conserved motif.

4) Biotechnological perspectives of PolpTN

PolpTN has many enzymatic activities. For some of them (e.g., DNA pol activity) many types of functionally equivalent enzymes (e.g., Taq polymerases) are commercially available, but there is still a clear demand for faster, less error prone and more stable DNA polymerases. Isolated from extremely thermophilic host, PolpTN polymerase could posses some of these properties and be interesting for the PCR applications. Because all current PCR enzymes are derived from bacterial or archaeal DNA repair polymerases rather than true DNA replicases, they have inherent limitations in DNA replication fidelity, robustness, and processivity. PolpTN polymerase is a bona fide replicase and as such could eventually overcome these limitations and prove to be much more effective in DNA amplification.

More unusually, this protein is a rare extremely thermostable reverse transcriptase and terminal nucleotidyltransferase, as well as DNA (and RNA) primase. Many enzymes with these activities are on the market; nevertheless very few thermostable versions of these enzymes are actually commercially available. For the following enzymatic activities the thermostable versions of conventional enzymes are desired for many applications: a) DNA Pols. A number of thermostable DNA polymerases that are generally utilized in various PCR applications are available. Nevertheless, there is still a strong demand for faster, less error-prone and more stable DNA polymerases. b) Terminal nucleotidyltransferases. Terminal transferases catalyze the template- independent addition of deoxy- and dideoxynucleosidetriphosphates to the 3 OH ends of double and single- stranded DNA fragments and oligonucleotides. Terminal transferases can also incorporate digoxigenin-, biotin-, and fluorochrome-labeled deoxy- and dideoxynucleoside triphosphates as well as radioactive labeled deoxy- and dideoxynucleoside triphosphates. They are widely used for:

- "tailing" with dNTP ' s:

• Addition of homopolymeric tails to DNA fragments;

• Labeling of double- and single- stranded DNA and oligonucleotides with either radioactive or chemically modified nucleotides (e.g. DIG-dUTP). -3 End Labeling with ddNTP ' s:

• Labeling of double- and single- stranded DNA and oligonucleotides with either radioactive or chemically modified dideoxynucleotides (e.g., DIG-ddUTP).

The range of tail length is between 1 and 100 nt. Many leading biotechnological companies offer the corresponding kit with non-termostable enzymes (e.g., Roche). Only few companies have developed thermostable versions of these enzymes (e.g., Lucigen). The enzyme of the latter company is described as having increased enzymatic efficiency.

Terminal transferase activity of PolpTN could be useful for controlled addition of a polynucleotide tail (for example a homopolymere tail) to the newly synthesized DNA strand. As this activity appears to be strictly dependent on the buffer used, 3 of the 4 dNTPs that are used for the synthesis of the complementary strand could be removed from the solution leading to preferential synthesis of homopolymer tails (e.g., polyA, polyC, etc.). The presence of these tails would then allow amplification of the DNA fragments of unknown sequence using conventional PCR without prior ligation of short oligonucleotides. c) Primases. The kits of genomic DNA amplification using the primase activity instead of random priming are already on the market (see Biohelix Rapisome™ pWGA Kit or (Li, Kim et al. 2008). Unlike other WGA products, the Rapisome™ pWGA kit is based upon the in vitro reconstitution of a naturally existing cellular DNA replication system, which performs fast isothermal DNA amplification without the need for thermocycling or added primers. Interestingly all enzymatic activities present in the protein mixture used in this kit could be replaced by PolpTN+helicase.

PolpTN primase/polymerase activity allows random initiation of the DNA synthesis since the protein does not require the presence of short random oligonucleotides to prime the DNA replication. This way, any genomic DNA could be amplified, by producing complementary fragments without any bias linked to the specificity of primers used in the conventional methods. The displacement activity of PolpTN (or the associated helicase protein) will afterwards allow isothermal amplification of these fragments. d) Reverse transcriptases. A number of known limitations of reverse transcribing enzymes (RTs) compromise current RT PCR and RNA sequence analyses. Most common RT protocols rely on retroviral RTs that impair analysis due to low accuracy, strand switching and bias. Since none of these RT enzymes are truly thermostable, a second DNA polymerase is needed to perform PCR amplification, which creates additional handling steps and necessitates a compromise in optimal reaction conditions suitable for the two enzymes. Very few termostable RT are present on the biotechnological market (Pyrophage RT of Lucigen, The GeneAmp® Thermostable iTth RT).

PolpTN RT is the first viral reverse transcriptase that due to its stability in extreme high temperature can also be employed in PCR. PolpTN and conditions for a desirable reaction can therefore be optimized for single-enzyme, one-step RT PCR with sensitivity, specificity and accuracy equivalent to or better than multienzyme RT PCR systems, with the added benefit of reduced hands-on time and downstream error probability. Extreme thermostability expedites sample preparation by allowing complete denaturation prior to reverse transcription and cDNA synthesis at 70°C, which improves analysis of highly structured RNA templates. A "one tube" reaction kit for production of dsDNA from a ssRNA template can be developed on the basis of the functional versatility of the PolpTN polymerase. For example, mRNA could be used as an initial template for priming and synthesis of the cDNA using the RNA-dependant primase and RT activities of PolpTN, respectively. The newly synthesized DNA strand would be subsequently amplified using the DNA- dependent primase and DNA polymerase activities. The obtained product could be used for sequencing or for any other applications.

Additionally, PolpTN was found to be encoded just upstream of the gene for superfamily I helicase. It is therefore likely that the two proteins constitute the entire replication complex of the pTN2 plasmid, where PolpTN provides the primase and polymerase activities and the helicase performs the strand displacement. The pTN2 helicase is now being expressed in our laboratory and will be studied as an adjunct to further improve the maximal performance of PolpTN. A helicase-dependent isothermal DNA amplification performed in the presence of adequate DNA polymerase was recently described (Vincent, Xu et al. 2004); it allows 10 6 amplification of any DNA sample used.

5) Conclusions

The obtained results clearly indicate the originality and strong biotechnological potential of PolpNT and, in general, a corresponding new primase/polymerase family of proteins.

EXAMPLE 8: PolpTN is a self-sufficient reverse primase/transcriptase

The inventors have found that PolpTN is a self-sufficient reverse primase/transcriptase, i.e., it is capable of initiating (primase activity) and extending (transcriptase activity) the synthesis of a complementary DNA strand on a single- stranded RNA template without using a specific primer (Figure 14). To the best of our knowledge, such biochemical activity has not been previously described. Notably, the widely used, commercially available reverse transcriptase from AMV (Avian Myeloblastosis Virus) did not display the equivalent enzymatic activity and was not capable of initiating the DNA synthesis without a specific primer. The discovery of the self-sufficient reverse primase/transcriptase activity holds a great promise for designing novel molecular biology applications based on PolpTN. For example, The inventors envision that the enzyme is on its own sufficient for converting any kind of RNA molecules (e.g., mRNA) into dsDNA products that can be used in various downstream applications, such as sequencing, cloning etc. The superiority of PolpTN over currently commercially available proteins is that no specific primers are required and all enzymatic reactions can be performed by a single protein: RNA- dependent DNA priming, RNA-dependent DNA polymerization, DNA-dependent DNA priming and DNA-dependent DNA polymerization.

All plasmids previously isolated from Thermococales were small rolling circle plasmids (Erauso et ah, 1996; Marsin & Forterre, 1998; Marsin & Forterre, 1999; Soler et ah, 2008). Here, we report the isolation, sequencing and characterization of three new plasmids from Thermococcales that are larger and probably replicate via the theta mode. These plasmids do not encode homologue of Rep endonucleases known to initiate rolling-circle replication. The plasmids pTNl and pP12-l encode a DNA polymerase fused to a domain of unknown function, forming a new type of plasmidic Rep proteins. These proteins are homologous to the RepA protein of the plasmids pTIK4 from S. neozelandicus and pXZl from a S. solfataricus species (Peng, 2008) and to proteins encoded by integrated elements present in the genome of Methanococcales. They form a new family of archaeal RepA proteins common to mobile elements present in both Euryarchaea and Crenarchaea. The DNA polymerase domain of these RepA proteins have no detectable sequence similarities with those of previously known DNA polymerases, including the DNA polymerase discovered by Georg Lipps in pRNl (Lipps et al., 2003), indicating that these proteins could be the prototypes of a new DNA polymerase family. As in the case of the pRNl polymerases, the DNA polymerase of pTN2 exhibits a primase activity and could be used for the initiation as well as for the elongation step of plasmid DNA replication. The genes encoding the Rep-DNA polymerase in pTN2 and pP12-l are contiguous to the genes encoding the SFI helicases. It is tempting to speculate that these large Rep proteins also bear an origin binding activity. The combination of the DNA polymerase/primase activity of these Rep proteins, together with the activity of the helicase SFI, and an origin recognition activity would reconstitute a fully operational system for plasmid DNA replication by the theta mode. The plasmids pTN2 and pP12-l can be considered as the prototypes of a new plasmid family, presently specific for Thermococcales (thereafter dubbed the pTN2 family). The gene encoding the new type of DNA primase/polymerase of these two plasmids is located directly downstream a gene encoding a helicase of the SFI family. In addition, both plasmids encode a probable 3' to 5' exonuclease domain of the DnaQ family (that could be involved in proof-reading), and a regulator of the CopG family. These plasmids could become interesting models to study plasmid replication and expression both in vitro and in vivo in hyperthermophilic archaea, much like the pRNl plasmid from Sulfolobus islandicus has been successfully use to study plasmid replication in hyperthermophilic crenarchaea (Lipps, 2009).

References

Albers SV, Jonuscheit M, Dinkelaker S, Urich T, Kletzin A, Tampe R, Driessen AJ & Schleper C. (2006). Production of recombinant and tagged proteins in the hyperthermophilic archaeon Sulfolobus solfataricus, Appl Environ Microbiol, 72, 102- 11.

Arnold HP, She Q, Phan H, Stedman K, Prangishvili D, Holz I, Kristjansson JK, Garrett R & Zillig W. (1999). The genetic element pSSVx of the extremely thermophilic crenarchaeon Sulfolobus is a hybrid between a plasmid and a virus, Mol Microbiol, 34, 217-26.

Baliga NS, Bonneau R, Facciotti MT, Pan M, Glusman G, Deutsch EW, Shannon P, Chiu Y, Weng RS, Gan RR, Hung P, Date SV, Marcotte E, Hood L & Ng WV. (2004). Genome sequence of Haloarcula marismortui: a halophilic archaeon from the Dead Sea, Genome Res, 14, 2221-34.

Basta T, Smyth J, Forterre P, Prangishvili D & Peng X. (2009). Novel archaeal plasmid pAHl and its interactions with the lipothrixvirus AFV1, Mol Microbiol, 71, 23-34.

Berkner S, Grogan D, Albers SV & Lipps G. (2007). Small multicopy, non-integrative shuttle vectors based on the plasmid pRNl for Sulfolobus acidocaldarius and Sulfolobus solfataricus, model organisms of the (cren-)archaea, Nucleic Acids Res, 35, e88.

Cortez et al. Genome Biol. 2009 Jun 16; 10(6):R65.

del Solar G, Giraldo R, Ruiz-Echevarria MJ, Espinosa M & Diaz-Orejas R. (1998).

Replication and control of circular bacterial plasmids, Microbiol Mol Biol Rev, 62, 434-

64.

Edgar RC. (2004). MUSCLE: a multiple sequence alignment method with reduced time and space complexity, BMC Bioinformatics, 5, 113.

Erauso G, Marsin S, Benbouzid-Rollet N, Baucher MF, Barbeyron T, Zivanovic Y, Prieur D & Forterre P. (1996). Sequence of plasmid pGT5 from the archaeon Pyrococcus abyssi: evidence for rolling-circle replication in a hyperthermophile, J Bacteriol, 178, 3232-7.

Erauso G, Stedman KM, van de Werken HJ, Zillig W & van der Oost J. (2006). Two novel conjugative plasmids from a single strain of Sulfolobus, Microbiology, 152, 1951-68. Greve B, Jensen S, Brugger K, Zillig W & Garrett RA. (2004). Genomic comparison of archaeal conjugative plasmids from Sulfolobus, Archaea, 1, 231-9.

Greve B, Jensen S, Phan H, Brugger K, Zillig W, She Q & Garrett RA. (2005). Novel RepA-MCM proteins encoded in plasmids pTAU4, pORAl and pTIK4 from Sulfolobus neozealandicus, Archaea, 1, 319-25.

Harriott OT, Huber R, Stetter KO, Betts PW & Noll KM. (1994). A cryptic miniplasmid from the hyperthermophilic bacterium Thermotoga sp. strain RQ7, J Bacteriol, 176, 2759-62.

Khan, S. A. (2005) "Plasmid rolling-circle replication: highlights of two decades of research." Plasmid 53(2): 126-36.

Krupovic M & Bamford DH. (2008). Archaeal pro viruses TKV4 and MVV extend the

PRD1 -adenovirus lineage to the phylum Euryarchaeota, Virology, 375, 292-300.

Lepage E, Marguet E, Geslin C, Matte-Tailliez O, Zillig W, Forterre P & Tailliez P.

(2004). Molecular diversity of new Thermococcales isolates from a single area of hydrothermal deep-sea vents as revealed by randomly amplified polymorphic DNA fingerprinting and 16S rRNA gene sequence analysis, Appl Environ Microbiol, 70,

1277-86.

Li, Y., H. J. Kim, C. Zheng, W. H. Chow, J. Lim, B. Keenan, X. Pan, B. Lemieux and H. Kong (2008). "Primase-based whole genome amplification" Nucleic Acids Res 36(13): e79.

Lipps G. (2004). The replication protein of the Sulfolobus islandicus plasmid pRNl, Biochem Soc Trans, 32, 240-4.

Lipps G. (2006). Plasmids and viruses of the thermoacidophilic crenarchaeote Sulfolobus, Extremophiles, 10, 17-28.

Lipps G, Ibanez P, Stroessenreuther T, Hekimian K & Krauss G. (2001a). The protein ORF80 from the acidophilic and thermophilic archaeon Sulfolobus islandicus binds highly site- specifically to double- stranded DNA and represents a novel type of basic leucine zipper protein, Nucleic Acids Res, 29, 4973-82.

Lipps G, Rother S, Hart C & Krauss G. (2003). A novel type of replicative enzyme harbouring ATPase, primase and DNA polymerase activity, Embo J, 22, 2516-25.

Lipps G, Stegert M & Krauss G. (2001b). Thermostable and site-specific DNA binding of the gene product ORF56 from the Sulfolobus islandicus plasmid pRNl, a putative archael plasmid copy control protein, Nucleic Acids Res, 29, 904-13. Lipps G, Weinzierl AO, von Scheven G, Buchen C & Cramer P. (2004). Structure of a bifunctional DNA primase-polymerase, Nat Struct Mol Biol, 11, 157-62.

Lipps G. Molecular biology of the pRNl plasmid from Sulfolobus islandicus. Biochem Soc Trans. 2009 Feb;37(Pt l):42-5.

Lucas S, Toffin L, Zivanovic Y, Charlier D, Moussard H, Forterre P, Prieur D & Erauso G. (2002). Construction of a shuttle vector for, and spheroplast transformation of, the hyperthermophilic archaeon Pyrococcus abyssi, Appl Environ Microbiol, 68, 5528-36. Marsin S & Forterre P. (1998). A rolling circle replication initiator protein with a nucleotidyl-transferase activity encoded by the plasmid pGT5 from the hyperthermophilic archaeon Pyrococcus abyssi, Mol Microbiol, 27, 1183-92.

Marsin S & Forterre P. (2001). pGT5 replication initiator protein Rep75 from Pyrococcus abyssi, Methods Enzymol, 334, 193-204.

Paull TT. Saving the ends for last: the role of pol mu in DNA end joining. Mol Cell., 2005; 19(3):294-6.

Peng X. (2008). Evidence for the horizontal transfer of an integrase gene from a fusellovirus to a pRN-like plasmid within a single strain of Sulfolobus and the implications for plasmid survival, Microbiology, 154, 383-91.

Peng X, Holz I, ZiUig W, Garrett RA & She Q. (2000). Evolution of the family of pRN plasmids and their integrase-mediated insertion into the chromosome of the crenarchaeon Sulfolobus solfataricus, J Mol Biol, 303, 449-54.

Santangelo TJ, Cubonova L & Reeve JN. (2008). Shuttle vector expression in Thermococcus kodakaraensis contributions of cis elements to protein synthesis in a hyperthermophilic archaeon, Appl Environ Microbiol, 74, 3099-104.

Sato T, Fukui T, Atomi H & Imanaka T. (2005). Improved and versatile transformation system allowing multiple genetic manipulations of the hyperthermophilic archaeon Thermococcus kodakaraensis, Appl Environ Microbiol, 71, 3889-99.

She Q, Singh RK, Confalonieri F, Zivanovic Y, Allard G, Awayez MJ, Chan-Weiher CC, Clausen IG, Curtis BA, De Moors A, Erauso G, Fletcher C, Gordon PM, Heikamp- de Jong I, Jeffries AC, Kozera CJ, Medina N, Peng X, Thi-Ngoc HP, Redder P, Schenk ME, Theriault C, Tolstrup N, Charlebois RL, Doolittle WF, Duguet M, Gaasterland T, Garrett RA, Ragan MA, Sensen CW & Van der Oost J. (2001). The complete genome of the crenarchaeon Sulfolobus solfataricus P2, Proc Natl Acad Sci U S A, 98, 7835-40. Soler N, Justome A, Quevillon-Cheruel S, Lorieux F, Le Cam E, Marguet E & Forterre P. (2007). The rolling-circle plasmid pTNl from the hyperthermophilic archaeon Thermococcus nautilus, Mol Microbiol, 66, 357-70.

Soppa J. (2006). From genomes to function: haloarchaea as model organisms, Microbiology, 152, 585-90.

Stedman KM, She Q, Phan H, Arnold HP, Holz I, Garrett RA & Zillig W. (2003). Relationships between fuselloviruses infecting the extremely thermophilic archaeon Sulfolobus: SSV1 and SSV2, Res Microbiol, 154, 295-302.

Tatusova TA, Madden TL. BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett. 1999 May 15; 174(2):247-50.

Ward DE, Revet JJV1, Nandakumar R, Turtle JH, de Vos WM, van der Oost J & DiRuggiero J. (2002). Characterization of plasmid pRTl from Pyrococcus sp. strain JT1, J Bacterid, 184, 2561-6.

Vincent, M., Y. Xu and H. Kong (2004). "Helicase-dependent isothermal DNA amplification." EMBO Rep 5(8): 795-800.

Zhou, M. Y. and C. E. Gomez-Sanchez (2000). "Universal TA cloning." Curr Issues Mol Biol 2(1): 1-7