Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MODIFIED PEPTIDE NUCLEIC ACIDS AND THEIR USE
Document Type and Number:
WIPO Patent Application WO/2016/043661
Kind Code:
A1
Abstract:
This invention relates to a peptide nucleic acid comprising the base N4-(2-guanidoethyl)-5-methylcytosine and method for their manufacture. Further, the invention relates to the use of said compounds as medicaments and to methods of targeting the structure of RNAs, particularly in the treatment of viral diseases and cancer.

Inventors:
DEVI GITALI (SG)
TOH DESIREE-FAYE KAIXIN (SG)
PATIL KIRAN M (SG)
QU QIUYU (SG)
MARASWAMI MANIKANTHA (SG)
KIERZEK ELZBIETA (PL)
XIAO YUNYUN (SG)
LOH TECK PENG (SG)
ZHAO YANLI (SG)
CHEN GANG (SG)
Application Number:
PCT/SG2015/050319
Publication Date:
March 24, 2016
Filing Date:
September 15, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV NANYANG TECH (SG)
INST OF BIOORG CHEMISTRY POLISH ACADEMY OF SCIENCES (PL)
International Classes:
C07K5/078; A61K38/05; A61K48/00; A61P31/14; A61P31/18; A61P35/00; C12Q1/68
Foreign References:
US20030229201A12003-12-11
Other References:
GUPTA P ET AL.: "Recognition of double-stranded RNA by guanidine-modified peptide nucleic acids.", BIOCHEMISTRY, vol. 51, no. 1, 7 December 2011 (2011-12-07), pages 63 - 73, [retrieved on 20150921]
DEVI G ET AL.: "Incorporation of thio-pseudoisocytosine into triplex-forming peptide nucleic acids for enhanced recognition of RNA duplexes.", NUCLEIC ACIDS RESEARCH, vol. 42, no. 6, 13 January 2014 (2014-01-13), pages 4008 - 4018, [retrieved on 20150921]
SEMENYUK A ET AL.: "Targeting of an interrupted polypurine:polypyrimidine sequence in mammalian cells by a triplex-forming oligonucleotide containing a novel base analogue.", BIOCHEMISTRY, vol. 49, no. 36, 20 August 2010 (2010-08-20), pages 7867 - 7878, [retrieved on 20150921]
AUSIN C ET AL.: "Synthesis of amino- and guanidino-G-clamp PNA monomers.", ORG. LETT., vol. 4, no. 23, 16 October 2002 (2002-10-16), pages 4073 - 4075, XP002515121, [retrieved on 20150921], DOI: doi:10.1021/OL026815P
Attorney, Agent or Firm:
VIERING, JENTSCHURA & PARTNER LLP (Rochor Post OfficeRochor Road, Singapore 3, SG)
Download PDF:
Claims:
CLAIMS

1. Compound having the structure of formula (I):

formula (I)

wherein R1, R2 and R3 are independently selected from the group consisting of H and amine protecting groups.

2. Compound according to claim 1, wherein the amine protecting group R1 is different from the amine protecting group R2 and R3.

3. Compound according to claims 1 or 2, wherein the amine protecting groups R1, R2 and R3 are selected from benzyloxy carbamate (Cbz), p-methoxybenzyl carbonyl (Moz or MeOZ), tert-butyloxycarbonyl (BOC), 9-fluorenylmethyloxycarbonyl (FMOC), allyloxycarbonyl, acetyl (Ac), benzoyl (Bz), benzyl (Bn), carbamate, p-methoxybenzyl (PMB), 3,4- dimethoxybenzyl (DMPM), p-methoxyphenyl (PMP), tosyl (Ts) or sulfonamides (nosyl).

4. Compound according to any one of claims 1 to 3, wherein the amine protecting groups R1, R2 and R3 are selected from the group consisting of benzyloxy carbamate (Cbz), tert-butyloxycarbonyl (BOC), 9-fluorenylmethyloxycarbonyl (FMOC) and allyloxycarbonyl.

5. Compound according to any one of claims 1 to 4, wherein the compound has the structure of formula (II):

formula (II);

6. Method of manufacturing a compound according to formula (I), comprising: (a) reacting thymine with ethyl bromoacetate to form the compound of formula (III) 0

^COOEt formula (III);

(b) reacting the compound of formula (III) with 1 ,2,4-triazole to form the compound of formula (IV)

formula (IV);

(c) reacting the compound of formula (IV) with the compound of formula (V) H N^ NHR4

2N formula (V);

to form the compound of formula (VI)

R4

formula (VI);

(d) reacting the compound of formula (VI) with the compound of formula (VII) formula (VII); to form the compound of formula (VIII)

formula (VIII); (e) reacting the compound of formula (VIII) with base and neutralizing by acid to form the compound of formula (IX)

formula (IX); reacting the compound of formula (IX) with the compound of formula (X)

H

. N^ ^COOR5

formula (X); to form the compound of formula (XI)

formula (XI);

(g) reacting the compound of formula (XI) with base and neutralizing by acid to form the compound of formula (I),

wherein Et is ethyl, R1, R2, R3 and R4 are independently selected from amine protecting groups and R5 is an organic moiety.

7. Peptide nucleic acid of formula (XII)

B

O

O

H

J p formula (XII) wherein B is thymine, adenine, uracil, cytosine, guanine, thio-pseudoisocytosine, N4-(2- guanidoethyl)-5-methylcytosine, 4-acetylcytosine, 5-(carboxyhydroxymethyl)uracil, 5- carboxymethylaminomethyl-2-thiouracil, 5-carboxymethylaminomethyluracil, dihydrouracil, pseudouracil, hypoxanthine, N6-isopentenyladenine, 1-methyladenine, 1-methylpseudouracil, 1-methylguanine, 1-methylhypoxanthine, 2,2-dimethylguanine, 2-methyladenine, 2- methylguanine, 3-methylcytosine, 5-methylcytosine, N6-methyladenine, 7-methylguanine, 5- methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 5-methoxycarbonylmethyl-2- thiouracil, 5-methoxycarbonylmethyluracil, 5-methoxyuracil, 2-methylthio-N6- isopentenyladenine, 2-methylthio-N6-threonylcarbamoyladenine, uracil-5-oxyacetic acid, wybutoxine, queuine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5- methyluracil, 6-carbamoylthreonyladenine, wybutine, 3-(3-amino-3-carboxy-propyl)uracil and derivatives thereof,

p is at least one,

Y is selected from the group consisting of H and lysine,

X is selected from the group consisting of OH and NH2,

wherein at least one B is N4-(2-guanidoethyl)-5-methylcytosine.

8. Peptide nucleic acid according to claim 7, wherein the peptide nucleic acid comprises a sequence that is complementary to a given RNA target sequence, wherein in the peptide nucleic acid sequence that is complementary to the given RNA target sequence the base pairing with adenine in the RNA is selected from the group consisting of thymine, uracil and derivatives thereof, the base pairing with uracil in the RNA is selected from the group of adenine and derivatives thereof, the base pairing with guanine is selected from the group consisting of thio-pseudoisocytosine, cytosine and derivatives thereof and the base pairing with cytosine is selected from the group consisting of N4-(2-guanidoethyl)-5-methylcytosine, guanine and derivatives thereof.

9. Peptide nucleic acid according to claim 8, wherein the given RNA target sequence is a first strand of a double- stranded RNA region, wherein the double- stranded RNA region comprises a first and a second sequence that are complementary to each other, wherein the first sequence consists of purine bases and one cytosine base.

10. Peptide nucleic acid according to any one of claims 7-9, wherein p is 5 to 15.

11. Peptide nucleic acid according to claim 10, wherein p is 8.

12. Peptide nucleic acid according to any one of claims 7-11, wherein the monomeric unit comprising the at least one A^-(2-guanidoethyl)-5-methylcytosine is not terminally positioned in the peptide nucleic acid chain.

13. Peptide nucleic acid according to any one of claims 7-12, wherein Y is lysine.

14. Peptide nucleic acid according to any one of claims 7-13, wherein X is NH2.

15. Peptide nucleic acid according to any one of claims 7-14, wherein the peptide nucleic acid comprises at least three N4-(2-guanidoethyl)-5-methylcytosine bases.

16. Peptide nucleic acid according to claim 7 consisting of the sequence:

NH2-Lys-TLTQTTTL-CONH2, wherein Lys is lysine, T is thymine, L is thio- pseudoisocytosine and Q is N4-(2-guanidoethyl)-5-methylcytosine.

17. Peptide nucleic acid according to any one of claims 7-16 for use as a medicament.

18. Peptide nucleic acid according to any one of claims 7-16 for use in the treatment of a viral disease or cancer.

19. Peptide nucleic acid for use according to claim 18, wherein the viral disease is caused by an RNA virus.

20. Peptide nucleic acid for use according to claim 19, wherein the RNA virus is selected from influenza, picornavirus, hepevirus, reovirus, coronavirus, togavirus, flavivirus, arenavirus, filo virus or retrovirus.

21. Peptide nucleic acid for use according to claim 18, wherein the cancer is a carcinoma or a sarcoma.

22. Method of targeting the structure of an at least partially double- stranded RNA molecule, wherein the double- stranded region comprises a first and a second sequence that are complementary to each other, wherein the first sequence consists of purine bases and one cytosine base, comprising:

contacting a peptide nucleic acid according to any one of claims 7-16 with the RNA to form a triple helix structure.

Description:
MODIFIED PEPTIDE NUCLEIC ACIDS AND THEIR USE

FIELD OF THE INVENTION

[0001] The present invention lies in the field of biochemistry and relates to compounds comprising the base A^-(2-guanidoethyl)-5-methylcytosine and methods of their manufacture. Further, the present invention relates to the use of said compounds as medicaments and to methods of targeting the structure of RNAs.

BACKGROUND OF THE INVENTION

[0002] Ribonucleic acids (RNAs) are one of the three major macromolecules that are crucial to all living life forms. RNA plays an active role in numerous biological functions, which include catalysis, coding, decoding and regulation of genes (Leontis, N. B., Westhof, E. Curr. Opin. Struct. Biol. 2003, 13, 300-308; Cech, T. R., Steitz, J. A. Cell, 2014, 157, 77-94). RNA consists of chains of nucleobases (guanine, cytosine, adenine and uracil) on a ribose- phosphate backbone. The nucleobases in Watson-Crick stems form hydrogen bonding interactions with each other, guanine with cytosine and adenine with uracil. Non-Watson- Crick structures can also form between all four bases. The phosphate groups that are present in the backbone each constitute a negative charge, making RNA a charged molecule at physiological pH.

[0003] RNA exists in complicated and diverse structures, affecting its function. Secondary structures of RNA consist of single- stranded loops and double- stranded duplex stem regions. On the other hand, tertiary structures of RNA are complicated three- dimensional structures formed through complementary base pairing and base stacking interactions between secondary structures. RNA is also a major component in protein synthesis. Unlike in deoxyribonucleic acid (DNA), the presence of a hydroxyl group on the 2' position allows RNA helixes to adopt the A-form geometry. This results in a deep and narrow major groove, as well as a shallow yet wide minor groove, distinguishing RNA from DNA (Salazar, M. et al. Biochemistry, 1992, 32, 4207-4215; Zengeya, T. et al. Bioorg. Med. Chem. Lett. 2011, 21, 2121-2124; Chenoweth, D. M. et al. Angew. Chem. Int. Ed. 2013, 52, 415- 418).

[0004] In RNA-based therapeutics various approaches have been investigated to regulate gene expression. To this end, many attempts are reported with small molecule ligands for targeting RNA. However, due to their nonspecific interactions, providing a small molecule that specifically targets a given RNA molecule is a challenging approach. Instead, natural or synthetic DNA/RNA or any other artificial nucleic acid sequences might have better recognition ability towards RNA, due to the Watson-Crick and Hoogsteen hydrogen bonding interactions between the strands. The Triplex-Forming Oligonucleotide (TFO) approach, where the third strand of oligonucleotide recognizes double helical RNA, is the most promising and effective strategy for this type of RNA regulation. Although the RNA double helix has a deep and narrow major groove, single stranded RNA could form a modestly stable triplex via parallel orientation. The pyrimidine bases in the third strand bind to the purine strand of the double helical RNA by Hoogsteen hydrogen bonding, recognizing A-U and G-C base pairs to form U-A-U and C + -G-C base triples, respectively.

[0005] One group of synthetic nucleic acid molecules are peptide nucleic acids (PNAs), which have been reported to be chemically stable and are able to mimic the behavior of DNA. Thus, they are used in a wide range of applications. PNA was first introduced by Nielsen et al. in 1991 (Nielsen, P. E. et al. Science 1991, 254, 1497-1500). PNA consists of repeating units of monomers, linked together by amide bonds. Compared to DNA and RNA, the sugar-phosphate backbone is replaced with a neutral N-(2-aminoethyl)-glycine backbone. Purine (adenine and guanine) and pyrimidine (cytosine and thymine) bases are attached to the backbone through a methylene carbonyl linker (Figure 1). PNA can bind to complementary bases in DNA/RNA to form duplexes through Watson-Crick hydrogen bonds. As PNAs are uncharged, the lack of electrostatic repulsions allows them to bind more tightly than their natural counterparts.

[0006] However, aside by binding to DNA or RNA by Watson-Crick hydrogen bonds, PNAs allow triplex formation with said molecules. Pyrimidine-rich Triplex-Forming PNAs (TFPNAs) can bind to the purine strand of an RNA duplex, recognizing A-U and G-C base pairs. As a result, U-A-U and C + -G-C base triples (Figure 2A and B) are formed. However, the formation of C + -G-C base triples is favored only at relatively low pH (pH<6.0) as the N 3 positions of cytosine need to be protonated in order to form stable Hoogsteen base pairs (Lee, J. S. Nucleic Acids Res. 1979, 6, 3073-3091). Thus, under physiological conditions there is no stable formation of C + -G-C base triples.

[0007] Moreover, in natural RNA duplexes, the purine tract in one of the two strands is often interrupted by pyrimidine residues, resulting in the formation of inverted Watson- Crick C-G and U-A pairs (pyrimidine inversions). Pyrimidine-purine inversions are present in many functional structured RNAs such as pre-microRNAs and HIV-1 -1 ribosomal frameshift-inducing hairpin, which are important therapeutic targets (Velagapudi, S. P. et al. Nat. Chem. Biol. 2014, 10, 291-297; Brakier-Gingras, L. et al. Expert Opin. Ther. Targets 2012, 16, 249-258).

[0008] However, previously designed PNAs with modified nucleobases, such as 5- methylisocytosine (iC), 2-pyrimidinone (P), and 3-oxo-2,3-dihydropyridazine (E) show only modest improvement in the binding affinity and selectivity towards the RNA duplexes with pyrimidine-purine inversions (Gupta, P. et al. Biochemistry 2012, 51, 63-73; Gupta, P. et al. Chem. Commun. 2011, 47, 11125-11127; Zengeya, T. et al. Bioorg. Med. Chem. Lett. 2011, 21, 2121-2124).

[0009] Semenyuk et al. (Semenyuk, A. et al. Biochemistry 2010, 49, 7867-7878) incorporated a guanidine functionalized C base, N 4 -(2-guanidoethyl)-5-methylcytosine, into relatively long TFOs (17-mer) with 2'-0-methyl (2'-OMe) and 2'-0-aminoethoxy (2'-AE) RNA backbones and have been able to demonstrate C-G inversion-dependent TFODNA2 triplex formation. However, no triplex formation of N 4 -(2-guanidoethyl)-5-methylcytosine containing molecules and RNA target molecules has been demonstrated.

[00010] Hence, there is need in the art to identify molecules that bind specifically and with high affinity to RNA duplexes containing pyrimidine-purine inversions.

SUMMARY OF THE INVENTION

[00011] It is an object of the present invention to meet the above need by providing compounds comprising N 4 -(2-guanidoethyl)-5-methylcytosine. Surprisingly, the present inventors have found that peptide nucleic acids (PNAs) comprising thio-pseudoisocytosine (Devi, G. et al. Nucleic Acids Res. 2014, 42) and N 4 -(2-guanidoethyl)-5-methylcytosine are able to bind specifically and with high affinity to RNA duplexes with C-G (pyrimidine- purine) inversions, but very weakly to DNA duplexes with the same sequence or single- stranded RNA or DNA. Further, the present inventors have found a fast and efficient method to synthesize a monomeric PNA unit comprising N 4 -(2-guanidoethyl)-5-methylcytosine. This monomeric unit can be used for further synthesis of PNA molecules. In addition, to bind specifically and with high affinity to RNA the PNAs of the present invention can be of small size (not more than 10 bases) compared to, for example, siRNA molecules, which usually consist of 21 bases. Based on this reduced length, the uncharged backbone of PNAs and the guanidine group in the N 4 -(2-guanidoethyl)-5-methylcytosine, the present inventors have found that PNAs can enter cells without applying any further transfection assistance, making them interesting molecules for therapeutic applications.

[00012] In a first aspect, the present invention is thus directed to compounds having the structure of formula (I):

formula (I); wherein R 1 , R 2 and R 3 are independently selected from the group consisting of H and amine protecting groups.

[00013] In a further aspect, the invention is directed to methods of manufacturing a compound according to formula (I), comprising:

(a) reacting thymine with ethyl bromoacetate to form the compound of formula (III)

formula (III); reacting the compound of formula (III) with 1,2,4-triazole to form the compound of < ,Ν

N

^COO formula (IV);

(c) reacting the compound of formula (IV) with the compound of formula (V)

H H 2 N N ^ NHR4 formula (V)

to form the compound of formula (VI)

R 4 formula (VI);

(d) reacting the compound of formula (VI) with the compound of formula (VII)

R -, HAN- R3 formula (VII); to form the com ound of formula (VIII)

(e) reacting the compound of formula (VIII) with base and neutralizing by acid to form the compound of formula (IX)

reacting the compound of formula (IX) with the compound of formula (X)

H

. N ^ ^COOR 5

formula (X) to form the compound of formula (XI)

formula (XI);

(g) reacting the compound of formula (XI) with base and neutralizing by acid to form the compound of formula (I),

wherein Et is ethyl, R 1 , R 2 , R 3 and R 4 are independently selected from amine protecting groups and R 5 is an organic moiety.

[00014] In a third aspect, the invention is directed to the peptide nucleic acid of formula (

formula (XII) wherein B is selected from the group consisting of thymine, adenine, uracil, cytosine, guanine, thio-pseudoisocytosine, N 4 -(2-guanidoethyl)-5-methylcytosine, 4-acetylcytosine, 5- (carboxyhydroxymethyl)uracil, 5-carboxymethylaminomethyl-2-thiouracil, 5- carboxymethylaminomethyluracil, dihydrouracil, pseudouracil, hypoxanthine, N6- isopentenyladenine, 1-methyladenine, 1-methylpseudo uracil, 1-methylguanine, 1- methylhypoxanthine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3- methylcytosine, 5-methylcytosine, N6-methyladenine, 7-methylguanine, 5- methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 5-methoxycarbonylmethyl-2- thiouracil, 5-methoxycarbonylmethyluracil, 5-methoxyuracil, 2-methylthio-N 6 - isopentenyladenine, 2-methylthio-N 6 -threonylcarbamoyladenine, uracil-5-oxyacetic acid, wybutoxine, queuine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5- methyluracil, 6-carbamoylthreonyladenine, wybutine, 3-(3-amino-3-carboxy-propyl)uracil and derivatives thereof,

p is at least one,

Y is selected from the group consisting of H and lysine,

X is selected from the group consisting of OH and NH 2 ,

wherein at least one B is A^-(2-guanidoethyl)-5-methylcytosine.

[00015] In still further aspects, the invention is directed to the peptide nucleic acid of the present invention for use as a medicament and for use in the treatment of a viral disease, neurodegenerative disease, or cancer.

[00016] In another aspect, the invention is directed to methods of treating a viral disease, neurodegenerative disease, or cancer, comprising: administering to a in need thereof patient an effective amount of the peptide nucleic acid of the present invention.

[00017] In addition, another aspect of the invention is directed to methods of targeting the structure of an at least partially double-stranded RNA molecule, wherein the double- stranded region comprises a first and a second sequence that are complementary to each other, wherein the first sequence consists of purine bases and one cytosine base, comprising: contacting a peptide nucleic acid according to any one of claims 9-18 with the RNA to form a triple helix structure.

BRIEF DESCRIPTION OF THE DRAWINGS

[00018] The invention will be better understood with reference to the detailed description when considered in conjunction with the non-limiting examples and the accompanying drawings.

[00019] Figure 1 shows the structures of DNA, RNA and PNA, wherein B are nucleobases (adenine, cytosine, guanine, thymine/uracil).

[00020] Figure 2 shows the chemical structures of (A) T-A-U, (B) C + -G-C, (C) Q C-G and (D) L-G-C major-groove base triples. The letter R represents the sugar-phosphate backbone of RNA. The hydrogen bonds are indicated by black dashed lines. A relatively low pH is needed to protonate C to form the C + -G-C (panel B). The enhanced van der Waals interaction between the sulfur atom in L base and H8 in G base (panel D) is indicated by a dashed line. A Watson-Crick-like L-G pair is destabilized due to the steric clash between the sulfur atom in L and amine in G base. The 5-methyl group in Q base (panel C) may repel the two-carbon linker away and favor the Q-C-G base triple formation, but destabilizes a Watson- Crick-like Q-G pair due to the steric clash caused by the linker.

[00021] Figure 3 shows the structures of RNA/DNA target molecules, PNAs and complexes formed by PNAs of the present invention. (A-D) Model RNA hairpins rHPl, rHP2, rHP3 and rHP4. (E-F) Model DNA hairpins dHPl and dHP2. (G) A model PNA-RNA 2 triplex formed between PNA3 and rHP2. (H-L) PNAs studied in this paper. (M) A PNA-RNA 2 triplex formed between a PNA (NH 2 -Lys-LLLTTLLQ-CONH 2 ) and HIV-1 frameshift-inducing hairpin.

[00022] Figure 4 shows non-denaturing PAGE (12%) with running buffer of lx TBE, pH 8.3 for 5 hours at 250V. The incubation buffer is 200 mM NaCl, 0.5 mM EDTA, 20 mM HEPES, pH 7.5. The loaded RNA hairpins (rHPl , rHP2, rHP3, rHP4, and HIV-HP) and DNA hairpins (dHPl and dHP2) are at 1 μΜ in 20 and 10 μί, respectively. The PNA concentrations in lanes from left to right are 0, 0.2, 0.4, 1, 1.6, 2, 4, 10, 16, 20, 28 and 50 μΜ, respectively. (A-D) PNA P3 only binds to rHP2 with a K d value of (4.4 + 0.5) μΜ. (E-F) PNA P3 shows no binding to dHPl or dHP2. (G-H) P6 (NH 2 -Lys-LLTTLLQ-CONH 2 ) and P7 (NH 2 -Lys-LLLTTLLQ-CONH 2 ) bind to HIV-HP with K d values of (1.7 + 0.7) and (2.4 + 1.0) μΜ, respectively.

[00023] Figure 5 shows confocal microscope images of HeLa cells after being treated with 0.5 μΜ of Cy3-labeled Gua- 1 (Cy3-Lys-TCTQTTTC-CONH 2 ), Gua-2 (Cy3-Lys- TQTQTTTQ-CONH 2 ) and Con-1 (Cy3-Lys-TCTCTTTC-CONH 2 ) for 24 hours. Images were captured under the same laser intensity (Ex.: 543 nm, Em: 570-630 nm). Scale bar: 20 μιη.

[00024] Figure 6 shows non-denaturing PAGE for various PNAs binding to rHPl . All incubation buffers contain 200 mM NaCl, 0.5 mM EDTA. The incubation buffers at pH 7.5 contain 20 mM HEPES, while the incubation buffer at pH 6.0 contains 20 mM MES. The loaded samples contain 1 μΜ of rHPl in 20 μΐ ^ . The PNA concentrations in lanes from left to right are (a-k) 0, 0.2, 0.4, 1, 1.6, 2, 4, 10, 16, 20, 28 and 50 μΜ, respectively, or (1-m) 0, 0.05, 0.1, 0.2, 0.4, 1, 1.6, 2, 4, 10, 16 and 20 μΜ, respectively. PNAs containing Q base (P2 and P3) showed no binding to rHPl . [00025] Figure 7 shows determination by non-denaturing PAGE for (a-c) PI, (d) P4 and (e-g) P5 binding to rHPl . The fraction of triplex formation was calculated according to Y = Itripiex/(Itri P iex + aldupiex). The band intensities were normalized according to a = Itripiex max/ Idupiex max. Idupiex max is the band intensity for hairpin alone without the addition of PNA. Itripiex max is the triplex band intensity with the highest concentration of PNA added. The data were fit to the equation: Y = Yo + (B/2/R 0 )(Ro + X + ^d - ((Ro + X + ^d) 2 - 4R 0 X) 1/2 ) where R 0 is the RNA hairpin concentration (1 μΜ). Yo and B are the minimum and maximum fraction of triplex formation, respectively. X is the total PNA concentration and Kd is the dissociation constant.

[00026] Figure 8 shows non-denaturing PAGE for various PNAs binding to rHP2. All incubation buffers contain 200 mM NaCl, 0.5 mM EDTA. The incubation buffers at pH 7.5 and 8.0 contain 20 mM HEPES, while the incubation buffer at pH 6.0 contains 20 mM MES. The loaded samples contain 1 μΜ of rHP2 in 20 μΐ ^ . The PNA concentrations in lanes from left to right are 0, 0.2, 0.4, 1, 1.6, 2, 4, 10, 16, 20, 28 and 50 μΜ, respectively.

[00027] Figure 9 shows ^d determination by non-denaturing PAGE for (a-d) P3 binding to rHP2 and (e-f) P6 and P7 binding to HIV- 1 frameshift- inducing RNA hairpin. The fraction of triplex formation was calculated according to Y = Itripiex/(Itripiex + aldupiex). The band intensities were normalized according to a = Itripiex max/ Idupiex max. Idupiex max is the band intensity for hairpin alone without the addition of PNA. Itripiex max is the triplex band intensity with the highest concentration of PNA added. The data were fit to the equation: Y = Yo + (B/2/Ro)(Ro + X + K d - ((Ro + X + K d ) 2 - 4RoX) 1/2 ) where Ro is the RNA hairpin concentration (1 μΜ). Yo and B are the minimum and maximum fraction of triplex formation respectively. X is the total PNA concentration and Kd is the dissociation constant.

[00028] Figure 10 shows non-denaturing PAGE for various PNAs binding to rHP3, rHP4, dHPl and dHP2. The incubation buffer is 200 mM NaCl, 0.5 mM EDTA, 20 mM HEPES, pH 7.5. The loaded samples contain 1 μΜ of rHP and dHP in 20 μΐ, and 10 μΐ,, respectively. The PNA concentrations in lanes from left to right are 0, 0.2, 0.4, 1, 1.6, 2, 4, 10, 16, 20, 28 and 50 μΜ, respectively.

[00029] Figure 11 shows non-denaturing PAGE for various PNAs binding to RNA duplexes found in influenza RNA. The PNAs are shown in blue. The RNA hairpins contain the RNA duplex regions found in the A/California/04/2009 (H1N1) virus. The stems of HA1 and HA3 are present in vRNA segment 8. The stem of HA1 is present in segment 8 mRNA. (a-c) Secondary structures of the RNA duplexes regions found in influenza RNA. The GCAA tetraloop was used to cap one end of the RNA duplex regions, (d-e) 12% non-denaturing gels for hairpins HAl (d), HA2 (e), and HA3 (f) binding to the corresponding PNAs. The concentration of all the RNA haipins is kept at 1 μΜ. All the RNA hairpins were labeled with 32 P at the 5' end. The incubation buffer contains 200 niM NaCl, 0.5 mM EDTA, 20 niM MES (pH 5.5) or HEPES (pH 7.0). The numbers on the top of the lanes are the corresponding PNA concentration at μΜ.

[00030] Figure 12 shows the synthesis of the PNA Q monomer: i) Ethyl bromoacetate, K2CO3, anhydrous N,N-dimethylformamide, 23 °C, overnight, ii) 1,2,4-triazole, POCI3, triethylamine and anhydrous acetonitrile/dichloromethane, 0 °C to 23 °C, 21 h. iii) t-butyl (2- aminoethyl)carbamate, K2CO3, acetonitrile, 23 °C, 20 h. iv) 50% trifluoroacetic acid in dichloromethane, 1 h, triethylamine, diCbz-protected 1-guanyl pyrazole. v) aq. 1 M LiOH, tetrahydrofuran, 2 M HC1, 0 °C to 23 °C, 0.5 h. vi) Ethyl N-(2-Boc-aminoethyl)glycinate, N- (3-dimethylaminopropyl)-N'-ethylcarbodiimide hydrochloride, N,N-diisopropylethylamine, 23 °C, 12 h. vii) aq 1 M LiOH, tetrahydrofuran, 2 M HC1, 0 °C to 23 °C, 0.5 h.

DETAILED DESCRIPTION OF THE INVENTION

[00031] The present inventors surprisingly found that PNAs comprising N 4 -(2- guanidoethyl)-5-methylcytosine are able to bind specifically and with high affinity to RNA duplexes with C-G inversions, but very weakly to DNA duplexes with the same sequence or single-stranded RNA or DNA. Furthermore, the present inventors have found a fast and efficient method to synthesize a monomeric PNA unit comprising N 4 -(2-guanidoethyl)-5- methylcytosine.

[00032] Thus, in a first aspect, the present invention is thus directed to the compound having the structure of formula (I):

formula (I) wherein R 1 , R 2 and R 3 are independently selected from the group consisting of H and amine protecting groups.

[00033] In various embodiments of the invention the amine protecting group R 1 is different from the amine protecting group R 2 and R 3 . The term "amine protecting group", as used herein, relates to a chemical group that is introduced into a molecule by chemical modification of a functional group of said molecule to obtain chemo selectivity in a subsequent chemical reaction. The protected functional group is an amine. The amine can be a primary, secondary or tertiary amine. Preferably, in various embodiments the amine protecting groups R 1 , R 2 , R 3 and R 4 are independently selected from benzyloxy carbamate (Cbz), p-methoxybenzyl carbonyl (Moz or MeOZ), tert-butyloxycarbonyl (BOC), 9- fluorenylmethyloxycarbonyl (FMOC), allyloxycarbonyl, acetyl (Ac), benzoyl (Bz), benzyl (Bn), carbamate, p-methoxybenzyl (PMB), 3,4-dimethoxybenzyl (DMPM), p-methoxyphenyl (PMP), tosyl (Ts) or sulfonamides (nosyl). In more preferred various embodiments, the amine protecting groups R 1 , R 2 , R 3 and R 4 are independently selected from the group consisting of benzyloxy carbamate (Cbz), tert-butyloxycarbonyl (BOC), 9-fluorenylmethyloxycarbonyl (FMOC) and allyloxycarbonyl.

[00034] In even more preferred embodiments the compound of the invention has the structure of formula (II):

formula (II)

[00035] In a second aspect the invention is directed to methods of manufacturing a compound according to formula (I), comprising:

(a) reacting thymine with ethyl bromoacetate to form the compound of formula (III)

formula (III); (b) reacting the compound of formula (III) with 1,2,4-triazole to form the compound of formula (IV)

formula (IV);

(c) reacting the compound of formula (IV) with the compound of formula (V) H N^ NHR4

" 2N formula (V); to form the compound of formula (VI)

R 4 formula (VI); ing the compound of formula (VI) with the compound of formula (VII) formula (VII); to form the com ound of formula (VIII)

formula (VIII);

(e) reacting the compound of formula (VIII) with base and neutralizing by acid to form the compound of formula (IX) reacting the compound of formula (IX) with the compound of formula (X) N. .COOR 5

UN formula (X); to form the compound of formula (XI)

formula (XI);

(g) reacting the compound of formula (XI) with base and neutralizing by acid to form the compound of formula (I),

wherein Et is ethyl, R 1 , R 2 , R 3 and R 4 are independently selected from amine protecting groups and R5 is an organic moiety.

[00036] In various embodiments of the invention concerning methods of manufacturing a compound according to formula (I) in step (a) the temperature is 20-26 °C and/or the reaction time is 10- 18 hours; in step (b) the temperature is 0-23 °C and/or the reaction time is 17-25 hours; in step (c) the temperature is 20-26 °C and/or the reaction time is 15-25 hours; in step (d) the reaction time is 10-16 hours; in step (e) the temperature is 0-23 °C and/or the reaction time is 15-45 minutes; in step (f) the temperature is 20-26 °C and/or the reaction time is 8- 16 hours; and/or in step (g) the temperature is 0-23 °C and/or the reaction time is 15-45 minutes.

[00037] In more preferred embodiments, in step (a) the temperature is 23 °C and/or the reaction time is 14 hours; in step (b) the reaction time is 21 hours; in step (c) the temperature is between 23 °C and/or the reaction time is 20 hours; in step (d) the reaction time is 13 hours; in step (e) the reaction time is 30 minutes; in step (f) the temperature is between 23 °C and/or the reaction time is 12 hours; and/or in step (g) the reaction time is 30 minutes.

[00038] In various embodiments of the invention concerning methods of manufacturing a compound according to formula (I) in step (a) K2CO3 and anhydrous N,N- dimethylformamide are additionally used for the reaction; in step (b) POCI3, triethylamine and anhydrous acetonitrile/dichloromethane are additionally used for the reaction; in step (c) the compound of formula (V) is t-butyl(2-aminoethyl)carbamate and K2CO3 and acetonitrile are additionally used for the reaction; in step (d) the compound of formula (VII) is diCbz- protected 1-guanyl pyrazole and 50% trifluoroacetic acid in dichloromethane and triethylamine are additionally used for the reaction; in step (e) 2M HC1 and 1M LiOH and tetrahydrofuran are additionally used for the reaction; in step (f) the compound of formula (X) is ethyl-N-(2-Boc-aminoethyl)glycinate and N-(3-dimethylaminopropyl)-N'- ethylcarbodiimide hydrochloride and Ν,Ν-diisopropylethylamine are additionally used for the reaction; in step (g) 2M HC1 and 1M LiOH and tetrahydrofuran are additionally used for the reaction.

[00039] "Reacting" as used with regard to the method of manufacturing the compound of formula (I) refers to contacting the educts under conditions that allow formation of the product. Exemplary reaction conditions are described above.

[00040] The term "any organic moiety", as used herein, refers to carbon-containing moieties. These moieties can be linear or branched, substituted or unsubstituted, and are preferably derived from hydrocarbons, including those obtained by substitution of one or more hydrogen or carbon atoms by other atoms, such as oxygen, nitrogen, sulfur, phosphorous, or functional groups that contain oxygen, nitrogen, sulfur, phosphorous. The organic moiety can comprise any number of carbon atoms, for example up to 5000 or more (typically in case of polymeric moieties), but preferably it is a low molecular weight organic moiety with up to 100, or more preferably up to 40 carbon atoms and, optionally, a molecular weight Mw of 1000 or less. It is preferred that the organic moiety is compatible with the activation/deprotection reaction described herein and does not adversely affect the described reactions. Suitable groups and moieties are well known to those skilled in the art or can be readily identified by routine experimentation.

[00041] In a preferred embodiment, the organic moiety can be a linear or branched, substituted or unsubstituted Ci-C x alkyl; linear or branched, substituted or unsubstituted C 2 -C x alkenyl; linear or branched, substituted or unsubstituted C 2 -C x alkinyl; linear or branched, substituted or unsubstituted Ci-C x alkoxy; substituted or unsubstituted C 3 -C x cycloalkyl; substituted or unsubstituted C 3 -C x cycloalkenyl; substituted or unsubstituted C 6 -C x aryl; and substituted or unsubstituted C 3 -C x heteroaryl; with x being any integer of 2 or more, preferably up to 50, more preferably up to 30.

[00042] In a further embodiment of the present invention, the organic moiety can be a linear or branched, substituted or unsubstituted alkyl with 1 to 40 carbon atoms; linear or branched, substituted or unsubstituted alkenyl with 3 to 40 carbon atoms; linear or branched, substituted or unsubstituted alkoxy with 1 to 40 carbon atoms, substituted or unsubstituted cycloalkyl with 5 to 40 carbon atoms; substituted or unsubstituted cycloalkenyl with 5 to 40 carbon atoms; substituted or unsubstituted aryl with 5 to 40 carbon atoms; and substituted or unsubstituted heteroaryl with 5 to 40 carbon atoms.

[00043] In another embodiment, the organic moiety can be a linear or branched, substituted or unsubstituted alkyl with 1 to 20 carbon atoms; linear or branched, substituted or unsubstituted alkenyl with 3 to 20 carbon atoms; linear or branched, substituted or unsubstituted alkoxy with 1 to 20 carbon atoms, substituted or unsubstituted cycloalkyl with 5 to 20 carbon atoms; substituted or unsubstituted cycloalkenyl with 5 to 20 carbon atoms; substituted or unsubstituted aryl with 5 to 14 carbon atoms; and substituted or unsubstituted heteroaryl with 5 to 14 carbon atoms.

[00044] The organic moiety can also be a combination of any of the above-defined groups, including but not limited to alkylaryl, arylalkyl, alkylheteroaryl and the like, to name only a few, all of which may be substituted or unsubstituted.

[00045] The term "substituted" as used herein in relation to the above moieties refers to a substituent other than hydrogen. Such a substituent is preferably selected from the group consisting of halogen, -OH, -OOH, -NH 2 , -N0 2 , -ON0 2 , -CHO, -CN, -CNOH, -COOH, -SH, -OSH, -CSSH, -SCN, -S0 2 OH, -CONH 2 , -NH-NH 2 , -NC, -CSH -OR, -NRR' , -C(0)R, - C(0)OR, -(CO)NRR', -NR'C(0)R, -OC(0)R, aryl with 5 to 20 carbon atoms, cycloalk(en)yl with 3 to 20 carbon atoms, 3- to 8-membered heterocycloalk(en)yl, and 5- to 20-membered heteroaryl, wherein R and R' are independently selected from hydrogen, alkyl with 1 to 10 carbon atoms, alkenyl with 2 to 10 carbon atoms, alkynyl with 2 to 10 carbon atoms, aryl with 5 to 14 carbon atoms, cycloalk(en)yl with 3 to 20 carbon atoms, 5- to 14- membered heteroaryl, comprising 1 to 4 heteroatoms selected from nitrogen, oxygen, and sulfur, and 5- to 14- membered heterocycloalk(en)yl, comprising 1 to 4 heteroatoms selected from nitrogen, oxygen, and sulfur. Any of these substituents may again be substituted, it is however preferred that these substituents are unsubstituted.

[00046] Akenyl and Alkynyl comprise at least one carbon-carbon double bonds or triple bonds, respectively, and are otherwise defined as alkyl above.

[00047] Cycloalkyl refers to a non-aromatic carbocyclic moiety, such as cyclopentanyl, cyclohexanyl and the like.

[00048] Cycloalkenyl refers to non-aromatic carbocyclic compounds that comprise at least one C-C double bond.

[00049] Similarly, heterocycloalk(en)yl relates to cycloalk(en)yl groups wherein 1 or more ring carbon atoms are replaced by heteroatoms, preferably selected from nitrogen, oxygen, and sulfur.

[00050] Aryl relates to an aromatic ring that is preferably monocyclic or consists of condensed aromatic rings. Preferred aryl substituents are moieties with 6 to 14 carbon atoms, such as phenyl, naphthyl, anthracenyl and phenanthrenyl.

[00051] Heteroaryl refers to aromatic moieties that correspond to the respective aryl moiety wherein one or more ring carbon atoms have been replaced by heteroatoms, such as nitrogen, oxygen, and sulfur.

[00052] All of the afore-mentioned groups can be substituted or unsubstituted. When substituted the substituent can be selected from the above list of substituents.

In a third aspect, the present invention is directed to peptide nucleic acids of formula (XII)

formula (XII) wherein B is selected from the group consisting of thymine, adenine, uracil, cytosine, guanine, thio-pseudoisocytosine, N 4 -(2-guanidoethyl)-5-methylcytosine, 4-acetylcytosine, 5- (carboxyhydroxymethyl)uracil, 5-carboxymethylaminomethyl-2-thiouracil, 5- carboxymethylaminomethyluracil, dihydrouracil, pseudouracil, hypoxanthine, N6- isopentenyladenine, 1-methyladenine, 1-methylpseudo uracil, 1-methylguanine, 1- methylhypoxanthine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3- methylcytosine, 5-methylcytosine, N6-methyladenine, 7-methylguanine, 5- methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 5-methoxycarbonylmethyl-2- thiouracil, 5-methoxycarbonylmethyluracil, 5-methoxyuracil, 2-methylthio-N 6 - isopentenyladenine, 2-methylthio-N 6 -threonylcarbamoyladenine, uracil-5-oxyacetic acid, wybutoxine, queuine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5- methyluracil, 6-carbamoylthreonyladenine, wybutine, 3-(3-amino-3-carboxy-propyl)uracil and derivatives thereof,

p is at least one,

Y is selected from the group consisting of H and lysine,

X is selected from the group consisting of OH and NH 2 ,

wherein at least one B is N 4 -(2-guanidoethyl)-5-methylcytosine.

[00053] "p", as used herein, is a placeholder for the number of monomeric units that form the peptide nucleic acid of the present invention. In various embodiments, "p" is 5 to 15, in more preferred embodiments "p" is 8. "At least one", as used herein, relates to one or more, in particular 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. Therefore, in various embodiments of the invention, at least 2, 3, 4, 5, 6, 7, 8, 9, 10 or more different monomeric PNA units may be attached to each other to form the PNA of the present invention.

[00054] The term "derivatives", as used herein, relates to compounds that are derived from a similar compound by some chemical or physical process. Usually, the derived compounds have similar chemical properties, especially binding properties. In various embodiments, the derivative is derived from one of the nucleic bases guanine, cytosine, thymine, adenine or uracil. This means that an organic moiety, as defined above, replaces hydrogen or other functional group in one or more of positions 1-9 of the purines or in one or more of positions 1-6 of the pyrimidines.

[00055] "PNA", "peptide nucleic acid" or "PNA chain", as interchangeably used herein, refers to an artificially synthesized polymer similar to DNA or RNA. DNA and RNA have a deoxyribose and ribose sugar backbone, respectively, whereas the PNA backbone is composed of repeating N-(2-aminoethyl)-glycine units linked by peptide bonds. The various purine and pyrimidine bases are linked to the backbone by a methylene bridge (-CH 2 -) and a carbonyl group (-(C=0)-). PNAs have a structure as depicted by Figure 1 (right side), wherein the substituent "B" represents a base and the curled lines indicate the binding to the adjacent monomeric units. PNAs are depicted like peptides, with the N-terminus at the first (left) position and the C-terminus at the last (right) position. "Monomeric PNA unit" or "PNA monomer", as interchangeably used herein, refers to a building block of a polymer PNA chain. It encompasses both forms of the unit, namely before and after being incorporated into the PNA chain. Before the monomeric units are reacted to form the PNA, they comprise a secondary amine and a carbonyl group. A representative unreacted, "free" monomeric PNA unit is the compound of formula (I). After being reacted to form a PNA, the carbonyl group and the amine of another PNA monomer form a so-called "peptide (amide) bond".

[00056] The term "nucleic acid molecule" or "nucleic acid sequence", as used herein, relates to DNA (deoxyribonucleic acid) or RNA (ribonucleic acid) molecules. Said molecules may appear independent of their natural genetic context and/or background. The term "nucleic acid molecule/sequence" further refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules") or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; "DNA molecules"), or any phosphoester analogs thereof, such as phosphorothioates and thioesters, in either single stranded form, or a double- stranded helix. Double stranded DNA- DNA, DNA-RNA and RNA-RNA helices are possible. The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms.

[00057] "RNA" or "ribonucleic acid", as interchangeably used herein, relates to a chain of nucleotides wherein the nucleotides contain the sugar ribose and bases selected from the group of adenine (A), cytosine (C), guanine (G), or uracil (U). "DNA" or "deoxyribonucleic acid", as interchangeably used herein, relates to a chain of nucleotides wherein the nucleotides contain the sugar 2'-deoxyribose and bases selected from adenine (A), guanine (G), cytosine (C) and thymine (T). The term "mRNA" refers to messenger RNA.

[00058] The term "base" or "nucleobase", as interchangeably used herein, relates to nitrogen-containing biological compounds (nitrogenous bases) found linked to the methylene carbonyl linker of PNAs or to a sugar within nucleosides - the basic building blocks of deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Their ability to form base pairs and to stack upon one another lead directly to the helical structure of DNA and RNA. The primary, or canonical, nucleobases are cytosine (DNA and RNA), guanine (DNA and RNA), adenine (DNA and RNA), thymine (DNA) and uracil (RNA), abbreviated as C, G, A, T, and

U, respectively. Because A, G, C, and T appear in the DNA, these molecules are called DNA- bases; A, G, C, and U are called RNA-bases. Uracil and thymine are identical except that uracil lacks the 5' methyl group. Adenine and guanine belong to the double-ringed class of molecules called purines (abbreviated as R). Cytosine, thymine, and uracil are all pyrimidines (abbreviated as Y). Other bases that do not function as normal parts of the genetic code are termed non-canonical. Nucleobases that can be used to form the PNAs of the present invention are thymine, adenine, uracil, cytosine, guanine, thio-pseudoisocytosine, N 4 -(2- guanidoethyl)-5-methylcytosine, 4-acetylcytosine, 5-(carboxyhydroxymethyl)uracil, 5- carboxymethylaminomethyl-2-thiouracil, 5-carboxymethylaminomethyluracil, dihydrouracil, pseudouracil, hypoxanthine, N6-isopentenyladenine, 1-methyladenine, 1-methylpseudouracil, 1-methylguanine, 1-methylhypoxanthine, 2,2-dimethylguanine, 2-methyladenine, 2- methylguanine, 3-methylcytosine, 5-methylcytosine, N6-methyladenine, 7-methylguanine, 5- methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 5-methoxycarbonylmethyl-2- thiouracil, 5-methoxycarbonylmethyluracil, 5-methoxyuracil, 2-methylthio-N 6 - isopentenyladenine, 2-methylthio-N 6 -threonylcarbamoyladenine, uracil-5-oxyacetic acid, wybutoxine, queuine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5- methyluracil, 6-carbamoylthreonyladenine, wybutine, 3-(3-amino-3-carboxy-propyl)uracil and derivatives thereof.

[00059] The variable "Y", as used in Formula XII, represents chemical groups that form the N-terminal end of the PNA of the present invention. Y is selected from the group consisting of H, lysine, arginine, cell-penetrating oligomer, small molecular drug, any delivery vector, fluorescently labelled molecule such as carboxyfluorescein, any other planar ring compounds, any amino sugar derivative such as neamine, or combination thereof. In more preferred embodiments, Y is lysine, neamine, carboxyfluorescein or combination thereof. More preferably Y is lysine. The variable "X", as used in Formula XII, represents chemical groups that form the C-terminal end of the PNA of the present invention. X is selected from the group consisting of OH, NH 2 , H, lysine, arginine, cell-penetrating oligomer, small molecular drug, any delivery vector, fluorescently labelled molecule such as carboxyfluorescein, any other planar ring compounds, any amino sugar derivative such as neamine, or combination thereof. In more preferred embodiments, X is NH 2 .

[00060] In various embodiments, the invention is directed to peptide nucleic acids of the present invention, wherein the peptide nucleic acid comprises a sequence that is complementary to a given RNA target sequence, wherein in the peptide nucleic acid sequence that is complementary to the given RNA target sequence the base pairing with adenine in the RNA is selected from the group consisting of thymine, uracil and derivatives thereof, the base pairing with uracil in the RNA is selected from the group of adenine and derivatives thereof, the base pairing with guanine is selected from the group consisting of thio-pseudoisocytosine, cytosine and derivatives thereof and the base pairing with cytosine is selected from the group consisting of N 4 -(2-guanidoethyl)-5-methylcytosine, guanine and derivatives thereof.

[00061] The term "complementary", as used herein, refers to the hybridization or base pairing between nucleobases, nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between RNA and a PNA. Two PNA, RNA or DNA molecules are said to be complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100%. Alternatively, complementarity exists when a PNA, RNA or DNA strand will hybridize under selective hybridization conditions to its complement. Typically, selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementarity. The term "pairing", as used herein, means a pair of nucleobases or their derivatives binding to each other by Watson-Crick or Hoogsteen base pairing.

[00062] "Pyrimidine-purine inversion" or "C-G inversion", as used herein, refer to an RNA strand in which one pyrimidine (including C and U) or C base, respectively, located in a purine-rich region. An inverted base pair therefore interrupts the continuity of a homopurine strand sequence. "Purine-rich", in this context means that at least 75%, 80%, 85%, 90%, 95% or 98% of the nucleotides in said region of the RNA molecule are purines. In preferred embodiments, 100% of the nucleotides of said region are purines. The purine-rich region has a length of at least 5, 8, 10, 13, 15, 20, 25 or 30 nucleotides.

[00063] The term "target sequence", as used herein, refers to any polynucleotide sequence in an RNA or DNA molecule. In various embodiments, the target sequence encodes a peptide or protein. In other various embodiments of the invention, the target sequence has a length of at least 5, 7, 10, 15, 20 or 25 nucleotides.

[00064] In various embodiments, the invention relates to the peptide nucleic acid of the present invention, wherein the given RNA target sequence is a first strand of a double- stranded RNA region, wherein the double-stranded RNA region comprises a first and a second sequence that are complementary to each other, wherein the first sequence consists of purine bases and one cytosine base. In this context, the term "double- stranded RNA region" refers to a) one RNA molecule that has two complementary sequence regions that allow the formation of an internal loop and the formation of an internal double- stranded region or b) two RNA molecules that bind to each other based on at least partially complementary sequences.

[00065] In still further various embodiments, the monomeric unit comprising the at least one A^-(2-guanidoethyl)-5-methylcytosine is not terminally positioned in the peptide nucleic acid chain. "Terminal position" or "terminal nucleobase", as used herein, refers to the C-terminal or N-terminal nucleobase in a peptide nucleic acid molecule.

[00066] In various embodiments of the invention, the peptide nucleic acid comprises at least three N 4 -(2-guanidoethyl)-5-methylcytosine bases. In further embodiments, the PNA of the present invention consisting of the sequence: NH2-Lys-TLQTTTL-CONH 2 , wherein Lys is lysine, T is thymine, L is thio-pseudoisocytosine and Q is N 4 -(2-guanidoethyl)-5- methylcytosine.

[00067] The term "thio-pseudoisocytosine", as used herein, refers to a nucleobase of formula (XIII):

formula (XIII) wherein P represents a peptide nucleic acid monomeric unit backbone or a peptide nucleic acid molecule, a nucleotide backbone or an RNA or DNA. The nucleobase thio- pseudoisocytosine or its monomeric unit incorporated into a PNA, RNA or DNA is herein abbreviated by "L".

[00068] The term "N 4 -(2-guanidoethyl)-5-methylcytosine", as used herein, refers to a nucleobase of formula (XIV):

formula (XIV) wherein P represents a peptide nucleic acid monomeric unit backbone or a peptide nucleic acid molecule, a nucleotide backbone or an RNA or DNA. The nucleobase N 4 -(2- guanidoethyl)-5-methylcytosine or its monomeric unit incorporated into a PNA, RNA or DNA is herein abbreviated by "Q".

[00069] In another aspect, the invention is directed to the peptide nucleic acid of the present invention for use as a medicament. The term "medicament", as used herein, is meant to mean and include any substance (i.e., compound or composition of matter) which, when administered to an organism (human or animal) induces a desired pharmacologic and/or physiologic effect by local and/or systemic action. The term therefore encompasses substances traditionally regarded as actives, drugs and bioactive agents, as well as biopharmaceuticals (e.g., peptides, hormones, nucleic acids, gene constructs, etc.) typically employed to treat a number of conditions which is defined broadly to encompass diseases, disorders, infections, and the like. Exemplary medicaments include, without limitation, antibiotics, antivirals, fh-receptor antagonists, 5HTi agonists, 5HT 3 antagonists, COX2- inhibitors, medicaments used in treating psychiatric conditions such as depression, anxiety, bipolar condition, tranquilizers, medicaments used in treating metabolic conditions, anticancer medicaments, medicaments used in treating neurological conditions such as epilepsy and Parkinsons Disease, medicaments used in treating cardiovascular conditions, non-steroidal anti-inflammatory medicaments, medicaments used in treating Central Nervous System conditions, and medicaments employed in treating hepatitis. The term medicament also encompasses pharmaceutically acceptable salts, esters, solvates, and/or hydrates of the pharmaceutically active substances referred to hereinabove. Various combinations of any of the above medicaments may also be employed. In accordance with the present invention, the medicament is typically employed in an oral pharmaceutical formulation. An oral pharmaceutical formulation typically refers to the combination of at least one medicament and one or more added components or elements, such as an "excipient" or "carrier." As will be appreciated by one having ordinary skill in the art, the terms "excipient" and "carrier" generally refer to substantially inert materials that are nontoxic and do not interact with other components of the composition in a deleterious manner. Examples of normally employed "excipients", include pharmaceutical grades of carbohydrates, including monosaccharides, disaccharides, cyclodextrins and polysacchahdes (e.g., dextrose, sucrose, lactose, raffinose, mannitol, sorbitol, inositol, dexthns and maltodextrins); starch; cellulose; salts (e.g., sodium or calcium phosphates, calcium sulfate, magnesium sulfate); citric acid; tartaric acid; glycine; leucine; high molecular weight polyethylene glycols (PEG); pluronics; surfactants; lubricants; stearates and their salts or esters (e.g., magnesium stearate); amino acids; fatty acids; and combinations thereof. The oral pharmaceutical formulation may be utilized in a variety of unit dosage forms including, without limitation, a tablet, a pill, a capsule, a lozenge, and combinations thereof. The unit dosage forms may encompass hospital unit dosage forms, as well as others.

[00070] In another aspect, the invention is directed to the peptide nucleic acid of the present invention for use in the treatment of viral diseases or cancer. In various embodiments, the viral disease is caused by an RNA virus. In still further embodiments, the RNA virus is selected from influenza, picornavirus, hepevirus, reovirus, coronavirus, togavirus, flavivirus, arenavirus, filovirus or retrovirus. An RNA virus is a virus that has RNA (ribonucleic acid) as its genetic material. This nucleic acid is usually single-stranded RNA (ssRNA), but may be double-stranded RNA (dsRNA). The peptide nucleic acid of the present invention may be used, without limitation, for the treatment of the following RNA virus caused diseases: Ebola hemorrhoragic fever, SARS, MERS, influenza, hepatitis C and E, West Nile fever, polio, rotavirus gastroenteritis, rubella, Dengue fever, tick-borne encephalitis (TBE), Lassa fever or Lassa hemorrhagic fever (LHF), HIV/AIDS and measles. Additional diseases caused by malfunction of RNA include Duchenne Muscular Dystrophy (DMD), dementia, and cancers.

[00071] In various embodiments of the invention, the cancer is a carcinoma or sarcoma. The term "cancer", as used herein, refers to diseases caused by uncontrolled cell division and the ability of cells to metastasize, or to establish new growth in additional sites. The terms "malignant", "malignancy", "neoplasm", "tumor" and variations thereof refer to cancerous cells or groups of cancerous cells. Exemplary cancers include: carcinoma, melanoma, sarcoma, lymphoma, leukemia, germ cell tumor, and blastoma. More particular examples of such cancers include squamous cell cancer (e.g., epithelial squamous cell cancer), lung cancer including small-cell lung cancer, non-small cell lung cancer, adenocarcinoma of the lung and squamous carcinoma of the lung, cancer of the peritoneum, hepatocellular cancer, gastric or stomach cancer including gastrointestinal cancer, pancreatic cancer, glioblastoma, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, hepatoma, breast cancer, colon cancer, rectal cancer, colorectal cancer, endometrial or uterine carcinoma, salivary gland carcinoma, kidney or renal cancer, prostate cancer, vulval cancer, thyroid cancer, hepatic carcinoma, anal carcinoma, penile carcinoma, melanoma, multiple myeloma and B-cell lymphoma, brain, as well as head and neck cancer, and associated metastases. "Carcinoma", as used herein, includes all carcinomas and adenocarcinomas present in humans. Carcinoma is a type of cancer that develops from epithelial cells. Specifically, a carcinoma is a cancer that begins in a tissue that lines the inner or outer surfaces of the body, and that generally arises from cells originating in the endodermal or ectodermal germ layer during embryogenesis. The term "sarcoma", as used herein, is a cancer that arises from transformed cells in one of a number of tissues that develop from embryonic mesoderm. Thus, sarcomas include tumors of bone, cartilage, fat, muscle, vascular, and hematopoietic tissues. For example, osteosarcoma arises from bone, chondrosarcoma arises from cartilage, liposarcoma arises from fat, and leiomyosarcoma arises from smooth muscle.

[00072] In a sixth aspect, the invention is directed to methods of targeting the structure of an at least partially double- stranded RNA molecule, wherein the double-stranded region comprises a first and a second sequence that are complementary to each other, wherein the first sequence consists of purine bases and one cytosine base, comprising: contacting a peptide nucleic acid according to any one of claims 9-18 with the RNA to form a triple helix structure.

[00073] In a still further aspect, the invention relates to methods of purifying RNA, wherein the RNA comprises a double-stranded region, wherein the double- stranded region comprises a first and a second sequence that are complementary to each other, wherein the first sequence consists of purine bases and one cytosine base, comprising: binding a peptide nucleic acid of the present invention to an insoluble material and contacting the bound peptide nucleic acid with the RNA.

EXAMPLES

Materials and Methods

General Methods

[00074] Reagents and solvents used were obtained from commercial sources and used without further purification. All organic reactions were monitored with the use of Thin-Layer Chromatography (TLC) using aluminium sheet silica gel 60 F254 (Merck). Compounds were purified by flash column chromatography using silica gel with ethyl acetate/petroleum ether mixture as the eluting solvent. Mass spectra of the compounds were obtained via Liquid- Chromatography-Mass Spectroscopy fitted with Electrospray Ionization Source (ESI) (LCMS-ESI). NMR spectra were obtained at room temperature on 300 MHz Bruker (AV300), 400 MHz Bruker (AV400) and 500 MHz Bruker (AV500) spectrometers and the chemical shifts (δ) are described in ppm. Reversed phase high performance liquid chromatography (RP- HPLC) purified RNA oligonucleotides were purchased from Sigma Aldrich Singapore.

Non-denaturing polyacrylamide gel electrophoresis

[00075] Non-denaturing (12%) polyacrylamide gel electrophoresis were conducted with incubation buffer containing 200 mM NaCl, 0.5 mM EDTA, 20 mM MES (pH 6.0), or 200 mM NaCl, 0.5 mM EDTA, 20 mM HEPES (pH 7.5), with and without 2 mM MgCl 2 . The loading volume for each sample was 20 μΐ ^ . Samples were prepared by snap cooling of the RNA hairpin, followed by annealing of PNA oligomers by slow cooling from 65 °C to room temperature and incubation at 4°C overnight. Before loading the samples into the wells, 35% glycerol (20% of the total volume) was added to the sample mixture, lx TBE (Tris-Borate- EDTA) buffer, pH 8.3 was used as the running buffer for all gel experiments. The gel was run at 4°C at 250V for 5h. The gels were then stained with ethidium bromide and imaged by the Typhoon Trio Variable Mode Imager.

Example 1: Synthesis of the compound of formula (I)

[00076] The compound of formula (I), herein also called PNA monomeric unit Q, was synthesized as follows and shown in Figure 12.

[00077] Ethyl 2-(thyminyl)acetate (2). Thymine (10 g, 79.2 mmol) and potassium carbonate (10.9 g, 79.2 mmol) were dissolved in anhydrous DMF (60 mL) under N 2 . To this mixture, ethyl bromoacetate (8.7 mL, 79.2 mmol) was added dropwise. The reaction mixture was stirred overnight at room temperature and dissolved in water. The product was extracted by ethyl acetate (150 mL x 3). The combined organic phase was washed with saturated aqueous KHS0 4 solution (70 mL x 2), brine (50 mL x 2) and dried over MgS0 4 . Solvent was removed under reduced pressure to get 13.1 g white solid with 78% isolated yield. l H NMR (300 MHz, CDCb): δ Η 8.92 (s, 1H), 6.91 (s, 1H), 4.41 (s, 2H), 4.26-4.19 (q, 2H), 1.91 (s, 3H), 1.30-1.25 (t, 3H); 13 C NMR (100 MHz, CDCb): 5c 167.75, 164.65, 151.13, 140.34, 110.64, 61.81, 48.49, 13.99, 12.18; ESI-MS: m/z: calculated for C 9 Hi 2 N 2 0 4 [M+H] + 213.08, found: 213.10.

[00078] Ethyl 2-(4-(l,2,4 triazole)thyminyl)acetate (3). Under N 2 atmosphere, POCI3

(8.8 mL, 94.2 mmol) and TEA (98.6 mL, 1.1 mol) were added into an ice-cold stirring solution of 1,2,4-triazole (29.3 g, 0.4 mol) in anhydrous ACN (200 niL). A solution of 2 (10 g, 47.1 mmol) in dry ACN/DCM mixture (10 mL. 1: 1) was added dropwise to the above reaction mixture. Upon the completion of the addition, the reaction mixture was stirred at room temperature for 21 h. It was then extracted by ethyl acetate (100 mL x 2). The combined organic phase was washed with aqueous saturated NaHC0 3 solution (50 mL), brine (50 mL) and dried over MgS0 4 . After removal of the organic solvent under rotavapor, 11.2 g white solid was obtained with an isolated yield of 89%. X H NMR (300 MHz, CDC1 3 ): δ Η 9.25 (s, 1H), 8.08 (s, 1H), 7.59 (d, 1H), 4.66 (s, 2H), 4.28-4.21 (q, 2H), 2.43 (d, 3H), 1.30-1.26 (t, 3H); 13 C NMR (100 MHz, CDC1 ): 5c 166.92, 158.74, 154.54, 153.46, 151.55, 145.00, 106.12, 62.33, 51.07, 16.85, 14.02. ESI-MS: m/z: calculated for C11H13N5O3 [M+H] + 264.11, found: 264.11.

[00079] Ethyl 2-(4-((2-t Boc aminoethyl)amino)thyminyl)acetate (4). Triazole derivative 3 (7.1 g, 26.0 mmol), i-butyl (2-aminoethyl)carbamate (8.4 g, 52.0 mmol) and K 2 C0 3 (10.8 g, 78.0 mmol) were mixed in ACN (100 mL). The reaction mixture was stirred for 20 h at room temperature followed by extraction with 2% methanol in ethyl acetate (100 mL x 3). The combined organic phase was washed with brine (50 mL). The crude was purified by flash column chromatography to obtain 7.9 g of compound 4 with an isolated yield of 87%. l H NMR (500 MHz, CDC1 ): δ Η 6.93 (s, 1H), 6.57 (s, 1H), 5.38 (s, 1H), 4.50 (s, 2H), 4.23-4.19 (q, 2H), 3.59-3.58 (m, 2H), 3.40-3.39 (m, 2H), 1.91 (s, 3H), 1.43 (s, 9H), 1.29- 1.26 (t, 3H); 13 C NMR (100 MHz, CDC1 ): 5c 168.59, 163.92, 157.95, 156.78, 141.27, 102.67, 79.64, 61.57, 49.81, 43.80, 39.43, 28.26, 14.01, 12.72; HRMS (EI): m/z: calculated for Ci 6 H 26 N 4 0 5 [M+H] + 355.1981, found: 355.1983.

[00080] Ethyl 2-(4-((2-diCbz guanidino)ethyl amino )thyminyl)acetate (5). A mixture of 50% TFA in DCM (40 mL) was added dropwise to a stirred solution of 4 (4.7 g, 13.4 mmol) dissolved in DCM (30 mL) at 0 °C. After complete Boc deprotection monitored by TLC, the reaction mixture was neutralized by TEA followed by the removal of DCM under reduced pressure. This crude was directly used for the next step without further purification. The compound N,N'-Bis(benzyloxycarbonyl)-lH-pyrazole-l-carboxamidine (5.6 g, 15 mmol) was then added to the above mentioned reaction mixture dissolved in DCM (200 mL). The reaction was left to stir for 12 h at room temperature. Subsequently, 100 mL DCM was added to the reaction mixture and was washed with water (100 mL x 2), half saturated NaHC0 3 solution (50 mL), brine (50 mL) and dried over MgS0 4 . DCM was removed under reduced pressure. Product was purified by flash column chromatography (5% MeOH in EtOAc) to afford 5.7 g of white solid with 75% isolated yield. l H NMR (300 MHz, CDC1 3 ): δ Η 11.70 (s, IH), 8.69 (s, IH), 7.35-7.29 (m, 10H), 7.11 (s, IH), 6.89 (s, IH), 5.17 (s, 2H), 5.09 (s, 2H), 4.45 (s, 2H), 4.21-4.14 (q, 2H), 3.67 (m, 4H), 1.71 (s, 3H), 1.27-1.22 (t, 3H); 13 C NMR (100 MHz, CDCI3): 5c 168.42, 163.37, 163.15, 157.67, 156.11, 153.41, 141.52, 136.09 134.34, 128.76, 128.60, 128.50, 128.39, 128.12, 102.58, 68.37, 67.21, 61.59, 49.72, 43.23, 39.57, 14.00, 12.77; HRMS (EI): m/z: calculated for C28H32N6O7 [M+H] + 565.2411, found: 565.2422.

[00081] 2-(4-((2-diCbz guanidino)ethyl amino)thyminyl)acetic acid (6). Compound 5 (4.0 g, 7.0 mmol) was dissolved in THF (30 mL) and stirred in ice bath. Aqueous 1 M LiOH was added dropwise. The reaction was left stirred for 0.5 h at room temperature and the completion of the reaction was checked by TLC. The reaction mixture was then neutralized by 2 M aqueous HCl. Product was obtained as white precipitate after filtration by using a Biichner funnel. The product was washed with water and dried in a vacuum desiccator to get 2.6 g of product as white solid, yield 70%. l H NMR (500 MHz, DMSO-d 6 ): δ Η 12.84 (s, IH), 11.60 (s, IH), 8.58-8.56 (s, IH), 7.43-7.33 (m, 10H), 7.18-7.17 (s, IH), 5.21 (s, 2H), 5.02 (s, 2H), 4.30 (s, 2H), 3.57-3.54 (m, 2H), 3.48-3.45 (m, 2H), 1.77 (s, 3H); 13 C NMR (100 MHz, DMSO-d 6 ): 5c 170.27, 163.60, 162.96, 155.71, 152.52, 142.97, 136.82, 135.16, 128.54, 128.49, 128.35, 128.11, 127.84, 100.92, 67.59, 66.42, 49.63, two methylene carbons are merged into DMSO-d 6 residual peak, 12.68; HRMS (EI): m/z: calculated for C26H28 5O7 [M+H] + 537.2098, found: 537.2102.

[00082] Ethyl-2-(2-(4-( (2-diCbz-guanidino )ethylamino )thyminyl-N-(2-t-boc- aminoethyl)acetamido) acetate (7). DIPEA (2.8 mL, 16.2 mmol) and EDC · HCl (1.2 g, 6.1 mmol) were added to an ice cold solution of compound 6 (2.2 g, 4.1 mmol) dissolved in DMF

(50 mL) under N 2 . The acid was activated by stirring at room temperature for 15 mins. Ethyl

N-(2-Boc-aminoethyl)glycinate (1.7 g, 4.1 mmol) was added to the above reaction mixture.

The reaction was allowed to stir for another 12 h at room temperature. Reaction mixture was extracted by 2% MeOH in ethyl acetate (100 mL x 2). The combined organic phase was washed with saturated aqueous KHS0 4 (50 mL), brine (50 mL) and dried over MgS0 4 .

Product was purified by flash chromatography to afford 2.1 g of product 7 as yellowish solid, yield 70 %; l H NMR (500 MHz, CDCI3; this compound exist as two rotamers; chemical shifts for the minor rotamer are given in brackets): 5H 11.72 (s, IH), 8.70-8.67 (m, IH), 7.36-7.32

(m, 10H), 6.93 (6.95) (s, IH), 6.75 (6.79) (s, IH), 5.74-5.72 (m, IH), 5.18 (s, 2H), 5.09 (s,

2H), 4.60 (s, IH), 4.31 (4.42) (s, IH), 4.19-4.14 (4.24-4.20) (q, 2H), 4.02 (s, IH), 3.69 (m, 2H), 3.62 (m, 2H), 3.56-3.53 (3.50-3.48) (t, 2H), 3.32-3.31 (3.24-3.23) (m, 2H) 1.64 (1.66) (s, 3H), 1.41 (1.42) (s, 9H), 1.26-1.24 (1.31-1.28) (t, 3H); 13 C NMR (100 MHz, CDC1 3 ): 5c 169.54 (168.18), 163.05 (163.54), 157.43, 156.73, 155.83, 153.21, 141.89, 135.94, 134.24, 128.56, 128.42, 128.30, 128.23, 127.97, 102.27, 79.27 (78.96), 68.13, 67.02, 61.16 (61.71), 53.29, 50.59, 48.78, 48.48, 48.30, 42.92, 39.42 (38.41), 28.15 (28.13), 13.83, 12.53; HRMS (EI): m/z: calculated for C 37 H 48 N 8 Oio [M+H] + 765.3572, found: 765.3569.

[00083] 2-(2-(4-(2-diCbz guanidino)ethyl amino)thyminyl-N-(2-t-boc aminoethyl)acetamido)acetic acid (8). Compound 7 (1.5 g, 1.9 mmol) was dissolved in THF (10 mL) and stirred in ice bath. Aqueous LiOH (1M, 15 mL) was added dropwise and stirred at room temperature for 0.5 h. The crude was neutralized by 2 M aqueous HC1. The obtained white precipitate was filtered by a Biichner funnel, washed with water and left in a desiccator under vacuum to get 0.87 g of product as white powder, yield 65%. l H NMR (300 MHz, MeOH-d 4 ; this compound exist as two rotamers; chemical shifts for the well separated minor rotamer are given in brackets): δ Η 7.42-7.29 (m, 10H), 7.20 (7.16) (s, 1H), 5.22 (s, 1H), 5.10- 5.07 (m, 2H), 4.70 (4.47) (s, 2H), 4.09 (4.26) (s, 2H), 3.80-3.16 (m, 8H), 1.82 (1.90) (s, 3H), 1.44 (1.43) (s, 9H); 13 C NMR (100 MHz, MeOH-d 4 ): 5c 170.66 (170.15), 165.73 (164.91), 1159.29, (158.54) 158.34, 154.69, (144.66) 144.56, 138.26, 136.72, 129.83, 129.72, 129.64, 129.45, 129.25, 104.74, 80.62 (80.29), 69.48 (68.55), (50.84) 50.74, 42.00, 40.91, 39.68 (39.32), (30.87) 28.90, (14.58) 13.15; HRMS (EI): m/z: calculated for C 3 5H 44 N 8 Oio [M+H] + 737.3259, found: 737.3277.

Example 2: Synthesis of PNAs

[00084] The PNA monomers were purchased from ASM Research Chemicals. PNA oligomers were synthesized manually using Boc chemistry via Solid-Phase Peptide Synthesis protocol (Dueholm, K. L. et al. Org Chem. 1994, 59, 5767-5773). 4-methylbenzhydrylamine hydrochloride (MBHA-HCl) polystyrene resins were used, and the oligomerization of PNA was monitored by Kaiser Test (Kaiser, E. et al. Anal. Biochem. 1970, 34, 595-599). Cleavage of the PNA oligomers was done using trifluoroacetic acid (TFA) and trifluoromethanesulfonic acid (TFMSA) method, after which the oligomers were precipitated with diethyl ether, dissolved in water and purified by RP-HPLC. Matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) analysis was used to characterize the oligomers. Example 3: PNAs comprising N 4 -(2-guanidoethyl)-5-methylcytosine and thio- pseudoisocytosine bases are able to bind specifically to RNA duplexes with C-G inversions

[00085] PNAs comprising N 4 -(2-guanidoethyl)-5-methylcytosine and thio- pseudoisocytosine bases were tested to bind to different RNAs. N 4 -(2-guanidoethyl)-5- methylcytosine is the base of the monomelic unit Q and thio-pseudoisocytosine is the base of the monomeric unit L. The non-denaturing PAGE result indicate that PNA P3 (NH 2 -Lys- TLTQTTTL-CONH 2 ), which has both Q and L modifications, binds to rHP2, an RNA forming a duplex and comprising a C-G inversion, with high affinity (Kd = 4.4 + 0.5 μΜ) to form a PNA-RNA 2 triplex in a near physiological buffer (200 mM NaCl, pH 7.5) (Figure 3G and 4B). Remarkably, PNA P3 shows no binding to other RNA hairpins with one C-G pair replaced by G-C (rHPl), A-U (rHP3), or U-A (rHP4) (Figure 4). In addition, PNA P3 does not show binding to the DNA hairpins with the sequence homologous to rHP2 (dHP2) or rHPl (dHPl) (Figure 4), presumably because the major groove of a DNA duplex is not structurally compatible for short PNA binding, which is consistent with previous studies (Devi, G. et al. Nucleic Acids Res. 2014, 42, 4008-4018; Li, M. et al. J. Am. Chem. Soc. 2010, 132, 8676-8681 ; Zengeya, T. et al. Angew. Chem. Int. Ed. Engl. 2012, 51, 12593- 12596). Thus, the Q residue in a PNA is highly sequence specific in forming a Q-C-G base triple in a PNA-RNA 2 triplex. Substitution of the single Q residue in P3 with a C residue (P4, NH 2 -Lys-TLTCTTTL-CONH 2 ) results in weak binding (K d > 50 μΜ) to rHP2 (Figure 8h-j), suggesting the guanidine group in the Q residue is critical for the recognition of the C-G pair in rHP2. It is noted that the guanidine group of arginine is often utilized in proteins for the sequence specific recognition of the Hoogsteen edge of a G base with two hydrogen bond acceptors (carbonyl and N7).

[00086] Previous computational modeling studies have indicated that the two-carbon linker and the guanidine group are orientated to form a total of four hydrogen bonds with the major-groove edge of a C-G pair (Figure 2C) (Semenyuk, A. et al. Biochemistry 2010, 49, 7867-7878). Consistently, previous experimental studies show that alkylation of the amine of a C base causes destabilization of a Watson-Crick C-G pair, due to the blocking of its own Watson- Crick edge by the alkyl group (Engel, J. D. and von Hippel, P.H. Biochemistry 1974, 13, 4143-4158; Micura, R. et al. Nucleic Acids Res. 2001, 29, 3997-4005). Presumably, the 5-methyl group in Q base may repel the carbon linker away and favor the orientation with the linker towards its own Watson-Crick edge, and thus destabilizes a Watson-Crick-like Q-G pair due to the steric clash caused by the linker, but favors Q-C-G base triple formation (Figure 2C). Consistently, the UV-absorbance-detected thermal melting results reveal no appreciable binding between the PNAs containing L and/or Q residues and single- stranded RNAs. It has been demonstrated previously (Devi, G. et al. Nucleic Acids Res. 2014, 42, 4008-4018) that short PNAs incorporating L residues show no significant binding to single- stranded RNAs or DNAs.

Example 4: Variation of pH condition and magnesium ion concentration has only mild effects on the binding of Q comprising PNAs to RNA duplexes with C-G inversions

[00087] Varying pH and magnesium ion concentration cause no significant changes for the binding of P5 (NH 2 -Lys-TLTLTTTL-CONH 2 ) to rHPl (Figures 6k-m and 7e-g), consistent with the fact that L base (with an increased p^a compared to C base) can form stable L-G-C base triple with minimal pH dependence (Devi, G. et al. Nucleic Acids Res. 2014, 42, 4008-4018) and that P5 has only two positive charges due to the presence of an N- terminal lysine residue. Consistently, upon lowering the pH to 6.0, the binding affinity of P3 to rHP2 was moderately enhanced (Kd = 1.2 + 0.2 μΜ, Figures 8e and 9b). The addition of 2 mM magnesium ion slightly weakens the binding affinity of P3 to rHP2 (Kd = 7.1 + 2.2 μΜ, Figure 8g, 9d). Thus, the attached guanidine group in P3 stabilizes the PNA-RNA 2 triplex formation mainly through sequence specific Q-C-G base triple formation, but not non-specific ionic interaction as the case when the guanidine group is attached to the PNA backbone (Gupta, P. et al. Biochemistry 2012, 51, 63-73; Lu, X.W. et al. Org. Lett. 2009, 11, 2329- 2332). Substitution of the two L residues in P3 with two C residues (P2, NH 2 -Lys- TCTQTTTC-CONH 2 ) results in no binding to rHP2 at pH 7.5 or 6.0 (Figure 8a-b), further underscoring the stabilizing effect of the L modification on the PNA-RNA 2 triplex formation. PI (NH 2 -Lys-TCTCTTTC-CONH 2 ) and P5 show binding to rHPl, but no binding to rHP2, consistent with previous results. Taken together, the results show that L and Q modifications in PNAs facilitate enhanced sequence specific recognition G-C and C-G pairs, respectively, in RNA duplexes with minimal binding to single-stranded RNAs or DNA duplexes, at near- physiological conditions. It is noted that the RNA duplex-binding triplex-forming PNAs are complementary to the pyrrol-imidazole polyamides, which selectively and sequence specifically bind to the minor groove of DNA duplexes (Chenoweth, D.M. et al. Angew. Chem. Int. Ed. Engl. 2013, 52, 415-418). Example 5: The stabilization effect of Q is affected by its position the PNA

[00088] The binding of HIV- 1 frameshift inducing RNA hairpin structure to PNAs with the Q residue incorporated at the terminal end was also tested (Figure 3M). At 200 mM NaCl, pH 7.5, PNA P6 (NH 2 -Lys-LLTTLLQ-CONH 2 ) binds to HIV-1 frameshift inducing RNA hairpin (Kd = 1.7 + 0.7 μΜ, Figure 4G, Figure 9e), which is comparable to the PNA without the Q residue (NH 2 -Lys-LLTTLL-CONH 2 , K d = 1.1 + 0.3 μΜ), and PNA P7 (NH 2 -Lys- LLLTTLLQ-CONH 2 , K d = 2.4 + 1.0 μΜ) (Figure 4H, Figure 9f). Thus, the stabilization effect of a Q residue at the terminal end is not as significant when it is in the middle of a PNA, which is similar to the L residue (Devi, G. et al. Nucleic Acids Res. 2014, 42, 4008-4018).

Example 6: PNAs containing Q enter HeLa cells without the application of transfection assistance

[00089] Without the aid of transfection agents, oligonucleotides and PNAs are not cell- permeable. The cellular uptake of a DNA duplex-binding TFO with a 2'-OMe or 2'-AE RNA backbone containing a Q base is facilitated by electroporation (Semenyuk, A. et al. Biochemistry 2010, 49, 7867-7878). It was tested if combining the guanidine group and the relatively hydrophobic regions of the PNA bases and backbones may facilitate the penetration (Zhou, P. et al. J. Am. Chem. Soc. 2003, 125, 6878-6879; Jain, D. R. et al. Org. Chem. 2014, 79, 9567-9577; Patil, K. M. et al. J. Am. Chem. Soc. 2012, 134, 7196-7199) of the Q modified PNAs through the cell membrane. Confocal microscopy studies show that a PNA incorporating three Q residues and labeled with Cy3 dye is taken up into the cytoplasm of HeLa cells without applying any transfection agent or other transfection methods (Figure 5).

[00090] The invention has been described broadly and generically herein. Each of the narrower species and sub-generic groupings falling within the generic disclosure also form part of the invention. This includes the generic description of the invention with a proviso or negative limitation removing any subject-matter from the genus, regardless of whether or not the excised material is specifically recited herein. Other embodiments are within the following claims. In addition, where features or aspects of the invention are described in terms of Markush groups, those skilled in the art will recognize that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group. [00091] One skilled in the art would readily appreciate that the present invention is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those inherent therein. Further, it will be readily apparent to one skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention. The compositions, methods, procedures, treatments, molecules and specific compounds described herein are presently representative of preferred embodiments are exemplary and are not intended as limitations on the scope of the invention. Changes therein and other uses will occur to those skilled in the art which are encompassed within the spirit of the invention are defined by the scope of the claims. The listing or discussion of a previously published document in this specification should not necessarily be taken as an acknowledgement that the document is part of the state of the art or is common general knowledge.

[00092] The invention illustratively described herein may suitably be practiced in the absence of any element or elements, limitation or limitations, not specifically disclosed herein. Thus, for example, the terms "comprising", "including," containing", etc. shall be read expansively and without limitation. The word "comprise" or variations such as "comprises" or "comprising" will accordingly be understood to imply the inclusion of a stated integer or groups of integers but not the exclusion of any other integer or group of integers. Additionally, the terms and expressions employed herein have been used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Thus, it should be understood that although the present invention has been specifically disclosed by exemplary embodiments and optional features, modification and variation of the inventions embodied therein herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention.

[00093] The content of all documents and patent documents cited herein is incorporated by reference in their entirety.