Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MEANS AND METHODS FOR THE SELECTIVE STABILIZATION OF RNA VIA CAS5 AND CAS7 PROTEINS
Document Type and Number:
WIPO Patent Application WO/2019/025444
Kind Code:
A1
Abstract:
The present invention relates to a method for producing an encased ribonucleic acid molecule, comprising (a) introducing a heterologous fusion construct into a host cell, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas5 protein, (b) introducing one or more nucleic acid molecules into said host cell, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, and (c) culturing the host cell obtained in step (a) and (b) under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule; or (a') introducing a heterologous fusion construct into a host cell, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas5 protein, said host cell comprising one or more nucleic acid molecules, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, and (b') culturing the host cell obtained in step (a') under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule; or (a") introducing one or more nucleic acid molecules into a host cell, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, said host cell comprising a heterologous fusion construct, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas5 protein, and (b") culturing the host cell obtained in step (a") under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule.

Inventors:
GLEDITZSCH DANIEL (DE)
RANDAU LENNART (DE)
Application Number:
PCT/EP2018/070739
Publication Date:
February 07, 2019
Filing Date:
July 31, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MAX PLANCK GESELLSCHAFT (DE)
International Classes:
C12N15/10
Other References:
DWARAKANATH SRIVATSA ET AL: "Interference activity of a minimal Type I CRISPR-Cas system from Shewanella putrefaciens", October 2015, NUCLEIC ACIDS RESEARCH, VOL. 43, NR. 18, PAGE(S) 8913-8923, ISSN: 0305-1048(print), XP002776612
MAIER LISA-KATHARINA ET AL: "The Adaptive Immune System of Haloferax volcanii", March 2015, LIFE-BASEL, VOL. 5, NR. 1, PAGE(S) 521-537, ISSN: 2075-1729(print), XP002776613
BRENDEL JUTTA ET AL: "A Complex of Cas Proteins 5, 6, and 7 Is Required for the Biogenesis and Stability of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-derived RNAs (crRNAs) in Haloferax volcanii", March 2014, JOURNAL OF BIOLOGICAL CHEMISTRY, VOL. 289, NR. 10, PAGE(S) 7164-7177, ISSN: 0021-9258(print), XP002776614
LI MING ET AL: "Characterization of CRISPR RNA Biogenesis and Cas6 Cleavage-Mediated Inhibition of a Provirus in the Haloarchaeon Haloferax mediterranei", February 2013, JOURNAL OF BACTERIOLOGY, VOL. 195, NR. 4, PAGE(S) 867-875, ISSN: 0021-9193(print), XP002776615
PENG WENFANG ET AL: "Genetic determinants of PAM-dependent DNA targeting and pre-crRNA processing in Sulfolobus islandicus", May 2013, RNA BIOLOGY, VOL. 10, NR. 5, SP. ISS. SI, PAGE(S) 738-748, ISSN: 1547-6286(print), XP002776616
REEKS JUDITH ET AL: "Structure of the archaeal Cascade subunit Csa5: Relating the small subunits of CRISPR effector complexes", May 2013, RNA BIOLOGY, VOL. 10, NR. 5, SP. ISS. SI, PAGE(S) 762-769, ISSN: 1547-6286(print), XP002776617
CASS SIMON D B ET AL: "The role of Cas8 in type I CRISPR interference", June 2015, BIOSCIENCE REPORTS, VOL. 35, NR. PART 3, PAGE(S) ARTICLE NO.: E00197, ISSN: 0144-8463(print), XP002776618
REEKS JUDITH ET AL: "CRISPR interference: a structural perspective", July 2013, BIOCHEMICAL JOURNAL, VOL. 453, NR. PART 2, PAGE(S) 155-166, ISSN: 0264-6021(print), XP002776619
FABRE ET AL., EUROPEAN JOURNAL OF HUMAN GENETICS, vol. 22, 2013, pages 379 - 385
MAKAROVA ET AL., BIOL DIRECT., vol. 6, 2011, pages 38
MAKAROVA ET AL., NATURE REVIEWS MICROBIOLOGY, vol. 13, 2015, pages 722 - 736
SODING ET AL., NUCLEIC ACID RESEARCH, vol. 33, 2005, pages W244 - W248
STEPHEN F. ALTSCHUL; THOMAS L. MADDEN; ALEJANDRO A. SCHAFFER; JINGHUI ZHANG; ZHENG ZHANG; WEBB MILLER; DAVID J. LIPMAN: "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", NUCLEIC ACIDS RES., vol. 25, 1997, pages 3389 - 3402, XP002905950, DOI: doi:10.1093/nar/25.17.3389
SAMBROOK; RUSSEL: "Molecular Cloning: A Laboratory Manual", 2001, COLD SPRING HARBOR LABORATORY PRESS
Attorney, Agent or Firm:
VOSSIUS & PARTNER PATENTANWÄLTE RECHTSANWÄLTE MBB (DE)
Download PDF:
Claims:
CLAIMS

A method for producing an encased ribonucleic acid molecule, comprising

(a) introducing a heterologous fusion construct into a host cell, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'- end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas 5 protein,

(b) introducing one or more nucleic acid molecules into said host cell, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, and

(c) culturing the host cell obtained in step (a) and (b) under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule; or

(a') introducing a heterologous fusion construct into a host cell, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'- end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas 5 protein, said host cell comprising one or more nucleic acid molecules, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, and

(b') culturing the host cell obtained in step (a') under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule; or

(a") introducing one or more nucleic acid molecules into a host cell, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, said host cell comprising a heterologous fusion construct, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas 5 protein, and

(b") culturing the host cell obtained in step (a") under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule.

2. The method of ciaim 1 , wherein

the Cas5 protein comprises an amino acid sequence being selected from any one of SEQ ID NOs 1 to 73 or a sequence being at least 70%, preferably at least 80%, and most preferably at least 90% identical thereto, and/or

the Cas7 protein comprises an amino acid sequence being selected from any one of SEQ ID NOs 74 to 148 or a sequence being at least 70%, preferably at least 80%, and most preferably at least 90% identical thereto.

3. The method of claim 2, wherein

the Cas5 protein comprises the amino acid sequence of SEQ ID NO: 1 or a sequence being at least 70%, preferably at least 80%, and most preferably at least 90% identical thereto,

the Cas7 protein comprises the amino acid sequence of SEQ ID NO: 74 or a sequence being at least 70%, preferably at least 80%, and most preferably at least 90% identical thereto, and

the heterologous tag has the nucleotide sequence of 5 -CUUAGAAA-3'.

4. The method of any one of claims 1 to 3, wherein the tag further comprises at its 5'-end a ribonucleic acid sequence that can be cleaved off by an enzyme and/or ribozyme, and the one or more nucleic acid molecules further encode in expressible form the enzyme and/or ribozyme being capable of cleaving off the ribonucleic acid sequence from the 5'-end of the tag.

5. The method claim 4, wherein the ribonucleic acid sequence that can be cleaved off by an enzyme and/or ribozyme is a repeat sequence, and wherein the enzyme and/or ribozyme is a Cas6 protein.

6. The method of any one of claims 1 to 5, wherein the ribonucleic acid molecule is expressed from a vector and the one or more nucleic acid molecules are expressed from one or more different vectors.

7. The method of any one of claims 1 to 6, wherein the expression from the heterologous fusion construct and/or the expression from the one or more nucleic acid molecules is/are under the control of the same or different inducible promoter(s).

8. The method of claim 7, wherein the expression from the heterologous fusion construct is initiated before the expression from the one or more nucleic acid molecules, or wee versa.

9. The method of any one of claims 1 to 8, wherein the expression from the heterologous fusion construct and/or the expression from the one or more nucleic acid molecules is/are under the control of two or more promoters giving raise to different expression yields.

10. The method of any one of claims 1 to 9, wherein the protein Cas5, Cas7 and/or Cas6 comprise(s) an affinity purification tag or a fluorescent tag.

1 1. The method of any one of claims 1 to 10, wherein at least two distinct types of Cas7 proteins are expressed, said at least two distinct types of Cas7 proteins are distinguished in that they are fused to different proteins. 12. The fusion construct or the heterologous ribonucleic acid molecule as defined in any one of the preceding claims.

13. A vector comprising the fusion construct as defined in any one of the preceding claims. 14. A host cell comprising the fusion construct or the heterologous ribonucleic acid molecule of claim 12 or the vector of claim 13.

15. A Kit comprising the fusion construct or the heterologous ribonucleic acid molecule of claim 12, the vector of claim 13, and/or the host cell of claim 14.

Description:
Means and methods for the selective stabilization of RNA via Cas5 and Cas7 proteins

The present invention relates to a method for producing an encased ribonucleic acid molecule, comprising (a) introducing a heterologous fusion construct into a host cell, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'- end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas5 protein, (b) introducing one or more nucleic acid molecules into said host cell, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, and (c) culturing the host cell obtained in step (a) and (b) under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule; or (a') introducing a heterologous fusion construct into a host cell, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas5 protein, said host cell comprising one or more nucleic acid molecules, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, and (b') culturing the host cell obtained in step (a') under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule; or (a") introducing one or more nucleic acid molecules into a host cell, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, said host cell comprising a heterologous fusion construct, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas5 protein, and (b") culturing the host cell obtained in step (a") under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule.

In this specification, a number of documents including patent applications and manufacturer's manuals are cited. The disclosure of these documents, while not considered relevant for the patentability of this invention, is herewith incorporated by reference in its entirety. More specifically, all referenced documents are incorporated by reference to the same extent as if each individual document was specifically and individually indicated to be incorporated by reference. RNA is a tool used in many fields, from molecular and cellular biology to medicine and nanotechnology. For instance, RNA populations (whole or subsets) are quantitatively analyzed for studying cellular functions, elucidating normal or pathological mechanisms or seeking molecular signatures for diagnostic purposes (microarrays, RNAseq (also called whole transcriptome shotgun sequencing)). In addition, specific RNA species are used as clinical standards or as agents for controlling gene expression (siRNA, ribozymes) (Fabre et al. (2013), European Journal of Human Genetics (2014) 22, 379-385).

For most of these uses, integrity of RNA is required and must be maintained during storage. However, RNA molecules can be affected by multiple degradation reactions. First, RNA molecules are very sensitive to oxidation by reactive oxygen species. In vivo, they are produced by respiration. Outside the cell, they can be generated by mechanisms generally involving metallic ions. Oxidation could also result from attacks by ozone, an atmospheric pollutant that rapidly reacts with RNA either in solution or in the solid state. Degradation can also occur through the activity of some metallic complexes catalyzing the hydrolytic cleavage of the phosphodiester bond or by contaminating nucleases.

The main degradative event is the spontaneous cleavage of the phosphodiester linkage through transesterification resulting from a nucleophilic attack of the phosphorus atom by the neighboring 2ΌΗ. A large variety of agents such as specific acids and bases as well as Bronsted acids and bases acting as catalysts can be involved. RNase A and some ribozymes share this mechanism. Water is involved, for instance, by providing hydroxyl or hydronium ions or by allowing proton transfer. As expected, dehydration of RNA strongly inhibits its degradation. However, partial rehydration by atmospheric water restores the initial instability while still in the solid state.

Another characteristic of the reaction is that it is highly dependent on the geometry of the molecule. Indeed, in the transition state, the 2' oxygen, the phosphorus and a negatively charged oxygen of the phosphate group must be 'in line'. This structural requirement leads up to 10000-fold rate variations depending on the local secondary and tertiary structures of the molecule.

In order to prevent degradation, RNA samples are generally stored frozen at -20 °C or -80 °C or under liquid nitrogen. However, even at a low temperature, RNA retains some reactivity. It has been shown, for instance, that ribonucleases are still active at -20 °C on frozen RNA. In addition, the activity of some ribozymes is still significant at -70 °C and can even be enhanced by freezing. More importantly, the increase in the number of samples that need to be stored by up to hundreds of thousands in biorepositories, biobanks and biological resource centers leads to problems of space, costs, maintenance and security. Shipping of RNA samples, usually done in dry ice, is costly and can be challenging for air transportation (e.g. in view of regulations, limited weight, or long distance travel).

Hence, there is an ongoing need for means and methods allowing an effective stabilization of RNA molecules against degradation, in particular means and methods allowing to store RNA at room temperature and to ship RNA molecules. The present application therefore relates in a first aspect to a method for producing an encased ribonucleic acid molecule, comprising (a) introducing a heterologous fusion construct into a host cell, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas5 protein, (b) introducing one or more nucleic acid molecules into said host cell, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, and (c) culturing the host cell obtained in step (a) and (b) under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule; or (a') introducing a heterologous fusion construct into a host cell, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas5 protein, said host cell comprising one or more nucleic acid molecules, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, and (b') culturing the host cell obtained in step (a') under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule; or (a") introducing one or more nucleic acid molecules into a host cell, said one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein, said host cell comprising a heterologous fusion construct, said fusion construct encoding in expressible form a ribonucleic acid molecule comprising at its 5'- end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas 5 protein, and (b") culturing the host cell obtained in step (a") under conditions wherein the expressed Cas5 binds to the tag and the expressed Cas7 encases the ribonucleic acid molecule.

The term "RNA molecule" designates a linear molecule composed of four types of smaller molecules called ribonucleotide bases: adenine (A), cytosine (C), guanine (G), and uracil (U). It is understood that the term "RNA" as used herein comprises all forms of RNA including mRNA, ncRNA (non-coding RNA), tRNA and rRNA. The term "non-coding RNA" includes siRNA (small interfering RNA), miRNA (micro RNA), rasiRNA (repeat associated RNA), snoRNA (small nucleolar RNA), and snRNA (small nuclear RNA). The RNA is preferably an mRNA. Organisms use mRNA to convey genetic information that directs synthesis of specific proteins.

As will be further detailed herein below, an encased ribonucleic acid molecule is in accordance with the present invention a ribonucleic acid molecule having a Cas5 protein bound on its 5' end and having bound one or more Cas7 proteins downstream (i.e. in direction of the RNA 3' terminus) of the Cas5 protein along the entire length of the ribonucleic acid molecule. Thereby the Cas5 protein and the one or more Cas7 proteins cover the ribonucleic acid molecule resulting in a ribonucleoprotein (RNP).

A "hetereologous fusion construct" as used herein defines the fusion of the DNA molecule encoding the ribonucleic acid molecule of the invention to a DNA molecule encoding a heterologous ribonucleic acid tag consisting of about 8 nucleotides and being recognized by a Cas5 protein. Hence, also the about 8 nucleotides of the tag are ribonucleic acid nucleotides. The tag-encoding DNA molecule is fused directly fused to the 5'-end of the ribonucleic acid molecule-encoding DNA molecule. Hence, the hetereologous fusion construct is a DNA molecule. A DNA molecule designates a linear molecule composed of four types of smaller molecules called nucleotide bases: adenine (A), cytosine (C), guanine (G), and thymine (T). The 5' carbon of ribonucleic acid molecules and nucleic acid molecules has a phosphate group attached to it and the 3' carbon a hydroxyl (-OH) group.

The term "hetereologous" with respect to the fusion construct means that the fusion construct is experimentally put into a host cell that does not normally produce (i.e., express) the ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas5 protein being encoded by the fusion construct. The term "hetereologous" with respect to the tag consisting of about 8 nucleotides and being recognized by a Cas5 protein means that within the hetereologous fusion construct the tag- encoding DNA molecule is not fused to a DNA molecule consisting of or comprising the DNA sequence encoding the spacer sequence of a naturally occurring crRNA (CRISPR RNA) or a fragment of the spacer sequence of a naturally occurring crRNA (CRISPR RNA) that can be bound by one or more Cas7 proteins. For instance, within the naturally occurring type l-Fv Cascade from S. putrefaciens the spacer sequence of the crRNA is 32 nucleotides long and is bound by 6 copies of Cas7fv. A shortened version of the spacer sequence of 14 nucleotides is bound by 3 copies of Cas7fv. One Cas7fv subunit covers a 6 nt segment of the crRNA molecule, so that up to five nucleotides at the 3 v end might stay uncased. The term "about 8 nucleotides" preferably means 7 to 10 nucleotides. The tag most preferably consists of 8 nucleotides. The CRISPR-Cas system is an adaptive immunity systems being present in most archaea and many bacteria. For this reason the Cas proteins (including Cas5, Cas6 and Cas7) - although sharing structural similarities allowing their unambiguous identification - and the tag being recognized by Cas5 are fast evolving. The fast evolution of the CRISPR-Cas system is essential for its function as an adaptive immunity system. Immune systems need to adapt quickly and efficiently to new viral challenges. For this reason the exact sequence of the about 8 nucleotides of the tag are not fixed and differ among the CRISPR-Cas systems from species to species and may even differ within a species from strain to strain. For this reason it is preferred that Cas proteins (Cas5, Cas7 and optionally Cas6) and the about 8 nucleotides from the crRNA to be used in accordance with the invention are derived from the same species and more preferably from the same strain.

The "one or more nucleic acid molecules encoding in expressible form a Cas5 protein and a Cas7 protein" as used herein defines one or more molecules comprising the DNA sequences encoding a Cas5 protein and a Cas7 protein in expressible form. It is preferred that one nucleic acid molecule encodes in expressible form a Cas5 protein and a Cas7 protein, since this requires only introducing one nucleic acid molecule into the host. In case one nucleic acid molecule is used the expression of Cas5 and Cas7 may be under the control of different promoters or under the control of one promoter. In the latter case a polycistronic mRNA might be expressed from the nucleic acid molecule that is processed and translated into Cas5 and Cas7 proteins.

Moreover, also the one or more nucleic acid molecules and the hetereologous fusion construct (e.g. a vector) may be comprised in one or more expression constructs. For instance, one nucleic acid molecule encoding in expressible form a Cas5 protein and a Cas7 protein and the hetereologous fusion construct may be within one expression construct. Also, for example, one nucleic acid molecule encoding in expressible form a Cas5 protein may be within a first expression construct, one nucleic acid molecule encoding in expressible form a Cas7 protein may be within a second expression construct and the hetereologous fusion construct may be within a third expression construct. However, it is preferred that the nucleic acid molecule encoding in expressible form a Cas5 protein and a Cas7 protein is within a first expression construct and the hetereologous fusion construct is within a second expression construct. Within all described options it is preferred that the expression of the Cas5 and Cas7 proteins are under the control of a different promoter than the expression of the RNA molecule being encoded by the hetereologous fusion construct. It is more preferred that the expression of Cas5 protein is under the control of a first promoter, the expression of Cas7 protein is under the control of a second promoter and the expression of the RNA molecule is under the control of a third promoter, said promotors being distinct from each other. As will be further detailed herein below the use of different inducible promoters and/or different promoters giving raise to different expression yields, allows for separately orchestrating the timing and/or amounts of the Cas proteins, e.g., by selecting inducible promoters and promoters providing different expression yields (i.e. weak or strong promoters).

The term "expressible form" with respect to the heterologous fusion construct requires that the fusion construct harbours the nucleic acid molecule encoding the ribonucleic acid molecule comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides in a form that is transcribed into RNA in the host cell. The term "expressible form" with respect to the one or more nucleic acid molecules encoding a Cas5 protein and a Cas7 protein requires that the fusion construct harbours the nucleic acid molecule encoding the Cas5 and Cas7 proteins in a form that is transcribed into RNA and translated into the Cas5 and Cas7 proteins in the host cell.

The host cells are preferably isolated host cells, meaning that the cells are not within the context of a living multicellular organism. The host may be any prokaryotic or eukaryotic cell. A suitable eukaryotic host may be a mammalian cell, an amphibian cell, a fish cell, an insect cell, a fungal cell or a plant cell. A eukaryotic cell may be an insect cell such as a Spodoptera frugiperda cell, a yeast cell such as a Saccharomyces cerevisiae or Pichia pastoris cell, a fungal cell such as an Aspergillus cell or a vertebrate cell. In the latter regard, it is preferred that the cell is a mammalian cell such as for instance a human cell, a hamster cell or a monkey cell. The cell may be a part of a cell line. The host cell is preferably a bacterial host cell, more preferably a gram negative bacterium, even more preferably a proteobacterium and most preferably an E. coil cell (e.g., E coli strains HB101 , DH5a, XL1 Blue, Y1090 and JM101 ).

Means and methods for introducing the heterologous fusion construct and/or the one or more nucleic acid molecules into the host cell are known in the art. Non-limiting examples are viral transfection, electroporation, lipofection and microinjection. Suitable conditions for culturing a host cell are well known to the person skilled in the art. For example, suitable conditions for culturing bacteria include growth under aeration in Lysogenic Broth (LB) medium. To increase the yield and the solubility of the expression product, the medium can be buffered or supplemented with suitable additives known to enhance or facilitate growth. For example, Escherichia coli can be cultured from 4 to about 37 °C, the exact temperature or sequence of temperatures depending on the exact molecule to be overexpressed. The skilled person is also aware of all these conditions and may further adapt these conditions to the needs of a particular host species and the requirements of the RNA molecules and Cas proteins to be produced.

The term "under conditions wherein the expressed Cas5 binds to the tag" requires that the tag forms the 5~-end of the RNA molecule, so that it is free to be bound by Cas5. This may be achieved either by placing no sequence 5' of the tag so that the tag forms the 5'-end of the RNA molecule, or by processing the RNA molecule so that the tag becomes the 5'-end of the RNA molecule. Means and methods for processing the RNA molecule (e.g. by enzymes and ribozymes) accordingly will be discussed herein below.

The CRISPR-Cas adaptive immunity systems are present in most archaea and many bacteria. They function by incorporating fragments of alien genomes into specific genomic loci, transcribing the inserts and using the transcripts as guide RNAs to destroy the genome of the cognate virus or plasmid. This RNA interference-like immune response is mediated by numerous, diverse and rapidly evolving Cas (CRISPR-associated) proteins, several of which form the Cascade complex involved in the processing of CRISPR transcripts and cleavage of the target DNA (Makarova et al. (2011 ), Biol Direct.; 6:38).

The evolution of CRISPR-cas loci involves rapid changes, in particular numerous rearrangements of the locus architecture and horizontal transfer of complete loci or individual modules. Despite these dynamics the analysis of signature protein families and features of the architecture of cas loci allows unambiguously partitioning CRISPR-cas loci into distinct two classes, five types and 16 subtypes (Makarova et al. (2015), Nature Reviews Microbiology 13, 722-736).

Class 1 CRISPR-Cas systems are defined by the presence of a multisubunit crRNA-effector complex whereas Class 2 CRISPR-Cas systems are defined by the presence of a single subunit crRNA-effector module. Class 1 includes type I, type III and type IV. All type I loci contain the signature gene cas3 (or its variant cas3'), which encodes a single-stranded DNA (ssDNA)-stimulated superfamily 2 helicase with a demonstrated capacity to unwind double- stranded DNA (dsDNA) and RNA-DNA duplexes. Often, the helicase domain is fused to a HD family endonuciease domain that is involved in the cleavage of the target DNA. The HD domain is typically located at the amino terminus of Cas3 proteins (with the exception of subtype l-U and several subtype l-A systems, in which the HD domain is at the carboxyl terminus of Cas3) or is encoded by a separate gene {cas ) that is usually adjacent to cas3'. Type I systems employ a multi-subunit Cascade (CRISPR-associated complex for antiviral defence) complex to facilitate duplex formation between a CRISPR RNA (crRNA) spacer and the complementary target DNA strand protospacer. The Cas proteins to be used in accordance the present invention, i.e. Cas5, Cas7 and optionally also Cas6 are class 1 , type I Cas proteins.

For the identification of distinct Cas5, Cas6 and Cas7 proteins, respectively, for example, HHpred may be used. HHpred is an interactive server for protein homiolgy detection and structure prediction (Soding et al. (2005), Nucleic Acid Research, 33:W244-W248). The Cas5 proteins are unified by sequence similarity and the presence of a C-terminal domain downstream of a G-rich loop. Cas5 proteins consist of two distinct subgroups one of which contains two RNA Recognition Motif (RRM) domains and the other one contains only one RRM domain. Most Cas6 proteins encompass two well-defined RRM domains which are connected by a "flange" in the extended conformation and have a glycine-rich loop upstream of the last strand of the second RRM fold domain. The Cas6 proteins contain a typical N- terminal RRM domain and a distinct C-terminal domain that displays certain topological features reminiscent of the RRM and contains a C-terminal G-rich loop. All Cas7 proteins contain a single RRM domain with additional elaborations (Makarova et al. (201 1 ), Biol Direct.; 6:38).

As mentioned, a common structural feature among the Cas proteins found in crRNA effector complexes is the RNA recognition motif (RRM), a nucleic acid-binding domain that is the core fold of the extremely diverse RAMP protein superfamily. The RAMPs Cas5 and Cas7 comprise the skeleton of the crRNA effector complexes. In type I systems, Cas6 is generally the active endonuciease that is responsible for crRNA processing, and Cas5 and Cas7 are non-catalytic RNA-binding proteins; however, in type l-C systems, crRNA processing is also catalysed by Cas5. In addition to common structural features, also certain consensus sequences shared by Cas proteins can be established. Accordingly, the Cas5 protein has used herein preferably comprises or consists of one of the consensus sequences of formula (I) including sequences sharing at least 85% identity, formula (II) including sequences sharing at least 85% identity, and formula (III) including sequences sharing at least 70% identity. pXo-ThlllXo-iplplpsANAhssXo^hohGhPuhsthhGh.aulpRplttsXo^hXo- ihpuhhXo-ilhhHtXo-iphphXo- !ttXo-nhXo^hXo-aptsXo^pXo^ptXo-gSshXo-ipcschHhploLllthXo-spX o-ispXo-ihXo-shXo-itXo- 5+IAGGplhXo-22tXo-8tXo-ihh+hhPuaslhttptXo-ihXo-ithXo-iottphh sAhhphX 0- itlpXo-33ualsshXo.

!hGattlpXo-ihXo-sGphXo-!ptcpXo-ithshtasEslhslscahXo^hclps Xo-iPphhWXo^pXo^tXo^lhXo-ihXo-

(Formula I) hXo^lXo-iGphAhFTpPXo-ihKhE+Xo-iSYXo-ilXo-iTXo-ipAh+Glh-ulaaK PthhXo-zlptltVhpXo.

^phhXo^thssXo-shhhhLpDVtYXo-ilcAchXo-ihsXo-iptpcXo-atKaXo -ithhpRphcpGtXo-ihptXo.

^LGsREhXo-iuXo-ihtXo^pXo-ittXo-esXo-sphplGhMhashsaXo-iPtX o-itXo^shhapshhppGhlthsXo- 2 pttthht

(Formula II) htttlphclhG-hAhFTcPthKIXo- 2 ERISYsVXo-iTXo-ipAh+Glh-AlaWKPslpWhlcclpVL+PlphpoXo- 13 psXo-itshptstsssLhhhXo. 12 hhL+DVuYhlcA+hX 0 . 1 hst+tXo- 24 DcpssKahshhpRphc+GtX 0 _ ihppshLGsREhXo-iuhhtXo-ihsXo-ispXo-sshttphcLGhMlashcassshsXo -igstFa+AhhcsGhlphPtXo- !psptlhtXo-se

(Formula III)

Similarly, the Cas6 protein has used herein preferably comprises or consists of one of the consensus sequences of formula (IV) including sequences sharing at least 90% identity, formula (V) including sequences sharing at least 85% identity, formula (VI) including sequences sharing at least 90% identity, formula (VII), formula (VIII) including sequences sharing at least 90% identity, formula (IX) including sequences sharing at least 85% identity, and formula (X) including sequences sharing at least 85% identity. X 0 - 2 h Xo^hsushssuhsHhh X 0 -ihGLu X 0 _ihhpp X 0 - 3 tt X 0 - 2 hhha X 0 - 2 t X 0 - 2 sh X 0 - 3 p X 0 -i-t X 0 - 4 phut X 0 -il X 0 - 2 hsppht X 0 - 5 hp Χ 0-1 ρ X 0 - 2 hs X 0 - 9 t X 0 - 2 hu X 0 -ihSPRht X 0 - 2 ss X 0 . 6 W X 0 -ithp X 0 . 2 RpthlDth X 0 .ip Xo- 2 D X 0 - 5 hh Xo-isLGpsuYW X 0 - 2 pttt Xo- ! h X 0 -iP X 0 - 5 uAShWEMhsRstGpEFI X 0 - 2 +Lh X 0 _ ! hst X 0 -ils X 0 -ihpstthhsGI X 0 -iG X 0 - 4 D X 0 -ihstp X 0 - 2 p X 0 -ihssoGhpsPt X 0 -iTD X 0 -iuhAhhuhhGhu X 0 -ihs X 0 - 3 h X 0 - 6 s x 0 -issuuh X 0 . 2 t X 0-6 s X 0 - 2 hllPI X 0 _ hp X 0 - 2 hhpslhhS X 0 . 2 h Χ 0 8 ρ Χ0-9Ρ X0-2L X 0 - ! t X 0 -iGh X 0 -ihhh X 0 -iF X 0 - 3 hststo X 0 _iS X 0 -i X 0-1 h X 0 _ipGphh X 0-1 h X 0 _ 7

(Formula IV) Xo-4hXo-i!p hphphXo-iSstXo-3tshXo-iphRGahAt+aspXo. 3 hHNHX 0 .2PS+Xo-2YtYP lQYKslpthsX 0 . !llulsEuXo-ipllhtlhXo^pXo-icXo-ilphXo-!ttXo-ihplXo-iptthhhcp pXo-ihthocphXo-aYXo^

!FhoPWLslppENXo-ipKaXo-!tXo-apXo^EpXo-rhLp+hLhGNILSMuKhhs apVXo-iCXo-iicsXo-ilslcXo- !hpXo-ipaKshXo-!hhuFhGtFhsNFXo-!lPcahGIGKusuhGhGolpclXo-is

(Formula V)

Xo^hXo-ihXo-!hhXo-ihXo-ihXo^tXo-i Xo-iLsXo-sGXo-sShhRGhhXo-sthXo-iphhCXo^tXo-ststXo-iCXo- !hXo-iCXo-ihhXo-ilhtXo^ttXo-iShPahhcPPXo^stXo-shXo^upthphthh LhGXo-iuXo-!tXo^hhXo-ishXo^ 2 hX 0-1 tXo-4GIGXo-8ttuXo-ihXo-ilXo-ithX 0 _ 8 tX 0 .3lhXo-2 6 tXo-3StXo-ihtlphXo-ioPhRhhXo. 2 tXo-3tXo-33hXo- 4lhXo- 2 hhXo-i+hXo-2lXo-3ahtXo.7pXo-itXo-ithXo-3sXo-ithphXo -1 ttphtXo-ihphX 0 _ 2 hpXo-itptpXo. 1 hXo- !GXo-ihXo-ihtGXo-asXo-rhhXo-ihhhhsthXo-!HhG+tsshGXo-iGhhphXo -^

(Formula VI)

Xo-4 Xo-i hXo- 2 hXo-i hXo-6tXo-i hXo-i IXo-4UXo-8Sh h RXo- 2 hXo- 2 thX 0 -i tXo- 2 sXo-7sXo- 2 CXo-5CXo-i Xo- 2ihtXo- 2 5ttXo-iShPaXo-ihiX 0 - 5 4ttXo-i Xo_ihthhlhGXo-isXo- 6 hXo- 1 shXo- 20 iXo-itXo-ihXo-ilXo-2hXo^ 3lXo-32tXo-i Xo-ilXo-i Xo-isPhphXo-3tX 0 .3tXo-38hhXo- 2 hXo- 2 +hX 0 . 2 hXo-3aXo-i3thXo-3sXo-it pXo- 2 ttphX 0 -i 1 ptX 0 _369hXo-i GXo-1 hXo-i httXo-io Xo-shXo-1 hsXo-sp G+tsshGXo^ GhhphX 0 - 26

(Formula VII)

Xo-sRhXo-ilXo-ihXo-iolshpatXo^ltuXo-ilhpthXo-spXo-iPhttXo ^hHpXo-sspXo^+hasaSXo-ilXo-gtXo- 43tXo-i hhhhsoXo-4hhtX 0 _ 1 hhXo- 1 thXo. 2 pXo- 2 hXo- 1 iXo-itXo-6hXo-ihtthphXo. 2 tXo- 4 tXo-49hXo- 1 hhSPIX^ gtsXo-sohXo-isXo-!ppXo-iaXo-zhlhXo-ishXo-it+YXo^hhttXo-iop o^hXo^hpXo^tXo-ip o-ipXo-ih o- ! hXo-iXo-sTstXo-ihXo-zsXo-shXo^hphXo-ihpXo^thhphhhXo-!s GhGtpssXo-!GaGhhtXo^

(Formula VIII)

XMhahohlXo^l Xo-ittXo-iettXo-shHphlhp^

^Xo-^tXo^hXo^P+ o-ih o-^Xo-ihtXo-^^

1 tpXo-ittXo-iUhXo-27hXo- 8 hX 0 . 1 thX 0 _3thX 0 - 8 tttpXo-3hthXo- 2 sX 0 . 1 hpGXo- 1 IXo- ltcXo- 2 hhhtXo. 1 X^ !pGhGXo-iU+uhGhGhhXo-iltXo-i Xo-7

(Formula IX) hsaYh-lphhPXo^pXo-rhXo^lhtthhtXo-iLHphLhthXo-!ttplGISFPthsXo -itXo-^LGthlR!aupttthptLXo !tXo^hhtthXo-asYhXo-ihsXo-!htsVPtXo^spahXo-!hpRhpXo^pXo^sXo- it+hcXo-i+hlp+tXo-!hXo-it- ph+tXo-ihXo-shtXo-ipthtXo-iPahplhSXo-iSstQXo-ihahhhlpXo^tXo- ihXo^pshXo-iGXo- !FstaGLStssslPhaXo-4

(Formula X) The Cas7 protein has used herein preferably comprises or consists of one of the consensus sequences of formula (XI) including sequences sharing at least 85% identity, formula (XII) including sequences sharing at least 85% identity, formula (XIII) including sequences sharing at least 70% identity, and formula (XIV) including sequences sharing at least 85% identity.

Xo^phsusLuap+plXo^SDuhhhttshtpppXo-stXo^slhlpEKuVRGTISNRh KXo.

sShttDPstLssplppsNLXo^QpVDsssLsXo-!StDTLchXo-iFol+VLushtp PssCNsXo- ^aptphhthlptYhpppuhpXo^LAtRYAtNIANuRaLWRNRhuAEtlplplpXo-spXo - shXatFsuXthshppFXo-etpsstclptlutXdltpuLsupXo-itXo-ihhhplpshs phGXo-ituQ-VaPSpELILDpXo- spspKS+hLYtlXo-uPthAuhHSQKIGNAIRTIDsWYP-Xo-!tXo-etPIAIEPYGuV TsXo-iGhuaRpPpXo- 8 pKXo-iDFYohhDsWhhtsXo-ihsX 0 . 3 IEppHalhusLIRGGVFGpts

(Formula XI)

X 0 .iphsusLuap+plXo-2SDuhhhttshtpppXo- 3 tXo-2slhlpEKuVRGTISNRhKXo.

5ShttDPstLssplppsNLX 0 . 4 QpVDsssLsXo-2StDTLchX 0 _i Fol+VLushtpPssCNsX 0 .

itaptphhthlptYhpppuhpXo-TLAtRYAtNIANuRaLWRNRhuAEtlplplpXo -spXo^hXo-iatFsuXo- !thshppFXo-etpsstclptlutXo-i ltpuLsupXo^tXo-!hhhplpshsphGXo-ituQ-VaPSpELILDpXo- 5 pspKS+hLYtlXo-i4PthAuhHSQKIGNAIRTIDsWYP-Xo-itXo- 6 tPIAIEPYGuVTsXo-iGhuaRpPpXo. 8 pKX 0 -iDFYohhDsWhhtsXo-ihsXo- 3 IEppHalhusLIRGGVFGpts

(Formula XII) h-hhhhhpVpXo- ! uNsNGDPXo-iStNhPRXo^sXo-ipXo-iSXo^sXo-!hoDVtlKRplRshhX o^sSlahptX, !thXo-epXo-stXo^p o-ittXo^tXo-i htphhDIRXo-!FGtlhsXo-!tXo^htXo-rthtlpGPIphthupSlpXo-!lXo- 3 phphTtXo-ihssXo-2oPPohtXo-icahlsaulhhhhuXo-ihsXo-iphAt pTGhXo- 2 opcDhphhhpslXo- phhcpX 0 -itosu+Xo-4sphXo- 1 hp X 0 -ilhhh

(Formula XIII) hhX0-1 hhhX0-1hXo-1htsXo-1 NX0-2ttX0.1sX0-12NhsthpXo-5hX0-1ttX0-1o3hshhSspXo.1 htahhXo.3hXo. i27tXo-5PXo-3t-lhGahXo-isXo-itXo-3ipRXo-iuXo-ihtXo-isXo-ihls hXo-i23tXo-5 Xo-4ittXo- 3 athEhX 0 .

itshhX 0 -i hXo-i hX 0 -i lsX 0 -itX 0 -i luX 0 -i 2 /pXo-i pRhX 0 .3lXo-iSlXo-i 5 spXo-i hX 0 - 2 htX 0 -i

(Formula XIV)

Within the above consensus sequence for Cas5, 6 and 7 proteins the recited symbols have the following meaning:

X => any amino acid { A, C, D, E, F, G, H, I, K, L, , N, P, Q, R, S, T, V, W, Y }

A => A { A }

C => C { C } D => D {D}

E => E { E }

F => F {F}

G => G { G }

H => H { H }

1 => 1 {1}

K => K {K}

L => L { L }

M => M { M }

N => N { N }

P => P {P}

Q => Q { Q }

R => R {R}

S => S {S}

T => T { T }

V => V {V}

W => W {W}

Y => Y { Y }

alcohol => o S,T}

aliphatic => I I, L, V }

aromatic => a F, H, W, Y }

charged => c D, E, H, K, R}

hydrophobic => h A, C, F, G, H, I, K, L, M, R, T, V, W, Y }

negative => . D, E}

polar => p C, D, E, H, K, N, Q, R, S, T}

positive => + H, K, R}

small => s A, C, D, G, N, P, S, T, V}

tiny => u A, G, S}

turnlike => t A, C, D, E, G, H, K, N, Q, R, S, T }

Means and methods for determining sequences identity will be discussed herein below. With regard to the above sequence identities of the consensus sequences, it is preferred with increasing preference that the sequence identity is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% and at least 99%.

Type I systems are currently divided into seven subtypes, l-A to l-F and l-U. The type l-C, I- D, l-E and l-F CRISPR-Cas systems are typically encoded by a single operon that encompasses the casl, cas2 and cas3 genes together with the genes for the subunits of the cascade complex, i.e. inter alia Cas 5, 6 and 7. Type l-E and l-F compare well in their structure and composition. With respect to class 1 , type l-F it is of note that type l-F encompasses the variant subtype type l-Fv. This subtype was initially identified in several beta- and gamma-proteobacteria and generally relies on a minimal set of five Cas proteins. Type l-Fv systems contain Cas1 , the integrase that mediates spacer acquisition, Cas3, the target nuclease and Cas6, the crRNA endonuclease. The Type l-Fv Cascade of S. putrefaciens consists of Cas5fv, the crRNA endonuclease Cas6 and the backbone protein Cas7fv. The Cas proteins from S. putrefaciens used in the examples herein below are class 1 , type l-Fv Cas proteins. The Cas proteins to be used in accordance with the present invention are therefore preferably class 1 , type l-C, l-D, l-E or l-F Cas proteins, more preferably class 1 , type l-E or I- F Cas proteins and most preferably class 1 , type l-Fv Cas proteins. Class 1 , type l-Fv Cas5, Cas7 and Cas6 proteins are also referred to herein as Cas5fv, Cas6f and Cas7fv, respectively.

As is shown in the examples herein below it was surprisingly found that solely the two Cas proteins Cas5 and Cas7 are sufficient to specifically encase any desired RNA molecule. In more detail, it was found that fusing the about 8 nucleotides tag being present at the 5'-end of naturally occurring crRNA upstream of the spacer sequence is sufficient to confer the specific binding of Cas5 to any RNA molecule being tagged accordingly. Hence, Cas5 supplies specificity for RNA target selection. Cas5fv interacts with Cas7fv and initiates the formation of a Cas7fv backbone that spans the entire length of the RNA molecule upstream of the tag. It was also unexpectedly found that Cas7fv binds the RNA sequence upstream of the about 8 nt tag being specifically recognized by Cas5 in a sequence and structure-independent fashion. Cas5 interacts with Cas7 and initiates the formation of a backbone consisting of one or more Cas7 proteins that span the entire length of the RNA sequence downstream (i.e. 3') of the tag of about 8 nt. Recombinant Cas5 and Cas7 form stable dimers in solution, which prevents unspecific filament formation by Cas7. Hence, Cas5 can specifically recognize a RNA tag sequences and Cas7 can form a protective filament around the RNA molecule being fused upstream of the tag. In summary, it was found that any desired RNA sequence can be fused at its 5 ' -end to a tag being specifically recognized by Cas5, and upon contacting said fusion construct with Cas5 and Cas7 the specific binding of Cas5 to the tag initiates the encasing of the RNA molecule by Cas7. Cas5 and Cas7 form a filament around the RNA molecule. This process is also referred to herein as RNA-CASing.

Thus, Cas5, Cas7 and the tag being recognized by Cas5 provide all the components necessary for a novel specific RNA stabilization process. The present invention advantageously allows for the generation of stabilized RNA molecules of flexible length and sequence that can be handled, stored and shipped at ambient room temperatures. As shown in the examples herein below, RNA molecules being encased by Cas5 and Cas7 are protected from chemical and enzymatic ribolysis.

Accordingly, Cas5 proteins are functionally characterized herein by their capability of specifically recognizing the 5'-terminal of about 8 nucleotides tag of a crRNA. Cas7 proteins are functionally characterized by their capability of forming extended filaments with RNA molecules, said RNA molecules comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas 5 protein.

The encased RNA molecules as produced in accordance with the present invention can inter alia be used in the following applications: (i) mRNA can be specifically silenced by Cas protein induction. The timing of Cas protein productions determines the timing of mRNA silencing. This process is stringent and not reversible, (ii) RNAs being toxic for the host cell can be produced in RNA-CASing complexes. The RNA transcripts can be generated in vivo and are shielded by RNA-CASing from contacts with other proteins within the host cell. (Hi) Selective purification of tagged RNA molecules is possible. Transcripts are produced in vivo and bound by Cas5fv. Cas5fv can be subjected to affinity purification and co-purified RNAs are isolated, (iv) Purified RNA molecules can be handled, stored and shipped in RNA-CASing at ambient temperatures. The RNA molecules are protected from enzymatic and/or chemical degradation, (v) Cas5fv can be fused with fluorescent proteins and tagged RNAs can be localized in the cell, (vi) The specificity of the targeting reaction can be adjusted by modulating the ratios of RNA:Cas5fv:Cas7fv concentrations and by coordinated timing of RNA and Cas proteins production, (vi) Proteins can be fused with Cas7fv and lined along the Cas7fv filament. This set-up could be used for compartmentalization of proteins that are required to interact in metabolic pathways, (vii) The stabilized RNA might not be fully encased and up to five-nt segments are available for interactions with RNA or ssDNA targets. Thus, regulatory RNA functions that rely on base pairing can be maintained in the RNA-CASing.

In accordance with a preferred embodiment of the invention, the Cas5 protein comprises an amino acid sequence being selected from any one of SEQ ID NOs 1 to 73 or a sequence being at least 70%, preferably at least 80%, and most preferably at least 90% identical thereto, and/or the Cas7 protein comprises an amino acid sequence being selected from any one of SEQ ID NOs 74 to 148 or a sequence being at least 70%, preferably at least 80%, and most preferably at least 90% identical thereto. SEQ ID NOs 1 to 73 are Cas5fv proteins and SEQ ID NOs 74 to 148 are Cas7fv proteins. Hence, all of SEQ ID NOs 1 to 148 are class 1 , type l-Fv proteins. It is preferred that the Cas5fv protein or a sequence being at least 70% identical thereto and the Cas7fv protein or a sequence being at least identical 70% thereto are selected from the same species. As is evident from the appended sequence listing, for example, SEQ ID NOs 1 and 74 are from Shewanella putrefaciens, so that one preferred option is that the Cas5 protein comprises an amino acid sequence of SEQ ID NO: 1 or a sequence being at least 70% identical thereto and the Cas7 protein comprises an amino acid sequence of SEQ ID NO: 74 or a sequence being at least 70% identical thereto.

Also encompassed by the embodiments of the present invention are sequences being at least 70% identical to one or more of SEQ ID NOs 1 to 148. In accordance with the present invention, the term "percent (%) sequence identity" describes the number of matches ("hits") of identical nucleotides/amino acids of two or more aligned nucleic acid or amino acid sequences as compared to the number of nucleotides or amino acid residues making up the overall length of the template nucleic acid or amino acid sequences. In other terms, using an alignment, for two or more sequences or subsequences the percentage of amino acid residues or nucleotides that are the same (e.g. 70%, 80%, 85%, 90% or 95% identity) may be determined, when the (sub)sequences are compared and aligned for maximum correspondence over a window of comparison, or over a designated region as measured using a sequence comparison algorithm as known in the art, or when manually aligned and visually inspected. The sequences which are compared to determine sequence identity may thus differ by substitution(s), addition(s) or deletion(s) of nucleotides or amino acids. This definition also applies to the complement of a test sequence.

The skilled person is also aware of suitable programs to align nucleic acid sequences. The percentage sequence identity of polypeptide sequences can, for example, be determined with programmes as the above explained programmes CLUSTLAW, FASTA and BLAST. Preferably the BLAST programme is used, namely the NCBI BLAST algorithm (Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402). With regard to the sequence identity as recited herein, it is preferred with increasing preference that the sequence identity is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% and at least 99%. With regard to the molecules sharing at least 70% with a Cas5 protein it is to be understood that these molecules retain the capability of Cas5 of specifically recognizing the 5'-terminal of about 8 nucleotides tag of a crRNA. Similarly, the molecules sharing at least 70% with a Cas7 protein retain the capability of Cas7 of forming extended filaments with RNA molecules, said RNA molecules comprising at its 5'-end a heterologous tag consisting of about 8 nucleotides and being recognized by a Cas 5 protein.

In accordance with a more preferred embodiment of the invention, the Cas5 protein comprises the amino acid sequence of SEQ ID NO: 1 or a sequence being at least 70%, preferably at least 80%, and most preferably at least 90% identical thereto, the Cas7 protein comprises the amino acid sequence of SEQ ID NO: 74 or a sequence being at least 70%, preferably at least 80%, and most preferably at least 90% identical thereto, and the heterologous tag has the nucleotide sequence of 5'-CUUAGAAA-3').

SEQ ID NOs 1 and 74 are the Cas5fv and Cas7fv proteins from S. putrefaciens that are used in the examples herein below. 5'-CUUAGAAA-3' is the sequence of the 8 nucleotides tag and being recognized by a Cas 5 protein. With regard to the molecules sharing at least 90% identity with the tag sequence of 5'- CUUAGAAA-3' it is to be understood that these tag sequences are still specifically recognizing by a Cas5 protein.

In accordance with a preferred embodiment of the invention, the tag further comprises at its 5'-end a ribonucleic acid sequence that can be cleaved off by an enzyme and/or ribozyme, and the one or more nucleic acid molecules further encode in expressible form the enzyme and/or ribozyme being capable of cleaving off the ribonucleic acid sequence from the 5'-end of the tag. Enzymes accelerate chemical reactions. The molecules upon which enzymes may act are called substrates and the enzyme converts the substrates into different molecules known as products. Enzymes are generally proteins. Ribozymes (or ribonucleic acid enzymes) are RNA molecules that are capable of catalyzing specific biochemical reactions, similar to the action of protein enzymes.

A tag further comprising at its 5 ! -end a ribonucleic acid sequence that can be cleaved off by an enzyme and/or ribozyme allows to further control the process of producing an encased ribonucleic acid molecule. This is because the enzyme and/or ribozyme being capable of cleaving off the ribonucleic acid sequence from the 5'-end of the tag can be placed, for example, under the control of an inducible promoter and only once the RNA encoding the enzyme and/or ribozyme is expressed and translated into the enzyme and/or ribozyme the ribonucleic acid sequence is cleaved off thereby generating a 5'end that can be bound by Cas5.

In accordance with a more preferred embodiment of the invention, the ribonucleic acid sequence that can be cleaved off by an enzyme and/or ribozyme is a repeat sequence, and the enzyme and/or ribozyme is a Cas6 protein.

As mentioned herein above, Cas6 is a crRNA endonuclease. In more detail, Cas6 is a endonuclease being capable of processing a "pre-crRNA" into the mature crRNA by cleaving-off a ribonucleic acid sequence being located upstream of the about 8 nt tag being recognized by Cas5. The sequence being cleaved of by Cas6 is called repeat sequence in the art. Hence, the enzymatic cleavage by Cas6 generates the binding site for Cas5.

The Cas6 protein is preferably a Cas6 protein selected from SEQ ID NOs 149 to 248. SEQ ID NOs 149 to 248 are Cas6f. Hence, all of SEQ ID NOs 149 to 248 are class 1 , type l-Fv proteins. Also encompassed by the embodiments of the present invention are sequences being at least 70% identical to one or more of SEQ ID NOs 149 to 248. It is preferred with increasing preference that the sequence identity is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% and at least 99%. It is preferred that the Cas5fv protein or a sequence being at least 70% identical thereto, the Cas6f protein or a sequence being at least identical 70% thereto and the Cas7fv protein or a sequence being at least identical 70% thereto are selected from the same species. As is evident from the appended sequence listing, for example, SEQ ID NOs 1 , 74 and 149 are from Shewanella putrefaciens, so that one preferred option is that the Cas5 protein comprises an amino acid sequence of SEQ ID NO: 1 or a sequence being at least 70% identical thereto, the Cas6 protein comprises an amino acid sequence of SEQ ID NO: 149 or a sequence being at least 70% identical thereto and the Cas7 protein comprises an amino acid sequence of SEQ ID NO: 74 or a sequence being at least 70% identical thereto.

With regard to the molecules sharing at least 70% with a Cas6 protein it is to be understood that these molecules retain the capability of Cas6 of acting as a endonuclease being capable of processing a "pre-crRNA" into the mature crRNA by cleaving-off a ribonucleic acid sequence being located upstream of the about 8 nt tag being recognized by Cas5. For instance, in S. putrefaciens the repeat sequence including the 8 nt tag sequence has the following sequence "GUUCACCGCCGCACAGGCGGCUUAGAAA" (SEQ ID NO: 249) and after the cleavage by Cas6f of S. putrefaciens the 8nt tag "CUUAGAAA" forms the 5'-end. It is accordingly preferred that in case Cas6f of S. putrefaciens is used the 5'-end of the RNA molecule to be encased comprises or consists of "GUUCACCGCCGCACAGGCGGCUUAGAAA" (SEQ ID NO: 249), so that after the cleavage by Cas6f an RNA molecule having the 5'-end "CUUAGAAA" is obtained. In accordance with another preferred embodiment of the invention, the ribonucleic acid molecule is expressed from a vector and the one or more nucleic acid molecules are expressed from one or more different vectors.

In accordance with this preferred embodiment it is possible to introduce the vectors encoding the RNA molecule and the vectors encoding the Cas proteins separately into a host cell. The vector backbones may be the same of different. It is preferred in this connection that the one or more nucleic acid molecules are expressed from one vector and the RNA molecule is expressed from another vector. The vectors may be any conventional type that suits the expression of the molecules to be expressed, e.g., the Cas proteins and/or the RNA molecule to be encased, in the desired host cell. Hence, the vectors are expression vectors. Preferably, the vector is a plasmid, cosmid, virus, bacteriophage or another vector used conventionally e.g. in genetic engineering. The skilled person will be able to select those vectors from the art and confirm their particular suitability for the desired purpose by routine methods and without an undue burden.

The one or more vectors of the invention comprise one or more nucleic acids of the invention and are expressing the proteins and/or RNA molecules of the invention. In certain embodiments, such vectors are selected from the group consisting of pQE vectors, pET vectors, pFUSE vectors, pUC vectors, YAC vectors, phagemid vectors, phage vectors, vectors used for gene therapy such as retroviruses, adenoviruses, or adeno-associated viruses. For vector modification techniques, see Sambrook and Russel, Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press, 2001. Generally, vectors can contain one or more origins of replication (ori) and inheritance systems for cloning or expression, one or more markers for selection in the host, e.g., for antibiotic resistance, and one or more expression cassettes. Suitable origins of replication (ori) include, for example, the Col E1 , the SV40 viral and the M13 origins of replication.

The coding sequences inserted in the vector can e.g. be synthesized by standard methods, or isolated from natural sources. Ligation of the coding sequences to transcriptional regulatory elements and/or to other amino acid encoding sequences can be carried out using established methods. Transcriptional regulatory elements (parts of an expression cassette) ensuring expression in host cells are well-known to those skilled in the art. These elements comprise regulatory sequences ensuring the initiation of transcription (e.g., translation initiation codon, promoters, such as naturally-associated or heterologous promoters and/or insulators). Additional regulatory elements may include transcriptional as well as translational enhancers. Preferably, the fusion construct of the invention is operatively linked to such expression control sequences allowing expression in a bacteria host cells.

As discussed herein above the host cell is preferably a bacterial host cell. Suitable bacterial expression vectors that can be used in connection with the present invention are known in the art. Non-limiting examples of bacterial expression vectors that can be used are pACYC177, pASK75, pBAD/His A, pBAD/His B, pBAD/His C, pBAD/MCS, pBADM-1 1 , pBADM-20, pBADM-20(+), pBADM-30, pBADM-30(+), pBADM-41 (+), pBADM-52, pBADM- 52(+), pBADM-60, pBADM-60(+), pBAT4, pBAT5, pCal-n, pET-3a, pET-3b, pET-3c, pET-3d, pET-12a, pET-14b, pET-15b, pET-16b, pET-19b, pET-20b(+), pET-21 d(+), pET-22b(+), pET- 24d(+), pET-28a, pET-28c, pET-32a(+), pET-32b(+), pET-32c(+), pET-39b(+), pET-40b(+), pETM-10, pETM-1 1 , pETM1 1 -SUM03GFP, pETM-12, pETM-13, pETM-14, pETM-14_ccdB, pETM-20, pETM-21 , pETM-22, pETM-22_ccdB, pETM-30, pETM-33, pETM-33_ccdB, pETM-40, pETM-41 , pETM-43, pETM-44, pETM-44_ccdB, pETM-50, pETM-51 , pETM-52, pET -54, pETM-55, pETM-60, pETM-66, pETM-70, pETM-80, pETM-82, pGAT, pGAT2, PGEX-3X, pGEX-4T-1 , pGEX-4T-2, pGEX-4T-3, pGEX-5X-1 , pGEX-5X-2, pGEX-5X-3, pGEX-6P-1 , pGEX-6P-2, pGEX-6P-3, pHAT, pHAT2, pKK223-3, pKK223-2, pMal-c2, pMal- p2, pProEx HTa, pProEx HTb, pProEx HTc, pQE-16, pQE-30, pQE-31 , pQE-32, pQE-60, pQE-70, pQE-80L, pQE-81 L, pQE-82L, pRSET A, pRSET B, pRSET C, pTrcHis2 A, pTrcHis2 B, pTrcHis2 C, pTrcHis2LacZ, pZA31-Luc, pZE12-Luc, pZE21-MCS-1 , and pZS*24-MCS-1.

The co-transformation with a selectable marker such as kanamycin or ampicillin resistance genes for culturing in Escherichia coli and other bacteria allows the identification and isolation of the transformed cells. Selectable marker genes for mammalian cell culture include the dhfr, gpt, neomycin, hygromycin resistance genes. The transfected nucleic acid can also be amplified to express large amounts of the encoded (poly)peptide. The DHFR (dihydrofolate reductase) marker is useful to develop cell lines that carry several hundred or even several thousand copies of the gene of interest. Another useful selection marker is the enzyme glutamine synthase (GS) (Murphy et al. 1991 ; Bebbington et al. 1992). Using such markers, the cells are grown in selective medium and the cells with the highest resistance are selected.

In accordance with a further preferred embodiment of the invention the expression from the heterologous fusion construct and/or the expression from the one or more nucleic acid molecules is/are under the control of the same or different inducible promoter(s).

In this connection it is preferred that the expression from the heterologous fusion construct and/or the expression from the one or more nucleic acid molecules is/are under the control of different inducible promoter(s). The activity of inducible promoters is induced by the presence or absence of biotic or abiotic factors. Inducible promoters are a very powerful tool in genetic engineering because the expression of genes operably linked to them can be turned on or off at a certain time or at a certain stage, e.g. a stage of development of an organism or in a particular tissue/cell-type. Inducible promoters can be grouped into chemically-regulated promoters and physically- regulated promoters. Chemically-regulated promoters are promoters whose transcriptional activity is regulated by the presence or absence of a chemical compound, e.g., alcohol, tetracycline, steroids, or a metal. Physically-regulated promoters are promoters whose transcriptional activity is regulated by the presence or absence of a physical parameter, e.g., light and low or high temperatures. A vast number of inducible promoters are available in the art.

A preferred examples of a chemically-regulated promoters are tetracycline-responsive promotes, which can function either to activate or repress gene expression system in the presence of tetracycline. Some of the elements of the systems include a tetracycline repressor protein (TetR), a tetracycline operator sequence (tetO) and a tetracycline transactivator fusion protein (tTA), which is the fusion of TetR and a herpes simplex virus protein 16 (VP16) activation sequence. Eukaryotic cells transformed with the promoter systems including animal cells are claimed. Another example is the arabinose inducible promoter P BA D- In accordance with another preferred embodiment of the invention the expression from the heterologous fusion construct is initiated before the expression from the one or more nucleic acid molecules, or vice versa. This is because it is generally advantageous to induce the expression of the Cas proteins along with and more preferably even before the expression of the RNA molecule, so that the RNA molecule becomes quickly encased within the host cells and does not stay unprotected within the host cell. Expressing the Cas proteins before the RNA molecule to be encased may be achieved by placing the expression from the heterologous fusion construct and the expression from the one or more nucleic acid molecules under the control of different inducible promoters, and inducing the expression from the one or more nucleic acid molecules first. This may also be achieved by placing the expression from the heterologous fusion construct under the control of an inducible promoter and placing the expression from the one or more nucleic acid molecules under the control of a constitutive active promoter, and inducing the expression from the inducible promoter at a desired time. Alternatively, the expression from the heterologous fusion construct may be placed under the control of a first inducible promoter and the expression from the one or more nucleic acid molecules may be placed under the control of a second inducible promoter, wherein the expression from the first and the second promoter can be induced by different means and hence independently of each other at a desired time.

In accordance with a further preferred embodiment of the invention the expression from the heterologous fusion construct and/or the expression from the one or more nucleic acid molecules is/are under the control of two or more promoters giving raise to different expression yields.

Also a vast number of promoters having various strengths are available in the art. These promoters may be inducible or constitutively active and are preferably inducible.

A suitable strength of the promoters can be selected, for example, on the basis of the RNA molecule to be encased. As explained above each 6 nucleotides subsequence of the RNA molecule downstream of the tag is bound by one Cas7fv protein. The expression yields of the Cas7 protein may be adjusted, for example, on the basis of the length of the RNA molecule. The longer the RNA molecule the more copies of Cas7 proteins are required to encase the RNA molecule. Since irrespective of the length of the RNA molecule only one copy of the Cas5 protein per copy of RNA molecule is required and it is in general advantageous to have a slight excess of the copies of Cas5 in order to ensure efficient and fast encasing, the expression of Cas5 protein may be placed under the control of a (slightly) stronger promoter as the expression of the RNA molecule. Generally the expression of Cas7 protein is placed under the control of the strongest promoter, in particular in case long RNA molecules are to be encased.

In accordance with a yet further preferred embodiment of the invention, the encased ribonucleic acid molecule may be purified from the host cell.

Since the encased ribonucleic acid is a ribonucleoprotein (RNP) protein purification methods may be used. Protein purification, in accordance with the invention specifies a process or a series of processes intended to further isolate the RNP of the invention from a complex mixture preferably to homogeneity. Purification steps, for example, exploit differences in protein size, physico-chemical properties and binding affinity. Methods of purification of the ribonucleoprotein produced are well-known in the art and comprise without limitation method steps such as selective centrifugation, ion exchange chromatography, gel filtration chromatography (size exclusion chromatography), affinity chromatography, high pressure liquid chromatography (HPLC), reversed phase HPLC, disc gel electrophoresis or immunoprecipitation.

Ribonucleoproteins may be purified according to their isoelectric points by running them through a pH graded gel or an ion exchange column. Further, ribonucleoproteins may be separated according to their size or molecular weight via size exclusion chromatography or by SDS-PAGE (sodium dodecyl sulfate-polyacrylamide gel electrophoresis) analysis. In the art, proteins are often purified by using 2D-PAGE and are then further analysed by peptide mass fingerprinting to establish the protein identity. This is very useful because the detection limits for proteins are very low and nanogram amounts of protein are in general sufficient for their analysis. Proteins may also be separated by polarity/hydrophobicity via high performance liquid chromatography or reversed-phase chromatography. Most preferably, the resulting purity of the ribonucleoprotein is more than 98%, such as 99% or 100% purity, i.e. they are free or essentially free of contaminants.

According to a preferred embodiment of the invention the protein Cas5, Cas7 and/or Cas6 comprise(s) an affinity purification tag or a fluorescent tag. Affinity purification tag can be used to purify Cas5, Cas7 and/or Cas6 from the host cell. In case Cas5 and/or Cas7 encase the RNA molecule the encased RNA molecule can be purified along with Cas5, and/or Cas7. Commonly used purification tags are, for example, polyhistidine-, glutathione S-transferase-, and maltose binding protein, calmodulin binding peptide, intein-chitin binding domain, and streptavidin/biotin-based tags

Fluorescent tags can be used to visualize Cas5, Cas7 and/or Cas6 into the host cell. Fluorescent tags include fluorescent proteins and fluorescent dyes. Examples of fluorescent dyes are Alexa Fluor or Cy dyes. Examples of fluorescent proteins are GFP, CFP, RFP, YFP and Cherry.

In accordance with a yet further preferred embodiment of the invention at least two distinct types of Cas7 proteins are expressed, said at least two distinct types of Cas7 proteins are distinguished in that they are fused to different proteins.

In this connection it is preferred that the RNA molecule downstream (i.e. 3') of the about 8 nt tag has a length, wherein at least two copies of Cas7 proteins can bind. This ensures that at least two distinct types of Cas7 proteins can bind to the same RNA molecule, so that different proteins being fused to the distinct types of Cas7 proteins are brought into close proximity. This in turn can be used, for example, to trigger cellular reactions requiring that two proteins come into close proximity or contact.

The present application relates in a second aspect to the fusion construct or the heterologous ribonucleic acid molecule as defined in connection with the first aspect of the invention.

The present application relates in a third aspect to a vector comprising the fusion construct as defined in connection with the preceding aspects of the invention. The present application relates in a fourth aspect to a host cell comprising the fusion construct or the heterologous ribonucleic acid molecule or the vector as defined in connection with the preceding aspects of the invention.

The present application relates in a fifth aspect to a kit comprising the fusion construct or the heterologous ribonucleic acid molecule, the vector, and/or the host cell as defined in connection with the preceding aspects of the invention. The preferred embodiments, definitions and explanations described herein above in connection with the first aspect of the invention as far as being applicable to the above embodiment of the invention apply mutatis mutandis to the second to fifth aspects of the invention.

The various components of the kit may be packaged into one or more containers such as one or more vials. The vials may, in addition to the components, comprise preservatives or buffers for storage. The kit may comprise instructions how to use the kit, which preferably inform how to use the components of the kit to encase an RIMA molecule.

As regards the embodiments characterized in this specification, in particular in the claims, it is intended that each embodiment mentioned in a dependent claim is combined with each embodiment of each claim (independent or dependent) said dependent claim depends from. For example, in case of an independent claim 1 reciting 3 alternatives A, B and C, a dependent claim 2 reciting 3 alternatives D, E and F and a claim 3 depending from claims 1 and 2 and reciting 3 alternatives G, H and I, it is to be understood that the specification unambiguously discloses embodiments corresponding to combinations A, D, G; A, D, H; A, D, I; A, E, G; A, E, H; A, E, I; A, F, G; A, F, H; A, F, I; B, D, G; B, D, H; B, D, I; B, E, G; B, E, H; B, E, I; B, F, G; B, F, H; B, F, I; C, D, G; C, D, H; C, D, I; C, E, G; C, E, H; C, E, I; C, F, G; C, F, H; C, F, I, unless specifically mentioned otherwise.

Similarly, and also in those cases where independent and/or dependent claims do not recite alternatives, it is understood that if dependent claims refer back to a plurality of preceding claims, any combination of subject-matter covered thereby is considered to be explicitly disclosed. For example, in case of an independent claim 1 , a dependent claim 2 referring back to claim 1 , and a dependent claim 3 referring back to both claims 2 and 1 , it follows that the combination of the subject-matter of claims 3 and 1 is clearly and unambiguously disclosed as is the combination of the subject-matter of claims 3, 2 and 1. In case a further dependent claim 4 is present which refers to any one of claims 1 to 3, it follows that the combination of the subject-matter of claims 4 and 1 , of claims 4, 2 and 1 , of claims 4, 3 and 1 , as well as of claims 4, 3, 2 and 1 is clearly and unambiguously disclosed.

The figures show. Figure 1 - Silencing of the GFP signal by Cas proteins. A) Differential interference contrast (DIC) and GFP fluorescence (GFP) imaging of E. coli cells that produced gfp mRNA transcript with (+repeat) or without (-repeat) added tag sequences. GFP fluorescence is only abolished in the +repeat constructs when Cas5fv, Cas7fv and Cas6f (+Cas) are produced. B) Northern blot analysis with a probe against gfp mRNA verified that the presence of a repeat tag yields gfp mRNAs in purified "RNA-CASing" complexes. Figure 2 - Filamentation of mRNA molecules. (A) Schematic model of filament formation. Cas6f cleavage within a repeat sequence upstream of a mRNA sequence releases a 5'-terminal 8 nt repeat tag. The tag is recognized by Cas5fv which initiates Cas7fv filament formation. (B) Crystal structure of Cas7fv filaments and transmission-electron microscopy images of filaments in complex with silenced gfp mRNA

Figure 3 - Stability of RNA molecules in "RNA-CASing" complexes. The Cas5fv/Cas7fv filaments were purified via Ni-NTA chromatography. The purified ribonucleoprotein complexes were incubated for the indicated time in the presence of added RNase. Figure 4 - Specificity of filament formation. RNA was extracted from filaments produced in E. coli cells containing either a plasmid with a repeat sequence upstream of the sfGFP gene (Repeat) or a plasmid without the repeat sequence upstream of the sfGFP gene (Control). RNA-seq (lllumina Hiseq3000) of the RNA component of the filaments was performed. The obtained reads were mapped to the E. coli host genome and the sequence of a pBAD plasmid containing the sfGFP gene sequence (indicated by an arrow). Reads of the control mapped to the genome and represent highly abundant RNAs (including ribosomal RNAs with a maximum number of -25,000 mapped reads). In contrast, the vast majority of reads from the specific Repeat filaments mapped to the sequence of the sfGFP gene on the plasmid (maximum number of reads -800,000). These assays demonstrate that the addition of the repeat sequence specifically guides filament formation.

Figure 5 - Helical artifact formation of Thermoproteus tenax Cas7 (TTX 1251 , belonging to a Type-IA CRISPR-Cas system) with E. coli RNA. A typical TEM analysis of a Cas7 purification resulted in cross-contamination with E. coli RNA and the formation of helical (black arrows) and double-helical filaments (white arrows) of up to 50 nm length. Scale bar: 50 nm.

Figure 6 - Helical artifact formation of Aromatoleum aromaticum EbN1 Cas7

(WP_01 1254653, belonging to a Type-IU CRISPR-Cas system) with E. coli RNA. A typical TEM analysis of a Cas7 purification resulted in cross-contamination with E. coli RNA and the formation of helicalfilaments (white arrows) of up to 200 nm length. Scale bar: 100 nm. Figure 7 - Direct RNA sequencing of a repeat-tagged non-coding RNA isolated by RNA-CASing and subjected to Nanopore sequencing.

The examples illustrate the invention. Example 1 - Material and Methods

Production and purification of RNA-CASing filaments

Filament formation along the sfGFP transcript was investigated in the presence of a repeat sequence upstream of the sfGFP gene (Repeat) and without the repeat sequence upstream of the sfGFP gene (Control). In both cases, filaments were observed in E. coli and purified as detailed below. The N-terminally hexa-his tagged Cas5fv was co-produced together with Cas7fv and Cas6f from a pRSF-Duet vector. In addition, sfGFP transcripts (with or without 5' terminal repeat tags) were produced from a pBAD vector. Expression cultures of E. coli BL21 (DE3) were grown to an OD 60 onm of 0.6-0.8 in 1 I LB medium supplemented with the appropriate antibiotics. The cultures were induced with 1 mM Isopropyl β-D-l-thiogalactopyranoside (IPTG) and 0.2% arabinose for cas gene and sfGFP gene expression, respectively. The induced cultures were then incubated overnight at 18 °C and harvested by centrifugation (8000 rpm, 30 min, 4 °C) the next day. Cell pellets were resuspended in 5x volume of wash buffer (50 mM Tris-HCI pH 7.0, 300 mM NaCI, 10 mM MgCI 2 , 10% glycerol, 20 mM Imidazole, 1 mM dithiothreitol (DTT)) and lysed by addition of lysozyme (from chicken egg white, Sigma-Aldrich) and subsequent loading on a LM10 Microfluidizer (Microfluidics). The lysate was cleared by additional centrifugation (18000 rpm, 45 min, 4 °C) and loaded on a 5 ml HisTrap HP column (GE Healthcare) for Ni-NTA affinity purification. After a wash step to remove unbound proteins, filaments were purified utilizing the His-tag on Cas5fv to co-purify Cas7fv subunits and the complexed RNA. The elution buffer contained wash buffer and additionally 200 mM imidazole.

RNA extraction and RNA-sequencing

RNA was isolated from affinity purified filaments via acidic phenol/chloroform extraction (pH 4.5) and subsequent ethanol precipitation. Stable gfp mRNA molecules were obtained and visualized. To facilitate RNA-seq analysis of the filaments' RNA component, 5 pg of the obtained RNA was treated with DNase I (2 U for 2 h, 37 °C) and fragmented by ZnCl 2 treatment (10 mM Tris pH 6.8, 10 mM ZnCl 2 ) and heating (5 min, 95 °C). Fragmentation was confirmed on a denaturating polyacry!amide gel (10 %, 8 M Urea) and the digested RNA was extracted from the gel. Fragmented RNA was dephosphorylated (20 U T4 PNK in 100 mM Tris/HCI pH 6.5, 100 mM magnesium acetate, 5 mM β-mercaptoethanol for 6 h at 37 °C) and subsequently phosphorylated (with 20 U T4 PNK and 1 mM ATP for 1 h at 37 °C). These RNA preparation (from control and repeat filaments) were used as template for cDNA library construction using the NEBNext® Small RNA Library Kit, according to the manufacturer's instructions. Illumina sequencing (HiSeq3000) was performed by the Max Planck-genome- centre Cologne. The obtained reads were trimmed for adaptor sequences and mapped against the E. coli BL21 (DE3) host genome and the sequence of the pBAD plasmid including the sfGFP gene.

Northern Blot analysis

RNA was extracted from filaments as described above and separated on a denaturating polyacrylamide gel (6 %, 8 M Urea). The RNA was then transferred on a positively charged nylon membrane (Roti®-Nylon plus, pore size 0.45 μιτι) using a semi-dry electrophoretic transfer system (2 h at 20 V) and UV-crosslinked. Pre-hybridization was performed in Ultrahyb-Oiigo Hybridization Buffer (1 ml/10 cm 2 ) for 30 min at 42 °C to block non-specific binding sites. A 5'-radiolabeled probe specific for sfGFP (5 - GAAGCTAATGGTACGTTCCTGCACATAGCCTTCCG-3') (SEQ ID NO: 250) was added to the hybridization buffer (10 5 οριτι/μ[ hybridization buffer) after incubation at 95 °C for 5 min. Hybridization was performed overnight at 50°C. The following day, the blot was washed twice, with 15 ml low stringency buffer (2x SSC, 0.1 % SDS) and with 15 ml high stringency buffer (1x SSC, 0.1 % SDS) for 30 min at 50 °C to remove unbound probe. The membrane was then visualized by phosphoimaging.

Fluorescence Microscopy

E coli cultures for fluorescence microcopy analysis were grown for 1 h after induction of Cas protein and sfGFP production. Subsequently, 4 μΙ of the cell culture was fixated on a microscope slide covered with 2 % agarose. Samples were analyzed with the Axioplan 2 fluorescence microscope (Carl Zeiss Microscopy GmbH) using an ocular ( 0X) and a Plan- Apochromat (100X), 1.4 oil DIC immersion objective. Images were taken with the GFP- and DIC-protocol from Metamorph (version 62r6). The exposure time was 60 ms for DIC and 800 ms for GFP. Example 2 - RNA-CASing by CasSfv and Cas7fv

It is demonstrated herein that the 8 nt RNA tag being recognized by Cas5fv can be fused with any RNA molecule which results in specific recognition by Cas5fv and filament protection of the RAN molecule by Cas7fv. This allows for the isolation of stabilized RNA molecules of flexible length and sequence that can be handled, stored and shipped at ambient room temperatures. These filaments are termed "RNA-CASing" herein as they provide protection from chemical and enzymatic ribolysis. The setup of experiments was as follows: Activity of the green fluorescent protein (GFP) was utilized to follow translation of gfp messenger RNA (mRNA). This mRNA was fused with a CRISPR repeat sequence at its transcription start site and the construct was cloned into a pBAD plasmid. All experiments were performed in Escherichia coli. Transcription was induced by addition of 0.2% arabinose and green fluorescence was observed as expected. Next, a second plasmid, pRSF-Duet, was co-transformed that contains all elements necessary for the IPTG-inducible production of S. putrefaciens Cas5fv, Cas7fv and Cas6. The presence of these Cas proteins resulted in reduction of the green fluorescence signal near background levels. As a control, it was verified that the omission of the repeat tag upstream of the gfp messenger RNA resulted in green fluorescent cells even in the presence of the three Cas proteins (Figure 1 ).

Cas6 is expected to cleave the repeat sequence, yielding the 5'-terminal 8 nt tag (5'-CUUAGAAA-3') which is recognized by the Cas5fv protein, inducing Cas7fv filament formation along the gfp mRNA spine (Figure 2A). To verify this process, a Cas5fv variant containing an N-terminal His-tag which allows affinity purification of the "RNACASing" complexes was used. These structures were purified via Ni-NTA chromatography and transmissionelectron microscopy revealed intact helical complexes (Figure 2 B).

Example 3 - RNA stabilization via-CASing

It was assayed how stable the RNA molecules within the RNA-CASing complexes are. The Cas5fv/Cas7fv filaments were purified via Ni-NTA chromatography. The purified ribonucleoprotein complexes were incubated for the indicated time in the presence of added RNase.

A band corresponding to the size of the gfp mRNA was detected and the intensity of the band remained largely unchanged even after incubation of the "RNA-CASing" filaments for a week in the presence of added ribonucleases (Figure 3). Hence, RNA-CASing stabilizes RNA molecules and protects them from degradation.

Example 4 - Specificity of filament formation

RNA was extracted from filaments produced in E. coli cells containing either a plasmid with a repeat sequence upstream of the sfGFP gene (Repeat) or a plasmid without the repeat sequence upstream of the sfGFP gene (Control). RNA-seq (lllumina Hiseq3000) of the RNA component of the filaments was performed. The obtained reads were mapped to the E. coli host genome and the sequence of a pBAD plasmid containing the sfGFP gene sequence (indicated by an arrow in Figure 4). Reads of the control mapped to the genome and represent highly abundant RNAs (including ribosomal RNAs with a maximum number of ~25,000 mapped reads). In contrast, the vast majority of reads from the specific Repeat filaments mapped to the sequence of the sfGFP gene on the plasmid (maximum number of reads ~800,000). These assays demonstrate that the addition of the repeat sequence specifically guides filament formation.

Example 5 - Filamentation by Cas5 and Cas7 family proteins Type l-A CRISPR-Cas, Thermoproteus tenax Cas7 (TTX_1251 )

Figure 5 shows helical artifact formation of Thermoproteus tenax Cas7 (TTX_1251 , belonging to a Type-IA CRISPR-Cas system) with E. coli RNA. A typical TEM analysis of a Cas7 purification resulted in cross-contamination with E. coli RNA and the formation of helical (black arrows) and double-helical filaments (white arrows) of up to 50 nm length. Scale bar: 50 nm.

Type-IU CRISPR-Cas system, Aromatoleum aromaticum EbN1 Cas7 (WP_011254653)

Figure 6 shows helical artifact formation of Aromatoleum aromaticum EbN1 Cas7 (WP_011254653, belonging to a Type-IU CRISPR-Cas system) with E. coli RNA. A typical TEM analysis of a Cas7 purification resulted in cross-contamination with E. coli RNA and the formation of helicalfilaments (white arrows) of up to 200 nm length. Scale bar: 100 nm.

Hence, not only Cas5fv and Cas7fv from S. putrefacines but also Cas5 and Cas7 proteins from other bacteria having a class I, type I CRISPR-Cas system are capable forming filaments on crRNA. It is thus expected that not only Cas5fv and Cas7fv from S. putrefacines can be used in the RNA-CASing process being described herein but also Cas5 and Cas7 proteins from other class I, type I CRISPR-Cas systems. Example 6 - Direct RNA sequencing of a non-coding RNA isolated by RNA-CASing

A ~640 nt pRSFDuet vector backbone sequence that did not contain a coding sequence or ribosome binding sites was cloned into a pBAD vector. The fragment was inserted downstream of the repeat sequence of the S. putrefaciens CN-32 Type l-Fv CRISPR array. The fusion of these sequences created an engineered repeat-tagged non-coding RNA that was produced in E. coli BL21 (DE3) together with the Type l-Fv Cas proteins Cas6f, Cas5fv and Cas7fv. Specific filaments were purified via Ni-NTA column purification via an N-terminal His-tag on Cas5fv. The encased RNA molecules were extracted by phenol/chloroform treatment and recovered in water after ethanol precipitation. The extracted RNA was used as template for the Direct RNA Sequencing Kit from Oxford Nanopore Technologies, according to the manufacturer's instructions. The obtained sequencing reads were mapped against the genome of the expression strain and the expression plasmid. A maximum of ~100 reads, matching the sequence of the entire non-coding RNA can be observed which highlights specific recovery of the encased RNA; see Figure 7.

The sequence of the plasmid pBAD with the non-coding RNA-repeat fusion is shown in SEQ ID NO: 252. The sequence of the non-coding RNA in RNA-CASing is shown in SEQ ID NO: 251 and is also provided in the following (single underlining highlights the full repeat, double underlining highlights the 5' handle, bold letters highlight the non-coding sequence, italics highlights the transcription terminator):

GTTCACCGCCGCACAGGCGGCTTAGAAAGCAAAAAGCAAAGCACCGGAAGAAGCCAA

CGCCGCAGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGAC

GCTCAAGCCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCC CTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTC CGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTTGGTATCTC AGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAG CCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACAC GACTTATCGCCACTGGCAGCAGCCATTGGTAACTGATTTAGAGGACTTTGTCTTGAAG TTATGCACCTGTTAAGGCTAAACTGAAAGAACAGATTTTGGTGAGTGCGGTCCTCCAA CCCACTTACCTTGGTTCAAAGAGTTGGATCTGCAGCTGGTACCATATGGGAATTCGAA GCTTGGCTGTTTTGGCGGATGAGAGAAGATTTTCAGCCTGATACAGATTAAATCAGAA CGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGrAGCGCGGTGGTCCCA CCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGG TCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTC GAAAGACTGGGCCTT