Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PLANT U14 NUCLEIC ACID SEQUENCES AND DERIVATIVES THEREOF
Document Type and Number:
WIPO Patent Application WO/1995/030747
Kind Code:
A1
Abstract:
The invention concerns a nucleic acid sequence comprising: [Fa-Ub-Fc]n wherein a + b + c 1 and n 1; and U represents a sequence, "U14", present in a plant genome as a non-translated sequence, said non-translated sequence comprising: i) at least one of the following phylogenetically conserved sequences, wherein N represents any one of the nucleotides A, C, G or T: I: TGATGA or T; II: CATTCGCAGTNNCCNCCTAAGA; or III: CCTTCCTNGGATGTCTGA or a sequence having at least 90 % homology with one of sequences I, II or III, and ii) at least one plant-specific region and, iii) a pair of inverted repeats, and F represents a non-translated flanking sequence which, in a plant genome, is immediately adjacent to the said "U14" sequence.

Inventors:
BROWN JOHN WILLIAM SLESSOR (GB)
LEADER DAVID JOHN (GB)
WAUGH ROBBIE (GB)
Application Number:
PCT/EP1994/001409
Publication Date:
November 16, 1995
Filing Date:
May 04, 1994
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
GENE SHEARS PTY LTD (AU)
BROWN JOHN WILLIAM SLESSOR (GB)
LEADER DAVID JOHN (GB)
WAUGH ROBBIE (GB)
International Classes:
C12N15/11; C12N15/82; C12Q1/68; C12Q1/6895; (IPC1-7): C12N15/11; A01H5/00; C12N15/82
Other References:
LEADER, D.J., ET AL.: "Novel genomic organization of plant U14 small nucleolar RNAs", J. EXPER. BOT., vol. 45, 1994, pages 11
CHEMICAL ABSTRACTS, vol. 120, no. 3, 17 January 1994, Columbus, Ohio, US; abstract no. 27400, SOLYMOSY, F., ET AL.: "Uridylate-rich small nuclear RNAs (UsnRNAs), their genes and pseudogenes, and UsnRNPs in plants: structure and function. A comparative approach"
Download PDF:
Claims:
CLAIMS
1. A nucleic acid sequence comprising : [FaUbFc]n wherein a + b + c > 1 and n > 1 ; and U represents a sequence, "U14", present in a plant genome as a nontranslated sequence, said non translated sequence comprising : i) at least one of the following phylogenetically conserved sequences, wherein N represents any one of the nucleotides A, C, G or T : I : TGATGA or II : CATTCGCAGTNNCCNCCTAAGA, or III : CCTTCCTNGGATGTCTGA or a sequence having at least 90% homology with one of sequences I, II or III, and ii) at least one plantspecific region and, iii) a pair of inverted repeats, and F represents a nontranslated flanking sequence which, in a plant genome, is immediately adjacent to the said "U14" sequence.
2. Nucleic acid sequence derived from the sequence of claim 1 comprising the plantspecific region of the said U14 sequence or a fragment comprising at least 6 consecutive nucleotides of this region.
3. Nucleic acid sequence according to claims 1 or 2 wherein the plantspecific region comprises : one of the following sequences : YRNARNN or GCCNNGCCAGGCTNGAGAGNTNNTGCTGNNNNAT wherein N represents any nucleotide, Y represents a pyrimidine and R represents a purine, or a sequence exhibiting at least 70 % homology with one of these sequences, or a fragment of any of these sequences having at least 6 nucleotides.
4. Nucleic acid sequence derived from the sequence of claim 1, comprising the flanking sequence "F", or a sequence having at least 70%, and preferably at least 80% homology with the flanking sequence "F", or a fragment comprising at least 6 consecutive nucleotides of either of these sequences.
5. Nucleic acid sequence according to claims 1 or 4, wherein the flanking sequence F comprises : TTCTTGTCCAGCTC or CCTATGTTTGATACTTGT or a sequence having at least 70% homology with one of these sequences, or a sequence comprising at least.
6. consecutive nucleotides of one of these sequences.
7. 6 Nucleic acid sequence derived from the sequence of claim 1 comprising one or both of the inverse repeats of the U14 sequence.
8. Nucleic acid sequence according to claims 1 or 6, wherein the inverted repeats are chosen from : TATGGC and GCCATA, TXYZTTGC wherein X = G or C, with its complement, Y = C or T, Z = A or T, TTGGGGGATT and AATCCCCCAA.
9. Nucleic acid sequence comprising in association at least two of the sequences according to claims 2 to 7, or one of the sequences according to claims 2 to 7 with a phylogenetically conserved region according to claim 1, or a fragment thereof comprising at least 4 consecutive nucleotides.
10. Nucleic acid sequence according to claim 8 wherein the fragment of the conserved region comprises at least one of the regions chosen from Box C, Box D, 18SA and 18SB regions.
11. Nucleic acid sequence according to claim 9 comprising an inverted repeat and a Box C and Box D region.
12. Nucleic acid sequence according to claim 1, comprising the maize U14 sequence : TATGGCAATGATGTTGAAGTTAAAGGCTTGTTTCTCAACAT TCGCAGTAGCCGCCTAAGAGCTTTCGCCCTGCCAGGCTTGA GAGCTTGTGCTGTTTAATCCTTCCTTGGATGTCTGAGCCATA or a sequence having at least 70% homology with this sequence.
13. Nucleic acid sequence comprising the complementary sequence of any of the sequences according to claims 1 to 11.
14. Sequence according to any one of the preceding claims, comprising RNA or DNA.
15. Sequence according to any one of the preceding claims, optionally labelled with a detectable marker, for use as a probe for plant U14 sequences.
16. Sequence according to any one of the preceding claims for use as a primer in a nucleic acid amplification reaction.
17. Transformation vector containing a nucleotide sequence according to any one of claims 1 to 12.
18. Plant cells stably transformed by at least one of the sequences according to claims 1 to 12.
19. Transgenic plants stably transformed by one of the sequences according to claims 1 to 12.
20. Method for the regulation of rRNA production and accumulation in plants by transforming the plant with a sequence according to any one of claims 1 to 12.
21. Use of U14 sequences of eukaryotic origin in plant cells.
22. Use according to claim 20 wherein the plant cell is stably transformed by a eukaryotic U14 sequence or fragment or complementary sequence thereof.
23. Use of a plant U14 sequence according to claims 1 to 13 in eukaryotic cells, particularly mammalian or yeast cells.
24. Use according to claim 22 wherein the eukaryotic cell is stably transformed by a sequence according to claims 1 to 13.
25. Use according to any one of claims 20 to 23 wherein the U14 sequence, fragment or derivative is a hybrid U14 comprising parts of a plant U14 sequence and parts of a U14 sequence of eukaryotic origin other than plant.
26. Nucleic acid sequence according to any one of claims 4 to 13, further comprising a U14 sequence of eukaryotic origin other than plant, or fragments or derivatives thereof.
Description:
PLANT U14 NUCLEIC ACID SEQUENCES AND DERIVATIVES

THEREOF

The invention concerns plant U14 sequences and fragments and derivatives thereof, as well as the use of the sequences in eukaryotic cells.

The two major classes of eukaryotic small nuclear RNAs are the spliceosomal snRNAs, involved in pre- messenger RNA splicing and small nucleolar RNAs (snoRNAs) involved in processing of pre-ribosomal RNA (pre-rRNA) transcripts and ribosome formation. At least 12 snoRNAs have been identified in yeast (reviewed by Fournier & Maxwell, 1993) and may be involved in pre-rRNA processing or ribosome formation due to their nucleolar localisation, propensity for base pairing with rRNA transcripts and their association with nucleolar proteins.

The best characterised animal snoRNAs are U3, U8, U13 and U14. The properties of these snoRNAs and the role in pre-rRNA processing of U3 in particular have recently been reviewed (Filipowicz & Kiss, 1993) .

U14snoRNA is required for a normal growth phenotype and for processing of pre-rRNA transcripts in yeast (Jarmolowski et al., 1990). U14 contains the conserved Box C and D sequences, of which Box C is required for binding of the nucleolar protein, fibrillarin (Baserga et al., 1991), and are found in a number of nucleolar snoRNAs, and two conserved sequence elements which are complementary to the 18S rRNA (Trinh-Rolik and Maxwell, 1986 ; Li and Fournier, 1992) . The genomic organisation of animal U14s is of particular significance in that in mouse, rat, Xenopus and human, three copies of U14 are located in introns 5, 6 and 8 of the constitutively expressed cognate

hsc70 heat shock gene (Liu & Maxwell, 1990). The orientation of the U14 sequences in the coding strand of the hsc70 introns, the lack of typical UsnRNA promoter sequences, and the lack of a cap structure at the 5• end suggest that these U14 sequences are transcribed as part of the hsc70 pre-mRNA and processed from the intron sequences containing them.

Recently, a number of other snoRNAs have been shown to be encoded within introns of, for the most part, nucleolar proteins. U15, a single copy gene, was found in the first intron of the S3 ribosomal protein gene in human (Tycowski et al., 1993) and U16 and U18 in introns of the LI ribosomal protein gene of Xenopus species and human (Fragapane et al., 1993 ; Prislei et al., 1993). Two U17 genes lie in introns l and 2 of a human cell cycle regulatory protein gene, RCCI. Examples of other snoRNAs, U19-U21, Y, E2 and E3 which appear to be or are likely to be encoded in introns of ribosomal or nucleolar protein genes have been recently reviewed (Filipowicz & Kiss, supra)

The evolutionary conservation of the U14 genomic organisation among different animal species and the now widespread occurrence of intron-encoded snoRNA genes suggests that this kind of gene arrangement may function to ensure co-ordination of expression of nucleolar or ribosomal components.

In yeast U14snoRNA is not intron-encoded but is likely to be processed from a precursor containing both U14 and a second nucleolar snRNA, snR190 (Zagorski et al., 1988 ; Balakin et al., 1994).

The common features of the termini of the yeast and mammalian U14 and many of the other intron-encoded snoRNAs and the successful processing of U14 from transcripts of the mouse hsc70 intron 5 in Xenopus nuclei suggests conservation of the processing

machinery. Although this mechanism remains to be elucidated, processing of U15 and U17 in human in vitro extracts suggested that 5• and 3' end processing were independent of one another and the discrete products detected were indicative of initial endonucleolytic cleavage steps with maturation particularly at the 3' end due to exonucleolytic digestion of bordering sequences (Kiss & Filipowicz, 1993 ; Tyckowski et al., 1993).

The difference in U14 genomic organisation between yeast and metazoans and the lack of U17s in introns of Drosophila and Saccharomyces pombe RCCI analogues raises the question of whether or not such gene arrangements are only found in higher eukaryotes and vertebrates.

To date, the only information available on plant snoRNA genes concerns U3 (Kiss & Solymosy, 1990 ; Marshallsay et al., 1990, 1992 ; Kiss et al., 1991) and MRP/7-2 snRNA (Kiss et al., 1992). No data on plant U14 genes is available. The technical problem underlying the present invention was to identify whether or not U14 genes are present in plants, and if so, to elucidate their structure and function.

The inventors have succeeded in characterising plant U14's using a PCR approach. They have isolated genomic clones of maize and potato and analysed their genomic organisation. Two genomic clones, one from maize and one from potato contained single U14 genes while the second maize genomic contained a cluster of four closely linked U14 sequences. Clustering of plant U14 genes was confirmed by intergenic PCR with maize, barley and potato DNA and intergenic RT-PCR on maize and potato RNA demonstrated expression of polycistronic U14 transcripts. No eukaryotic RNA polymerase II or III promoter elements were

recognisable and the immediate flanking sequences were AU-rich, a property of plant introns (Goodall & Filipowicz, 1989) suggesting that the plant U14s may be intron encoded. The lack of homology of the flanking sequences with sequences in the databases has not allowed identification of the host gene(s).

The invention concerns a nucleic acid sequence comprising :

[F a -U b -F c ] n wherein a + b + c > 1 and n > 1; and U represents a sequence, "U14", present in a plant genome as a non-translated sequence, said non- translated sequence comprising : i) at least one of the following phylogenetically conserved sequences wherein N represents any one of the nucleotides A, C, G or T :

I : TGATGA or

T

II : CATTCGCAGTNNCCNCCTAAGA, or

III : CCTTCCTNGGATGTCTGA or a sequence having at least 90% homology with one of sequences I, II or III, and ii) at least one plant-specific region and, iii) a pair of inverted repeats, and F represents a non-translated "flanking" sequence which, in a plant genome, is immediately adjacent to the said "U14" sequence.

In the context of the present invention, a "phylogenetically conserved region" signifies a nucleic acid sequence having a length of at least 6 nucleotides and being present either identically or with at least 90 % homology, and preferably at least 95 % homology, in U14 sequences of eukaryotes, including yeast.

In the context of the invention "non-translated" signifies a sequence which is present in the genome and in transcribed RNA, but which is not translated into a protein sequence. The U14 gene coding sequence is processed from an RNA precursor and is addressed as a stable molecule to the nucleolus.

In the context of the present invention, % homologies are calculated as the number of identical bases in the alignment of 2 sequences, divided by the length of the longest sequence if the 2 sequences are not of the same length.

Thus, the invention firstly concerns isolated nucleic acid sequences having the same sequence and structure as the U14 genes (small nucleolar RNA's) occurring naturally in the plant genome and in transcribed RNA, and/or the flanking intergenic sequences of the U14 genes.

According to this embodiment, sequences of the invention as well as those illustrated in the figures, can be isolated, using the above phylogenetically conserved sequences I, II or III, preferably II and III, or a sequence having about 90%, for example 95% homology with these sequences, as primers in a nucleic acid amplification reaction e.g. Polymerase Chain Reaction (PCR) , preferably reverse-transcriptase PCR as described by Simpson et al., 1992, from plant total RNA preparations (see examples below) . For example, RT-PCR is carried out, using conserved sequences II and III as primers, on RNA from various plant tissues. This normally gives products of around 70 to 100 base pairs, corresponding to an internal fragment of the U14 coding sequence. Next, primers can be designed from the products of the RT-PCR, to allow amplification between closely linked U14 genes. This normally gives products in the 150 to 300 bp size

range. Alternatively, hybridisation probes corresponding to the conserved regions can be used to screen genomic plant DNA libraries.

The U14 sequences of the invention (also referred to as "U14 coding sequences") are easily identifiable in the amplification products, since they contain at least one, and usually all three, of the above conserved sequences and are bordered at each extremity by a pair of inverted repeats. They are also characterised by the absence of known eukaryotic promoters, and their potential to bind the nucleolar protein fibrillarin.

The total length of the U14 sequence of the invention including inverted repeats, varies according to the plant species from which it originates, and according to the particular U14 gene, but is generally between 50 and 150 nucleotides long. Typical lengths are between 100 and 150 nucleotides. The inverted repeats are usually 4 to 20 nucleotides long.

The conserved sequence I defined above is also known as Box C, and is most commonly TGATGA but can also be TGATGT. It occurs at the 5* end of the U14 sequence, separated from the 5* inverted repeat sequence by 2 or 3 nucleotides.

The conserved sequence II is roughly central in the U14 sequence and contains the universal 18-SA element CATTCGCATNNC (Li and Fournier, 1992) complementary to the 18-S rRNA.

The conserved sequence III is at the 3* end of the U14 sequence, containing the "Box D" element, GTCTGA an element universal in all U14 sequences identified to date. The conserved sequence III also contains part of the putative 18-SB element CCTTCCT (Li and Fournier, 1992) .

The U14 sequences of the invention contain, in addition to the phylogenetically conserved sequences defined above, one or more plant specific regions. In the context of the present invention "plant-specific" signifies that the region has less than 40% and preferably less than 30 % or 20% homology with U14 sequences from other eukaryotes, particularly metazoans and yeast. The plant specific regions identified so far have no homologous counterparts in any known U14 sequences. They can vary in length from about 5 or 6 nucleotides to 50 or 60 or more. Typical lengths are between 20 to 40 nucleotides, for example 30 to 35.

The U14 sequences typically contain two plant- specific regions, one major one having a length of about 35 nucleotides positioned between the 3 1 and central conserved sequences, and a minor one, having 7 or 8 nucleotides, immediately 3' to the Box C conserved sequence. Within the plant kingdom, these sequences are generally conserved with a homology of at least 70%. The major and minor plant specific regions often have the following sequences, respectively :

GCCNNGCCAGGCTNGAGAGNTNNTGCTGNNNNAT and

YRNARNN wherein N represents any one of the nucleotides A, C, G, T. Y is a pyrimidine and R is a purine. Examples of the major plant specific sequences are :

GCCNYGCCAGGCTYGAGAGNTNRTGCTGYNNNAT. Typically, the major and minor plant specific regions have the following sequences, respectively :

GCCNTGCCAGGCTTGAGAGNTNNTGCTGNNNAAT and

TGAAGTT or alternatively they have a sequence having at least 70%, and preferably 80 or 90% homology with one of

these sequences. Within a species, homology of the plant-specific region can be as high as 90 % or 95 %. The minor plant specific sequence has no counterpart in yeast or vertebrate U14 genes. The major conserved sequence has a counterpart in yeast although the degree of homology between these two regions in plants and yeast is very low. In vertebrates, this region is missing completely.

The U14 sequences of the invention further contain, at the extremities of the coding sequence, a pair of inverted repeats, which usually have a length of between 4 and 20, for example 6 to 12 nucleotides, and which are complementary to each other to allow the formation of a stem by base-pairing. The stem structure is thought to protect the U14 gene from digestion RNase. Examples of such inverted repeats are :

TATGGC and GCCATA,

TXYZTTGC wherein X = G or C, with its complement,

Y = C or T, Z = A or T,

TTGGGGGATT and AATCCCCCAA In the above sequences, XYZ is often GCA or CTT.

The structure of a typical plant U14 sequence is shown in figures 1 and 2.

The sequences of the invention may comprise, in addition to the U14 sequence, or instead of the U14 sequence, a sequence "F" which corresponds in the plant genome to the non-translated flanking sequence which is immediately adjacent to the U14 sequence.

The sequence F is also referred to as an intergenic region, in the case of polycistronic U14 genes, because it separates one U14 gene from the next. If plant U14 genes are intron encoded, the sequence F will separate the U14 sequence from the

adjacent exons. Normally, therefore, each U14 gene is immediately flanked, on the 5' and 3 1 sides, by a flanking sequence. The 5 1 and 3' "F" sequences may be identical to each other of different.

Where the sequences of the invention are obtained by amplification of genomic DNA, or by RT-PCR of total plant RNA, the sequences "F" can easily be identified among the amplification products by their position immediately 5 1 and 3* to the U14 sequence, more particularly, immediately 5 1 and 3" to the inverted repeats which delimit the U14 sequences. The "F" sequence extends from the inverted repeats upto either the adjacent U14 sequence in the case of polycistronic organisation, and potentially upto the adjacent exon, which can be identified by comparison with the mRNA or cDNA of the same genomic sequence, the U14 and flanking sequences not being present in the mRNA.

The flanking sequences "F" vary in lengh from, for example in the case of polycistronic U14 genes from about 20 to 140 nucleotides, typically 20 to 50 and 20 to 500 or more in the case of monocistronic U14 genes.

Since the plant U14 genes are most probably in introns, the intergenic sequences F are commonly AU rich. Although there is, in general, no noticeable conservation between flanking regions of different species, it has been shown, by the inventors, that within a species, particularly for polycistronic U14 genes, a high degree of homology, for example 70% or more, is exhibited between flanking sequences. For example, in maize, the following two sequences occur in a plurality of U14 flanking sequences :

TTCTTGTCCAGCTC and

CCTATGTTTGATACTTGT

The sequences of the invention may be composed of a U14 sequence alone, an intergenic or flanking sequence "F" alone, or a combination of U14 and F sequences. Particularly preferred sequences have the structure [F a -U b -F c ] n , in which a, b and c each have the value 1, and n is an integer from 1 to 200, preferably 1 to 10, for example 3, 4 or 5. This type of structure corresponds to the polycistronic genomic organisation naturally occurring in plant genomes. Also preferred are variants wherein only one F sequence is present, i.e, a or c in the above general formula, is 0.

A typical example of a U14 sequence of the invention is the maize U14 sequence :

TATGGCAATGATGTTGAAGTTAAAGGCTTGTTTCTCAACAT TCGCAGTAGCCGCCTAAGAGCTTTCGCCCTGCCAGGCTTGA GAGCTTGTGCTGTTTAATCCTTCCTTGGATGTCTGAGCCATA, or a sequence exhibiting at least 70 % homology, preferably at least 80 %, with this sequence. Other examples are shown in the figures. U14 sequences of monocotyledonous and dicotyledonous plants, for example the cereals, e.g. maize, wheat, barley, oats, rye, etc... and of the Solanacea (e.g. potato), the Cruciferae and the Cucurbitacea are particularly preferred.

In addition to the U14 and "F" sequences defined above, the sequences of the invention may also include sequences other than F or U14 sequences, for example, vector sequences, linkers, adaptors, restriction enzyme sites, etc..

In a preferred embodiment, the invention also relates to derivatives of the genomic-type sequences defined above, said derivatives comprising fragments of either or both of the F and U14 sequences, or of sequences showing at least 70% and preferably at least 80% or 85% homology with the F or U14 sequences. The

fragments may be combined with other types of nucleic acid sequence. Advantageously the fragments have a length of between 6 to 200 nucleotides, for example 6 to 50.

According to a preferred variant, the U14- derivative comprises or consists of the plant-specific region of the U14 sequence as defined above, or a fragment comprising at least 6, particularly at least 10 or 20, consecutive nucleotides of this region. Preferably, the plant-specific fragment has a length of 6 to 25, for example 6 to 10 nucleotides.

For example, according to this variant, the plant-specific fragments may comprise at least 6 consecutive nucleotides of one of the following sequences :

YRNARNN or

GCCNNGCCAGGCTNGAGAGNTNNTGCTGNNNNAT wherein N represents any nucleotide, Y represents a pyrimidine and R represents a purine, or of a sequence exhibiting at least 70% and preferably at least 80% homology with one of these sequences.

The U14 sequence derivative may also be a sequence comprising or consisting of one or both of the inverted repeats found at the 5• and 3• extremities of the U14 sequence in the plant genome, as described above.

According to this variant, the length of the inverted repeat fragment may correspond to that naturally occurring in the plant genome, for example between 4 and 20 nucleotides, or may be lengthened or shortened by addition or deletion of nucleotides. If the inverted repeats are lengthened by addition of nucleotides, base-pairing between the two halves of the repeat must be conserved.

The U14 sequence derivative may also be a fragment of the flanking or intergenic sequence "F" comprising or consisting of at least 6, and preferably at least 10 or 20, consecutive nucleotides of the sequence F, or of a sequence having at least 70%, and preferably at least 80% homology with the F sequence occurring in the plant genome. For example, the F sequence fragment may comprise or consist of at least 6 consecutive nucleotides of one of the following sequences :

TTCTTGTCCAGCTC or

CCTATGTTTGATACTTGT

According to a most preferred embodiment, the sequence of the invention comprises a nucleic acid sequence containing or consisting of an association of fragments as defined above, i.e. containing or consisting of at least two of the sequences derived from the F-U14 sequences, as defined above, for example a plant specific sequence or fragment thereof in association with an inverted repeat or a fragment of the flanking or intergenic sequence "F" with an inverted repeat. In this case, the inverted repeat may correspond to an inverted repeat naturally occurring in the plant U14 gene, or alternatively, may be any inverted repeat capable of forming a base-paired stem. The sequence comprising or consisting of an association of fragments of the invention may also contain one or more of the sequence derivatives as defined above, in association with all or part of one of the phylogenetically conserved regions. According to this latter variant, particularly preferred associations are the Box C and Box D regions in association with an inverted repeat, or with a flanking sequence fragment. Another example is the 18-SA or 18-SB regions in association with an inverted

repeat, optionally also with parts of the flanking sequence. It is possible according to this embodiment of the invention, to associate in one sequence, fragments originating from different plants, for example inverted repeats from maize with a plant specific sequence from barley, or another cereal etc... giving rise to hybrid U14 genes and fragments, or even to create hybrids originating from different classes of organisms, e.g. plant/human or plant/yeast. According to this latter variant, the hybrids preferably contain at least part of the plant specific region defined above, or its complementary sequence, and/or a plant U14 inverted repeat, in association with a U14 sequence, fragment or derivative from a eukaryotic organism other than a plant.

The invention also concerns sequences which are complementary to any of the above sequences, derivatives and fragments. By "complementary" is to be undestood a degree of complementarity sufficiently high, for example at least 80 or 90%, to allow hybridisation of the sequence to its complement in stringent conditions by standard techniques (Sambrook et al., 1989). For example :

- hybridisation on nylon membrane filters in 20 ml of hybridisation solution (%xSSPE, 0.5 % SDS, 5xDenhardt's, 20 μg/ml sonicated salmon sperm DNA),

- a pre-hybridisation for 1 hour at 65°C is followed by a hybridisation of 12-16 hours (solution containing labelled probe) at 65°C. The filters are washed as follows :

* 1 wash for 30 min in 2xSSC, 0.5%SDS at 65°C ;

* 1 wash for 20 min in 2xSSC, 0.1%SDS at 65°C ;

* 1 wash for 10 min in O.δxSSC, 01.%SDS at 45°C

Most preferably, the sequences are 100% complementary, possibly with 3 or 4 mismatches, at most.

This embodiment of the invention includes antisense sequences to plant whole U14 genes and flanking or intergenic sequences, as well as antisense to all of the derivatives and fragments defined above.

Also included in this embodiment of the invention are ribozymes, preferably of the hammerhead or hairpin type, whose hybridising arms are complementary to the U14 sequences and derivatives as defined above. For details of hammerhead and hairpin ribozymes see European and International patent applications EP- A-321201 and WO-A-9119789 respectively. The ribozymes are preferably capable of cleaving the U14 coding sequences or intergenic regions.

The sequences of the invention may be either RNA or DNA, or a mixture of the two.

The sequences of the invention have a number of applications and uses. The identification, by the inventors, of the existence of U14 sequences showing regions of homology with vertebrate and yeast U14 sequences signifies that U14 sequence of any eukaryotic origin and fragments and derivatives thereof, can be used in plants, e.g. in transformation. Furthermore, plant U14 sequences and fragments and derivatives thereof as defined above, can be used in other eukaryotes, e.g. yeast or mammals, particularly in transformation.

Any of the above sequences, derivatives and fragments can be used, optionally labelled with a detectable marker, as a probe to detect plant U14 sequences, or as a primer in a nucleic acid amplification reaction for amplication of sequences within and adjacent to plant U14 genomes. As labels,

all the conventional systems such as radio-labelling, enzyme-labelling, fluorescein and biotin etc.. can be used.

Use can be made of the particular functions of certain parts of the U14 genes and flanking sequences, for example the inverted repeats can be used as stabilisors to protect heterologous RNA from digestion by RNase.

The anti-sense and ribozyme variants of the invention, since they are capable of inactivating or cleaving, respectively, the U14 genes and flanking sequences, are useful in the regulation of the processing of pre-rRNA transcripts and ribosome formation. Indeed, it has been shown in yeast (reviewed by Filipowicz and Kiss, 1993) that impairment of U14 results in impairment of 18SrRNA production, and accumulation of 23SrRNA. Inactivation or cleavage of U14 genes by ribozymes therefore results in cell death due to impairment of these essential functions.

The invention further relates to transformation vectors containing any of the sequences of the invention, suitable for stably transforming plant cells, and to the transformed cells thus obtained. Conventional transformation techniques are applied. Transgenic plants, obtained by regeneration of the transformed cells, also fall within the scope of the invention, particularly plants capable of producing the U14 anti-sense and ribozyme sequences discussed above. As examples of plants capable of being transformed by the sequences of the invention, mention can be made of monocotyledons and dicotyledons such as the cereals e.g. wheat, maize, barley, rice, etc... ; Brassicae e.g. oilseed rape, cabbage, broccoli,

etc... ; Cucurbitacea e.g. melon, cucumber and also sunflower, soya, etc...

Different aspects of the invention are illustrated in the figures :

Figure 1 is a schematic representation of the different regions of mouse, yeast and plant U14 snoRNAs. Thick boxes represent conserved regions, as do the Boxes C and D.

Figure 2 shows the aligned U14 coding sequences of Yeast, Mouse, Maize (MU14.1a → MU14.4) and Potato (PU14.1), the conserved and specific regions of the sequences being illustrated. The consensus sequences

1, II and III are shown. Conserved region II can be extended both in the 5' and 3* direction as shown. Putative 18S-A and 18-SB regions are also marked, PL.sp. = Plant specific. X = region highly conserved between plants and yeast but absent in vertebrates. The minor plant specific region immediately 3' to Box C is absent in all other classes of organisms so far tested and shows around 60 % homology between plant species. The major plant specific region is conserved with at least 70 % homology between species and about 90 to 95 % within a species. Phylogenetically conserved regions II and III have at least 70 % homology, (often at least 90 % homology in the case of region III) between classes of organisms. Capital letters indicate the presence of the base in all positions ; small case letter indicate the presence of the base in all but one position.

Figure 3 shows a strategy for cloning U14 snRNA genes from plants. In the first stage, primers 1 and

2, specific to conserved sequences in animal and yeast U14 were used in RT-PCR reactions with plant RNA. The sequences of the generated products (approximately 90bp) were used to design primers for amplifying

between closely linked genes using total plant DNA in stage 2.

Figure 4 shows the alignment of Plant (PL) , Yeast (YST) and vertebrate (VERT) U14 snoRNA Consensus sequences.

Figure 5 shows the alignment of plant (MU = Maize, PU = potato) , yeast and vertebrate U14 snoRNA sequences, with the consensus sequence.

Figure 6 shows the polycistronic Maize U14.1 sequences, including flanking intergenic sequences and inverted repeats, as obtained by sequencing of fragments of isolated genomic clones. This transcript contains four full U14 sequences MU14.1a, MU14.1b, MU14.1c and MU14.1d. Inverted repeats and Box C and D elements are marked.

Figure 7 shows the maize single U14 sequence U14.4, with flanking intergenic sequences, as obtained by sequencing of fragments of isolated genomic clones. Inverted repeats and Box C and D elements are marked.

Figure 8 shows the potato single U14 sequence PU14.1 with flanking intergenic sequences, as obtained by sequencing of fragments of isolated genomic clones. Inverted repeats and Box C and D elements are marked.

Figure 9 shows approximately 80 nucleotides of the 3' end of a U14 gene from Arabidopsis thaliana, in cDNA. Box D element is marked.

EXAMPLES

Details of method :

Total RNA preparations of potato "Cara" and Maize "Kelvedon Glory" were prepared by standard procedures. For reverse transcriptase-polymerase chain reaction

(RT-PCR) the RNA preparations were extensively DNased with RQI RNase-free, DNase prior to first strand cDNA synthesis with primer II, 5'-TCAGACATCCAAGGAAGG-3' , complementary to a conserved region at the 3' end of mouse and yeast U14. PCR reactions were carried out following addition of a second oligonucleotide primer I, 5'-CCATTCGNNGTTTCCAC-3• homologous to mouse and yeast U14. RT-PCR reactions were carried out by standard procedures (Simpson et al., 1992). Amplified products were isolated following agarose gel electrophoresis, subcloned into pGEM 3zf(+) (Promega) and sequenced.

Primer III 5'-CCCGGGAGCTCAGGCTTGAGAGCTAGTGC-3 • , and

Primer IV 5'-GAATTCTGCAGCANGGCGAAAGCTCTTAG-3 • (or primers corresponding to III and IV but lacking the first 10 5' nucleotides) were designed on the basis of the maize RT-PCR amplified sequences and were used to amplify intergenic sequences between adjacent, closely linked U14 genes in both PCR and RT-PCR reactions.

Primers V 5'-CCCGGGAGCTCAGGCTTGAGAGCATATGC-3 ' , and VI 5'-GAATTCTGCAGCGAAGGCGAAAGCACTTAG-3 • were used in PCR and RT-PCR reactions with potato DNA and RNA.

Materials :

Restriction enzymes and T4 DNA ligase were purchased from Boehringer (Mannheim) , Pharmacia and Promega. SP6 RNA polymerase, T7 RNA polymerase and RNase inhibitor were from Pharmacia. Radionucleotides, T4 polynucleotide kinase and Hybond N* filters were obtained from Amersham. RNase A, RNase T, and tag DNA polymerase were purchased from Boehringer-Mannheim. MMLV reverse transcriptase and RNase-free, DNase RQI

were obtained from Gibco-BRL and Sequenase from United States Biochemicals.

RNA and DNA extraction :

RNA was isolated from leaf tissue of the potato (Solanum tuberosum L.) cultivars Cara and Desiree, and from leaf, ovule and silk material of the maize (Zea mays L.) variety, Kelvedon Glory. DNA was extracted from leaf tissue of Cara and Kelvedon Glory. For Northern analysis, total RNA was isolated from leaf tissue of maize, wheat, barley, asparagus, potato, bean, cucumber and bird's nest fern (Asplenium nidulans) . All techniques were standard (Sambrook et al., 1989).

Primer extension :

Mapping of the 5' end of U14 transcripts by primer extension was carried out on DNase-treated total RNA preparations using [ 32 P]-end labelled oligonucleotide primers :

PotPex 1 : 5'-TTAGGCGGCCACTGCGAATG-3• ;

MzPex 1 : 5 -TTAGGCGGCAACTGCGAATG-3' ;

MzPex 2 : 5•-TTAGGCGGCTACTGCGAATG-3• .

The two maize primers differed in a single nucleotide to reflect sequence variation seen in the cloned maize U14 genes.

Primers were end-labelled with polynucleotide kinase and gel purified. Primer extension was carried out by standard procedures (Sambrook et al., 1989) and products were separated on DNA sequencing gels.

RNase A/Tχ protection analysis :

Preparation of [ 32 P]-labelled RNA transcripts complementary to maize and potato U14 genes were prepared from linearised plasmids as described previously (Waugh et al., 1991). RNase A/T, protection

analysis were performed on 5 μg of total RNA as described by Goodall and Filipowicz, 1989 and products separated on DNA sequencing gels.

Nucleolar localisation of U14snRNAs in plant tissue :

Anti-sense U14 probes corresponding to maize and potato U14 coding/flanking sequences incorporating digoxygenin-UTP was generated by in vitro transcription, and used as an in situ probe on pea root cells as described by Highett et al., 1993. This showed a clear localization to the nucleoli with little or no labelling of the rest of the nucleus or the cytoplasm, thus confirming that this sequence is a nucleolar snRNA.

RESULTS :

Isolation of U14 sequences from maize and potato by PCR :

Expressed plant U14 sequences were amplified by RT-PCR from total RNA preparations of maize and potato using primers complementary to highly conserved regions of mouse and yeast U14s (primers I and II) . Amplified products of the expected size of approximately 80 bp were subcloned and sequenced. Four maize and two potato sequences showed homology to mouse and yeast U14 sequences suggesting that putative plant U14 sequences had been generated. To amplify more of the U14 sequence in the 5' and 3' direction from U14 coding sequences two further primers (primers III and IV) were designed using the above plant RT-PCR sequences. PCR reactions with both potato and maize DNA generated amplified DNA products of only 50-350 bp which represented the amplification of sequences

between adjacent, closely linked genes. The PCR products were subcloned and the sequences of three maize and five potato clones were obtained. Each of the clones contained approximately 45 bp of the 3' end of the upstream gene and approximately 70 bp of the 5' end of the downstream U14 gene. The PCR reaction amplified across the regions containing the initial pair of primers confirming their presence and similarity to mouse and yeast U14s.

Isolation of maize and potato U14 genomic clones :

Screening of maize and potato genomic libraries with a maize U14 and potato U14 probe generated by RT-PCR identified a number of positive clones. Two maize and one potato genomic clone were purified and hybridising fragments subcloned into pUC19. Maize U14.1 gave an extremely strong hybridising signal and was subcloned on four consecutive Pstl fragments of 1.1, 1.0, 0.7 and 0.065 kb which were sequenced. Four complete U14 coding sequences and a fragment of a fifth coding sequence were identified in a 760 bp region. The four apparently intact genes were called MzU14.1a-d. In order to isolate gene specific clones, sub-fragments were subcloned into pGEM3zf+. i) Gene (a) was cloned as a 432 bp Rsal fragment

(pgMU14.1a) and contained 276 bp of upstream flanking sequence ; ii) Gene (b) was isolated as a 266 bp Rsal fragment (pgMU14.1b) and contained the U14.1b coding sequence fragment and 35 bp of upstream and 101 bp of downstream flanking sequence. iii) The whole of gene (d) , the 3'-most two thirds of gene (c) and 327 bp of downstream

flanking sequence were isolated on a 581 bp Tag

I/Bcfl II fragment (pgMU14.1d).

The second maize genomic clone, MzU14.4, was subcloned as a 2.2 kb EcoRI fragment. This clone contained a single U14 coding sequence which was subcloned as a 750 bp Pstl/EcoRI fragment and a 254 bp Nsil/Bfrl fragment. Similarly, the potato genomic clone, StU14.1 contained a single U14 coding sequence which was subcloned as a 873 bp EcoRI/Pstl fragment and a 438 bp EcoRI/Hindlll fragment. The coding sequences and extensive flanking sequences of all of the genomic U14 genes were generated by standard procedures.

Putative plant U14 genomic sequences show regions of homology with vertebrate and yeast U14s :

The sequences of the four apparently full length U14 coding regions of maize U14.1, and the single genes of maize U14.4 and potato U14.1 can be aligned with yeast and mouse U14 sequences. The putative coding regions have been defined by the inverted repeat sequences adjacent to the Box C and Box D sequences. All six plant sequences show high homology with extensive regions of identity to one another. While sequence variation exists among the plant, yeast and animal U14 sequences, there are regions of virtual identity : the Box C sequence and an 19 nt region, including Box D, and lying directly upstream of the 3* inverted repeat. These two regions were called regions 1 and 3 in the comparison of yeast and mouse U14 (Li and Fournier, 1992) . In addition, a region of extensive similarity was identified in the 5• halves of the molecules and includes region 2 of Li and Fournier, 1992. However, the plant U14 sequence information allows a realignment of the mouse U14

central region which extends region 2 in the 3• direction for all of the sequences. Region 2 is also extended in the 5 direction in the comparison of the plant and sequences. The new alignment gives the functionally essential yeast-specific sequence 1 (Li and Fournier, 1992) counterparts in both mouse and plant U14s, while the essential yeast-specific region 2 is still absent in mouse but overlaps the plant specific sequence. In addition, the plant sequences contain an additional 7 or 8 nucleotides between the 5' extended region 2 sequence and Box C, which are found only in plant U14s. The much longer central yeast-specific region, which is absent in mouse U14 has a counterpart in plant U14s but the plant sequences are shorter and show little homology to the yeast sequence. The sequences and lengths of the inverted repeats in the plant genes vary and there are 2-3 nt between the 5' IR and Box C sequence and the 3' IR is usually directly adjacent to the Box D sequence. The predicted sizes of the plant U14s are intermediate between those of mouse and yeast with maize and potato U14s being approximately 117-120 nt long.

Northern analysis and primer extension mapping of 5* end of U14 transcripts :

From the sequence alignment, plant U14snoRNAs were expected to be intermediate in size between yeast and mouse U14s. The sizes of maize, potato and a number of other plant U14s were estimated from Northern analysis of total RNA hybridised with the 316 bp EcoRI/Hindlll fragment of pgMU14.1b containing the maize U14.1b gene. All of the plant species showed a hybridising RNA band of approximately 120 nt consistent with the sequence data, with the exception of the fern, Asplerium nidulans which showed a

considerably larger U14 hybridising band of approximately 160 nt.

Primer extension analysis of total maize ovule RNA with primers complementary to the maize and potato sequences generated a number of major products. Two maize primers (differing by a single nucleotide) were used to reflect a single nucleotide difference between U14.1b, c and d (primer MzPex 1) and U14.1a and U14.4 (primer MzPex 2) . The MzPex 1 primer produced a number of primer extension products in the size range of 54 to 59 nt of which those of 56 nt were the most intense. All of the products mapped within the 5* inverted repeat. The MzPex 2 primer gave a narrower range of products (54 to 57 nt) , of which the 56 nt band was again most intense and all mapped to the 5' IR. Primer extension with total RNA from potato leaf using primer PotPex 1 gave a range of products from 56 to 60 nt which again mapped to the 5' IR. In addition to the major primer extension products described above, other products in the size range of approximately 80-350 nt were observed and were indicative of 5* extensions to, at least, some U14 transcripts. A control primer extension reaction with total maize ovule RNA and a primer complementary to maize UδsnRNA gave a single primer extension product of expected size.

Detection of polycistronic U14 transcripts by RT-PCR :

Intergenic PCR and the gene arrangement of MzU14.1 suggested that some U14 genes are clustered in both potato and maize raising the possibility that these U14 genes are transcribed together as a polycistronic transcript and processed to individual U14snoRNAs. RT-PCR on maize RNA using primer III for

initial cDNA production and primers III and IV in the PCR reaction produced a number of labelled products of 135, 152, 180, 261, 266 and 272 nt. Similar reactions with potato RNA using primers V and VI gave products of 82, 118, 258, 277, 278 and approximately 360 nt. Control reactions without reverse transcriptase gave no products showing no residual DNA in the RNA preparations and confirming that the RT-PCR products in lanes 1-4 were derived from RNA. Cloning and sequencing of some of the products from these Ig RT- PCR reactions confirmed the presence of 3' and 5' fragments of adjacent genes divided by an intergenic sequence. From the positions of the primers, the RT- PCR products would contain approximately 120 nt of coding region from the two adjacent genes leaving intergenic regions in the range of 15-150 nt for maize and 140-240 nt for potato.

RNase A/Tj. protection mapping :

RNase A/T, protection mapping using antisense RNA probes specific for MU14.1a, MU14.1b, MU14.1d, MU14.4 and PU14.1 detected full length products of 118-126 nt indicating the presence of mature U14 transcripts from these genes (or genes virtually identical to them within the limitations of the RNase A/T. J technique) in total RNA. The production of full length products from those genes in the MU14.1 gene cluster taken together with the detection of polycistronic U14 transcripts would suggest that the individual U14s are processed from these larger RNA precursors.

References

1. Fournier M. J. and Maxwell E. S., 1993, Trends Biochem. Sci., 18, 131-135 ;

2. Zagorski J., Tollervey D. and Fournier M. J., 1988, Mol. Cell Biol., 8, 3282-3290 ;

3. Filipowicz F. and Kiss T., 1993, Mol. Biol. Rep., 149-156

4. Jarmolowski A., Zagorski J. , Li H. V. and Fournier M. J., 1990, Embo J., 9, 4503-4509 ;

5. Baserga S. J., Yang W. and Steitz J. A., 1991, Embo J., 10, 2645-2651 ;

6. Trinh-Rohlik Q. and Maxwell E. S., 1986, Nucl. Acids Res., 16, 6041-6056 ;

7. Liu J. and Maxwell E. S., 1990, Nucl. Acids Res., 18, 6565-6571 ;

8. Tycowski K. T. , Shu M. D. and Steitz J. A., 1993, Gene Dev. , 7, 1176-1190 ;

9. Fragapane P., Prislei S., Michienzy A., Corffourelli E. and Bozzoni I., 1993, Embo J., 12, 2921-2928

10. Kiss T. and Filipowicz W. , 1993, Embo J., 12, 2913-2920 ;

11. Kiss T. and Solymosy F. , 1990, Nucl. Acids Res., 18, 1941-1949 ;

12. Marshallsay C. , Kiss T. and Filipowicz W. , 1990, Nucl. Acids Res., 18, 3459-3466 ;

13. Kiss T., Marshallsay C. and Filipowicz W. , 1991, Cell., 65, 517-526 ;

14. Goodall G. A. and Filipowicz W. , 1989, Cell, 58, 473-483 ;

15. Simpson C. G., Sinibaldi R. and Brown J.W.S., 1992, Plant J. , 2, 835-836 ;

16. Highett M. I., Beven A. F. and Shaw P. J. , 1993, J. Cell Sci., 105, 1151-1158 ;

17. Waugh R. , Clark, G. , Vaux, P. and Brown, J.W.S. (1991) Nucl. Acids Res. 19, 249-256 ;

18. Prislei S., Michienzy A., Presutti C. , Fragapane P. and Bozzoni I., 1993, Nucl. Acids Res., 21, 5824-5830 , '

19. Kiss T. , Marshallsay C. and Filipowicz W. , 1992, Embo J., 11, 3737-3746 ;

20. Marshallsay C. , Connelly S. and Filipowicz W. , 1992, Plant Mol. Biol., 19, 973-983 ;

21. Balakin A. G. , Lempicki R. A., Huang G. M. and Fournier M. J. , 1994, J. Biol. Chem., 269, 739-746 ;

22. Li H. V. and Fournier M. J. , 1992, Embo J., 11, 683-689 ;

23. Sambrook J., Fritsch E. F. and Maniatis T. , 1989, Molecular Cloning-A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cols Spring Harbor, New York.