Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CHIMERIC GENES FOR PLASTID EXPRESSION
Document Type and Number:
WIPO Patent Application WO/2001/007590
Kind Code:
A2
Abstract:
Novel chimeric genes and methods for the control of expression of nucleic acid sequences in plant plastids are described in the instant invention. Multiple genes or polycistronic units are expressed using upstream elements of plastid genes naturally found in divergent orientation. These upstream elements may be derived from sequences found in the native plastid genome of the same plant, or preferably from the plastid genome of a different plant species having sequence identity of less than 90 %.

Inventors:
HEIFETZ PETER BERNARD (US)
COX KEVIN (US)
Application Number:
PCT/EP2000/007118
Publication Date:
February 01, 2001
Filing Date:
July 25, 2000
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SYNGENTA PARTICIPATIONS AG (CH)
HEIFETZ PETER BERNARD (US)
COX KEVIN (US)
International Classes:
C12N15/82; (IPC1-7): C12N15/00
Domestic Patent References:
WO1997006250A11997-02-20
WO1995016783A11995-06-22
WO1998011235A21998-03-19
WO1997032011A11997-09-04
WO1995034659A11995-12-21
WO1998033927A11998-08-06
WO1998055595A11998-12-10
WO1999010513A11999-03-04
WO1999046394A11999-09-16
WO1999046370A21999-09-16
WO2000020612A22000-04-13
WO2000003022A22000-01-20
WO2000003017A22000-01-20
WO1999005265A21999-02-04
Foreign References:
US5693507A1997-12-02
EP0770682A21997-05-02
Other References:
ROGERS S A ET AL: "ANALYSIS OF CHLOROPLAST PROMOTERS USING BIDIRECTIONAL TRANSCRIPTION VECTORS" PLANT MOLECULAR BIOLOGY, vol. 15, no. 3, 1990, pages 421-435, XP002159238 ISSN: 0167-4412
ZOUBENKO ET AL: "efficient targeting of foreign genes into the tobacco plastid genome" NUCLEIC ACIDS RESEARCH,GB,OXFORD UNIVERSITY PRESS, SURREY, vol. 22, no. 19, 1994, pages 3819-3824, XP002090405 ISSN: 0305-1048
SRIRAMAN ET AL: "transcription from heterologous rRNA operon promoters in chloroplasts reveals requirement for specific activating factors" PLANT PHYSIOLOGY,US,AMERICAN SOCIETY OF PLANT PHYSIOLOGISTS, ROCKVILLE, MD, vol. 117, no. 4, August 1998 (1998-08), pages 1495-1499, XP002106114 ISSN: 0032-0889
SRIRAMAN ET AL: "the phage-type PclpP-53 promoter comprises sequences downstream of the transcription inititaion site" NUCLEIC ACIDS RESEARCH,GB,OXFORD UNIVERSITY PRESS, SURREY, vol. 26, no. 21, November 1998 (1998-11), pages 4874-4879, XP002106113 ISSN: 0305-1048 cited in the application -& DATABASE EMBL [Online] ACCESSION NO: AF090188, 19 November 1998 (1998-11-19) SRIRAMAN P., ET AL.: "Arabidopsis thaliana ATP-dependent protease proteolytic subunit (clpP) gene, promoter region and partial cds." XP002159239
Attorney, Agent or Firm:
Becker, Konrad (Patent and Trademark Dept. Site Rosental, Basel, CH)
Download PDF:
Claims:
What is claimed is:
1. A chimeric gene comprising: (a) a plastid nucleic acid sequence natively comprised in a plastid genome between two nucleic acid sequences that are transcribed in divergent orientation with respect to each other in a plastid, (b) a first nucleic acid sequence operativeiy linked to the 3'end of said plastid nucleic acid sequence, and (c) a second nucleic acid sequence operatively linked to the 5'end of said plastid nucleic acid sequence, wherein at least one of said first or second nucleic acid sequence is heterologous to said plastid nucleic acid sequence, and wherein said first and second nucleic acid sequences are linked to said plastid nucleic acid sequence in divergent orientation with respect to each other.
2. The chimeric gene of claim 1, wherein one of the nucleic acid sequences of (a) comprises an open reading frame.
3. The chimeric gene of claim 1, wherein both of the nucleic acid sequences of (a) comprise an open reading frame.
4. The chimeric gene of claim 1, wherein the nucleic acid sequences of (a) are transcribed by different RNA polymerases.
5. The chimeric gene of claim 1, wherein one nucleic acid sequence of (a) is transcribed by a nuclearencoded RNA polymerase and the other nucleic acid sequence of (a) is transcribed by a plastidencoded RNA polymerase.
6. The chimeric gene of claim 1, wherein one nucleic acid sequence of (a) is transcribed by a nuclearencoded RNA polymerase and by a plastidencoded RNA polymerase.
7. The chimeric gene of claim 6, wherein the other nucleic acid sequence of (a) is transcribed by a plastidencoded RNA polymerase.
8. The chimeric gene of claim 1, wherein the nucleic acid sequences of (a) are adjacent to each other.
9. The chimeric gene of claim 1, wherein said plastid nucleic acid sequence comprises the promoter of a clpP gene and the promoter of a psbB gene.
10. The chimeric gene of claim 1, wherein said plastid nucleic acid sequence is derived from Arabidopsis.
11. The chimeric gene of claim 1, wherein said plastid nucleic acid sequence comprises a nucleotide sequence substantially similar to the nucleotide set forth in SEQ ID NO: 1.
12. The chimeric gene of claim 11, wherein said plastid nucleic acid sequence comprises the nucleotide sequence set forth in SEQ ID NO: 1.
13. The chimeric gene of claim 1, wherein said first or second nucleic acid sequence comprises a selectable marker gene for plastid transformation.
14. The chimeric gene of claim 1, wherein said first or said second nucleic acid sequence comprises a gene conferring upon a plant tolerance to a herbicide.
15. The chimeric gene of claim 14, wherein said gene conferring upon a plant tolerance to a herbicide encodes a PPO and said herbicide is a PPO inhibitor.
16. The chimeric gene of claim 14, wherein said gene conferring upon a plant tolerance to a herbicide is derived from a plant or from a microorganism.
17. The chimeric gene of claim 14, wherein said gene conferring upon a plant tolerance to a herbicide is an E. coli hemG gene or a B. subtilis hemY gene.
18. A plant transformation vector comprising the chimeric gene of any one of claims 1 to 17.
19. A plastid comprising the chimeric gene of any one of claims 1 o 17.
20. A plant comprising the plastid of claim 19.
21. Seeds of the plant according to claim 20.
22. A method comprising transforming a plastid genome of a plant with a chimeric gene according to any one of claims 1 to 17, and expressing said first or said second nucleic acid sequence in said plant.
Description:
NOVEL CHIMERIC GENES The present invention generally pertains to plant molecular biology and more particularly pertains to novel chimeric genes for expression of nucleic acid sequences in plastids and methods of use therefor.

Plastid transformation, in which genes are inserted by homologous recombination into all of the several thousand copies of the circular plastid genome present in each plant cell, takes advantage of the enormous copy number advantage over nuclear-expressed genes to permit expression levels that may exceed 10% of the total soluble plant protein. In addition, plastid transformation is desirable because plastid-encoded traits are not pollen transmissible in most commercially important crops; hence, potential risks of inadvertent transgene escape to wild relatives of transgenic plants are greatiy reduced. Other advantages of plastid transformation include the feasibility of simultaneous expression of multiple genes as a polycistronic unit and the elimination of positional effects and gene silencing that may result following nuclear transformation.

Plastid transformation technology is extensively described in U. S. Patent Nos. 5,451,513, 5,545,817,5,545,818, and 5,576,198; in Intl. Application No. WO 95/16783; and in Boynton et al., Methods in Enzymology 217: 510-536 (1993), Svab et al., Proc. Natl. Acad. Sci. USA 90: 913-917 (1993), Golds et al. BiolTechnology 11: 95-97 (1993) and McBride et al., Proc.

Natl. Acad. Sci. USA 91: 7301-7305 (1994); all of which are incorporated herein by reference. The basic technique for tobacco plastid transformation involves the particle bombardment of leaf tissue with regions of cloned plastid DNA flanking a selectable marker, such as an antibiotic resistance gene. The 1 to 1.5 kb flanking regions, termed targeting sequences, facilitate homologous recombination with the plastid genome and thus allow the replacement or modification of specific regions of the 156 kb tobacco plastome. Techniques have also been described for the transfection of plastids in plant protoplasts (O'Neill et a/., Plant Journal 3 (5): 729-738 (1993) and Koop etal., Planta 199: 193-201 (1996), both of which are incorporated herein by reference).

In many genetic engineering applications, more than one gene or polycistronic unit (also called operon-like polycistronic gene) need to be expressed from a plant's plastid genome and it is highly desirable to have a means of controlling the expression of such genes or polycistronic units. In particular, the use of promoter and 5'regulatory sequences from genes naturally expressed by either or both of the RNA polymerase types active in plastids (e. g. those encoded by the plastid or nuclear genomes) is preferred, since these polymerases may confer differential expression in different tissues, plastid types, or at different times in plant development. Having the means to control one gene with one set of regulatory sequences and a second gene with a different set of regulatory sequences thus lends greater flexibility to the types of expression one can achieve in plastids.

However, when such regulatory sequences are removed from their native context and combined in non-natural ways to direct the expression of heterologous genes, unexpected results have been found to often follow. For example, an upstream fragment of the plastid psbA gene from tobacco was surprisingly found to be able to direct transcription of a promoterless reporter gene in opposite orientation to the one expected, although at extremely low levels only detectable by RNase protection assays (L. A. Allison and P.

Maliga, EMBO J 14 (15): 3721-3730,1995). When a similarpsbA promoter:: selectable marker construct was placed in opposite orientation to a series of gene constructs comprising various deletion variants of the upstream sequences of the tobacco plastid clpP gene fused to a GUS reporter, read-through divergent transcription from the chimeric psbA promoter altered the c/pP-driven expression of GUS (P. Sriraman, D. Silhavy and P. Maliga, Nucleic Acids Research 26 (21): 4874-4879,1998).

Additionally, due to high rates of homologous recombination in plastid genomes, the presence of several plastid gene regulatory sequences in a transgene dramatically increases the risk of recombination between endogenous plastid genome sequences and the transgene, leading to instability of transformed plastid genomes. To alleviate this problem, heterologous regulatory sequences, such as those recognized by a phage T7 RNA polymerase, have been used, but the choice of heterologous regulatory sequences is limited and they do not have the temporal and developmental characteristics of endogenous ones.

Consequently, there is a long felt but unfulfilled need for strategies allowing the predictable expression of multiple genes or polycistronic units in plastids while repressing instability of transgenic plastid genomes.

In view of the above, one object of the invention is to provide novel chimeric genes allowing the predictable expression of multiple genes or polycistronic units in plant plastids and securing high stability of transgenic plastid genomes. The present invention also provides novel methods for expression of nucleic acid sequences in plastids.

Chimeric genes of the present invention comprise a plastid nucleic acid sequence capable of directing the expression of two nucleic acid sequences in divergent orientation. Using the present invention a single plastid nucleic acid sequence can be used for the expression of two genes or polycistronic units, reducing the homology between the transgene and the plastid genome and, thus, the frequency of somatic recombination. Also, using chimeric genes of the present invention, more predictable expression of a plastidic transgene is achieved and different temporal and developmental expression specificity for two nucleic acid sequences are obtained.

The present invention also provides a novel plastid promoter from Arabidopsis thaliana that is functional in all plastid types. Another object of the invention is to provide a method for utilizing protein-coding regions of plastid genes to isolate novel intervening regulatory sequences, such as novel promoter sequences or untranslated 3'or 5'RNA sequences.

Still another object of the invention is to use novel plastid promoter sequences to improve plastid transformation efficiency by reducing undesired homologous recombination between native DNA sequences in the plastid genome and exogenous DNA sequences contained in chimeric DNA fragments incorporated into plastid transformation vectors.

In furtherance of these and other objects, the present invention provides: A chimeric gene comprising: (a) a plastid nucleic acid sequence natively comprised in a plastid genome between two nucleic acid sequences that are transcribed in divergent orientation with respect to each other in a plastid, (b) a first nucleic acid sequence operatively linked to the 3'end of said plastid nucleic acid sequence, and (c) a second nucleic acid sequence operatively linked to the 5'end of said plastid nucleic acid sequence, wherein at least one of said first or second nucleic acid sequence is heterologous to said plastid nucleic acid sequence, wherein said first and second nucleic acid sequences are linked to said plastid nucleic acid sequence in divergent orientation with respect to each other in a plastid. Preferably, the first and second nucleic acid sequences are transcribed in divergent orientation with respect to each other in a plastid. Preferably, one of said nucleic acid sequences of (a) comprises an open reading frame or both of the nucleic acid sequences of (a) comprise an open reading frame. Preferably, the nucleic acid sequences of (a) are transcribed by different RNA polymerases. Preferably, one nucleic acid sequence of (a) is transcribed by a nuclear-encoded RNA polymerase and the other nucleic acid sequence of (a) is transcribed by a plastid-encoded RNA polymerase. In another preferred embodiment, one nucleic acid sequence of (a) is transcribed by both a nuclear-encoded RNA polymerase and a plastid-encoded RNA polymerase, wherein the other nucleic acid sequence of (a) is preferably transcribed by a plastid-encoded RNA polymerase. Preferably, the nucleic acid sequences of (a) are adjacent to each other in a plastid genome.

In another preferred embodiment, the plastid nucleic acid sequence comprises the promoter of a clpP gene and the promoter of a psbB gene and is preferably is derived from Arabidopsis. Preferably, the plastid nucleic acid sequence comprises a nucleotide sequence substantially similar to the nucleotide set forth in SEQ ID NO: 1 or preferably comprises the nucleotide sequence set forth in SEQ ID NO: 1.

In another preferred embodiment, the first or second nucleic acid sequence comprises a selectable marker gene for plastid transformation. Preferably, the first or second nucleic acid sequence comprises a gene conferring upon a plant tolerance to a herbicide. Preferably, the gene conferring upon a plant tolerance to a herbicide encodes a PPO and said herbicide is a PPO inhibitor and is preferably derived from a plant or from a microorganism. Preferably, the gene conferring upon a plant tolerance to a herbicide is an E. coli hemG gene or a B. subtilis hemY gene.

The present invention further provides: A plant transformation vector comprising a chimeric gene of the present invention, a plastid comprising a chimeric gene of the present invention, a plant comprising a plastid of the present invention and the progeny thereof, and seeds of such plants.

The present invention also further provides: A method comprising transforming a plastid genome of a plant with a chimeric gene of the present invention, and expressing the first or said second nucleic acid sequence in said plant and a method comprising transforming a plastid genome of a plant with a chimeric gene of the present invention and expressing said first and said second nucleic acid sequence in said plant.

The present invention further provides a nucleic acid promoter isolated from the 5'flanking region upstream of the coding sequence of the Arabidopsis plastid clpP gene. In a preferred embodiment, the nucleic acid promoter of the invention is substantially similar to a promoter sequence downstream of nucleotide number 263 of SEQ ID NO: 1. In a more preferred embodiment, the nucleic acid promoter of the invention has sequence identity with a promoter sequence downstream of nucleotide number 263 of SEQ ID NO: 1. In still another embodiment, the nucleic acid promoter of the invention is substantially similar to SEQ ID NO: 1. In yet another embodiment, the nucleic acid promoter of the invention is comprised within SEQ ID NO: 1. Instill another embodiment, the nucleic acid promoter of the invention comprises a 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair nucleotide portion of SEQ ID NO: 1. The present invention also encompasses a chimeric gene comprising the nucleic acid promoter of the invention operatively linked to the coding sequence of a gene of interest; a plant transformation vector comprising such a chimeric gene; and a transgenic plant, plant cell, plant seed, plant tissue, or plant plastid, each comprising such a chimeric gene.

In another aspect, the present invention provides a novel method for isolating intervening regulatory DNA sequences from between the protein-coding regions of two plastid genes, comprising the steps of: (a) determining the relative orientation and either a degenerate or a specific nucleotide sequence of protein-coding regions of two plastid genes; (b) designing a first degenerate or specific PCR primer based on the determined sequence of the protein-coding region of one of the two plastid genes; (c) designing a second degenerate or specific PCR primer based on the determined sequence of the protein-coding region of the other of the two plastid genes; (d) amplifying a DNA fragment using the primers of steps (b) and (c), whereby the amplified DNA fragment comprises an intervening regulatory DNA sequence from between the protein-coding regions of the two plastid genes.

In a preferred embodiment of this method, the two plastid genes are a clpP gene and a psbB gene. According to this embodiment, the intervening regulatory DNA sequence comprises a clpP promoter. In another preferred embodiment of this method, the two plastid genes are a 16S rRNA gene and a valine tRNA gene. According to this embodiment, the intervening regulatory DNA sequence comprises a 16S rRNA promoter.

In yet another aspect, the present invention provides an improved plastid transformation method, comprising transforming a plastid of a host plant species with a chimeric gene comprising a plastid-active regulatory sequence operatively linked to a coding sequence of interest, wherein the regulatory sequence has a nucleotide sequence that is less than approximately 90% identical to a corresponding native regulatory sequence in the host plant plastid, whereby undesired somatic recombination between the regulatory sequence in the chimeric gene and the corresponding native regulatory sequence in the host plant plastid is reduced. In a preferred embodiment of this method, the chimeric gene is isolated from the plastid genome of the host plant species and at least approximately 10% of the nucleotides of the regulatory sequence have been mutated. In another preferred embodiment of this method, the regulatory sequence in the chimeric gene is isolated from the plastid genome of a different plant species than the host plant species. For example, the regulatory sequence in the chimeric gene may be isolated from the plastid genome of Arabidopsis. In one especially preferred embodiment, the regulatory sequence in the chimeric gene is a nucleic acid promoter isolated from the 5'flanking region upstream of the coding sequence of the Arabidopsis clpP gene. In another especially preferred embodiment, the regulatory sequence in the chimeric gene is a nucleic acid promoter isolated from the 5'flanking region upstream of the coding sequence of the Arabidopsis 16S rRNA gene.

Other objects and advantages of the invention will become apparent to those skilled in the art from a study of the following description of the invention and non-limiting examples.

DESCRIPTION OF THE SEQUENCES IN THE SEQUENCE LISTING SEQ ID NO: 1 is the nucleotide sequence of the Arabidopsis clip genre promoter region SEQ ID NO: 2 is primer A_cipP used in Example 1 SEQ ID NO: 3 is primer A_psbB used in Example 1 SEQ ID NO: 4 is primerAc)pP1a used in Example 5 SEQ ID NO: 5 is primer Ac)pP2b used in Example 5 SEQ ID NO: 6 is primer rps16P_1a used in Example 5 SEQ ID NO: 7 is primer rps16P_1b used in Example 5 SEQ ID NO: 8 is the top-strand primer used in Example 6 SEQ ID NO: 9 is a bottom-strand primer used in Example 6 SEQ I D NO: 10 is the nucleotide sequence of the Arabidopsis 16S rRNA gene promoter region SEQ ID NO: 11 is a top-strand primer used in Example 7 SEQ ID NO: 12 is a bottom-strand primer used in Example 7 SEQ ID NO: 13 is primer M13 (-20)_ used in Example 1 SEQ ID NO : 14 is primerA-cipP-2 used in Example 1 SEQ ID NO: 15 is primer HemGP1a used in Example 2 SEQ I D NO: 16 is primer HemGP1b used in Example 2 SEQ ID NO : 17 is primer used in Example 3. 1 SEQ ID NO : 18 is primer used in Example 3. 1 SEQ ID NO : 19 is primer used in Example 3. 1 SEQ ID NO : 20 is primer used in Example 3. 1 SEQ ID NO : 21 is primer used in Example 3. 11 SEQ ID NO : 22 is primer used in Example 3. 11 SEQ ID NO : 23 is primer used in Example 3. 11 SEQ ID NO: 24 is primer used in Example 3. 11 SEQ ID NO: 25 is primer used in Example 3.111 SEQ ID NO: 26 is primer used in Example 3. 111 For clarity, certain terms used in the specification are defined and presented as follows: Associated With/Operatively Linked: refers to two nucleic acid sequences that are related physically or functionally. For example, a promoter or regulatory DNA sequence is said to be "associated with"a DNA sequence that codes for an RNA or a protein if the two sequences are operatively linked, or situated such that the regulator DNA sequence will affect the expression level of the coding or structural DNA sequence.

Chimeric Gene/Fusion Sequence: a recombinant nucleic acid sequence in which a promoter or regulatory sequence is operatively linked to, or associated with, one or more nucleic acid sequences, e. g. a nucleic acid sequence that codes for an mRNA or which is expressed as a protein, such that the regulatory sequence is able to regulate transcription or expression of the associated nucleic acid sequence. The regulatory sequence of the chimeric gene is not normally operatively linked to the associated nucleic acid sequence as found in nature.

Codinq Sequence: nucleic acid sequence that is transcribed into RNA such as mRNA.

Preferably, the RNA is then translated in an organism to produce a protein. Such nucleic acid sequence is also called an open reading frame (ORF).

Derived: in the context of the present invention, a nucleic acid molecule or an enzyme derived from, for example, a cell or an organism, is a nucleic acid molecule or enzyme that is typically present in the cell or organism and that has been originally obtained or isolated from the cell or organism. The derived nucleic acid molecule or enzyme is then either directly used after isolation from the cell or organism, or is first subjected to various intermediate steps or modifications such as, for example, cloning in a recombinant vector or in a c-DNA or genomic library, or expression in a heterologous system.

Divergent orientation: referring to the direction of transcription of two nucleic acid sequences with respect of each other, means that the direction of transcription of the first nucleic acid sequence is opposite to the direction of transcription of the second nucleic acid sequence.

Gene: as used herein typically comprises a nucleic acid sequence optionally operatively linked to DNA sequences preceding or following the nucleic acid sequence. The nucleic acid sequence can typically be transcribed into RNA, such as e. g. mRNA (sense RNA or antisense RNA), rRNA, tRNA or snRNA. The nucleic acid sequence may comprises an open reading frame or coding sequence, which can be translated into a polypeptide. Examples of DNA sequences preceding or following the nucleotide sequence are 5'and 3'untranslated sequences, termination signals and ribosome binding sites (rbs), or portions thereof. Further elements that may also be present in a gene are, for example, introns.

Gene of Interest: any gene that, when transferred to a plant, confers upon the plant a desired characteristic such as antibiotic resistance, virus resistance, insect resistance, disease resistance, or resistance to other pests, herbicide tolerance, improved nutritional value, improved performance in an industrial process or altered reproductive capability. The "gene of interest"may also be one that is transferred to plants for the production of commercially valable enzymes or metabolites in the plant.

Heteroloqous: as used herein means"of different natural origin"or represents a non-natural state. For example, if a host cell is transformed with a nucleotide sequence derived from another organism, particularly from another species, that nucleotide sequence is heterologous with respect to that host cell and also with respect to descendants of the host cell which carry that gene. Similarly, heterologous refers to a nucleotide sequence derived from and inserted into the same natural, original cell type, but which is present in a non- natural state, e. g. a different copy number, or under the control of different regulatory sequences. A transforming nucleotide sequence may comprise a heterologous coding sequence, or heterologous regulatory sequences. Alternatively, the transforming nucleotide sequence may be completely heterologous or may comprise any possible combination of heterologous and endogenous nucleic acid sequences.

Homotogous Nucleic Acid Sequence: a nucleic acid sequence naturally associated with a host genome into which it is introduced.

Homoloqous Recombination: the reciprocal exchange of nucleic acid fragments between homologous nucleic acid molecules.

Homoplasmic: refers to a plant, plant tissue or plant cell wherein all of the plastids are geneticallygeneticallyidentical. This is the normal state in a plant when the plastids have not been transformed, mutated, or otherwise genetically altered. In different tissues or stages of development, the plastids may take different forms, e. g., chloroplasts, proplastids, etioplasts, amyloplasts, chromoplasts, and so forth.

Isolated: in the context of the present invention, an isolated nucleic acid molecule or an isolated enzyme is a nucleic acid molecule or enzyme that, by the hand of man, exists apart from its native environment and is therefore not a product of nature. An isolated nucleic acid molecule or enzyme may exist in a purified form or may exist in a non-native environment such as, for example, a transgenic host cell.

Minimal Promoter: promoter elements that are inactive or that have greatly reduced promoter activity in the absence of upstream activation. In the presence of a suitable transcription factor, the minimal promoter functions to permit transcription.

Nucleic Acid Molecule/Nucleic Acid Sequence: a linear segment of single-or double- stranded DNA or RNA that can be isolated from any source. In the context of the present invention, the nucleic acid molecule is preferably a segment of DNA.

Operon-like Polvcistronic Gene: comprises two or more genes of interest under control of a single promoter capable of directing the expression of such operon-like polycistronic gene in plant plastids. Every gene in an operon-like polycistronic gene optionally comprises a ribosome binding site (rbs) operatively linked to the 5'end of the nucleotide sequence.

Preferably each rbs in the operon-like polycistronic gene is different. The operon-like polycistronic gene also typically comprises a 5'UTR operatively linked to the 5'end of the rbs of the first gene in the operon-like polycistronic gene and a 3'UTR operatively linked to the 3'end of the last gene in the operon-like polycistronic gene. Two genes in an operon-like polycistronic gene may also comprise several nucleic acids which overlap between the two genes.

Plant: any plant at any stage of development, particularly a seed plant.

Plant cell: a structural and physiological unit of a plant, comprising a protoplast and a cell wall. The plant cell may be in form of an isolated single cell or a cultured cell, or as a part of higher organized unit such as, for example, plant tissue, a plant organ, or a whole plant.

Plant Cell Culture: cultures of plant units such as, for example, protoplasts, cell culture cells, cells in plant tissues, pollen, pollen tubes, ovules, embryo sacs, zygotes and embryos at various stages of development.

Plant material : leaves, stems, roots, flowers or flower parts, fruits, pollen, egg cells, zygotes, seeds, cuttings, cell or tissue cultures, or any other part or product of a plant.

Plant Organ: a distinct and visibly structured and differentiated part of a plant such as a root, stem, leaf, flower bud, or embryo.

Plant tissue: as used herein means a group of plant cells organized into a structural and functional unit. Any tissue of a plant in planta or in culture is included. This term includes, but is not limited to, whole plants, plant organs, plant seeds, tissue culture and any groups of plant cells organized into structural and/or functional units. The use of this term in conjunction with, or in the absence of, any specific type of plant tissue as listed above or otherwise embraced by this definition is not intended to be exclusive of any other type of plant tissue.

Plastid nucleic acid sequence: nucleic acid sequence derived from a plastid genome.

Pro en: as used herein comprises all the subsequent generations obtained by self- pollination or out-crossing of a plant of the present invention.

Promoter: an untranscribed DNA sequence upstream of the coding region or transcribed nucleic acid sequence that contains the binding site for a RNA polymerase, for example a RNA polymerase active in plastids such as a nuclear-encoded"NEP"RNA polymerase or a plastid-encoded"PEP"RNA polymerase, or RNA polymerase li, and initiates transcription of the DNA. The promoter region may also include other elements that act as regulators of gene expression.

Protoplast: an isolated plant cell without a cell wall or with only parts of the cell wall.

Regulatory Sequence: an untranslated nucleic acid sequence that assists in, enhances, or otherwise affects the transcription, translation or expression of an associated structural nucleic acid sequence that codes for a protein or other gene product. Regulatory sequences include promoters. A promoter sequence is usually located at the 5'end of a transcribed or translated nucleic acid sequence, typically between 20 and 100 nucleotides from the 5'end of the translation start site. Regulatory sequences may also include transcribed but untranslated nucleic acid sequences located 5'and 3'to coding sequences. These untranslated RNA's are typically involved in post-transcriptional regulation of gene expression. Regulatory sequences also typically encompass sequences required for proper translation of the nucleotide sequence, such as, in the case of expression in plastids, ribosome binding sites (rbs).

Substantial similar: with respect to nucleic acids, a nucleic acid molecule that has at least 60 percent sequence identity with a reference nucleic acid molecule. In a preferred embodiment, a substantially similar DNA sequence is at least 80% identical to a reference DNA sequence; in a more preferred embodiment, a substantially similar DNA sequence is at least 90% identical to a reference DNA sequence; and in a most preferred embodiment, a substantially similar DNA sequence is at least 95% identical to a reference DNA sequence.

A substantially similar nucleotide sequence typically hybridizes to a reference nucleic acid molecule, or fragments thereof, under the following conditions: hybridization at 7% sodium dodecyl sulfate (SDS), 0.5 M NaP04 pH 7.0,1 mM EDTA at 50°C; wash with 2X SSC, 1% SDS, at 50°C. With respect to proteins or peptides, a substantially similar amino acid sequence is an amino acid sequence that is at least 90% identical to the amino acid sequence of a reference protein or peptide and has substantially the same activity as the reference protein or peptide.

Tolerance: the ability to continue normal growth or function when exposed to an inhibitor or herbicide.

Transformation: a process for introducing heterologous DNA into a cell, tissue, or plant, including a plant plastid. Transformed cells, tissues, or plants are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof.

Transformed/Transgenic/Recombinant: refer to a host organism such as a bacterium or a plant into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome of the host or the nucleic acid molecule can also be present as an extrachromosomal molecule. Such an extrachromosomal molecule can be auto-replicating. Transformed cells, tissues, or plants are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof. A"non-transformed","non-transgenic", or"non-recombinant"host refers to a wild-type organism, e. g., a bacterium or plant, which does not contain the heterologous nucleic acid molecule.

Nucleotides are indicated by their bases by the following standard abbreviations: adenine (A), cytosine (C), thymine (T), and guanine (G). Amino acids are likewise indicated by the following standard abbreviations: alanine (Ala; A), arginine (Arg; R), asparagine (Asn; N), aspartic acid (Asp; D), cysteine (Cys; C), glutamine (Gln; Q), glutamic acid (Glu; E), glycine (Gly; G), histidine (His; H), isoleucine (lie; I), leucine (Leu; L), lysine (Lys; K), methionine (Met; M), phenylalanine (Phe; F), proline (Pro; P), serine (Ser; S), threonine (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y), and valine (Val; V). Furthermore, (Xaa; X) represents any amino acid.

The present invention provides chimeric genes for expression of nucleic acid sequences in plastids and methods therefore. In particular, the present invention provides novel chimeric genes and methods for controlling the expression of multiple genes or polycistronic units in plastids that takes advantage of naturally-evolved regulatory sequences adapted for the transcription of genes in divergent orientation. It is therefore an advantage of the present invention over the prior art to use divergent activities of promoters to regulate the transcription of genes or polycistronic units. A more predictable expression of a nucleic acid sequence operatively linked to such a promoter is expected and the present invention also allows one to use limited lengths of plastid nucleic acid sequences in a plastid transgene and thus avoid or greatly diminish instability of transgenic plastids due to somatic recombination between heterologous and endogenous sequences.

A chimeric gene of the present invention comprises a plastid nucleic acid sequence natively located in a plastid genome between two nucleic acid sequences that are expressed in a plastid. Preferably, the two nucleic acid sequences are transcribed in a plastid in divergent orientation in respect to each other. The chimeric gene further comprises a first nucleic acid sequence operatively linked to the 3'end of said plastid nucleic acid sequence. Preferably, the chimeric gene also further comprises a second nucleic acid sequence operatively linked to the 5'end of said plastid nucleic acid sequence. The plastid nucleic acid sequence is capable of directing the transcription of the first and the second nucleotide sequences in a plastid. Preferably, when included in a chimeric gene of the present invention, the first and second nucleotide sequences are transcribed in a plastid in divergent orientation with respect to each other.

The invention therefore discloses plastid nucleic acid sequences comprising intergenic regions located natively in a plastid genome between two nucleic acid sequences expressed in a plastid. The two nucleic acid sequences may be expressed in the same plastid at the same time or may be expressed at different times, for example in different types of plastid types. Preferably, the two nucleic acid sequences are transcribed divergently with respect to each other. Desirably, the two nucleic acid sequences are expressed as mRNA (sense RNA or antisense RNA), rRNA or tRNA. Desirably, one or both nucleic acid sequences comprises an open reading frame and, therefore, the intergenic regions are preferably located between two open reading frames. More desirably, the two nucleic acid sequences are two nearby nucleic acid sequences, and, preferably, they are adjacent, in that there are no other transcribed sequences between the two nearby plastid nucleic acid sequences; however, a small gene, such as a gene encoding a tRNA, may be present in the intergenic regions between the two nearby nucleic acid sequences, and such intergenic region would still be within the scope of the present invention.

The intergenic regions described in the present invention comprise regulatory sequences, preferably promoters, for the expression of the two nucleic acid sequences in a plastid. In a preferred embodiment, the entire intergenic region located between the two different nucleic acid sequences is used in a chimeric gene of the present invention. In an altemate embodiment, only sequences necessary for transcription of the nucleic acid sequences are used. If one or both nucleic acid sequences contain open reading frames, 5'untranslated regions of one or both ORFs may be used in the chimeric gene. Altematively, they may be replaced with heterologous 5'UTRs. Examples of heterologous 5'UTRs are the T7 RNA polymerase 5'UTR or 5'UTRs derived from other plastid genes. In yet another embodiment, a portion of the 5'UTR may be retained in the chimeric gene while another portion (e. g. a ribosome binding site) may be replaced by heterologous sequences.

In yet another preferred embodiment, the two different nucleic acid sequences are transcribed primarily by different RNA polymerases, such as a plastid-encoded"PEP"RNA polymerase and a nuclear-encoded"NEP"RNA polymerase. In another preferred embodiment, one of the nucleic acid sequences is transcribed by both a plastid-encoded "PEP"RNA polymerase and a nuclear-encoded"NEP"RNA polymerase and the other nucleic acid sequence is preferably transcribed by a plastid-encoded"PEP"RNA polymerase. In another preferred embodiment, one or more than one transcript may initiate from one or from more than one regulatory sequence or promoter in the intergenic region.

For example, some plastid genes, such as the psbB gene encoding a component of the Photosystem II core complex and the clpP gene, encoding the catalytic subunit of a plastid ATP-dependent protease, are located only a short distance apart in most plastid genomes (ca. 500 bp) but are transcribed in divergent orientations. These two gene products are also transcribed primarily by different RNA polymerases, the plastid-encoded PEP RNA polymerase in the case of psb8 and the nuclear-encoded NEP RNA polymerase in the case of cipP. Thus, the intergenic region between the clpP gene and the psbB gene is an example of a plastid nucleic acid sequence used in chimeric gene of the present invention and the corresponding region derived from Arabidopsis is described in further details in the examples supra.

Other examples of plastid intergenic regions used in the present invention are for example in the tobacco plastid genome (GenBank ACCESSION Z00044 S54304), in the maize plastid genome (GenBank ACCESSION X86563) and in the rice plastid genome (GenBank ACCESSION X15901). For example, the intergenic region between the tobacco clpP and psbB genes is used. The clpP gene spans nucleotides 74437 to 74507 on the non-coding strand of the tobacco plastid genome and the psbb gene spans nucleotides 74953 to 76479 on the coding strand. Therefore, the intergenic region of interest in this case comprises nucleotides 74507 to 74953 on the tobacco plastid genome.

The intergenic region between the tobacco rpL23 and ycf2 genes is also used. In the inverted repeat of the tobacco plastid genome, the rpL23 gene spans nucleotides 88252 to 88533 on the non-coding strand, and the ycf2 gene spans nucleotides 88885 to 95727 on the coding strand. Therefore, the intergenic region of interest in this case comprises nucleotides 88533 to 88885 in the tobacco plastid genome. The other copies of the rpL23 and ycf2 genes in the tobacco plastid genome span nucleotides 154093 to 154374 on the coding strand for the rpL23 gene and nucleotides 146899 to 153741 on the non-coding strand. Therefore, in this case, the intergenic region of interest comprises nucleotides 153741 to 154093 on the tobacco plastid genome.

The intergenic region between the tobacco plastid genes encoding the ATPase Beta subunit (ATPase) and the Rubisco large subunit (rbcL) is also used. The ATPase gene spans nucleotides 55281 to 56777 on the non-coding strand, and the rbcL gene spans nucleotides 57595 to 59028 on the coding strand. Therefore, the intergenic region of interest in this case comprises nucleotides 56777 to 57595 on the tobacco plastid genome.

The intergenic region in the maize plastid genome between the genes encoding the ATPase and the rbcL is also used. The maize ATPase gene spans nucleotides 54621 to 56117 on the non-coding strand, and the maize rbcL gene spans nucleotides 56877 to 58307 on the coding strand. Therefore, the intergenic region of interest in this case comprises nucleotides 56117 to 56877 on the maize plastid genome.

The intergenic region in the maize plastid genome between the clpP and psbB genes is also used. The maize clpP gene spans nucleotides 69557 to 70207 on the non-coding strand, and the maize psbb gene spans nucleotides 70709 to 72235 on the coding strand.

Therefore, the intergenic region of interest in this case comprises nucleotides 70207 to 70709 on the maize plastid genome.

The intergenic region in the rice plastid genome between the genes encoding the ATPase and the rbcL is also used. The rice ATPase gene spans nucleotides 51814 to 53310 on the non-coding strand, and the rice rbcL gene spans nucleotides 54095 to 55528 on the coding strand. Therefore, the intergenic region of interest in this case comprises nucleotides 53310 to 54095 on the rice plastid genome.

Additional intergenic nucleic acid sequences useful for the instant invention are, for example, nucleic acid sequences located upstream of the dpP genes of various other species, such as barley (Westhoff (1985) Mol. Gen. Genet. 201: 115-123) or described in Sriraman et al.

(1998) NAR 26: 4879-4879), both incorporated herein by reference in their entirety.

In another preferred embodiment, a chimeric gene of the present invention further comprises a nucleic acid sequence operatively linked to the plastid nucleic acid sequence described above. Such nucleic acid sequence is typically transcribed into RNA, such as e. g. mRNA (sense RNA or antisense RNA), rRNA or tRNA. The nucleic acid sequence preferably comprises one or more open reading frame or coding sequence, which can be translated into one or more polypeptide. Preferably, heterologous nucleic acid sequences are used and any gene of interest or polycistronic unit, or combination thereof, is used. In a preferred embodiment, one of the nucleic acid sequences encodes a selectable marker gene for plastid transformation. A selectable marker for plastid transformation is for example an aadA gene conferring resistance to spectinomycin or streptomycin, the nptll gene conferring resistance to kanamycin, or a nucleic acid sequence that encodes a protein conferring tolerance to a herbicide. A nucleic acid sequence that encodes a protein conferring tolerance to a herbicide encodes, for example, a protoporphyrinogen oxidase (PPO) that confers tolerance to a PPO inhibitor (see for example US patent 5,767,373 and W097/32011, both incorporated herein by reference in their entirety). The nucleic acid sequence encodes for example a mutant plant PPO. Alternatively, the nucleic acid sequence comprises a E. coli hemG gene or a 8. subtilis hemY gene, preferably a E coli hemG gene as described in further detail in the Examples.

A chimeric gene of the present invention can be incorporated into plastid transformation vectors and transformed into plastids according to methods known in the art, particularly those described in the following: U. S. Patent Nos. 5,451,513,5,545,817,5,545,818, and 5,576,198; Intl. Application Nos. WO 95/16783, WO 97/32011, and WO 97/32977; and Svab et a/. (1993), Golds et a/. (1993) and McBride et a/. (1994), all incorporated herein by reference in their entirety. The present invention therefore also describes methods of producing a transgenic plant comprising in its plastid genome a chimeric gene of the present invention.

The present invention also discloses methods for expressing two nucleic acid sequences in plastids using a chimeric gene of the present invention. In a preferred embodiment, such a method comprises transforming a plastid genome of a plant with a chimeric gene of the present invention. Preferably, the method further comprises recovering a plant comprising the chimeric gene in its plastid genome, the plant being preferably homoplasmic for transgenic plastids.

The present invention also provides the promoter region for the clpP gene from the Arabidopsis thaliana plastid genome that encodes a plant homologue of the Clp ATP- dependent protease (also described in Intemational Application No. WO 99/46394, incorporated herein by reference in its entirety). The disclosed promoter can be used to drive expression of coding sequences for selectable marker genes or any other genes of interest in the plastids of transgenic plants. The promoter of the present invention is useful for constitutive expression of transgenes in both green and non-green plastids and is therefore particularly useful for plastid transformation in plants such as maize, in which preferential selection of regenerable transformants requires selection in non-green tissues.

The Arabidopsis clpP promoter of the present invention can be incorporated into plastid transformation vectors and transformed into plastids according to methods known in the art, particularly those described in the following: U. S. Patent Nos. 5,451,513,5,545,817, 5,545,818, and 5,576,198; Intl. Application Nos. WO 95/16783, WO 97/32011, and WO 97/32977; and Svab et al. (1993), Golds et al. (1993) and McBride et al. (1994).

The present invention also provides a novel method for utilizing regions of expressed nucleic acid sequences of plastid genes, preferably protein-coding regions of plastid genes, to isolate novel intervening regulatory sequences, such as nove ! promoters or 3'or 5'UTR's.

Such a method is exemplified by Applicant's technique for isolating the Arabidopsis plastid clpP promoter region and the Arabidopsis plastid 16S rRNA promoter region, as set forth in detail in the Examples below. Briefly, isolation of these promoter regions is facilitated by the chance that gene order in the Arabidopsis plastid genome is conserved relative to that of Nicotiana tabacum, for which the entire plastid genome sequence is known. In tobacco, clpP is present in divergent orientation from the psbB gene, the coding sequence of which is conserved among a number of plant species. Because only 445 base pairs separate the psbB start codon from the divergently oriented start codon of clpP in tobacco, the sequences of the protein coding regions of the divergent cIpP and psbB genes are used to design primers for PCR that amplify the noncoding intergenic region between these genes. This region includes the promoters for psbB in one orientation and clpP in the other. An expressed sequence tag (EST) sequence from Arabidopsis is found in an EST database that appears to include a portion of the clpP coding sequence and 5'untranslated RNA (5'UTR).

The sequence of this EST is used to design primers for PCR amplification of the dp ? promoter based on the Arabidopsis DNA sequence encoding the putative start of the clpP protein. These primers are paired with ones designed to match the highly conserved DNA sequences around the psbB start codon. Using these primers, a DNA fragment of approximately 500 nucleotides, which includes the Arabidopsis plastid cipP promoter region, is amplified from total DNA of Arabidopsis. A DNA fragment that includes the Arabidopsis plastid 16S rRNA promoter region is amplified in a like manner.

Using the above method, one of ordinary skill in the art can use the protein-coding regions of two nearby plastid genes to isolate intervening untranslated sequences such as promoters and other regulatory sequences from the plastid genome of any plant. Preferably the two plastid genes are adjacent, in that there are no other transcribed sequences between the two nearby plastid genes; however, it is foreseeable that this method will work even if there is a small gene, such as a gene encoding a tRNA, in the amplified region between the two nearby plastid genes. In a preferred embodiment, one of ordinary skill in the art can use the above method to isolate a plastid clpP promoter from the plastid genome of any plant. In another preferred embodiment, one of ordinary skill in the art can use the above method to isolate a plastid 16S rRNA promoter from the plastid genome of any plant.

The present invention further provides a method of using novel plastid promoters, such as the Arabidopsis plastid clpP or 16S rRNA promoters, to improve plastid transformation efficiency by reducing undesired recombination between native DNA sequences in the plastid genome and exogenous DNA sequences contained in chimeric DNA fragments that are incorporated into plastid transformation vectors. It is known that even relative short regions of homology between native DNA sequences in the plastid genome and exogenous DNA sequences will ultimately cause somatic recombination in plastid transformants. This biological property has even been used as a means for eliminating selectable markers from plastid transformants in chloroplasts of the green alga Chlamydomonas by flanking the selectable marker with identical repeated heterologous DNA sequences (Goldschmidt- Clermont, Nucl. Acids Res. 19: 4083-4089 (1991). Although neither the minimum size tract of homology required nor the precise degree of sequence identity within a particular homology tract sufficient for recombination has been identified, as little as 50-bp of homology to the plastid genome may be enough to induce recombination. These recombination events are visible in transgenic plants as pale sectors in leaves resulting from division of cells in which plastid genome rearrangements have occurred. In extreme cases the result is nearly white leaves with small patches of green indicating recombination occurring in the majority of somatic cells and their lineage.

The essential features of non-recombinogenic regulatory sequences (such as promoters and 5'and 3'UTR's) include both the ability to function correctly to control heterologous gene expression in the plastids of a plant species of interest, as well as the lack of sufficient sequence identity to promote homologous plastid recombination. The latter property may be achieved either by using a heterologous regulatory sequence derived from the plastid genome of a different plant species, which has diverged in sequence to less than 85-90% identity, or by sufficiently mutating a native regulatory sequence derived from the plastid genome of the same plant species. In one embodiment this method involves using the Arabidopsis cIpP promoter of the present invention to direct transcription of genes of interest in the plastids of heterologous plant species such as tobacco, maize, rice, soybean, tomato, potato, or others. In another embodiment this method involves using the Arabidopsis 16S rRNA promoter described in the Examples to direct transcription of genes of interest in the plastids of heterologous plant species such as tobacco, maize, rice, soybean, tomato, potato, or others. In addition to higher plant plastid genes, useful heterologous promoters or 5'and 3'UTR's for non-recombinogenic regulation of plastid transgenes may also be derived from plastid genes of lower plants or algae, chromosomal genes of cyanobacteria, or genomes of viruses that infect plant or algal chloroplasts or cyanobacterial cells.

Selection of mutated native genes from the same plant, which are incapable of undesired recombination, is facilitated by random mutagenesis of regulatory sequences such that the sequence identity is reduced to at most 90% relative to the starting sequence. The pool of randomly mutated regulatory sequences is then selected for the subset that still is plastid- active (capable of normal functioning in plant plastids) by cloning each mutant upstream of a selectable marker gene that operates in the plastid then transforming the entire pool of chimeric DNA's into the plastids of wildtype plants. Only those mutated sequences still capable of functioning in plastids will result in expression of the selectable marker in the transgenic plants. Transgenic plants expressing the selectable marker are also assessed for somatic recombination by observing the frequency of leaf sectoring. The targeted region of the plastid genome of a transformed plant expressing the selectable marker and having a desirable frequency of leaf sectoring is then sequenced to determine which mutated regulatory sequence is present. This mutated sequence thus meets the criteria of controlling expression in a plastid of a gene of interest and having sufficient sequence divergence relative to native plastid DNA sequences to reduce the frequency of undesired recombination.

EXAMPLES The invention will be further described by reference to the following detailed examples.

These examples are provided for purposes of illustration only, and are not intended to be limiting unless otherwise specified. Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Ausubel (ed.), Current Protocols in Molecular Biology, John Wiley and Sons, Inc. (1994); T. Maniatis, E. F. Fritsch and J. Sambrook, Molecular Cloning : A Laboratory Manual, Cold Spring Harbor laboratory, Cold Spring Harbor, NY (1989); and by T. J. Silhavy, M. L. Berman, and L. W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1984).

Example 1 : Isolation of a divergent Arabidopsis psbB/cipP Promoter Region Isolation of a divergent Arabidopsis psbB/cIpP promoter region is facilitated by the likelihood that gene order in the Arabidopsis plastid genome is conserved relative to that of Nicotiana tabacum, a plant for which the entire plastid genome sequence is known. In tobacco, clpP is present in divergent orientation from the psbB gene, which has been sequenced from a number of plant species and shown to be conserved in sequence. An alignment of the psbB sequences of tobacco, maize, wheat, and Nicotiana acumina indicates that the first eight amino acids are identically conserved, as are their DNA coding sequences. In tobacco only 445 base pairs separate the psbB start codon from the divergently oriented start codon of clpP.

In view of the above, the Applicants postulate that the sequences of the protein-coding regions of the divergent clpP and psbB genes can be used to design primers for PCR that could amplify the noncoding intergenic region between these genes. This region will, in theory, include the promoters for psbB in one orientation and clpP in the other. An expressed sequence tag (EST) sequence from Arabidopsis is found in the TIGR NHC AtEST database (http://www. tigr. org) that appears to include a portion of the clpP coding sequence and 5'untranslated RNA (5'UTR). Because the putative translation of this sequence is similar to the mature clpP of E. coli, and hence does not appear to include a plastid transit peptide, it is postulated that this EST (Seq ID&num P 3982 from the TIGR NHC AtEST database) represents a portion of the plastid clpP message. However, because Shanklin et al., 1995, Plant Cell 7: 1713-22, have suggested that nuclear-encoded cipP homologs might exist in Arabidopsis, Applicants are wary of finding these instead of the genuine plastid- encoded gene. Because EST's by definition come from expressed messages, ESTP 3982 is not expected to include any of the untranscribed dp ? promoter region.

The nucleotide sequence of ESTP 3982 is used to design primers for PCR amplification of the cIpP promoter based on the Arabidopsis DNA sequence encoding the start of the clpP protein in this plant. These primers are paired with ones designed to match the highly conserved DNA sequences around the psbB start codon, which Applicants postulate are similarly conserved in Arabidopsis. The primers used are: A_cipP: 5'-AAGGGACTTTTGGAACGCCAATAGGCAT-3' (SEQ ID NO: 2) and A_psbB: 5'-CACGATACCAAGGCAAACCCATGGA-3' (SEQ ID NO: 3).

These successfully amplify a DNA fragment of approximately 500 nucleotides from total DNA of A. thaliana (cv"Landsburg erecta") using Pfu thermostable DNA polymerase. The blunt-ended DNA fragment is sequenced both directly (using the cloning primers) and subsequent to cloning into the EcoRV site of vector pGEM5Zf (-) to construct plasmid pPH146b. The nucleotide sequence of this approximately 500-bp PCR fragment is given in SEQ ID NO: 1. Sequence analysis reveals 86% sequence identity to the tobacco clpP promoter region over a 200-bp region extending upstream of the clpP start codon. Thus, SEQ ID NO: 1 includes the Arabidopsis clpP promoter region.

DNA from plasmid pPH146b is used for PCR amplification of the divergent psbB/clpP promoter with primers M13 (-20): 5'GTA AAA CGA CGG CCA GT-3' (SEQ ID NO: 13) (Stratagene, La Jolla, CA) and A-clpP-2 5'GCC ATG GAA TGG AAA AAA AAA GAG-3 (SEQ ID NO: 14). Upon amplification, A-clpP-2 incorporates a Ncol site (underlined above) on the clpP side of the divergent promoter. This facilitates cloning of the PCR product since there is already a Ncol site on the psbB side of the promoter. PCR is performed in a Perkin Elmer Thermal Cycler 480 (Perkin Elmer/Roche, Branchburg, NJ) using the thermostable Pfu DNA polymerase (Stratagene, La Jolla, CA) according to manufacturers recommendations as follows: 35 cycles of 1 min 95°C/1 min 55°C/2 min 72°C followed by 10 min at 72°C after the 35 cycles. The resulting PCR product (606 bp) is digested with Ncol, separated on a 1.0% low melt agarose gel, and isolated. The PCR fragment is ligated with Ncol-digested pLitmus 28 cloning vector (New England Biolabs) forming plasmid pPB41 b.

Example 2: Isolation of the E. coli hemG Gene.

The E. coli hemG is amplified with primers HemGP1 a (sense) CGC CCA TGG CAA AAA CAT TAA TTC TTT TCT C (SEQ ID NO: 15) and HemGP1 b (antisense) GCG TCT AGA TTA TTT CAG CGT CGG TTT GT (SEQ ID NO: 16). Upon amplification, HemGP1a incorporates a Ncol site (underlined above) and two addition nucleotides (CA) after the Ncol site at the gene's 5'end, while HemGP1 b incorporates a Xbal site (underlined above) at the gene's 3' end. These alterations facilitate cloning and, in the case of HemGP1a, provide an ATG start codon and ensure that the gene is translated in the correct reading frame.

PCR is performed in a Perkin Elmer Thermal Cycler 480 (Perkin Elmer/Roche, Branchburg, NJ) using the thermostable Pfu DNA polymerase (Stratagene, La Jolla, CA) according to manufacturers recommendations as follows: 25 cycles of 1 min 95°C/1 min 60°C/2 min 72°C followed by 10 min at 72°C after the 25 cycles. The resulting PCR product (hemG, 563bp) is digested with Ncol and Xbal, separated on a 1.0% low melt agarose gel, and isolated. The PCR fragment is ligated with Xbal/Ncol digested pLitmus 28 cloning vector forming plasmid pPB27.

Example 3: Preparation of Tobacco Chloroplast Transformation Vectors Containing a E. coli hemG Gene or Modified Arabidopsis PPO Genes Driven By the clpP Promoter and the aadA Gene Driven by the psbB Promoter.

I. Construction of vector for homologous recombination into the tobaccoplastid qenome The tmV and rps12/7 intergenic region of the tobacco plastid genome is modified for insertion of chimeric genes by homologous recombination. A 1.78 kb region (positions 139255 to 141036, Shinozaki et al., (1986) EMBO J 5: 2043-2049) is PCR amplified from the tobacco plastid genome and a Pstl site is inserted after position 140169, yielding 915 bp and 867 bp of flanking plastid DNA 5'and 3'of the Pstl insertion site. PCR amplification (PfuTurbo DNA Polymerase, Stratagene, La Jolla, CA) is performed with a primer pair inserting a BslEI site before position 139255 (5'-TAA CGG CCG CGC CCA ATC ATT CCG GAT A-3', SEQ I D NO: 17) and a Pstl site after position 140169 (5'-TAA CTG CAG AAA GAA GGC CCG GCT CCA A-3', SEQ ID NO: 18). PCR amplification is also performed with a primer pair inserting a Pstl site before position 140170 (5'-CGC CTG CAG TCG CAC TAT TAC GGA TAT G-3', SEQ ID NO: 19) and a BslWI site after position 141036 (5'-CGC CGT ACG AAA TCC TTC CCG ATA CCT C-3', SEQ ID NO: 20). The Pstl-BsiF-i fragment is inserted into the Pstl-Sacil sites of pBluescript SK+ (Stratagene), yielding pAT216 and the Pstl-BsiWI fragment is inserted into the Pstl-Acc651 sites of pBluescript SK+, yielding pAT215. Plasmid pAT218 contains the 1.78 kb of plastid DNA with a Pstl site for insertion of chimeric genes and selectable markers and is constructed by ligation of the 2.0 kb Pstl-Scal fragment of pAT215 and the 2.7 kb PstI-Scal band of pAT216.

II. Amplification of the tobacco otastid rps16 gene 3'untransiated RNA sequence (3'UTR) and of the Arabidopsis thaliana plastid psbA 3'UTR, and ligation into pAT218 The tobacco plastid rps16 3'UTR is PCR amplified from tobacco DNA (N. tabacum cv.

Xanthi) using the following oligonucleotide pair: a Spel site is added immediately after the stop codon of the plastid rps16 gene encoding ribosomal protein S16 with the"top strand" primer (5'-CGC GAC TAG TTC AAC CGA AAT TCA AT-3', SEQ ID NO: 21) and a Pstl site is added at the 3'end of the rps16 3'UTR with the"bottom strand"primer (5'-CGC TCT GCA GTT CAA TGG AAG CAA TG-3', SEQ ID NO: 22). The amplification product is gel purified and digested with Spel and Pstl, yielding a 163 bp fragment containing the tobacco rps16 3' UTR (positions 4941 to 5093 of the tobacco plastid genome, Shinozaki et al. (1986) EMBO J 5: 2043-2049) flanked 5'by a Spel site and 3'with a Pstl site.

The A. thaliana plastid psbA 3'UTR is PCR amplified from A. thaliana DNA (ecotype Landsberg erecta) using the following oligonucleotide pair: the"top strand"primer adds a Spel site to the 5'end of the 3'UTR and eliminates a Xbal site in the native sequence by mutating a G to an A (underlined) (5'-GCG ACT AGT TAG TGT TAG TCT AAA TCT AGT T- 3', SEQ ID NO: 23) and the"bottom strand"primer adds a Hindlll site to the 3'end of the UTR (5'-CCG CAA GCT TCT AAT AAA AAA TAT ATA GTA-3', SEQ ID NO: 24). The amplified region extends from position 1350 to 1552 of GenBank accession number X79898.

The 218 bp PCR product is gel purified, digested with Spel and Hindlil and ligated with the Hindlll-Pstl cut PCR fragment carrying the T7 terminator into the Spel-Pstl sites of pBluescript SK- (Stratagene), yielding pPH171. Sequence analysis of the psbA 3'UTR region of pPH171 compared to GenBank accession number X79898 reveals deletion of an A at positions 1440 and 1452.

The SpellPstl fragment containing the tobacco plastid rps16 3'-UTR and the SpellPstl fragment containing the A. thaliana plastid psbA 3'UTR are ligated into the Pst site of pAT218 to form pPB3.

III. Ligation of a E. coli hemG Gene or Modified Arabidopsis PPO Genes Driven Bv the clpP Promoter and the aadA Gene Driven by the osbB Promoter into a Chloroplast Transformation Vector The coding sequence of the aadA gene, a bacterial gene encoding the enzyme aminoglycoside 3"adenyltransferase that confers resistance to spectinomycin and streptomycin, is isolated from pRL277 (Black et al. (1993) Molecular Microbiology 9: 77-84 and Prentki et al. (1991) Gene 103: 17-23). The 5'major portion of the aadA coding sequence is isolated as a 724 bp BspHI-BssHII fragment from pRL277 (the starting codon is at the BspHl site) and the 3'remainder of the aadA gene is modified by adding a Spel site 20 bp after the stop codon by PCR amplification using pRL277 as template and the following oligonucleotide pair: the"top strand"primer (5'-ACC GTA AGG CTT GAT GAA-3', SEQ ID NO: 25) and the"bottom strand"primer which added a Spel site (5'-CCC ACT AGT TTG AAC GAA TTG TTA GAC-3', SEQ ID NO: 26). The 658 bp amplification product is gel purified, digested with BssHll, Spel and the 89 bp fragment is ligated to the 5'portion of the aadA gene carried on a 724 bp BspHl-BssHll fragment. The aadA gene is fused to the psbB promoter by ligating pPB41 b NcollSpel (3369 bp) and a BspHI/Spel fragment containing the aadA gene (813 bp, described above) to form plasmid pPB42. In order to fuse the Ecoli hemG gene to the dp ? promoter, pPB42 KpnllNcol (4159 bp) is ligated with pPB27 Kpnl/Ncol (577 bp) forming plasmid pPB43. A modified Arabidopsis PPO gene is attached to the clpP promoter in a similar manner.

Final tobacco plastid transformation vectors are made by ligating pPB43 SpellXbal (aadA- psbB:: cipP-hemG, 1854 bp) or pPB63 Spel (aadA-psbB : : cipP-Arab PPO, 2843 bp) with pPB3 Spel (5033 bp) which contains psbA 3'UTR, rps16 3'UTR, and tobacco homologous flanking regions to produce plasmids pPB45a and pPB64a having the psbA 3'UTR fused to the aadA gene and the rps16 3'UTR fused to the hemG gene or Arab PPO gene.

Example 4: Preparation of Tobacco Chloroplast Transformation Vectors Containing the GUS Gene Driven by the clpP Promoter and the aadA Gene Driven by the psbB Promoter An 1864 bp-glucuronidase (GUS) reporter gene fragment derived from plasmid pRAJ275 (Clontech) containing an Ncol restriction site at the ATG start codon and an Xbal site following the native 3'UTR is produced by digestion with Ncol and Xbal. The GUS gene is fused to the clpP promoter by ligating pPB43 XballHindlll (2846 bp), pPB43 NcollHindlll (1339 bp), and the 1864 bp (GUS) reporter gene fragment forming plasmid pPB47. A final tobacco plastid transformation vector is made by ligating pPB47 Spel/Xbal (aadA- psbB : : clpP-GUS, 3167 bp) and a plastid transformation vector as described above to produce plasmid pPB48a.

Example 5: Preparation of a Chimeric Gene Containing the Arabidopsis clpP Promoter and Native clpP 5'Untranslated Sequence Fused to a GUS Reporter Gene and Tobacco Plastid rpsl6 Gene 3'Untranslated Sequence in a Plastid Transformation Vector I. Amplification of the Arabidopsis Plastid c/oP Gene Promoter and Complet 5'Untranslated RNA (5'UTR).

DNA from plasmid pPH146b is used as the template for PCR with a left-to-right"top strand" primer comprising an introduced EcoRl restriction site at position-234 relative to the ATG start codon of the Arabidopsis plastid cIpP gene (nucleotide no. 263 of SEQ ID NO: 1) (primerAc)pP1a: 5'-GCGGAATTCATCATTCAGAAGCCCGTTCGT-3' (SEQ ID NO: 4; EcoRl restriction site underlined)) and a right-to-left"bottom strand"primer homologous to the region from-21 to-1 relative to the ATG start codon of the dp ? promoter that incorporates an introduced BspHl restriction site at the start of translation (primer Ac!pP2b: 5'-GCGTCATGAAATGAAAGAAAAAGAGAAT-3' (SEQ ID NO: 5; BspHI restriction site underlined)). This PCR reaction is undertaken with Pfu thermostable DNA polymerase (Stratagene, La Jolla CA) in a Perkin Elmer Thermal Cycler 480 according to the manufacturer's recommendations (Perkin Elmer/Roche, Branchburg, NJ) as follows: 7 min 95°C, followed by 4 cycles of 1 min 95°C/2 min 43°C/1 min 72°C, then 25 cycles of 1 min 95°C/2 min 55°C/1 min 72°C. A 250 bp amplification product comprising the promoter and 5'untranslated region of the Arabidopsis clpP gene containing an EcoRl site at its left end and an BspHl site at its right end with two modifications near the ATG to correspond with the tobacco clpP sequence 5'UTR is gel purified using standard procedures and digested with EcoRl and BspHl (all restriction enzymes may be purchased from New England Biolabs, Beverly, MA).

II. Amplification of the Tobacco Plastid rus16 Gene 3'Untranslated RNA Sequence (3' UTR).

Total DNA from N. tabacum c. v."Xanthi NC"is used as the template for PCR as described above with a left-to-right"top strand"primer comprising an introduced Xbal restriction site immediately following the TAA stop codon of the plastid rps16 gene encoding ribosomal protein S16 (primer rps16P_1a: 5'-GCGTCTAGATCAACCGAAATTCAATTAAGG-3' (SEQ ID NO: 6; Xbal restriction site underlined)) and a right-to-left"bottom strand"primer homologous to the region from +134 to +151 relative to the TAA stop codon of rps16 that incorporates an introduced Hindlil restriction site at the 3'end of the rps16 3'UTR (primer rps16P_1b: 5'-CGCAAGCTTCAATGGAAGCAATGATAA-3' (SEQ ID NO: 7; Hindlll restriction site underlined)). The amplification product comprising the 3'untranslated region of the rps16 gene containing an Xbal site at its left end and a Hindlil site at its right end and containing the region corresponding to nucleotides 4943 to 5093 of the N. tabacum plastid DNA sequence (Shinozaki et al., 1986, EMBO J 5: 2043-2049) is gel purified and digested with Xbal and Hindlil.

I. Ligation of a GUS Reporter Gene Fragment to the clpP Gene Promoter and 5'and 3'UTR's.

An 1864 bp p-gtucuronidase (GUS) reporter gene fragment derived from plasmid pRAJ275 (Clontech) containing an Ncol restriction site at the ATG start codon and an Xbal site following the native 3'UTR is produced by digestion with Ncol and Xbal. This fragment is ligated in a four-way reaction to the 250 bp EcoRI/BspHI Arabidopsis clpP promoter fragment, the 157 bp Xbal/Hindlil tobacco rps16 3'UTR fragment, and a 3148 bp EcoRl/Hindlll fragment from cloning vector pGEM3Zf (-) (Promega, Madison WI) to construct plasmid pPH165. Plastid transformation vector pPH166 is constructed by digesting plasmid pPRV111 a (Zoubenko et al. 1994, Nucleic Acids Res. 22 (19): 3819-3824) with EcoRl and Hindlll and ligating the resulting 7287 bp fragment to a 2222 bp EcoRl/Hindlll fragment of pPH165.

Example 6: Isolation of the Arabidopsis 16S rRNA Gene Promoter Region Isolation of the Arabidopsis 16S rRNA gene promoter region is facilitated by the chance that gene order in the Arabidopsis plastid genome is conserved relative to that of Nicotiana tabacum, a plant for which the entire plastid genome is known. In Sinapis alba, a closely related species to Arabidopsis, the 16S rRNA gene and valine tRNA are oriented as in tobacco (GenBank accession number CHSARRN1). The Arabidopsis 16S rRNA gene promoter region is isolated by PCR amplification (PfuTurbo DNA Polymerase, Stratagene, La Jolla, CA) using total A. thaliana (cv"Landsburg erecta") as template and the following primers that are conserved in both Nicotiana and Sinapis alba:"top strand"primer (5'- CAGTTCGAGCCTGATTATCC-3' (SEQ ID NO: 8) and the"bottom strand"primer (5'- GTTCTTACGCGTTACTCACC-3' (SEQ ID NO: 9). The predicted 379 bp (based on Sinapis alba sequence) amplification product comprising the Arabidopsis 16S rRNA gene promoter region corresponding to nucleotides 102508 to 102872 of the tobacco plastid genome (Shinozaki et al., 1986, EMBO J 5: 2043-2049) is blunt end ligated into the EcoRV site of pGEM5Zf (-) (Promega) to construct pArab16S and sequence analysis and comparisons to the tobacco 16S rRNA promoter is performed. The Arabidopsis 16S rRNA gene promoter region product is 369 bp and is set forth as SEQ ID NO: 10.

Example 7: Preparation of a Chimeric Gene Containing the Arabidopsis 16S rRNA Gene Promoter and Native 5'Untranslated Sequence Fused to the Ribosome Binding Site of the Tobacco rbcL gene, a GUS Reporter Gene and the Tobacco Plastid rps16 Gene 3'Untranslated Sequence in a Plastid Transformation Vector I. Amplification of the Arabidopsis Plastid 16S rRNA Gene Promoter and Native 5' Untranslated Sequence (5'UTR) and Fusion to the Ribosome Bindinq Site of the Tobacco rbcL gene.

DNA from plasmid pArab16S is used as the template for PCR with a"top strand"primer comprising an introduced EcoRl restriction site at the 5'end of the 16S rRNA gene promoter region (position 63 of SEQ ID NO: 10) (5'- GCCGGAATTCTCGCTGTGATCGAATAAGAATG-3' (SEQ ID NO: 11; EcoRl restriction site underlined)). The"bottom strand"primer extends to position 172 (SEQ ID NO: 10) of the 16S rRNA gene promoter 5'untranslated region, mutates three ATG's downstream of the transcription start site by changing position 151 (T to G) (SEQ ID NO: 10), position 158 (A to C) (SEQ I D NO: 10 and position 167 (A to C) (SEQ ID NO: 10), fuses the ribosome binding site of the tobacco rbcL gene (positions 57569 to 57585) (Shinozaki et al., 1986, EMBO J 5: 2043-2049) as a 5'extension to the 3'end of the 16S rRNA gene 5'UTR and introduces a BspHl site at the 3'end of the ribosome binding site (5'- GCCTTCATGAATCCCTCCCTACAACTATCCAGGCGCTTCAGATTCGCCTGGAGTT-3' (SEQ ID NO: 12; BspHl restriction site underlined)). PCR amplification is performed with the Pfu Turbo DNA Polymerase kit (Stratagene). The 145 bp amplification product comprising the Arabidopsis 16S rRNA gene promoter and 5'untranslated region with three ATG's mutated and the ribosome binding site of the tobacco rbcL gene is gel purified and digested with EcoRl and BspHl, yielding a 131 bp product.

II. Ligation of the Arabidopsis 16S rRNA Gene Promoter, 5'UTR and Ribosome Binding Site of the Tobacco rbcL gene to the GUS Reporter Gene and Tobacco Plastid rps16 Gene 3'Untranslated Region (3'UTR) in a Plastid Transformation Vector.

An 1864 bp b-glucuronidase (GUS) reporter gene fragment derived from plasmid pRAJ275 (Clontech) containing an Ncol restriction site at the ATG start codon and an Xbal site following the stop codon is produced by digestion with Ncol and Xbal. This fragment is ligated in a four-way reaction to the 131 bp EcoRI/BspHI Arabidopsis 16S rRNA gene promoter, 5'UTR and tobacco rbcL ribosome binding site fragment, the Xbal/Hindlll tobacco rps16 3'UTR fragment described in Example 2, and a 3148 bp EcoRl/Hindlll fragment from cloning vector pGEM3Zf (-) (Promega, Madison, WI). A plastid transformation vector is constructed by digesting the previous construct with EcoRl and Hindlll and ligating the resulting 2.1 kb fragment to a 7.3 kb EcoRl/Hindlll fragment from plasmid pPRV111a (Zoubenko et al. 1994, Nucleic Acids Res. 22 (19): 3819-3824).

Example 8: Biolistic Transformation of the Tobacco Plastid Genome Seeds of Nicotiana tabacum c. v.'Xanthi nc'are germinated seven per plate in a 1"circular array on T agar medium and bombarde 12-14 days after sowing with 1 Nm tungsten particles (M10, Biorad, Hercules, CA) coated with DNA from the plasmids described above in the Examples essentially as described in Svab, Z. and Maliga, P. ( (1993) PNAS 90,913- 917). Bombarded seedlings are incubated on T medium for two days after which leaves are excised and placed abaxial side up in bright light (350-500 pmol photons/m2/s) on plates of RMOP medium (Svab, Z., Hajdukiewicz, P. and Maliga, P. (1990) PNAS 87, 8526-8530) containing 500 gug/ml spectinomycin dihydrochloride (Sigma, St. Louis, MO). Resistant shoots appearing underneath the bleached leaves three to eight weeks after bombardment are subcloned onto the same selective medium, allowed to form callus, and secondary shoots are isolated and subcloned. Complete segregation of transformed plastid genome copies (homoplasmicity) in independent subclones is assessed by standard techniques of Southem blotting (Sambrook et al., (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor). BamHI/EcoRl-digested total cellular DNA (Mettier, l. J. (1987) Plant Mol Biol Reporter 5, 346-349) is separated on 1 % Tris-borate (TBE) agarose gels, transferred to nylon membranes (Amersham) and probed with 32p_ labeled random primed DNA sequences corresponding to a 0.7 kb BamHI/Hindlll DNA fragment from a plastid transformation vector containing a portion of the rps7/12 plastid targeting sequence. Homoplasmic shoots are rooted aseptically on spectinomycin- containing MS/IBA medium (McBride, K. E. et al. (1994) PNAS 91,7301-7305) and transferred to the greenhouse.

Various modifications of the invention described herein will become apparent to those skilled in the art. Such modifications are intended to fall within the scope of the appended claims.