Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IDENTIFYING AGENTS THAT CONFER DESIRED TRAITS USING THE CONSTANS RESPONSE ELEMENT
Document Type and Number:
WIPO Patent Application WO/2011/043868
Kind Code:
A1
Abstract:
Methods for identifying compounds that can enhance plant performance are disclosed. Further disclosed are methods for conferring altered traits to plants by incorporation of expression vectors operably linking specified promoter sequences to reporter genes which signal changes in plant phenotypes in response to application of compounds.

Inventors:
ARMSTRONG JOSHUA I (US)
TIWARI SHIV B (US)
RATCLIFFE OLIVER J (US)
Application Number:
PCT/US2010/045941
Publication Date:
April 14, 2011
Filing Date:
August 18, 2010
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MENDEL BIOTECHNOLOGY INC (US)
ARMSTRONG JOSHUA I (US)
TIWARI SHIV B (US)
RATCLIFFE OLIVER J (US)
International Classes:
G01N33/48
Foreign References:
US6495737B12002-12-17
US20020111471A12002-08-15
US20050019927A12005-01-27
US20070226839A12007-09-27
US20080301836A12008-12-04
US20080301836A12008-12-04
Other References:
SCHWARTZ ET AL.: "Cis-regulatory Changes at Flowering Locus T Mediate Natural Variation in Flowering Responses of Arabidopsis thaliana", GENETICS 183, vol. 183, 3 August 2009 (2009-08-03), pages 723 - 732
IZAWA ET AL.: "Comparative biology comes into bloom: genomic and genetic comparison of flowering pathways in rice and Arabidopsis", CURRENT OPINION IN PLANT BIOLOGY, vol. 6, 2003, pages 113 - 120, XP009084520, DOI: doi:10.1016/S1369-5266(03)00014-1
DATABASE GENEBANK [online] 9 February 2011 (2011-02-09), SCHWARTZ ET AL., retrieved from http://www.ncbi.nlm.nih.gov/nuccore/256427029?sat=OLD&satkey=5341381 Database accession no. GQ395478
"Cis-regulatory Changes at Flowering Locus T Mediate Natural Variation in Flowering Responses of Arabidopsis thaliana", GENETICS 183, vol. 183, 3 August 2009 (2009-08-03), pages 723 - 732
Attorney, Agent or Firm:
MAO, Yifan et al. (Inc.3935 Point Eden Wa, Hayward California, US)
Download PDF:
Claims:
What is claimed is:

1. A method of selecting a compound of interest comprising the steps of:

a) applying a test compound to a plant cell comprising an expression vector that comprises a promoter sequence operably linked to a reporter gene;

wherein said promoter sequence comprises SEQ ID NO: 16 or SEQ ID NO: 44; and

b) selecting the compound of interest that modifies the reporter gene activity or expression level relative to a control plant;

wherein said modification of reporter gene activity or expression level is an increase or decrease in reporter gene activity or expression level.

2. The method of claim 1, wherein the promoter sequence has one or more subsequences, any of which has a percentage identity with any of SEQ ID NOs: 1 - 4, 7-10, 13- 21, 55-56 or a functional part thereof;

wherein the percentage identity is selected from the group consisting of at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% and 100%; wherein the promoter sequence can be bound by a COL polypeptide and as a result of said binding, the promoter sequence regulates transcription of the reporter gene.

3. The method of claim 1, wherein the promoter sequence comprises a continuous region of at least 8, 9, 10, 13, 17, 20, 30, 35, 40, 50, 75, 100, 125, 150, 191, 200, 251, 311, 491, 496, 525, 550, 575, 600, 650, 700, 750, 800, 900, 950, 1000, 1050, 1 100, 1150, 1200, 1500, 1633, 1639, 1800, 2465, 2500, or 5700 base pairs of SEQ ID NO: 1, 14, 15, 21, 55 or 56;

wherein the promoter sequence can be bound by a COL polypeptide and as a result of said binding the promoter sequence regulates transcription of the reporter gene.

4. The method of claim 2, wherein the plant cell is comprised within a plant, and the method further comprises a validation assay comprising:

contacting the plant cell with the compound of interest, and detecting an altered trait in the plant relative to a control plant; wherein the altered trait includes accelerated onset of flowering, delayed onset of flowering, enhanced tolerance to biotic or abiotic stress, increased yield, enhanced disease resistance, altered sterility, reduced sensitivity to light, greater early season growth, greater height, greater stem diameter, increased biomass, increased photosynthetic rate, increased resistance to lodging, increased internode length, increased secondary rooting, greater cold tolerance, greater tolerance to water deprivation, greater tolerance to salt, greater tolerance to heat, altered sugar sensing, reduced stomatal conductance, altered C/N sensing, increased low nitrogen tolerance, increased low phosphorus tolerance, increased tolerance to hyperosmotic stress, greater late season growth and vigor, increased number of mainstem nodes, and/or greater canopy coverage.

5. The method of claim 2, wherein one or more additional compounds of interest are applied to the plant cell.

6. The method of claim 2, wherein the promoter sequence is operably linked to SEQ ID NO: 22 that encodes SEQ ID NO: 23, and wherein SEQ ID NO: 23 regulates transcription of a reporter gene operably linked to SEQ ID NO: 24.

7. The method of claim 2, wherein the altered reporter gene activity is indicated by changes in colorimetric, fluorescent, or luminescence signals.

8. The method of claim 2, wherein the reporter gene is any of SEQ ID NO: 49-51.

9. A method of selecting a compound of interest, the method comprising the steps of: a) applying the compound of interest to a plant cell comprising an expression vector

comprising a promoter sequence operably linked to a reporter gene, wherein the promoter sequence comprises one or multiple copies of SEQ ID NO: 7-10, 13, 16-20, or any combinations thereof; wherein said promoter sequence regulates expression of the reporter gene in a plant cell; and

b) selecting the compound of interest on the basis of an alteration of activity of the reporter gene relative to a control plant. 10. The method of claim 9, wherein the plant cell is comprised within a plant, and the method steps further comprise:

contacting the plant cell with the compound of interest and detecting an altered trait of the plant relative to a control plant;

wherein the altered traits include accelerated onset of flowering, delayed onset of flowering, enhanced tolerance to biotic or abiotic stress, increased yield, enhanced disease resistance, altered sterility, reduced sensitivity to light, greater early season growth, greater height, greater stem diameter, increased biomass, increased photosynthetic rate, increased resistance to lodging, increased internode length, increased secondary rooting, greater cold tolerance, greater tolerance to water deprivation, greater tolerance to salt, greater tolerance to heat, altered sugar sensing, reduced stomatal conductance, altered C/N sensing, increased low nitrogen tolerance, increased low phosphorus tolerance, increased tolerance to hyperosmotic stress, greater late season growth and vigor, increased number of mainstem nodes, and/or greater canopy coverage.

11. The method of claim 9, wherein one or more additional compounds of interest are applied to a plant cell.

12. The method of claim 9, wherein the promoter sequence is operably linked to SEQ ID NO: 22 that encodes SEQ ID NO: 23, and wherein SEQ ID NO: 23 regulates the transcription of a reporter gene operably linked to SEQ ID NO: 24.

13. The method of claim 9, wherein the promoter sequence is operably linked to a reporter gene; and wherein the promoter can be activated by a translational fusion of a glucocorticoid receptor and at least one COL polypeptide.

14. A method of conferring an altered trait to a plant relative to a control plant, the method comprising the steps of:

a) applying a compound of interest to a plant cell, wherein the plant cell comprises an expression vector comprising a promoter sequence operably linked to a reporter gene, wherein the promoter sequence has one or more subsequences, any of which has a percentage identity with any of SEQ ID NOs: 1 - 4, 7-10, 13- 21, 55-56 or a functional part thereof; wherein the percentage identity is selected from the group consisting of at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% and 100%; and

wherein said promoter sequence regulates expression of the reporter gene in a plant cell;

b) selecting the compound of interest on the basis of an alteration of activity in a plant cell of the reporter gene or an alteration of expression level in a plant cell of the reporter gene relative to activity or expression level in a control plant cell; c) applying the selected compound of step (b) to a plant; and

d) detecting an altered trait of the plant relative to a control plant.

15. The method of conferring accelerated or delayed onset of flowering to plants, the method comprising the steps of the method in claim 14, wherein the plants have accelerated or delayed onset of flowering.

16. The method of conferring altered abiotic stress tolerance, the method comprising the steps of the method in claim 14, wherein the plants have increased abiotic stress tolerance.

17. A method of selecting a compound of interest comprising the steps of:

a) contacting at least one candidate compound with a plant cell comprising a promoter operably linked to a polynucleotide encoding a translational fusion of a reporter gene molecule and a polypeptide of interest,

b) selecting a compound that alters the reporter gene activity or expression level relative to controls.

18. The method of claim 17, wherein the polypeptide of interest is SEQ ID NO: 57.

19. The method of claim 17, wherein the promoter is a CORE promoter.

20. The method of claim 17, further comprising the step of:

c) contacting a plant with the selected compound and detecting a modified trait in the plant relative to controls. The method of claim 20, wherein the modified trait is increased yield.

Description:
IDENTIFYING AGENTS THAT CONFER DESIRED TRAITS USING THE

CONSTANS RESPONSE ELEMENT CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims benefit of U.S. provisional application no. 61/235,271, filed on August 19, 2009. The application also claims priority to the PCT application PCT/US 10/37391, filed on June 4, 2010, which claims priority to the provisional application 61/184,588, filed on June 5, 2009. The entire content of each of these applications are hereby incorporated by reference.

FIELD OF THE INVENTION

The present invention relates to the enhancement of plant performance by manipulating genetic regulatory pathways, particularly CONSTANS regulated flowering pathways, through application of chemical compounds.

BACKGROUND OF THE INVENTION

The transition of plants from a vegetative state to a reproductive stage is complex, involving alterations on multiple levels. In many plant species, this transition is controlled by day length, which is perceived in leaves and induces a systemic signal, called florigen, which moves through the phloem to the shoot apex (Turck et al, 2008). Flowering is integrated with several physiological pathways including nutrient sensing (sugar signaling), development and the perception of environmental stress. The ability to control precisely the timing of flowering would be a significant benefit in many types of crops including food crops, ornamentals, and biomass crops. The timing of the floral transition often has a significant impact on both biomass production and grain or fruit yield, with early flowering sometimes being associated with reduced grain or fruit production and reduced plant biomass. In contrast, a delay or prevention of flowering can enhance vegetative biomass accumulation, which is a desirable trait for forage and biomass crops. Non-synchronous flowering of parental varieties can also pose a barrier to the development of hybrid lines, impeding breeding of new out-crossed plant varieties of agricultural interest. Additionally, many crop species show an increased sensitivity to environmental stresses such as water deficit during the reproductive versus the vegetative phase of the life cycle (Edmeades et al, 2000). The ability to manipulate directly the timing of the floral transition or to activate gene networks that confer stress tolerance during particular developmental stages could therefore have direct implications on the yield of bio-energy and food crops.

Manipulation of the floral transition can be achieved via the genetic regulation of critical components of the flowering pathway. Current transgenic technologies typically rely on constitutive promoters or tissue-specific promoters driving ectopic expression of a gene (or genes). For instance, there are numerous examples in the art whereby constitutive, high-level expression or knock-out or knock-down of CONSTANS (CO), FLOWERING LOCUS C (FLC), SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) and other genes result in dramatic changes in the timing of the floral transition (Ratcliffe and Riechmann, 2002). Of the aforementioned genes, the transcription factor CONSTANS has a critical role in the triggering of flowering in response to photoperiod (Putterill et al, 2004). CO is a member of a family of related transcription factors, which are conserved across plant species, known as CONSTANS LIKE PROTEINS (COL); in Arabidopsis the COL family comprises around 17 members (Robson et al, 2001).

However, genetic engineering of plants to manipulate floral transition can be costly and time consuming. An alternative approach is to develop methods to identify additional molecules that are capable of modulating these genetic pathways. The present application provides novel and efficient methods that can be rapidly deployed to alter floral transition. These methods comprise treating plants with chemical compounds to induce specific molecular signaling pathways resulting in altered expression or activity of the COL family of transcription factors. These chemical compounds are identified through screening assays using promoter elements that are responsive to key genetic components involved in the signaling cascade that culminates with flowering. The present invention provides advantages relative to other approaches in that compounds identified in accordance with the invention can be easily and quickly deployed across multiple plant species. Chemical compounds can be manufactured in a cost effective way and applied to plants to induce a desired trait, for example, accelerated or delayed flowering, which is of tremendous value in breeding of new out-crossed plant varieties of agricultural interest or in controlling the growth and development of bio-energy crops. Examples of how to identify and employ chemical compounds to regulate plant traits are provided.

The present invention thus relates to novel methods for selecting a useful chemical compound and use of the compound to induce an altered or a modified trait in a plant. The other aspects and embodiments are described below and can be derived from the teachings of this disclosure as a whole.

SUMMARY OF THE INVENTION

The present invention provides methods and compositions for identifying chemicals that can be used to modulate flowering pathways. Compounds identified in accordance with the methods can be used to treat plants quickly and efficiently to confer an altered or a modified trait, such as accelerated or delayed flowering, at a time when the altered or modified trait is desired.

The invention provides a method of identifying a compound that can be used to enhance plant performance. In one aspect, the method comprises the application of a candidate compound to a plant cell comprising an expression vector that comprises a promoter sequence comprising the CO-Response Element motif (CORE motif): SEQ ID NO: 16 or SEQ ID NO: 44 operably linked to a reporter gene and selecting the compound that alters reporter gene activity or expression level relative to a control compound (CORE reporter system). The promoter sequences that comprise one or more copies of the polynucleotide SEQ ID NO: 16 or SEQ ID NO: 44 and can be regulated by at least one COL polypeptide are referred to as CORE promoters (CO-Response Element Promoters) in the present application.

In one embodiment, the promoter includes any of the promoter sequences provided by SEQ ID NOs: 1 - 4, 7-10, 13- 21, 55-56. The functional part of the promoter that is capable of regulating transcription when operably linked to a transcribable polynucleotide may have about 8, 9, 10, 13, 17, 20, 30, 35, 40, 50, 75, 100, 125, 150, 191, 200, 251, 311, 491, 496, 525, 550, 575, 600, 650, 700, 750, 800, 900, 950, 1000, 1050, 1100, 1 150, 1200, 1500, 1633, 1639, 1800, 2465, 2500, or 5700 contiguous nucleotides of the nucleic acid sequences of SEQ ID NOs: 1, 14, 15, 21, 55, or 56, as well as all lengths of contiguous nucleotides within such sizes, or have multimeric sequences selected from any of SEQ ID NO: 7-10, 13, 16-20 or from any combinations thereof. The promoter sequence comprises a transcription initiation domain having an RNA polymerase binding site. The promoter located 5' relative to and operably linked to a polynucleotide encoding a reporter molecule. The reporter molecule can be any reporter gene molecule, for example, reporter gene molecules whose activity can be measured by calorimetric, fluorescent or luminescence signals.

In another embodiment, the present invention provides methods of screening chemical compounds that can modulate the stability of FLOWERING LOCUS T (FT) or its homologs. The method involves using a reporter gene construct that comprises a promoter that regulates the expression of the FT: REPORTER fusion polypeptide directly or indirectly, i.e., mediated via a DNA sequence specific transactivator, such as LEXA: GAL4. A change in expression level and/or activity of the reporter following the compound treatment indicates the change of the stability of FT.

In yet another embodiment, the method of the invention further comprises a validation assay comprising contacting a plant cell with a candidate compound identified through the reporter gene assay, and detecting an altered or a modified trait relative to control plants in one or more phenotypic assays to validate the candidate compound for the ability to confer a desired trait.

In yet another embodiment, the plant cell is contacted with a plurality of candidate compounds. A compound that alters the reporter gene expression or activity can then be identified from the pool of candidates by further testing using the methods described herein.

In yet another embodiment, the promoter sequence or promoter control element has at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, and/ or less than 100%, or 100% sequence identity to SEQ ID NOs: 1 - 4, 7-10, 13- 21, 55-56 or a functional part thereof, provided the functional part can be bound by a COL polypeptide and as a result of said binding, regulates transcription of the reporter gene.

The invention also provides compounds identified in accordance with the methods. The chemical compounds identified through the methods as described above, can be used to regulate the transcription of genes whose native promoters have the CORE motif SEQ ID NO: 16 or SEQ ID NO: 44, for example, the native FT promoter that controls FT expression to directly impact flowering, and the native ALTERNATIVE OXIDASE I A (AOXla) and PRODUCTION OF ANTHOCYANIN PIGMENT 2 PROTEIN (PAP2) promoters control the expression of their target genes that have important roles in abiotic stress responses, including a plant's photorespiration and response to sucrose level, respectively. These promoter elements, when bound by active COLs, enable regulated target gene expression. Regulation of such genes by such selected chemical compounds will have a profound impact in COL regulated signaling pathway, in particular, the photoperiodic control of floral transition and environmental stress tolerance. Thus, the invention provides methods to confer an enhanced response to abiotic stress in plants by applying a selected chemical compound identified through the CORE reporter system.

In another embodiment, the present invention provides methods to deliver a desired trait to a plant at a time when such treatment is effective. Desired traits include accelerated onset of flowering, delayed onset of flowering, enhanced tolerance to biotic or abiotic stress, increased yield, enhanced disease resistance, altered sterility, reduced sensitivity to light, greater early season growth, greater height, greater stem diameter, increased biomass, increased

photosynthetic rate, increased resistance to lodging, increased internode length, increased secondary rooting, greater cold tolerance, greater tolerance to water deprivation, greater tolerance to salt, greater tolerance to heat, altered sugar sensing, reduced stomatal conductance, altered C/N sensing, increased low nitrogen tolerance, increased low phosphorus tolerance, increased tolerance to hyperosmotic stress, greater late season growth and vigor, increased number of mainstem nodes, and/or greater canopy coverage. The identification of compounds through the methods as described allows efficient and convenient delivery of the desired traits during a critical stage of plant life cycle.

Brief Description of the Sequence Listing and Drawings

The Sequence Listing provides exemplary polynucleotide and polypeptide sequences. The traits associated with the use of the sequences are included in the Examples.

Incorporation of the Sequence Listing. The copy of the Sequence Listing, being submitted electronically with this patent application, provided under 37 CFR § 1.821-1.825, is a read-only memory computer-readable file in ASCII text format. The Sequence Listing is named "MBI-0093PCT_ST25.txt". The electronic file of the Sequence Listing was created on August 18, 2010, and is 107,388 bytes in size (measured in MS-WINDOWS). The Sequence Listing is herein incorporated by reference in its entirety.

Figure 1A illustrates the promoter analysis strategy leading to the identification of the CORE motif. Progressively shorter fragments of the FLOWERING LOCUS T (FT) promoter region (FT1, FT2, FT3, FT4, FT5 and FT6) were used to identify candidate CO binding sequences COREl and CORE2; comparison with the SOC1 promoter enabled the identification of a conserved CORE motif, indicated in boxes. Polynucleotide start and end positions relative to the start codons are indicated in parentheses after the sequence name. SEQ ID NOs of sequences are also found within the parentheses. Fig. IB shows that a conserved sequence, SEQ ID NO: 44, which is complementary to the CORE motif, and is present in the PAP2 and AOXla promoters. The conserved sequences are found within the boxes.

Figures 2A, 2B and 2C show a comparison of synthetic promoter elements that comprise multiple copies of the COREl or CORE2 sequences (or mutated versions thereof) and a hybrid promoter sequence comprised of multimeric copies of both the COREl and CORE2 motifs (CORE3). Asterisks indicate the positions where the nucleotides differ among the aligned sequences. The broken lines in Figs. 2B and 2C represent the regions where the nucleotide sequence is absent. These multimeric sequences were used to characterize the CORE motif. SEQ ID NOs of sequences are found within the parentheses after the sequence names.

Figure 3 shows direct binding of the CONSTANS (CO) protein with the CORE motif in an electromobility shift assay (EMSA). The EMSA experiment used the 4XCORE2B sequence and an epitope tagged variant of the CO protein. CO protein was bound specifically to the 4XCORE2B sequence (middle lane). The CO protein failed to bind to the 4XCORE2BM1 sequence with a mutated CORE motif (TGTG to TATA) (the last lane). SEQ ID NOs of sequences are found within the parentheses next to the sequence names. This result confirms that the CORE motif is a direct DNA target of CO.

Figures 4A and 4B schematically represent the methods used to identify chemistry that regulates the COL-mediated activation of the CORE promoter. pCORE refers to promoter sequence containing one or more copies of the CORE motif. Reporter refers to a gene encoding a protein or enzyme that can be detected using calorimetric, luminescent or fluorescent detection systems (for example, GUS, luciferase or GFP). COL refers to a member of the family of transcription factors that are homologous and phylogenetically related to CONSTANS (CO), SEQ ID NO: 25. LEXA: GAL4 is a translational fusion of the LEXA DNA binding and GAL4 transcriptional activation protein domains. COL: GR represents a fusion of at least one COL polypeptide with the glucocorticoid receptor protein (GR) or the ligand-binding domain of GR, which sequesters the protein complex in the cytosol until the addition of dexamethasone (dex) that drives nuclear translocation. OpLexA is a promoter sequence which is bound by the LEXA protein. Rectangles represent polynucleotides, ovals represent polypeptides, and hexagons represent chemical compounds. Fig. 4A depicts methods that can be employed to identify chemistry that induces expression or activity of the COL proteins leading to an increase in the transcription and translation of a reporter gene and higher reporter activity. Roman numerals i, ii, iii, iv and v represent the serial events that may occur upon treatment of a chemical compound: i - compound-mediated induction of COL expression or protein activation; ii - binding to and activation of CORE promoter (pCORE); iii - transcription/translation of

REPORTER or LEXA: GAL4; iv - binding to and activation of opLexA promoter; v - transcription/translation of REPORTER. Fig. 4B illustrates identifying chemistry that inhibits expression or activity of the COL proteins leading to a decrease in the transcription and translation of a reporter gene and lower reporter activity. Roman numerals represent the following:

i - transcription/translation of COL: GR;

ii - nuclear translocation of COL: GR upon the addition of dexamethasone (dex);

iii - binding and activation of the CORE promoter; iv - transcription/translation of a REPORTER. The asterisks represent the events where one or multiple compounds inhibit the stability and/or activity of COL proteins.

Figures 5A and 5B schematically represent the methods used to identify chemistry that modulate the stability of FT or its homologs. A change in expression level or activity of the reporter gene after the compound treatment indicates the increase or decrease of the stability of FT or its homologs affected by the compound. Roman numerals in Figure 5A represent: i -transcription/translation of the fusion protein FT: reporter;

ii- stabilization or degradation of the fusion protein upon the treatment of chemical compounds.

Roman numerals in Fig. 5B represent:

i - transcription/translation of LEXA:GAL4 fusion protein;

ii -LEXA:GAL4 binding to and activating opLexA promoter;

iii - transcription/translation of FT:REPORTER fusion protein;

iv- stabilization or degradation of the fusion protein upon the treatment of the chemical compounds.

DETAILED DESCRIPTION

The invention relates to methods for identification of chemicals that can be applied to plants to improve performance, particularly through targeting components of COL signaling pathways. The methods employ promoter sequences that can be recognized by transcription factors of CONSTANS like (COL) family of proteins. COL family proteins have been identified in a wide range of plants (Robson et al, 2001 ; Miller et al, 2008; Chia et al, 2008; Nemoto et al, 2003). Studies of the model dicot Arabidopsis have shown that the CO (CONSTANS) gene has an important role in the photoperiod pathway, which is one of four regulatory pathways controlling the timing of flowering (Martinez-Zapater et al, 1994; Putterill et al, 1995;

Mouradov et al, 2002; Simpson and Dean., 2002). In Arabidopsis , CONSTANS (CO) promotes flowering in response to a lengthening photoperiod via a mechanism involving tight regulation at both the DNA and protein level (see Turck et al, 2008). Ectopic, constitutive expression of CO prematurely triggers flowering independent of day length. CO protein has two-tandem B- box zinc finger domains at the amino terminus and contains a conserved region at the carboxy- terminus known as the CCT motif. Arabidopsis has 16 additional proteins with either one or two B-boxes at the amino terminus and one CCT motif at the carboxy- terminus. These proteins are named CONSTANS-like or COL proteins (COLs): COL1 to COL16 (Robson et al, 2001). A subset of the COL proteins has been demonstrated to participate in the regulation of the floral transition while other family members may have no apparent role in the process. There are numerous other Arabidopsis proteins in the B-box zinc finger family. These proteins contain one or two B-box at their amino-terminus but lack the conserved CCT motif at their carboxy- terminus. Both classes of B-box protein families (with or without CCT motif) regulate flowering, circadian rhythm and light mediated growth and development processes in plants. The function of the conserved B-box and CCT domains has not been fully elucidated; however, the existence of proteins containing only one of these domains suggests that these motifs do not require each other for their function and act independently. In animal systems, the B-box is present alone or as a part of a tripartite motif comprised of a zinc -binding RING finger, one or two B-boxes followed by a coil-coil domain (RBCC). The RBCC motif is implicated in protein- protein interactions and is present in various transcription factors, ubiquitin ligases, receptor proteins and other structural protein classes. It is unclear if the B-box proteins in plants, with a distinct composition of protein structural motifs, function similarly as the animal proteins. COL proteins, including CO, have been shown to function as transcriptional regulators that control the expression of various genes including the flowering modulator FLOWERING LOCUS T (FT).

In Arabidopsis, the CO protein is a positive regulator of the FT gene. Rice is a facultative short day plant and CO acts as a floral repressor. In rice, under long days (for example, 16 h of light) the CO protein is stabilized and negatively regulates the expression of FT resulting in a repression of flowering (Kojima et al, 2002; Tamaki et al, 2007) Another key transcription factor that negatively regulates flowering time is the repressor FLOWERING LOCUS C (FLC). However, unlike CO, FLC regulates flowering as part of the "autonomous" pathway of flowering control, and in Arabidopsis the protein has a native function which is largely independent of light period duration (Martinez-Zapater et al, 1994). FLC binds to the CArG box within the FT promoter and negatively regulates the expression of the FT gene. In addition to CO and FLC, numerous other transcription factors directly or indirectly effect the expression of FT, but CO is likely to represent the major component in regulating the floral transition in response to photoperiod.

The invention relates to identifying chemical compounds that can be used to confer a desired trait to plants and enhance plant performance with the identified compound. Throughout this disclosure, various information sources are referred to and/or are specifically incorporated. The information sources include scientific journal articles, patent documents, textbooks, and world wide web browser-inactive page addresses. While the reference to these information sources clearly indicated that they can be used by one of skill in the art, each and every one of the information sources cited herein are specifically incorporated in their entirely, whether or not a specific mention of "incorporated by reference" is noted. The contents and teachings of each and every one of the information sources can be relied on and used to make and use

embodiments of the invention.

As used herein and in the appended claims of the invention, the singular forms "a", "an", and "the" include the plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to "a host cell" includes a plurality of such host cells, and a reference to "a stress" is a reference to one or more stresses and equivalents thereof known to those skilled in the art, and so forth.

DEFINITIONS

"Nucleic acid molecule" refers to an oligonucleotide, polynucleotide or any fragment thereof. It may be DNA or RNA of genomic or synthetic origin, double-stranded or

single-stranded, and combined with carbohydrate, lipids, protein, or other materials to perform a particular activity such as transformation or form a useful composition such as a peptide nucleic acid (PNA).

"Polynucleotide" is a nucleic acid molecule comprising a plurality of polymerized nucleotides, e.g., at least about 15 consecutive polymerized nucleotides. A polynucleotide may be a nucleic acid, oligonucleotide, nucleotide, or any fragment thereof. In many instances, a polynucleotide comprises a nucleotide sequence encoding a polypeptide (or protein) or a domain or fragment thereof. Additionally, the polynucleotide may comprise a promoter, an intron, an enhancer region, a polyadenylation site, a translation initiation site, 5' or 3' untranslated regions, a reporter gene, a selectable marker, or the like. The polynucleotide can be single-stranded or double-stranded DNA or RNA. The polynucleotide optionally comprises modified bases or a modified backbone. The polynucleotide can be, e.g., genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, a PCR product, a cloned DNA, a synthetic DNA or RNA, or the like. The polynucleotide can be combined with carbohydrate, lipids, protein, or other materials to perform a particular activity such as transformation or form a useful composition such as a peptide nucleic acid (PNA). The polynucleotide can comprise a sequence in either sense or antisense orientations. "Oligonucleotide" is substantially equivalent to the terms amplimer, primer, oligomer, element, target, and probe and is preferably single-stranded.

A "recombinant polynucleotide" is a polynucleotide that is not in its native state, e.g., the polynucleotide comprises a nucleotide sequence not found in nature, or the polynucleotide is in a context other than that in which it is naturally found, e.g., separated from nucleotide sequences with which it typically is in proximity in nature, or adjacent (or contiguous with) nucleotide sequences with which it typically is not in proximity. For example, the sequence at issue can be cloned into a vector, or otherwise recombined with one or more additional nucleic acid.

An "isolated polynucleotide" is a polynucleotide, whether naturally occurring or recombinant, that is present outside the cell in which it is typically found in nature, whether purified or not. Optionally, an isolated polynucleotide is subject to one or more enrichment or purification procedures, e.g., cell lysis, extraction, centrifugation, precipitation, or the like. "Gene" or "gene sequence" refers to the partial or complete coding sequence of a gene, its complement, and its 5' or 3' untranslated regions. A gene is also a functional unit of inheritance, and in physical terms is a particular segment or sequence of nucleotides along a molecule of DNA (or RNA, in the case of RNA viruses) involved in producing a polypeptide chain. The latter may be subjected to subsequent processing such as chemical modification or folding to obtain a functional protein or polypeptide. A gene may be isolated, partially isolated, or found with an organism's genome. By way of example, a transcription factor gene encodes a transcription factor polypeptide, which may be functional or require processing to function as an initiator of transcription.

Operationally, genes may be defined by the cis -trans test, a genetic test that determines whether two mutations occur in the same gene and that may be used to determine the limits of the genetically active unit (Rieger et al. (1976)). A gene generally includes regions preceding ("leaders"; upstream) and following ("trailers"; downstream) the coding region. A gene may also include intervening, non-coding sequences, referred to as "introns", located between individual coding segments, referred to as "exons". Most genes have an associated promoter region, a regulatory sequence 5' of the transcription initiation codon (there are some genes that do not have an identifiable promoter). The function of a gene may also be regulated by enhancers, operators, and other regulatory elements.

A "promoter" or "promoter region" or "promoter sequence" refers to a segment of DNA containing an RNA polymerase binding site, generally found upstream or 5' relative to a coding sequence under the regulatory control of the promoter. The promoter will generally comprise response elements that are recognized by transcription factors. Transcription factors bind to the promoter sequences, recruiting RNA polymerase, which synthesizes RNA from the coding region. Dissimilarities in promoter sequences account for different efficiencies of transcription initiation and hence different relative expression levels of different genes.

A promoter described in the invention can be native or heterologous to a nucleic acid sequence of interest whose transcription is regulated by said promoter. A heterologous promoter as used herein is defined as a promoter which is not the natural promoter of said sequence of interest. In other words, some form of human intervention, e.g. molecular cloning, has been used at any point in time to make the functional combination of a heterologous promoter with a nucleic acid of interest, and it is readily understood in this context that a heterologous promoter can be derived from the same or from a different organism as the sequence of interest. A "COL polypeptide or COL protein" refers to a member of the family of transcriptional regulators with homology and phylogenetic relatedness to the CONSTANS protein, as exemplified in Robson et al, 2001.

A "COL" refers to a polynucleotide that encodes a COL polypeptide.

"CORE promoter" herein refers to a promoter sequence that comprises at least one

CORE sequence (SEQ ID NO: 16 or the complement thereof) and can be regulated by at least one COL polypeptide.

"COL regulated signaling pathways" refers to signaling pathways in which one or multiple COLs act as transcriptional regulators.

"Promoter function" includes regulating expression of the coding sequences under a promoter's control by providing a recognition site for RNA polymerase and/or other factors, such as transcription factors, all of which are necessary for the start of transcription at a transcription initiation site. A "promoter function" may also include the extent to which a gene coding sequence is transcribed to the extent determined by a promoter sequence.

"CORE promoter function" herein refers to the regulation of transcription of a transcribable DNA sequence by any of the COL polypeptides through binding to its cognate site in a promoter.

The term "inductive day length" or "inductive photoperiod" refers to a light period of a duration that induces flowering in a plant. For example, so-called "long day plants" are typically induced to flower by day lengths of 16 hours or more duration whereas "short day plants" are typically induced to flower by day lengths of 8-10 hours in duration.

A promoter or promoter region may include variations of promoters found in the present Sequence Listing, which may be derived by ligation to other regulatory sequences, random mutagenesis, controlled mutagenesis, and/or by the addition or duplication of enhancer sequences. Promoters disclosed in the present Sequence Listing and biologically functional equivalents or variations thereof may drive the transcription of operably-linked coding sequences when comprised within an expression vector and introduced into a host plant.

Promoters such as those found in the Sequence Listing (i.e., SEQ ID NOs: 1-4, 7, 8, 9, 10, 13, 14, 15, 18, 20, 21, 55, and 56) may be used to generate similarly functional promoters containing essential promoter elements. Functional promoters may also include a functional part or fragment of any of SEQ ID NO: 1 - 4, 7-10, 13-15, 18, 20, 21, 55 or 56, provided the functional part also includes a CORE promoter function.

A "polypeptide" is an amino acid sequence comprising a plurality of consecutive polymerized amino acid residues e.g., at least about 15 consecutive polymerized amino acid residues. In some of the instances referred to in this application, a polypeptide comprises a polymerized amino acid residue sequence that is a transcription factor or a domain or portion or fragment thereof. Additionally, the transcription factor may comprise: (i) a localization domain; (ii) an activation domain; (iii) a repression domain; (iv) an oligomerization domain; (v) a DNA- binding domain; or the like. The polypeptide optionally comprises modified amino acid residues, naturally occurring amino acid residues not encoded by a codon, non-naturally occurring amino acid residues.

"Protein" refers to an amino acid sequence, oligopeptide, peptide, polypeptide or portions thereof whether naturally occurring or synthetic.

A "recombinant polypeptide" is a polypeptide produced by translation of a recombinant polynucleotide. A "synthetic polypeptide" is a polypeptide created by consecutive

polymerization of isolated amino acid residues using methods well known in the art. An "isolated polypeptide," whether a naturally occurring or a recombinant polypeptide, is more enriched in (or out of) a cell than the polypeptide in its natural state in a wild-type cell, e.g., more than about 5% enriched, more than about 10% enriched, or more than about 20%, or more than about 50%, or more, enriched, i.e., alternatively denoted: 105%, 1 10%, 120%, 150% or more, enriched relative to wild type standardized at 100%. Such enrichment is not the result of a natural response of a wild-type plant. Alternatively, or additionally, the isolated polypeptide is separated from other cellular components with which it is typically associated, e.g., by any of the various protein purification methods herein.

"Homology" refers to sequence similarity between a reference sequence and at least a fragment of a newly sequenced clone insert or its encoded amino acid sequence.

"Identity" or "similarity" refers to sequence similarity between two polynucleotide sequences or between two polypeptide sequences, with identity being a more strict comparison. The phrases "percent identity" and "% identity" refer to the percentage of sequence similarity found in a comparison of two or more polynucleotide sequences or two or more polypeptide sequences.

A "comparison window", as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well- known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, 1981, by the homology alignment algorithm of Needleman & Wunsch, 1970, by the search for similarity method of Pearson & Lipman, 1988, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al, eds. 1994-1999).

A preferred example of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al, 1977, and Altschul et al. 1990), respectively. BLAST and BLAST 2.0 are used, with the parameters described herein, to determine percent sequence identity for nucleic acids and proteins described herein. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, 1977, and Altschul et al, 1990, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, 1989) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands.

The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, 1993). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001.

An indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below. Yet another indication that two nucleic acid sequences are substantially identical is that the same primers can be used to amplify the sequence.

"Sequence similarity" refers to the percent similarity in base pair sequence (as determined by any suitable method) between two or more polynucleotide sequences. Two or more sequences can be anywhere from 0-100% similar, or any integer value there between. Identity or similarity can be determined by comparing a position in each sequence that may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same nucleotide base or amino acid, then the molecules are identical at that position. A degree of similarity or identity between polynucleotide sequences is a function of the number of identical, matching or corresponding nucleotides at positions shared by the polynucleotide sequences. A degree of identity of polypeptide sequences is a function of the number of identical amino acids at corresponding positions shared by the polypeptide sequences. A degree of homology or similarity of polypeptide sequences is a function of the number of amino acids at corresponding positions shared by the polypeptide sequences.

"Complementary" refers to the natural hydrogen bonding by base pairing between purines and pyrimidines. For example, the sequence A-C-G-T (5' -> 3') forms hydrogen bonds with its complements A-C-G-T (5' -> 3') or A-C-G-U (5' -> 3'). Two single-stranded molecules may be considered partially complementary, if only some of the nucleotides bond, or

"completely complementary" if all of the nucleotides bond. The degree of complementarity between nucleic acid strands affects the efficiency and strength of hybridization and

amplification reactions. "Fully complementary" refers to the case where bonding occurs between every base pair and its complement in a pair of sequences, and the two sequences have the same number of nucleotides.

The terms "paralog" and "ortholog" are defined as evolutionarily related genes that have similar sequences and functions. Orthologs are structurally related genes in different species that are derived by a speciation event. Paralogs are structurally related genes within a single species that are derived by a duplication event.

The term "equivalog" describes members of a set of homologous proteins that are conserved with respect to function since their last common ancestor (Haft et al, 2003). Related proteins are grouped into equivalog families, and otherwise into protein families with other hierarchically defined homology types.

In general, the term "variant" refers to molecules with some differences, generated synthetically or naturally, in their base or amino acid sequences as compared to a reference (native) polynucleotide or polypeptide, respectively. These differences include substitutions, insertions, deletions or any desired combinations of such changes in a native polynucleotide of amino acid sequence.

Promoters that are similar to those listed in the Sequence Listing: SEQ ID NOs 1 - 4, 7- 10, 13-15, 18, 20, 21, 55, or 56 may be made that have some alterations in the nucleotide sequence and yet retain the function of the listed sequences. One preferred method of alteration of a polynucleotide sequence is to use PCR to modify selected nucleotides or regions of sequences. These methods are well known to those of skill in the art. Sequences can be modified, for example by insertion, deletion, or replacement of template sequences in a PCR- based DNA modification approach. A "promoter variant" or "variant promoter" is a promoter containing changes in which one or more nucleotides of an original promoter is deleted, added, and/or substituted, preferably while substantially maintaining promoter function. For example, one or more base pairs may be deleted from the 5' or 3' end of a promoter to produce a

"truncated" promoter. One or more base pairs can also be inserted, deleted, or substituted internally to a promoter. In the case of a promoter fragment, variant promoters can include changes affecting the transcription of a minimal promoter to which it is operably linked. Variant promoters can be produced, for example, by standard DNA mutagenesis techniques or by chemically synthesizing the variant promoter or a portion thereof.

Also within the claimed scope is a variant of a gene promoter listed in the Sequence Listing, that is, one having a sequence that differs from one of the polynucleotide sequences in the Sequence Listing, or a complementary sequence.

With regard to polynucleotide variants of coding sequences that encode polypeptides, differences between presently disclosed polynucleotides and polynucleotide variants are limited so that the nucleotide sequences of the former and the latter are closely similar overall and, in many regions, identical. Due to the degeneracy of the genetic code, differences between the former and latter nucleotide sequences may be silent (i.e., the amino acids encoded by the polynucleotide are the same, and the variant polynucleotide sequence encodes the same amino acid sequence as the presently disclosed polynucleotide. Variant nucleotide sequences of coding sequences may encode different amino acid sequences, in which case such nucleotide differences will result in amino acid substitutions, additions, deletions, insertions, truncations or fusions with respect to the similar disclosed polynucleotide sequences. These variations may result in polynucleotide variants encoding polypeptides that share at least one functional characteristic. The degeneracy of the genetic code also dictates that many different variant polynucleotides can encode identical and/or substantially similar polypeptides in addition to those sequences illustrated in the Sequence Listing.

The term "plant" includes whole plants, shoot vegetative organs/structures (for example, leaves, stems and tubers), roots, flowers and floral organs/structures (for example, bracts, sepals, petals, stamens, carpels, anthers and ovules), seed (including embryo, endosperm, and seed coat) and fruit (the mature ovary), plant tissue (for example, vascular tissue, ground tissue, and the like) and cells (for example, guard cells, egg cells, and the like), and progeny of same. The class of plants that can be used in the instant method is generally as broad as the class of higher and lower plants amenable to transformation techniques, including angiosperms (monocotyledonous and dicotyledonous plants), gymnosperms, ferns, horsetails, psilophytes, lycophytes, bryophytes, and multicellular algae.

A "transgenic plant" refers to a plant that contains genetic material not found in a wild- type plant of the same species, variety or cultivar. The genetic material may include a trans gene, an insertional mutagenesis event (such as by transposon or T-DNA insertional mutagenesis), an activation tagging sequence, a mutated sequence, a homologous recombination event or a sequence modified by chimeraplasty. Typically, the foreign genetic material has been introduced into the plant by human manipulation, but any method can be used as one of skill in the art recognizes.

A transgenic plant may contain a nucleic acid construct (e.g., an expression vector or cassette). The nucleic acid construct typically comprises a polypeptide-encoding sequence operably linked (i.e., under regulatory control of) to an inducible regulatory sequence, such as a promoter, that allows for the controlled expression of polypeptide. The nucleic acid construct can be introduced into a plant by transformation or by breeding after transformation of a parent plant. A plant refers to a whole plant as well as to a plant part, such as seed, fruit, leaf, or root, plant tissue, plant cells or any other plant material, e.g., a plant explant, as well as to progeny thereof, and to in vitro systems that mimic biochemical or cellular components or processes in a cell.

"Wild type" or "wild-type", as used herein, refers to a plant cell, seed, plant component, plant tissue, plant organ or whole plant that has not been genetically modified or treated in an experimental sense. Wild-type cells, seed, components, tissue, organs or whole plants may be used as controls to compare levels of expression and the extent and nature of trait modification with cells, tissue or plants of the same species in which expression of a polypeptide, such as a transcription factor polypeptide, is altered or modified, e.g., in that it has been overexpressed or ectopically expressed.

A "control plant" as used in the present invention refers to a plant cell, seed, plant component, plant tissue, plant organ or whole plant used to compare against a plant that is treated with one or more compounds of interest. A control plant may in some cases be a plant that is treated with a control compound or a carrier solvent or no treatment. In general, a control plant is a plant of the same line or variety as the plant being treated with one or more compounds of interest.

A "trait" refers to a physiological, morphological, biochemical, or physical characteristic of a plant or particular plant material or cell. In some instances, this characteristic is visible to the human eye, such as seed or plant size, or can be measured by biochemical techniques, such as detecting the protein, starch, or oil content of seed or leaves, or by observation of a metabolic or physiological process, e.g., by measuring tolerance to a form of stress, such as water deficit or water deprivation, or particular salt or sugar concentrations, or by the observation of the expression level of a gene or genes, e.g., by employing Northern analysis, RT-PCR, microarray gene expression assays, or reporter gene expression systems, or by agricultural observations such as extent of wilting, turgor, hyperosmotic stress tolerance or in a preferred embodiment, yield. Any technique can be used to measure the amount of, comparative level of, or difference in any selected chemical compound or macromolecule in the transgenic plants, however.

"Trait modification" refers to a detectable difference in a characteristic in a plant ectopically expressing a polynucleotide or polypeptide of the present invention relative to a plant not doing so, such as a wild-type plant. In some cases, the trait modification can be evaluated quantitatively. For example, the trait modification can entail at least about a 2% increase or decrease, or an even greater difference, in an observed trait as compared with a control or wild- type plant. It is known that there can be a natural variation in the modified trait. Therefore, the trait modification observed entails a change of the normal distribution and magnitude of the trait in the plants as compared to control or wild-type plants.

"Ectopic expression" or "altered expression" in reference to a polynucleotide indicates that the pattern of expression in, e.g., a transgenic plant or plant tissue, is different from the expression pattern in a wild-type plant or a reference plant of the same species. The pattern of expression may also be compared with a reference expression pattern in a wild-type plant of the same species. For example, the polynucleotide or polypeptide is expressed in a cell or tissue type other than a cell or tissue type in which the sequence is expressed in the wild-type plant, or by expression at a time other than at the time the sequence is expressed in the wild-type plant, or by a response to different inducible agents, such as hormones or environmental signals, or at different expression levels (either higher or lower) compared with those found in a wild-type plant. The term also refers to altered or modified expression patterns that are produced by lowering the levels of expression to below the detection level or completely abolishing expression. The resulting expression pattern can be transient or stable, constitutive or inducible. In reference to a polypeptide, the term "ectopic expression or altered expression" further may relate to altered or modified activity levels resulting from the interactions of the polypeptides with exogenous or endogenous modulators or from interactions with factors or as a result of the chemical modification of the polypeptides.

The term "overexpression" as used herein refers to a greater expression level of a gene in a plant, plant cell or plant tissue, compared to expression in a wild-type plant, cell or tissue, at any developmental or temporal stage for the gene. Overexpression can occur when, for example, the genes encoding one or more proteins are under the control of a strong promoter (e.g., the cauliflower mosaic virus 35S transcription initiation region). Overexpression may also under the control of an inducible promoter such as a CORE promoter. Thus, overexpression may occur throughout a plant or in the presence of particular environmental signals, depending on the promoter used.

Overexpression may take place in plant cells normally lacking expression of

polypeptides functionally equivalent or identical to a polypeptide that can confer an improved trait, for example, increased stress tolerance or improved yield. Overexpression may also occur in plant cells where endogenous expression of the present proteins that confer an improved trait, for example, improved stress tolerance, or functionally equivalent molecules, normally occurs, but such normal expression is at a lower level. Overexpression thus results in a greater than normal production, or "overproduction" of the protein that confers the improved trait in the plant, cell or tissue.

The term "transcription regulating region" refers to a DNA regulatory sequence that regulates expression of one or more genes in a plant when a polypeptide having one or more specific binding domains binds to the DNA regulatory sequence. Polypeptides, for example, transcription factors, may possess a conserved domain. Transcription factors may also comprise an amino acid subsequence that forms a transcription activation domain that regulates expression of one or more stress resistance genes in a plant when the transcription factor binds to the regulating region. The term "regulator" herein refers to a polynucleotide or polypeptide sequence that regulates expression of one or more genes.

The phrases "coding sequence," "structural sequence," and "transcribable polynucleotide sequence" refer to a physical structure comprising an orderly arrangement of nucleic acids. The nucleic acids are arranged in a series of nucleic acid triplets that each form a codon. Each codon encodes for a specific amino acid. Thus the coding sequence, structural sequence, and transcribable polynucleotide sequence encode a series of amino acids forming a protein, polypeptide, or peptide sequence. The coding sequence, structural sequence, and transcribable polynucleotide sequence may be contained, without limitation, within a larger nucleic acid molecule, vector, etc. In addition, the orderly arrangement of nucleic acids in these sequences may be depicted, without limitation, in the form of a sequence listing, figure, table, electronic medium, etc.

The term "isolated", indicates that the molecule referenced is not in its native environment, that is, not normally found in the genome of a particular host cell, or a DNA not normally found in the host genome in an identical context, or any two sequences adjacent to each other that are not normally or naturally adjacent to each other.

The term "operably linked" refers to a first polynucleotide molecule, such as a promoter, connected with a second transcribable polynucleotide molecule, such as a gene of interest, where the polynucleotide molecules are so arranged that the first polynucleotide molecule affects the function of the second polynucleotide molecule. The two polynucleotide molecules may be part of a single contiguous polynucleotide molecule and may be adjacent. For example, a promoter is operably linked to a gene of interest if the promoter modulates transcription of the gene of interest in a cell.

"Activation" of an CORE promoter-reporter construct is considered to be achieved when the activity value relative to control, e.g., a sample that is not treated with a candidate compound, is 120%, 130%, 140%, 150%, usually 200%, 250%, 300%, 400%, 500%, or 1000-3000% or more higher.

DESCRIPTION OF THE SPECIFIC EMBODIMENTS

The present invention teaches a process for discovering chemistry that can impart an altered or a modified trait to plants through the modulation of COL signaling pathway. The invention may use a new composition of matter comprised of plant cells transformed with a novel DNA construct containing a reporter gene or other transcribable polynucleotide under the control of a CORE promoter. This invention provides methods of screening chemical libraries to identify compounds that regulate genes involved in COL regulated signaling pathways. The screening steps comprise evaluating candidate compounds for the ability to activate or repress a CORE reporter system as indicated by the relative reporter gene activity. The methods of the invention may further comprise validation assays to confirm the activity of the compounds in plants, for example, to validate chemical compounds can be used to enhance floral induction, contacting the candidate compound with a plant; examining a plant for accelerated flowering relative to the control plant treated with a control compound.

Alternatively, the invention involves using a two-component system to deliver higher levels of reporter gene expression following the activation of the CORE motif, for example, the CORE promoter could drive expression of a translational fusion of the LexA DNA binding domain and the GAL4 transcriptional activation domain. A reporter gene would be expressed from the opLexA promoter such that activation of the CORE promoter would result in LEXA: GAL4-mediated expression of the reporter gene (Fig. 4A). This alternative activation system thus enables more sensitive detection method to identify the chemical compounds that can affect the COL regulated signaling pathways.

A variant of the screen for useful chemical compounds involves using an inducible system. In this system, a transcriptional fusion of the CORE promoter and the reporter gene would be introduced into the plant in cis with a dexamethasone-inducible transgene system that expresses a translational fusion of GR and COL peptide. The COL: GR fusion protein translocates into the nucleus upon addition of dexamethasone and activates the CORE promoter resulting in high reporter expression (Fig. 4B). The benefit of this multi-component system is the ease of identifying a candidate line with strong induction characteristics, eliminating lines with silencing or poor expression due to the chromosomal integration site. Furthermore, this approach conveniently enables screening for chemistries that can repress (in the presence of dexamethasone) or activate (in the absence of dexamethasone) COL signaling pathways with one singular construct and selected transgenic line.

Additionally, the present invention provides methods of screening chemical compounds that can modulate the stability of FT or its homologs. The method involves using a reporter gene construct that comprises a promoter that regulates the expression of a translational fusion of the REPORTER and FT or one of its homologs directly or indirectly, i.e., mediated via a DNA sequence-specific transactivator, such as LEXA: GAL4. The promoter used in this method can be any promoter, for example, a heterologous promoter or a native promoter, which can be any of the CORE promoters as described above or the Cauliflower Mosaic Virus promoter (CaMV 355). Test compounds and control compounds are applied to the cells transformed with the translational fusion constructs. Test compounds that change the stability of FT or one of its homologs can be identified based on the altered reporter gene activity levels relative to controls. Hit compounds can be applied to the plants of interest, and further validated for the ability to change the stability of the FT or its homologs using biochemical approaches that are known in the art.

The chemical compounds that are selected and validated in assays using CORE promoters can be used to regulate expression of useful proteins and may be of significant value for a number of reasons, including, but not limited to, the following:

The selected and validated chemical compounds can be used to regulate floral transition timing. The timing of floral transition can have a significant impact on both biomass production and grain yields, with early flowering sometimes being associated with reduced grain production and plant biomass. Conversely, delayed flowering can result in the accumulation of vegetative biomass and an increase in photosynthetic capacity, which can lead to enhanced yield. However, the absence of flowering, or photoperiod dependent flowering, or late flowering can pose barriers in breeding programs. For example, if two parental lines have widely differing flowering times, this can impede the development of hybrid lines, which restricts the ability to breed new out-crossed plant varieties of agricultural interest. By regulating the expression of flowering genes, flowering could be triggered or repressed by application of the chemical compound identified using the method described in this application. This could also allow flowering to be synchronized across a crop and facilitate more efficient harvesting. For example, the identified chemical compounds can be used to trigger flowering in SD (short-day) grasses, which usually do not flower under LD (long day) conditions due to repression of the floral transition by high levels of COLs. In one embodiment of the invention, the reporter system described herein can be used to identify chemical compounds that inhibit COL or its pathway components and can therefore be applied to such SD crops in LD conditions to induce flowering. Conversely, identified and validated compounds that activate COL can be used to enable LD plants to flower under SD conditions. This would be exceptionally valuable for breeding programs which depend on parental plants that flower normally under different photoperiodic conditions since no such compounds are currently available. Chemical compounds provided by the invention could also be used to tune the flowering of crop varieties to different latitudes. At present, species such as soybean and cotton are available as a series of maturity groups that are suitable for different latitudes on the basis of their flowering time (which is governed by day-length). A system in which flowering could be chemically controlled would allow a single high-yielding northern maturity group to be grown at any latitude. In southern regions such plants could be grown for longer, thereby increasing yields, before flowering was induced. In more northern areas, the induction of flowering by the compound would be used to ensure that the crop flowers prior to the first winter frosts.

This invention also teaches to enhance performance of a plant by treating the plant with selected chemical compounds as described above. The chemical compounds identified through the reporter system described herein can be used to modulate the expression of a polypeptide of interest to confer an altered or a modified trait in an efficient and convenient way. It has been reported that heterozygosity for tomato loss -of -function alleles of SINGLE FLOWER TRUSS (SFT), a tomato homolog of the Arabidopsis FT) increases yield by up to 60% (Krieger et al, 2010) relative to the inbred control plants, and the transcription levels of SFT in these SFT heterozygous tomato plants increased by 2-3 fold relative to control plants (Krieger et al, 2010, the entire content of this publication is hereby incorporated by reference). As described above, chemical compounds identified via the methods of the present invention can result in modification of the transcriptional and/or translational level of FT or its homologs, and therefore these compounds could be used to increase yield of higher plants. Altered or modified traits induced by the chemical compounds selected using the invention can also include, but are not limited to, enhanced tolerance to biotic or abiotic stress, altered or modified floral induction, increased yield, enhanced disease resistance, altered or modified sterility, reduced sensitivity to light, greater early season growth, greater height, greater stem diameter, increased biomass, increased photosynthetic rate, increased resistance to lodging, increased internode length, increased secondary rooting, greater cold tolerance, greater tolerance to water deprivation, greater tolerance to salt, greater tolerance to heat, altered or modified sugar sensing, reduced stomatal conductance, altered or modified C/N sensing, increased low nitrogen tolerance, increased low phosphorus tolerance, increased tolerance to hyperosmotic stress, greater late season growth and vigor, increased number of mainstem nodes, or greater canopy coverage. Selection of promoters for use in the invention

The present application provides methods that employ the promoter sequences to which COL proteins can recognize.

Exemplar promoter sequences are provided as SEQ ID NO: 1 - 4, 7-10, 13-15, 18, 20,

21, 55, or 56, and expression vectors comprising these promoters operably linked to a polynucleotide encoding a reporter molecule may be constructed to evaluate chemical compounds that can be used to confer desired traits to a plant. The promoter sequences in the invention also encompass a CORE promoter that comprises a functional part of any of SEQ ID NOs: 1 - 4, 7-10, 13- 21, 55-56, provided that the functional part of the promoter also includes a CORE promoter function. 8, 9, 10, 13, 17, 20, 30, 35, 40, 50, 75, 100, 125, 150, 191, 200, 251, 31 1, 491, 496, 525, 550, 575, 600, 650, 700, 750, 800, 900, 950, 1000, 1050, 1100, 1150, 1200, 1500, 1633, 1639, 1800, 2465, 2500, or 5700 contiguous nucleotides of the nucleic acid sequences of SEQ ID NOs: 1, 14, 15, 21, 55, or 56, as well as all lengths of contiguous nucleotides within such sizes, or have multimeric copies of sequences selected from any of SEQ ID NO: 7-10, 13, 16-20 or from any combinations thereof, provided that the functional part of the promoter includes a CORE promoter function. These promoters contain the consensus sequence TGTGN(1-3)ATG, (SEQ ID NO: 16) (Fig. 1A) or a complement thereof (Fig. IB).

The invention employs a screening assay using an expression vector construct comprising a CORE promoter operably linked to a reporter gene polynucleotide. To select a CORE promoter, a system that displays changes in amplitude of a quantifiable marker signal was used. For example, co-transfection into plant protoplasts of (1) a reporter construct comprised of the FT promoter sequences (FT1-FT6, SEQ ID NOs 1-6) fused with the GUS reporter gene and (2) a construct encoding the CO protein enabled the direct measurement of CO transcriptional activity. Activation is then achieved when the reporter activity value relative to the control is at least 110%, optionally 150%, 200-500%, or 1000-2000%.

Critical regions required for the CORE promoter function have been identified through this system, for example, in the FT promoter a key activation motif must exist between -146 and -190 (relative to the start codon) with additional elements likely to be present in the region between -190

and -251. The two related motifs, CORE1 and CORE2 were identified in these regions (Fig. 1A). A similar element GATTGTGCATG (SEQ ID NO: 47) is present in the promoter of another downstream target of CO, SOC1 gene (Fig. 1A). A sequence motif CATN(1-3)CACA (SEQ ID NO: 44), which is complementary to TGTGN(1-3)ATG (SEQ ID NO: 16) is present in both AOXla, PAP2, and TERMINAL FLOWER 1 (TFLl) (a homolog of FT) (Fig. IB). It has been reported that a 5.7 kb sequence element upstream of the FT translation start site FT0 (SEQ ID NO: 55) is sufficient to mediate day length response in COL (Adrian et al, 2010). Promoters that are similar to those listed in the Sequence Listing: SEQ ID NOs 1 - 4, 7-10, 13-15, 18, 20, 21, 55, or 56 may be made that have some alterations in the nucleotide sequence and yet retain the function of the listed sequences. At the nucleotide level, the promoter sequences will typically share at least about at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, and/or less than 100%, or 100% nucleotide sequence identity with any of SEQ ID NOs: 1 - 4, 7-10, 13- 21, 55-56.

To determine the percent identity of two amino acid sequences or of two nucleic acids, the sequences are aligned for optimal comparison purposes. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., percent identity=number of identical positions/total number of positions (e.g., overlapping positions).times. l00). In one embodiment, the total number of positions is the total number of nucleotides or amino acid residues contained in the entire length of one of the optimally aligned sequences. The percent identity between two sequences can be determined using techniques similar to those described below, with or without allowing gaps.

Percent identity can be determined electronically, e.g., by using the MEGALIGN program (DNASTAR, Inc. Madison, Wis.). The MEGALIGN program can create alignments between two or more sequences according to different methods, for example, the clustal method (see, for example, Higgins and Sharp (1988). The clustal algorithm groups sequences into clusters by examining the distances between all pairs. The clusters are aligned pairwise and then in groups. Other alignment algorithms or programs may be used, including Accelrys Gene, FASTA, BLAST, or ENTREZ, FASTA and BLAST, and which may be used to calculate percent similarity. These are available as a part of the GCG sequence analysis package

(University of Wisconsin, Madison, WI), and can be used with or without default settings.

ENTREZ is available through the National Center for Biotechnology Information. In one embodiment, the percent identity of two sequences can be determined by the GCG program with a gap weight of 1 (see USPN 6,262,333).

The percent identity between two polypeptide sequences can also be determined using Accelrys Gene v2.5 (2006) with default parameters: Pairwise Matrix: GONNET; Align Speed: Slow; Open Gap Penalty: 10.000; Extended Gap Penalty: .100; Multiple Matrix: GONNET;

Mulitple Open Gap Penalty: 10.000; Multiple Extended Gap Penalty: .05; Delay Divergent: 30; Gap Separation Distance: 8; End Gap Separation: false; Residue Specific Penalties: false;

Hydrophilic Penalties: false; Hydrophilic Residues: G, P, S, N, D, Q, E, K, and R. The default parameters for determining percent identity between two polynucleotide sequences using Accelrys Gene are: Align Speed: Slow; Open Gap Penalty: 10.000; Extended Gap Penalty: 5.000; Mulitple Open Gap Penalty: 10.000; Multiple Extended Gap Penalty: 5.000; Delay Divergent: 40; Transition: Weighted.

Software for performing BLAST analyses is publicly available, e.g., through the National Center for Biotechnology Information (see internet website at www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul, 1990); Altschul, 1993). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989)). Unless otherwise indicated for comparisons of predicted polynucleotides, "sequence identity" refers to the % sequence identity generated from a tblastx using the NCBI version of the algorithm at the default settings using gapped alignments with the filter "off (see, for example, internet website at www.ncbi.nlm.nih.gov/).

Novel chimeric promoters can be designed or engineered based on the promoters disclosed in the present invention by a number of methods. Many promoters contain cis- elements that activate, enhance or define the strength and/or specificity of the promoter. For example promoters may contain "TATA" boxes defining the site of transcription initiation and other cis-elements located upstream of the transcription initiation site that modulate transcription levels. For example, a chimeric promoter may be produced by fusing a first promoter fragment containing the activator cis-element from one promoter to a second promoter fragment containing the activator cis-element from another promoter; the resultant chimeric promoter may cause an increase in expression of an operably linked transcribable polynucleotide molecule. Promoters can be constructed such that promoter fragments or elements are operably linked, for example, by placing such a fragment upstream of a minimal promoter. The cis-elements and fragments of the present invention can be used for the construction of such chimeric promoters. Methods for construction of chimeric and variant promoters of the present invention include, but are not limited to, combining control elements of different promoters or duplicating portions or regions of a promoter (see for example, U.S. Pat. Nos. 4,990,607; 5,1 10,732; and 5,097,025, all of which are herein incorporated by reference). Those of skill in the art are familiar with the standard resource materials that describe specific conditions and procedures for the construction, manipulation, and isolation of macromolecules (e.g., polynucleotide molecules, plasmids, etc.), as well as the generation of recombinant organisms and the screening and isolation of polynucleotide molecules.

A possible promoter variant that is capable of regulating transcription when operably linked to a transcribable polynucleotide may have about 8, 9, 10, 13, 17, 20, 30, 35, 40, 50, 75, 100, 125, 150, 191, 200, 251, 31 1, 491, 496, 525, 550, 575, 600, 650, 700, 750, 800, 900, 950, 1000, 1050, 1100, 1 150, 1200, 1500, 1633, 1639, 1800, 2465, 2500, or 5700 contiguous nucleotides of the nucleic acid sequences of SEQ ID NOs: 1, 14, 15, 21, 55, or 56, as well as all lengths of contiguous nucleotides within such sizes, or have multimeric sequences selected from of SEQ ID NO: 7-10, 13, 16-20 or from any combinations thereof.

Related promoters corresponding to the promoters of SEQ ID NO: 1, 14, 15, 21, 55 and 56 can be employed in the invention. Related promoters can be identified, for example, by analyzing the regulatory regions of corresponding orthologous or paralogous genes. Orthologous or paralogous genes can readily be identified by the those skilled in the art through sequence comparisons using tools such as BLAST and phylogenetic analysis. Such variants or related promoters can be identified, for example, by inspection of sequences upstream of homologous genes and testing the upstream sequences for CORE promoter function as outlined above. For example, the promoter of the rice ortholog of FT (HD3a; SEQ ID NO: 52) can be employed in the invention. Related promoter sequences can also be identified by isolating a corresponding gene, e.g., by screening a library or PCR, from another plant species and testing the upstream sequence for CORE promoter function.

Synthetic promoters

In some embodiments, the methods of the invention can employ artificial promoters that have CORE promoter function. Such artificial promoter construct typically comprise at least one CORE motif (SEQ ID NO: 16) and a minimal promoter for activating transcription. In some embodiments, the synthetic promoter has four CORE motifs, such as SEQ ID NO: 18 and 20, or even eight CORE motifs, such as SEQ ID NO: 13.

The minimal promoter for use in synthetic promoters can be from any promoter. The minimal promoter supports basal transcription and typically comprises regulatory elements such as TATAA sequences. Exemplary minimal promoter regions can be from promoters such as the cauliflower mosaic virus (CaMV) 35S transcription initiation region, and other transcription initiation regions from various plant genes known to those of skill.

Synthetic promoters are tested to verify if they possess CORE promoter function, typically using reporter constructs, supra. The evaluation of promoter activity is typically performed in comparison to a known active promoter, e.g., SEQ ID NO: 1, 55, 56, 14, 15 and 21. Synthetic promoters for use in the invention generally have at least 50%, more often 70%, 80%, 90%, 100% or greater of the reporter activity, of the promoter construct that has the known CORE promoter function.

Reporter genes

Reporter genes suitable for use in the invention are known to those of skill in the art.

Reporters include, but are not limited to, fluorescent proteins, such as green or red fluorescent proteins, or variants that produce a fluorescent color; -glucuronidase (GUS); luciferase;

chloramphenicol acetyltransferase; β-galactosidase; and alkaline phosphatase. Transcription of the sequences encoding the reporter gene can be determined using any method known in the art.

In some embodiments, protein activity of the reporter gene is measured, e.g., using a fluorescent reader or other instrumentation appropriate to the reporter system. Products to assist in determination of reporter activity are commercially available.

Samples that are treated with a candidate compound, or pool of candidate compounds, are compared to control samples without the test compound to examine the extent of modulation. Control samples (untreated with activators) are assigned a relative activity value.

Activation is then achieved when the reporter activity value relative to the control is 1 10%, optionally 150%, 200-500%, or 1000-2000%.

In other embodiments endpoints other than reporter activity are assayed. For example, expression levels of the mRNA or protein can be measured to assess the effects of a test compound on reporter activation. In this instance, the amount of transcription of the reporter construct is measured by assessing the level of mRNA that encodes the reporter gene, or alternatively of the protein product. These assays can be performed using any methods known by those of skill in the art to be suitable. For example, mRNA expression can be detected using amplification-based methodologies, northern or dot blots, nuclease protection and the like.

Polypeptide products can be identified using immunoassays.

Introduction of reporter constructs into plants

Reporter constructs can be introduced into the desired plant host by a variety of conventional techniques. For example, the vector can be introduced directly into the plant cell using techniques such as electroporation, microinjection, and biolistic methods, such as particle bombardment. Alternatively, the constructs may be combined with suitable T-DNA flanking regions and introduced into a conventional Agrobacterium tumefaciens host vector. The virulence functions of the Agrobacterium tumefaciens host will direct the insertion of the construct and adjacent marker into the plant cell DNA when the cell is infected by the bacteria.

Microinjection techniques are known in the art and well described in the scientific and patent literature. The introduction of DNA constructs using polyethylene glycol precipitation is described, e.g., in Paszkowski et al. (1984). Electroporation techniques are described in Fromm et al. (1985). Biolistic transformation techniques are described in Klein et al. (1987).

Agrobacterium tumefaciens-mediated transformation techniques, including disarming and use of binary vectors, are well described in the scientific literature. See, for example Horsch et al. (1984), and Fraley et al. (1983).

The host plant cells for screening reporter constructs can be from any plant, including both dicots and monocots. Typically, plant cells are from Nicotiana benthamiana or Arabidopsis thaliana or another plant that is routinely transformed and assayed in the art.

Other plants also can be used in the screening methods taught herein. These include cereals, for example, maize, sorghum, rice, wheat, barley, oats, rye, milo, flax, or gramma grass. Other plant genera include, but are not limited to, Cucurbita, Rosa, Vitis, Juglans, Gragaria, Lotus, Medicago, Onobrychis; Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, Datura, Hyoscyamus,

Lycopersicon, Nicotiana, Solanum, Petunia, Digitalis, Majorana, Ciahorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Heterocallis, Nemesis, Pelargonium, Panieum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, Browaalia, Glycine, Pisum,

Phaseolus, Lolium, Oryza, Avena, Hordeum, Secale, Allium, and Triticum.

Following transformation of the reporter constructs into the plant cell, the transformed cell or plant tissue is selected or screened by conventional techniques. The transformed cell or plant tissue containing the reporter construct can then be regenerated, if desired, by known procedures. Additional methodology for the generation of plants comprising expression constructs for screening chemicals can be found in the art (see, e.g., US Patent No,. 5,614,395). Chemical libraries

The compounds tested as modulators of COL regulated signaling pathways are typically chemical compounds. Essentially any chemical compound of interest can be used as a CORE promoter activator in the assays of the invention. Most often, compounds can be dissolved in aqueous or organic (e.g., DMSO-based) solutions. The assays are designed to screen large chemical libraries and usually include automating the assay steps, which are typically run in parallel (e.g., in microtiter formats on microtiter plates in robotic assays). It will be appreciated that there are many suppliers of chemical compounds, including Sigma- Aldrich (St. Louis, MO), Fluka Chemika-Biochemica Analytika (Buchs Switzerland) and the like.

In one preferred embodiment, high throughput screening methods involve providing a combinatorial chemical library containing a large number of candidate compounds. Such "combinatorial chemical libraries" are then screened in one or more assays, as described herein, to identify those library members (particular chemical species or subclasses) that activate the CORE promoter. The compounds thus identified serve as conventional "lead compounds" or can themselves be used as potential or actual agents for treating plants.

A combinatorial chemical library is a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis, by combining a number of chemical "building blocks" such as reagents. For example, a linear combinatorial chemical library such as a polypeptide library is formed by combining a set of chemical building blocks (amino acids) in every possible way for a given compound length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical compounds can be synthesized through such combinatorial mixing of chemical building blocks.

Preparation and screening of combinatorial chemical libraries is well known to those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, small organic molecule libraries (see, e.g., U.S. Patent 5,569,588; thiazolidinones and

metathiazanones, U.S. Patent 5,549,974; pyrrolidines, U.S. Patents 5,525,735 and 5,519, 134; morpholino compounds, U.S. Patent 5,506,337; and the like). Other chemistries for generating chemical diversity libraries can also be used. Chemical diversity libraries are also commercially available, e.g., from such companies as 3-Dimensional Pharmaceuticals Inc., Albany Molecular Research Inc., Alchemia Pty. Ltd., Argonaut Technologies Inc., ArQuie Inc, Biofocus pic, Array Biopharma Inc., Axys Pharmaceutical Inc., Cambridge Combinatorial Ltd., Charybdis

Technologies Inc, ChemBridge Corp., CombiChem Inc., ComGenex Inc., Discovery Partners International Inc., Diversa Corp., EnzyMed Inc. Versicor, Gryphon Sciences Inc, Ixsys Inc., Kosan Biosciences Inc., Maxygen Inc.,Molecumetics Ltd., Nanoscale Combinatorial Synthesis Inc., Ontogen Corp., Orchid Biocompter Inc., Oxford Asymmetry Ltd., Oxford Molecular Group pic, Panlabs Inc., Pharmacopeia Inc., Phytera Inc., Proto Gene Inc., Sphere Biosystems Inc., Symyx Technologies Inc., and Systems Integration Drug Discovery Co.

Often, chemical libraries that are screened in the methods of the invention comprise compounds with molecular weights between 150 and 600, an average cLogP value of 3 10 (range 0-9), an average number of -bonding acceptors ofi.5 (range 0-9), an average number of -bonding donors of 1 (range 0-4) and an average of3 rotatable bonds (range 0-9). Such characteristics are typical of agrichemicals known in the art.

Devices for the preparation of combinatorial libraries are commercially available (see, e.g., 357 MPS, 390 MPS, Advanced Chern Tech, Louisville KY, Symphony, Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, MA). In addition, numerous combinatorial libraries, are themselves commercially available (see, e.g., Chembridge, Inc., San Diego, CA; ComGenex, Princeton, N.J., Tripos, Inc., St. Louis, MO, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, Columbia, MD, etc.).

High throughput assays

In the high throughput assays, it is possible to screen up to several thousand different candidate compounds in a single day. For example, each well of a microtiter plate can be used to run a separate assay against a selected candidate compound, or, if concentration or incubation time effects are to be observed, every 5-10 wells can test a single candidate compound. Further, pools of candidate compounds can also be tested where 25 multiple compounds are included in a single test sample. If an activator is then identified, the chemicals included in the pool can be individually tested to identify an active compound.

Phenotypic analysis of candidate compounds

The invention includes methods that comprise further evaluating compounds identified using CORE promoter-reporter assay using phenotype assays as a secondary assay. Selected candidates may also be evaluated for the ability to an altered or a modified trait to a plant by regulating its target gene expression. For example, to evaluate the ability of the compounds to cause altered or modified floral induction, plants after treatment are examined if they have visible bolts earlier or later than control plants with statistical significance compared to controls. Flowering time is typically quantified in Arabidopsis by the number of days prior to the development of a primary bolt of a minimum length (e.g. 1cm) or the number of rosette leaves at the appearance of a primary bolt of a minimum length. These phenotypic assays can be performed using any number of methods known in the art.

Treatment of plants

Once chemical compounds are identified and further validated in accordance with the methods of the invention, they can be used to treat any plant, for example, vegetable, fruit, and orchard crops, to improve yield, enhance resistance to abiotic stress such as cold and drought, or to regulate flowering. Plants that can be treated include both monocots and dicots and in particular, agriculturally important plant species, including but not limited to, switchgrass, miscanthus, crops such as soybean, wheat, com, potato, cotton, rice, oilseed rape (including canola), sunflower, alfalfa, sugarcane and turf; or fruits and vegetables, such as banana, blackberry, blueberry, strawberry, and raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, spinach, squash, sweet com, tobacco, tomato, watermelon, rosaceous fruits (such as apple, peach, pear, cherry and plum) and vegetable brassicas (such as broccoli, cabbage, cauliflower, brussel sprouts and kohlrabi).

Other crops, fruits and vegetables whose phenotype may be changed include barley, currant, avocado, citrus fruits such as oranges, lemons, grapefruit and tangerines, artichoke, cherries, nuts such as the walnut and peanut, endive, leek, roots, such as arrowroot, beet, cassava, turnip, radish, yam, sweet potato and beans.

The selected chemicals can be formulated for treating plants as a liquid or a solid form. For example, in liquid formulations, the plants can be treated with a spray, in a drench application, a drip application, or through irrigation. Formulations are prepared using known methodology and may comprise other reagents conventionally employed in formulation of agricultural chemicals, e.g., emulsifying agents, surfactants, etc. Examples of formulations include emulsifiable concentrates, directly sprayable or dilutable solutions, dilute emulsions, wettable powders, soluble powders, dusts, granules or microcapsules. The methods of application, such as spraying, atomising, dusting, wetting, scattering or pouring, are selected in accordance with the desired application. For example, a slow-release formulation can be applied as a soil treatment so that a plant is exposed frequently to an isolated chemical (e.g., turf grass). In other instance, it may be desirable to incorporate a chemical compound selected in accordance with the method of the invention into irrigation water for plants that experience frequent droughts (e.g., cotton).

Novel approach for regulating COL signaling pathway

CO (CONSTANS) gene has an important role in the photoperiod pathway, which is one of four regulatory pathways controlling the timing of flowering (Martinez-Zapater et al, 1994; Putterill et al, 1995; Mouradov et al, 2002; Simpson et al, 2002). COL genes have been widely identified across the plant kingdom. In two cases (Brassica napus BnCOal, Robert et al, 1998; and Pharbitis nil PnCO, Liu et al, 2001), COL genes have been shown to complement a cons tans mutant in Arabidopsis, demonstrating functional equivalence. The Hdl (Heading date 1) gene of rice (Oryza sativa) is also homologous to CO (Yano et al, 2000). Conservation between short-day (SD) plants (rice and P. nil) and long-day (LD) plants (Arabidopsis and B. napus) suggests that CO is involved in a conserved pathway regulating flowering in response to inductive day length and CO-like genes are likely to be involved in flowering time control in other cereals (Griffiths et al, 2003). These orthologous CONSTANS proteins are likely to function through binding their cognitive CORE motif, as does the Arabidopsis CONSTANS. In addition, two Arabidopsis CONSTANS homologs, COL9 and COL 15, demonstrated capability of activating the promoters AOXla, PAP2 and synthetic promoters to varying levels (Table 2) using the reporter gene analysis as described above, indicating possible roles in tissue-specific or developmentally-specific or environmentally-specific (e.g. stress responses) downstream signaling pathways.

The reporter systems described herein offers novel approach of identifying chemical compounds that can activate or inhibit COL pathway components which can therefore be applied to induce an altered or a modified trait, such as to promote or repress flowering of crops under different day length conditions.

An alternative embodiment of this invention involves a screen for chemical compounds that can repress gene expression of either COL or a target gene regulated by COL or modulate the stability of a protein in the COL signaling network. A variant of the assay involves an expression vector comprising a promoter operably linked to a reporter gene where the promoter enables sufficiently high transcriptional levels of the reporter gene molecule. Plant cells that are transformed with these constructs or manipulated as such, having a defined reporter gene activity as required for an effective implementation of the invention, are distributed to multi- well plates and are treated with test compounds and control compounds. Hit compounds that repress the reporter gene activity are identified and are subjected to a phenotypical validation assay on plants.

Phenotypical validation assays can be used to confirm the activity of a hit compound.

These assays typically involve morphological or physiological assays on a plant treated with the test compound, for altered or modified onset of flowering or for any trait of interest, for example enhanced tolerance to abiotic stress. Such transgenic plants are contacted with the hit compounds and then subjected to an external stress (e.g. heat, drought, UV). Hit compounds that confer stress tolerance relative to mock-treated control plants are further validated in additional rounds of the experiment and considered lead chemistry for additional development in target crops and under field conditions. Sequence Variations

It will readily be appreciated by those of skill in the art, that the invention includes any of a variety of polynucleotide sequences provided in the Sequence Listing or capable of encoding polypeptides that function similarly to those provided in the Sequence Listing. Due to the degeneracy of the genetic code, many different polynucleotides can encode identical and/or substantially similar polypeptides in addition to those sequences illustrated in the Sequence Listing. Nucleic acids having a sequence that differs from the sequences shown in the Sequence Listing, or complementary sequences, that encode functionally equivalent peptides (that is, peptides having some degree of equivalent or similar biological activity) but differ in sequence from the sequence shown in the sequence listing due to degeneracy in the genetic code, are also within the scope of the invention.

Altered polynucleotide sequences encoding polypeptides include those sequences with deletions, insertions, or substitutions of different nucleotides, resulting in a polynucleotide encoding a polypeptide with at least one functional characteristic of the instant polypeptides. Included within this definition are polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe of the polynucleotide encoding the instant polypeptides, and improper or unexpected hybridization to allelic variants, with a locus other than the normal chromosomal locus for the polynucleotide sequence encoding the instant polypeptides.

Sequence alterations that do not change the amino acid sequence encoded by the polynucleotide are termed "silent" variations. With the exception of the codons ATG and TGG, encoding methionine and tryptophan, respectively, any of the possible codons for the same amino acid can be substituted by a variety of techniques, for example, site-directed mutagenesis, available in the art. Accordingly, any and all such variations of a sequence selected from the above table are a feature of the invention.

In addition to silent variations, other conservative variations that alter one, or a few amino acids in the encoded polypeptide, can be made without altering the function of the polypeptide. For example, substitutions, deletions and insertions introduced into the sequences provided in the Sequence Listing are also envisioned. Such sequence modifications can be engineered into a sequence by site-directed mutagenesis (for example, Olson et al, Smith et al, Zhao et al, and other articles in Wu (ed.) Meth. Enzymol. (1993) vol. 217, Academic Press) or the other methods known in the art or noted herein. Amino acid substitutions are typically of single residues; insertions usually will be on the order of about from 1 to 10 amino acid residues; and deletions will range about from 1 to 30 residues. In preferred embodiments, deletions or insertions are made in adjacent pairs, for example, a deletion of two residues or insertion of two residues. Substitutions, deletions, insertions or any combination thereof can be combined to arrive at a sequence. The mutations that are made in the polynucleotide encoding the transcription factor should not place the sequence out of reading frame and should not create complementary regions that could produce secondary mR A structure. Preferably, the polypeptide encoded by the DNA performs the desired function.

Conservative substitutions are those in which at least one residue in the amino acid sequence has been removed and a different residue inserted in its place. Such substitutions generally are made in accordance with the Table 1 when it is desired to maintain the activity of the protein. Table 1 shows amino acids which can be substituted for an amino acid in a protein and which are typically regarded as conservative substitutions.

Table 1. Possible conservative amino acid substitutions

The polypeptides provided in the Sequence Listing have a novel activity, such as, for example, regulatory activity. Although all conservative amino acid substitutions (for example, one basic amino acid substituted for another basic amino acid) in a polypeptide will not necessarily result in the polypeptide retaining its activity, it is expected that many of these conservative mutations would result in the polypeptide retaining its activity. Most mutations, conservative or non-conservative, made to a protein but outside of a conserved domain required for function and protein activity will not affect the activity of the protein to any great extent. The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples that follow represent techniques discovered by the inventors to function well in the practice of the invention. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments that are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention, therefore all matter set forth or shown in the accompanying drawings is to be interpreted as illustrative and not in a limiting sense.

EXAMPLES

Example 1. Identification of CORE promoters

COL proteins have been shown to function as transcriptional regulators that control the expression of various genes including the flowering modulator FLOWERING LOCUS T (FT). In Arabidopsis, CONSTANS is a positive regulator of the FT gene; however, the binding sites of the CO protein within the FT promoter had not been publicly reported. A series of

transcriptional activation experiments were performed in Arabidopsis protoplasts using progressively shorter sequences derived from the FT promoter (Fig. 1A and Table 2). Co- transfection of (1) a reporter construct comprised of the FT promoter sequences (FT1-FT6, SEQ ID NOs: 1-6) fused with the GUS reporter gene and (2) an expression construct encoding the CO protein enabled the direct measurement of transcriptional activity conferred by CO in the plant cells. FT1-FT4 contains one promoter motif between -146 and -190 (relative to the start codon), and a second CORE motif between -190 and -251 conferred the normal CORE function as indicated by the reporter activity, while FT5 and FT6 showed largely reduced or a loss of activity.

AOXla protein plays a role in photorespiration, and PAP2 protein is responsive to sugar levels in the plant cell. Both the AOXla promoter, SEQ ID NO: 15, and the PAP2 promoter, SEQ ID NO: 14 contain the CORE motif in the complementary strand. Reporter constructs comprised of transcriptional fusions of the promoters and the GUS gene were generated and used in transient transfection assays with plant protoplasts. Co-transfection of the reporter construct and construct driving constitutive CO expression resulted in the activation of the GUS reporter gene from both the AOXla and PAP2 promoters, demonstrating that these promoters contain CORE activity. In particular, this result indicates that the CORE motif is active within alternative promoter sequence environments in addition to FT promoter (Table 2). The values in Table 2 represent relative promoter activity, i.e., fold induction of GUS activity for the indicated combination of protein and promoter sequence relative to the reporter activity caused by a control protein on the same promoter. It has also been reported that a 5.7 kb sequence element upstream of the FT translation start site FTO (SEQ ID NO: 55) is sufficient to mediate day length response in COL (Adrian et al, 2010).

Table 2. Relative promoter activity

Abbreviations:

NT: not tested Sequence alignment of the promoters with CORE function revealed the presence of one or more copies of the conserved CORE motif (SEQ ID NO: 16); FT1-FT4, SEQ ID NOs 1-4, contains CORE1 motif (SEQ ID NO: 17) between -146 and -190 (relative to the start codon), and CORE2 motif (SEQ ID NO: 19) between -190 and -251 (Fig. 1A). AOXla promoter, SEQ ID NO: 15, PAP2 promoter, SEQ ID NO: 14, and TFL1 promoter, SEQ ID NO: 56 contain the complementary sequence of the CORE motif, SEQ ID NO: 45 (Fig. IB). To analyze the function of the CORE motif, synthetic promoter elements comprised of multiple copies of the CORE1 or CORE2 sites (or mutated versions thereof) and a hybrid promoter sequence comprised of multimeric copies of both the CORE1 and CORE2 motifs (CORE3) upstream of the minimal Cauliflower Mosaic Virus (-46S TATA) promoter (SEQ ID NO: 48) (Figs. 2A, 2B and 2C) were also tested for CORE activity. These promoters were used to generate transcriptional fusions to the GUS reporter gene. The synthetic reporter constructs demonstrated high levels of CO-dependent induction in plant protoplasts based studies as shown in Table 2. The CORE3 reporter demonstrated higher levels of reporter induction than constructs containing multiple copies of either single motif.

Specific binding assays were also performed to address whether CO is capable of directly binding the CORE motif. The EMSA experiment used the 4XCORE2B sequence and an epitope tagged variant of the CO protein. It was found that CO protein was bound specifically to the 4XCORE2B sequence (Fig. 3). The CO protein failed to bind to the 4XCORE2BM1 sequence with a mutated CORE motif (TGTG to TATA). This result confirmed that the CORE motif is a direct target of CO and therefore likely plays a critical role in the regulation of the FT promoter and flowering.

Both the TGTG and ATG components of the CORE motif are necessary for the CORE function of the promoter. Mutant variants of the synthetic promoters where these individual components are disrupted (CORE2BM1 and CORE2BM2) lost their CORE activity (Table 2). Example 2. Screening of a chemical library using a screening assay in a high throughput format

CORE promoters such as SEQ ID NOs: 1 - 4, 7-10, 13- 21, or 55-56, fused to a GFP reporter gene is transformed into plants. 1 μΐ each of the chemicals from a library purchased from a commercial source (such as ChemBridge™. Inc., San Diego, CA) is added to a 96-well plate containing in each well 5-10 CORE::GFP or FT:GFP Arabidopsis seeds. The volume of the media in each well is 250 μΐ and the final concentration of the chemical in each well is 28 μΜ. The seeds are allowed to germinate and grow in the medium. The data are normalized based on negative controls in the same plate that are not treated with the chemical for one week and the GFP signal is quantified in a 96 well fluorescent reader (DTX800, Beckman Coulter, Fullerton, CA).

An alternative screening method involves the germination and growth of the

CORE::GFP or the FT:GFP Arabidopsis seedlings in 96 well plates for 5 days prior to the addition of the compound stock solutions. The seedlings are exposed to the compound solution for an additional 1-3 days and the GFP signal quantified in a 96 well fluorescent plate reader (DTX800, Beckman Coulter).

Example 3. Marker screening assay: RT-PCR of genetic markers

A compound identified in the screen analysis can also evaluated for the effects on the COL target genes. In this assay, Arabidopsis seedlings are grown on solid media (50% MSIB5, 0.05% MES (PH 5.7), 0.5% sucrose, 0.8% agar) in a growth chamber at 22°C with continuous light (95IlMollm2/s) for 9 days. The seedlings are then transplanted onto media containing various chemicals (typically at 20μΜ) or DMSO controls and returned to identical growth conditions for 6 or 24h. At the indicated time, the seedlings are removed from the media and immediately frozen in liquid nitrogen. RNA is extracted and cDNA is prepared using standard procedures known in

the art. RT-PCR analysis is performed using primers for the indicated genes, such as, for example, FT, PAP2 ox AOXla.

Example 4. Seed preparation

Prior to plating, seeds for all experiments are surface sterilized in the following manner:

(1) 5 minute incubation with mixing in 70 % ethanol; (2) 20 minute incubation with mixing in 30% bleach, 0.01% Triton® X-100; (3) five rinses with sterile water. Seeds are resuspended in 0.1% sterile agarose and stratified at 4°C for 2-4 days.

Example 5. Transplant compound treatment procedure

Sterile seeds (50 per plate) are sown on standard Petri dishes containing the following medium: 80% Murashige & Skoog (MS) solution (Murashige & Skoog, 1962), 1% sucrose, 0.05%) 2-(N-morpholino) ethanesulfonic acid (MES), and 0.65%> Phytagar®. Plates are incubated at 22°C under 24-hour light (95μΕ m-2 s-1) in a germination growth chamber.

Typically, on day 9, the seedlings are transferred to 6-well assay plates at a density of 10 seedlings per well. The assay plates contain growth medium spiked with a unique test compound or dimethyl sulfoxide (DMSO; carrier solvent, 0.4% v/v) per well. The plates are re-sealed and returned to the growth chamber (horizontal orientation). After an additional (assay-dependent) number of days in a growth chamber, the seedlings are then subjected to any of the plate-based abiotic or biotic stress tolerance assays detailed below. Alternatively, seeds can be sown directly on plates containing media spiked with the test compound to assess the effect of the compound on seedling growth in any of the germination assays described below.

Example 6. Spray compound treatment procedure

Sterile seeds (50 per plate) are sown on standard Petri dishes containing the following medium: 80% MS solution, 1% sucrose, 0.05% MES, and 0.65% Phytagar. Plates are incubated at 22°C under 24-hour light (95μΕ m-2 s-1) in a germination growth chamber. On day 8, the seedlings are transferred to square growth plates containing fresh medium (15-25 seedlings per plate) and arranged such that their primary roots are exposed and aligned in parallel along the surface of the plate. The plates are sealed with venting tape and returned to the growth chamber, oriented for vertical growth. Typically, on day 9, the plates are sprayed with a 0.01% Spreader Sticker surfactant solution containing the test compound or DMSO (carrier solvent, 0.4% v/v) using a Preval® aerosol sprayer (1.5 mL/plate). The plates are re-sealed and returned to the growth chamber (horizontal orientation). After an additional (assay-dependent) number of days in a growth chamber, the seedlings are then subjected to any of the plate-based abiotic or biotic stress resistance assays detailed below. Alternatively, the plants may be treated by spraying on soil either once or multiple times during growth using a formulated solution of the test compound (e.g. 0.01% Spreader Sticker); control plants are mock treated. The plants are then subjected to phenotypic validation analysis by means of morphological, developmental or abiotic/biotic stress resistance assays, such as in the example as described below.

Example 7. Phenotypic validation analysis

In these Examples, unless otherwise indicated, morphological and physiological traits are disclosed for plants that are treated by a test compound in comparison to those treated by a control compound or a carrier solvent under the identical environmental conditions. Thus, a plant treated with a test compound that is described as large and/or drought tolerant is large and more tolerant to drought with respect to a control plant, the latter including plants treated with a control compound or a carrier solvent or no treatment. When a plant is said to have a better performance than controls, it generally is larger, have greater yield, and/or show less stress symptoms than control plants. The better performing lines may, for example, have produced less anthocyanin, or are larger, greener, more turgid, or more vigorous when challenged with a particular stress, compared to controls as noted below. Better performance generally implies greater size or yield, or tolerance to a particular biotic or abiotic stress, less sensitivity to ABA, or better recovery from a stress (as in the case of a soil-based drought treatment) than controls. When plants are said to have accelerated or delayed onset of flowering, it implies that the plants switched to reproductive development at an earlier or later time than control plants which is typically recognized based on the appearance of floral buds or a primary inflorescence stem earlier or later than control plants. Flowering time is typically quantified in Arabidopsis by the number of days prior to the appearance or visible flower buds, and/or the development of a primary inflorescence (sometimes based on on a minimum length e.g. 1cm) and/or the number of rosette leaves produced by the primary apical meristem and/or measured by a modification in the expression or activity a marker of reproductive development such as LFY, AP I or homologs thereof.

Morphological analysis

Morphological analysis is performed to determine whether changes in transcription factor levels or compound treatment affect plant growth and development. Arabidopsis seeds are cold-treated (stratified) on plates for 3 days in the dark (in order to increase germination efficiency) prior to transfer to growth cabinets. Initially, plates are incubated at 22°C under a light intensity of approximately 100 microEinsteins for 7 days. Seedlings (treated or untreated as described in Examples 5 or 6) are then transferred onto soil (Sunshine potting mix) Following transfer to soil, trays of seedlings are covered with plastic lids for 2-3 days to maintain humidity while they become established. Plants are grown on soil under fluorescent light at an intensity of 70-95 microEinsteins and a temperature of 18-23°C and are optionally subjected to chemical treatments (or mock) as described in Example 6 above. Light conditions consist of a 24-hour photoperiod unless otherwise stated. In instances where alterations in flowering time are apparent, flowering may be re-examined under 8-hour, 12-hour and 24-hour light to assess whether the phenotype is photoperiod dependent. Under typical 24-hour light growth conditions, the typical generation time (seed to seed) for Arabidopsis is approximately 14 weeks.

Because many aspects of Arabidopsis development are dependent on localized environmental conditions, in all cases plants are evaluated in comparison to controls (i.e. plants that are untreated or treated with a control compound or a solvent carrier and are otherwise identical to the plants treated with the test compounds) in the same flat. Careful examination is made at the following stages: seedling (1 week), rosette (2-3 weeks), flowering (4-7 weeks), and late seed set (8-12 weeks). Seed is also inspected. Plants having no or few seeds are considered partially or totally sterile. Seedling morphology is assessed on selection plates. At all other stages, plants are macroscopically evaluated while growing on soil or another suitable growth medium. All significant differences (including alterations in growth rate, size, leaf and flower morphology, coloration and flowering time) are recorded, but routine measurements are not be taken if no differences are apparent. In certain cases, stem sections are stained to reveal lignin distribution. In these instances, hand-sectioned stems are mounted in phloroglucinol saturated 2M HC1 (which stains lignin pink) and viewed immediately under a dissection microscope.

Ten lines are typically examined in subsequent plate based physiology assays. A similar number of compound-treated plants are compared to controls when testing the effects of compound treatments.

Plate Assays.

Different plate-based physiological assays (shown below), representing a variety of abiotic and water-deprivation-stress related conditions, are used as a pre-screen to identify top performing lines (i.e. lines treated with a particular compound), that are generally then tested in subsequent soil based assays. Typically, ten lines are subjected to plate assays, from which the best three lines are selected for subsequent soil based assays.

In addition, a nutrient limitation assay can be used to find compounds that allow more plant growth upon deprivation of nitrogen. Nitrogen is a major nutrient affecting plant growth and development that ultimately impacts yield and stress tolerance. These assays monitor primarily root but also rosette growth on nitrogen deficient media. In all higher plants, inorganic nitrogen is first assimilated into glutamate, glutamine, aspartate and asparagine, the four amino acids used to transport assimilated nitrogen from sources (e.g. leaves) to sinks (e.g. developing seeds). This process may be regulated by light, as well as by C/N metabolic status of the plant. A C/N sensing assay is thus used to look for alterations in the mechanisms plants use to sense internal levels of carbon and nitrogen metabolites which could activate signal transduction cascades that regulate the transcription of N-assimilatory genes. To determine whether these mechanisms are altered or modified, we exploit the observation that control plants grown on media containing high levels of sucrose (3%) without a nitrogen source accumulate high levels of anthocyanins. This sucrose-induced anthocyanin accumulation can be relieved by the addition of either inorganic or organic nitrogen. Glutamine is used as a nitrogen source since it also serves as a compound used to transport N in plants.

Growth assays.

Unless otherwise stated, experiments are typically performed with the Arabidopsis thaliana ecotype Columbia (col-0), soybean or maize plants.

Growth assays may be conducted with Arabidopsis or other plant species (e.g., soy, maize, etc.) that are treated or untreated with test compounds or control as described in

Examples 5 or 6. For example, Arabidopsis seedlings are grown on solid media (50% MS/B5, 0.05% MES (pH 5.7), 0.5% sucrose, 0.8% agar) in a growth chamber at 22°C with continuous light (95 μΜο1/ιη2/8) for 9 days. The seedlings are then transplanted onto media containing various chemicals (typically at 20 μΜ) or DMSO controls and returned to identical growth conditions for 3 additional days. Growth assays may assess tolerance to severe desiccation (a type of water deprivation assay), growth in cold conditions at 8° C, root development (visual assessment of lateral and primary roots, root hairs and overall growth), and phosphate limitation.

For the nitrogen limitation assay, plants are grown in 80% Murashige and Skoog (MS) medium in which the nitrogen source is reduced to 20 mg/L of NH 4 NO 3 . Note that 80% MS normally has 1.32 g/L NH 4 NO 3 and 1.52 g/L KN0 3 .

For phosphate limitation assays, seven day old seedlings are germinated on phosphate- free MS medium in which KH 2 P0 4 is replaced by K 2 S0 4 .

For chilling growth assays, seeds are germinated and grown for seven days on MS +

Vitamins + 1% sucrose at 22° C and are then transferred to chilling conditions at 8° C and evaluated after another 10 days and 17 days.

For severe desiccation (plate-based water deprivation) assays, seedlings are grown for 14 days on MS+ Vitamins + 1% Sucrose at 22° C. After the treatment by test compounds or controls, Plates are opened in the sterile hood for 3 h. for hardening and then seedlings are removed from the media and let dry for two hours in the hood. After this time the plants are transferred back to plates and incubated at 22° C for recovery. The plants are then evaluated after five days.

For the polyethylene glycol (PEG) hyperosmotic stress tolerance screen, plant seeds are gas sterilized with chlorine gas for 2 h. The seeds are plated on each plate containing 3% PEG, 1/2 X MS salts, 1% phytagel, and 10 μg/ml glufosinate-ammonium (BASTA). Two replicate plates per seed line are planted. The plates are placed at 4° C for 3 days to stratify seeds. The plates are held vertically for 1 1 additional days at temperatures of 22° C (day) and 20° C (night). The photoperiod is 16 h. with an average light intensity of about 120 μιηο1/ιη2/8. The racks holding the plates are rotated daily within the shelves of the growth chamber carts. At 1 1 days, root length measurements are made. At 14 days, seedling status is determined, root length is measured, growth stage is recorded, the visual color is assessed, pooled seedling fresh weight is measured, and a whole plate photograph is taken.

Germination assays may also be carried out with NaCl (150 mM, to measure tolerance to salt), sucrose (9.4%, to measure altered or modified sugar sensing), cold (8° C) or heat (32° C). All germination assays are performed in aseptic conditions. Growing the plants under controlled temperature and humidity on sterile medium produces uniform plant material that has not been exposed to additional stresses (such as water stress) which could cause variability in the results obtained. Prior to plating, seed for all experiments are surface sterilized in the following manner: (1) 5 minute incubation with mixing in 70% ethanol, (2) 20 minute incubation with mixing in 30% bleach, 0.01% triton-X 100, (3) 5X rinses with sterile water, (4) Seeds are re-suspended in 0.1% sterile agarose and stratified at 4° C for 3-4 days. All germination assays follow modifications of the same basic protocol. Sterile seeds may be sown on conditional media that has a basal composition of 80% MS + Vitamins, or media containing test compounds as described in Example 5 above. Plates may be incubated at 22° C under 24-hour light (120-130 μΕ m-2 s-1) in a growth chamber. Evaluation of germination and seedling vigor may be performed five days after planting.

Chlorophyll content, an indicator of photosynthetic capacity, may be measured with a

SPAD meter.

Wilt screen assay. Soybean plants treated with test compounds or DMSO are grown in 5" pots in growth chambers. After the seedlings reach the VI stage (the VI stage occurs when the plants have one trifoliolate, and the unifoliolate and first trifoliolate leaves are unrolled), water is withheld and the drought treatment thus started. A drought injury phenotype score is recorded, in increasing severity of effect, as 1 to 4, with 1 designated no obvious effect and 4 indicating a dead plant. Drought scoring is initiated as soon as one plant in one growth chamber had a drought score of 1.5. Scoring continues every day until at least 90% of the wild type plants achieve scores of 3.5 or more. At the end of the experiment the scores for both test compound treated and control soybean seedlings are statistically analyzed using Risk Score and Survival analysis methods (Glantz (2001); Hosmer and Lemeshow (1999).

Water use efficiency (WUE). WUE is estimated by exploiting the observation that elements can exist in both stable and unstable (radioactive) forms. Most elements of biological interest (including C, H, O, N, and S) have two or more stable isotopes, with the lightest of these being present in much greater abundance than the others. For example, 12 C is more abundant than 13 C in nature ( 12 C = 98.89%, 13 C =1.11%, 14 C = <10-10%). Because 13 C is slightly larger than 12 C, fractionation of CO 2 during photosynthesis occurs at two steps:

1. 12 C0 2 diffuses through air and into the leaf more easily;

2. 12 C0 2 is preferred by the enzyme in the first step of photosynthesis, ribulose bisphosphate carboxylase/oxygenase.

WUE has been shown to be negatively correlated with carbon isotope discrimination during photosynthesis in several C3 crop species. Carbon isotope discrimination has also been linked to drought tolerance and yield stability in drought-prone environments and has been successfully used to identify genotypes with better drought tolerance. 13 C/ 12 C content is measured after combustion of plant material and conversion to CO 2 , and analysis by mass spectroscopy. With comparison to a known standard, 13 C content is altered in such a way as to suggest that treatment with test compounds improves water use efficiency.

Another potential indicator of WUE is stomatal conductance, that is, the extent to which stomata are open.

Data interpretation

At the time of evaluation, plants are typically given one of the following qualitative scores:

(++) Substantially enhanced performance compared to controls. The phenotype is very consistent and growth is significantly above the normal levels of variability observed for that assay.

(+) Enhanced performance compared to controls. The response is consistent but is only moderately above the normal levels of variability observed for that assay.

(wt) No detectable difference from wild-type controls.

(-) Impaired performance compared to controls. The response is consistent but is only moderately above the normal levels of variability observed for that assay.

(- -) Substantially impaired performance compared to controls. The phenotype is consistent and growth is significantly above the normal levels of variability observed for that assay.

(n/d) Experiment failed, data not obtained, or assay not performed.

Soil Drought (Clay Pot)

The soil drought assay (typically performed on Arabidopsis in clay pots) is based on that described by Haake et al. (2002).

Procedure. Sterile seeds (50 per plate) are sown on standard Petri dishes containing the following medium: 80% MS solution, 1% sucrose, 0.05% MES, and 0.65% Phytagar. Plates are incubated at 22°C under 24-hour light (95μΕ m-2 s-1) in a germination growth chamber. After 7 days of growth the seedlings are transplanted to 3.5 inch diameter clay pots containing 80g of a 50:50 mix of vermiculite:perlite topped with 80g of ProMix. Typically, each pot contains 14 evenly spaced seedlings. The pots are maintained in a growth room under 24-hour light conditions (18 - 23°C, and 90 - 100 μΕ m-2 s-1) and watered for a period of 14 days.

Compounds (or DMSO) are applied as a 0.01% Spreader Sticker solution (or similar formulation) using a Preval aerosol sprayer (ca. 2 mL/pot or lOOg/ha) no more than three times during days 7-13 post-transplant. Water is then withheld and pots are placed on absorbent diaper paper for a period of 8-10 days to apply a drought treatment. At the end of the drought period, pots are re-watered and then scored after 5-6 additional days. The number of surviving plants in each pot is counted, and the survival percentage calculated.

Analysis of results. In a given experiment, 6 or more pots of plants treated by test compounds with 6 or more pots of the appropriate control are typically compared. The mean drought score and mean proportion of plants surviving (survival rate) are calculated for both the transgenic line and the wild-type pots. In each case a / value* is calculated, which indicates the significance of the difference between the two mean values.

Calculation of p-values . For the assays where control and experimental plants are in separate pots, survival is analyzed with a logistic regression to account for the fact that the random variable is a proportion between 0 and 1. The reported -value is the significance of the experimental proportion contrasted to the control, based upon regressing the logit-transformed data.

Drought score, being an ordered factor with no real numeric meaning, is analyzed with a non-parametric test between the experimental and control groups. The -value is calculated with a Mann- Whitney rank-sum test.

Disease Resistance

Resistance to pathogens, such as Sclerotinia sclerotiorum and Botrytis cinerea, can be assessed in plate-based assays. Unless otherwise stated, all experiments are performed with the Arabidopsis thaliana ecotype Columbia (Col-0). Control plants for assays on lines containing direct promoter-fusion constructs are wild-type plants or Col-0 plants transformed an empty transformation vector (pMEN65).

Prior to plating, seed for all experiments are surface sterilized in the following manner: (1) 5 minute incubation with mixing in 70 % ethanol; (2) 20 minute incubation with mixing in 30% bleach, 0.01% Triton X-100™; (3) five rinses with sterile water. Seeds are resuspended in 0.1% sterile agarose and stratified at 4 °C for 2-4 days.

Sterile seeds are sown on starter plates (15 mm deep) containing 50% MS solution, 1% sucrose, 0.05% MES, and 1% Bacto™-Agar. 40 to 50 seeds are sown on each plate. Seedlings are grown on solid media (50% MS/B5, 0.05% MES (pH 5.7),0.5% sucrose, 0.8% agar) in a growth chamber at 22°C with continuous light (95 μΜο1/ιη2/8) for 9 days. The seedlings are then transplanted onto media containing various chemicals (typically at 20 μΜ) or DMSO controls and returned to identical growth conditions for 3 additional days. Seedlings are then transferred to assay plates (25 mm deep plates with medium minus sucrose). On day 14, seedlings are inoculated (specific method below). After inoculation, plates are put in a growth chamber under a 12-hour light/12-hour dark schedule. Light intensity is lowered to 70-80 μΕ m- 2 s-1 for the disease assay.

Sclerotinia inoculum preparation. A Sclerotinia liquid culture is started three days prior to plant inoculation by cutting a small agar plug (1/4 sq. inch) from a 14- to 21-day old

Sclerotinia plate (on Potato Dextrose Agar; PDA) and placing it into 100 ml of half-strength Potato Dextrose Broth. The culture is allowed to grown in the Potato Dextrose Broth at room temperature under 24-hour light for three days. On the day of seedling inoculation, the hyphal ball is retrieved from the medium, weighed, and ground in a blender with water (50 ml/gm tissue). After grinding, the mycelial suspension is filtered through two layers of cheesecloth and the resulting suspension is diluted 1 :5 in water. Plants are inoculated by spraying to run-off with the mycelial suspension using a Preval aerosol sprayer.

Botrytis inoculum preparation. Botrytis inoculum is prepared on the day of inoculation. Spores from a 14- to 21-day old plate (on PDA) are resuspended in a solution of 0.05% glucose, 0.03M KH 2 PO 4 to a final concentration of 10 4 spores/ml. Seedlings are inoculated with a Preval aerosol sprayer, as with Sclerotinia inoculation.

Resistance to Erysiphe cichoracearum is assessed in a soil-based assay. Erysiphe cichoracearum is propagated on a pad4 mutant line in the Col-0 background, which is highly susceptible to Erysiphe (Reuber et al. (1998), or on squash plants, since this particular strain also parasitizes squash. Inocula are maintained by using a small paintbrush to dust conidia from a 2-3 week old culture onto 4-week old plants. For the assay, seedlings are grown on plates for one week under 24-hour light in a germination chamber, then transplanted to soil and grown in a walk- in growth chamber under a 12-hour light/12-hour dark light regimen, 70% humidity. Each line is transplanted to two 13 cm square pots, nine plants per pot. In addition, three control plants are transplanted to each pot, for direct comparison with the test line. Approximately 3.5 weeks after transplanting, plants are inoculated using settling towers, as described by Reuber et al, 1998. Generally, three to four heavily infested leaves are used per pot for the disease assay. Level of fungal growth is evaluated eight to ten days after inoculation.

It is expected that the same methods may be applied to identify other useful and valuable promoter sequences, and the sequences may be derived from a diverse range of species.

All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference. References

Adrian et al. (2010) The Plant Cell (Preview)

Altschul et al. (1977) Nuc. Acids Res. 25: 3389-3402

Altschul et al. (1990), J. Mol. Biol. 215: 403-410

Altschul (1993) J. Mol. Evol. 36: 290-300

Ausubel et al., eds., 1994-1999, Current Protocols in Molecular Biology , John Wiley & Sons, New York

Chia et al. (2008) J. Exp. Bot. 59: 2735-2748.

Edmeades et al. (2000). In Physiology and Modeling Kernel Set in Maize., M.E. Westgate and K.J.

Fraley et al. (1983) Proc. Natl. Acad. Sci. USA 80:4803

Fromm et al. (1985) Proc. Natl. Acad. Sci. USA, 82: 5824-5828

Griffiths et al. (2003) Plant Physiol. 131 : 1855-67.

Haake et al. (2002) Plant Physiol. 130: 639-648

Haft et al., (2003), Nucleic Acids Res. 31 : 371-373

Henikoff & Henikoff (1989) Proc. Natl. Acad. Sci. USA 89: 10915

Higgins and Sharp (1988) Gene 73: 237-244

Horsch et al. (1984) Science 233: 496-498

Hosmer and Lemeshow (1999) Applied Survival Analysis: Regression Modeling of Time to Event Data. John Wiley & Sons, Inc., Publisher.

Karlin & Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90: 5873-5787

Klein et al. (1987) Nature 327: 70-73

Kojima et al. (2002) Plant Cell Physiol, 43: 1096-105

Krieger et al. (2010) Nature Genetics 42: 459-463

Liu et al. (2001) Plant Physiol 125: 1821-1830

Martinez-Zapater et al. (1994) In: Meyerowitz EM, Somerville CR (eds) Arabidopsis. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, pp 403-433

Miller et al. (2008). Planta 227: 1377-1388.

Mouradov et al. (2002) Plant Cell 14, SI 11-S130

Murashige and Skoog (1962) Physiol Plant 15: 473-497

Needleman & Wunsch (1970) J. Mol. Biol. 48:443

Nemoto et al. (2003) Plant J. 36: 82-93

Olson et al, in Wu (ed.) Meth. Enzymol. (1993) vol. 217, Academic Press, Orlando, FL Paszkowski et al. (1984) EMBO J. 3: 2717-2722 Pearson & Lipman (1988) Proc. Nat'l Acad. Sci. USA 85: 2444

Putterill et al. (1995) Cell 80: 847-857.

Putterill et al. (2004) Bioessays 26: 363-73.

Ratcliffe and Riechmann (2002) Curr. Issue Mol. Biol. 4: 77-91

Reuber et al. (1998) Plant J. 16: 473-485

Rieger et al. (1976) Glossary of Genetics and Cytogenetics: Classical and Molecular, 4th ed., Springer-Verlag, New York

Robert et al. (1998) Plant Mol Biol 37: 763-772

Robson et al. (2001) Plant Journal 28: 919-631

Tiwari et al. (2010) New Phytologist 187: 57-66

Simpson and Dean (2002) Science 296: 285-289

Smith et al, in Wu (ed.) Meth. Enzymol. (1993) vol. 217, Academic Press, Orlando, FL Smith and Waterman (1981) ί/ν. Appl. Math. 2: 482

Tamaki et al. (2007) Science 316: 1033 - 1036.

Turck et al. 2008 Annu Rev Plant Biol. ;59: 573-594

Zhao et al, in Wu (ed.) Meth. Enzymol. (1993) vol. 217, Academic Press, Orlando, FL

All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.

The present invention is not limited by the specific embodiments described herein. The invention now being fully described, it will be apparent to one of ordinary skill in the art that many changes and modifications can be made thereto without departing from the spirit or scope of the appended claims. Modifications that become apparent from the foregoing description and accompanying figures fall within the scope of the claims.