Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
EFFICIENT INDUCTION OF PARTHENOGENESIS IN CROP PLANTS
Document Type and Number:
WIPO Patent Application WO/2024/076507
Kind Code:
A1
Abstract:
Methods for improving parthenogenesis efficiency by DWT1 and BABY BOOM transcription factors in plants are provided. A rice embryo trigger transcription factor BABY BOOM1 can initiate embryogenesis when expressed in the unfertilized egg cell through a process called parthenogenesis (Khanday et al., 2019, Nature 565: 91–95). The parthenogenesis efficiency by BABY BBOM1 itself is 10-29 %. This invention describes methods of high frequency of parthenogenesis by simultaneous expression of BABY BOOM and DWT1 transcription factors. When BABY BOOM1 and DWT1 are expressed together through egg cell-specific promoters, parthenogenesis efficiencies of up to 90 % are achieved. These high parthenogenesis efficiencies are a prerequisite for field applications of synthetic apomixis in crop plants.

Inventors:
KHANDAY IMTIYAZ (US)
REN HUI (US)
SUNDARESAN VENKATESAN (US)
Application Number:
PCT/US2023/034142
Publication Date:
April 11, 2024
Filing Date:
September 29, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV CALIFORNIA (US)
International Classes:
C12N15/82; C12N5/04; C12N15/113; C07K14/415; C12N15/10; C12N15/11
Attorney, Agent or Firm:
HINSCH, Matthew E. et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A plant comprising: a first expression cassette comprising a first plant egg-specific promoter operably linked to a polynucleotide encoding a Dwarf Tiller 1 (DWT1) polypeptide; and a second expression cassette comprising a second plant egg-specific promoter operably linked to polynucleotide encoding a Babyboom polypeptide, wherein the plant has more efficient parthenogenesis than a control plant lacking at least the first, and optionally the second, expression cassette.

2. The plant of claim 1, wherein the plant is diploid and progeny from the plant resulting from parthenogenesis are haploid.

3. The plant of claim 1, wherein the plant further comprises sufficient mitosis instead of meiosis (MiME) expression cassettes comprising a promoter operably linked to gRNAs to induce a MiME phenotype such that the plant produces clonal seed.

4. The plant of claim 3, wherein the MiMe expression cassettes comprise: an expression cassette comprising a promoter operably linked to a gRNA that targets OSD1 or an ortholog thereof; an expression cassette comprising a promoter operably linked to a gRNA that targets ATREC8 or an ortholog thereof; an expression cassette comprising a promoter operably linked to a gRNA that targets SPO11, or PRD1, or PRD2 or PRD3/PAIR1 or an ortholog thereof.

5. The plant of any one of claims 1-4, wherein the DWT1 polypeptide comprises an amino acid sequence at least 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NO:33.

6. The plant of any one of claims 1-4, wherein the DWT1 polypeptide comprises an amino acid sequence at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NO: 1, 2, 3, 4, 5, 6 or 72.

7. The plant of any one of claims 1-6, wherein the Baby boom polypeptide comprises an amino acid sequence at least 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NOTO-29.

8. The plant of any one of claims 1-7, wherein the first egg-specific promoter and the second egg-specific promoter are the same.

9. The plant of any one of claims 1-7, wherein the first egg-specific promoter and the second egg-specific promoter are the different.

10. The plant of any one of claims 1-7, wherein the first egg-specific promoter and the second egg-specific promoter or both comprise SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:32.

11. The plant of any one of claims 1-7, wherein the plant is a rice plant.

12. A method of making the plant of any one of claims 1-11. the method comprising, introducing the first expression cassette and the second expression cassette into the plant.

13. The method of claim 12, wherein the introducing comprises transformation of the plant with the first or second or both expression cassettes, introducing the first or second or both expression cassettes into the plant with a sexual cross, or introducing one of the first and second expression cassettes into the plant via transformation and introducing one of the first and second expression cassettes into the plant via a sexual cross.

14. A method of generating haploid progeny, the method comprising cultivating a plant of any one of claims 1-2 or 6-11; and collecting haploid seed from the plant.

15. A method of generating clonal progeny, the method comprising growing a plant of any one of claims 3-4. and collecting clonal seed from the plant.

16. A nucleic acid comprising an expression cassette comprising a plant egg-specific promoter operably linked to a polynucleotide encoding a DWT1 polypeptide.

17. The nucleic acid of claim 16, wherein the promoter comprises SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:32.

18. The nucleic acid of claim 16 or 17, wherein the DWT1 polypeptide comprises an amino acid sequence at least 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NO:33. 19. The nucleic acid of claim 16 or 17, wherein the DWT1 polypeptide comprises an amino acid sequence at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NO: 1, 2, 3, 4, 5, 6 or 72.

Description:
EFFICIENT INDUCTION OF PARTHENOGENESIS IN CROP PLANTS

CROSS-REFERENCE TO RELATED PATENT APPLICATIONS

[0001] The present patent application claims benefit of priority to U.S. Provisional Patent Application No. 63/412,666, filed October 3, 2022, which is incorporated by reference for all purposes.

BACKGROUND OF THE INVENTION

[0002] Fusion of haploid gametes - an egg and a sperm during fertilization initiates embryogenesis in sexually reproducing plants. The molecular basis of embry o initiation after fertilization is still obscure. Previous transcriptome analysis of rice gametes and zygotes identified two transcription factors OsBBMl and OsDWTl, belonging to the PLETHORA/ BABY BOOM clade of APETALA2-family and WUSCHEL-Homebox or WOX family, respectively that are only expressed from the male alleles in the zygote after fertilization (Anderson et al., Developmental Cell 2017, 43:349-358. e4 (2017)). It was further shown that OsBBMl functions to initiate embryogenesis after fertilization in rice plants. In transgenic rice with OsBBMl expressed under an egg cell-specific promoter, the result is parthenogenesis (embryo development without fertilization) and the production of haploid progeny (Khanday et al., Nature 565: 91-95 (2019)). The parthenogenesis frequencies arising from ectopic expression of OsBBMl in the egg cell were found to be in the range of 10-29 % (Khanday et al., Nature 565: 91-95 (2019)).

BRIEF SUMMARY OF THE INVENTION

[0003] In some embodiments, a plant is provided comprising: a first expression cassette comprising a first plant egg-specific promoter operably linked to a polynucleotide encoding a Dwarf Tiller 1 (DWT1) polypeptide; and a second expression cassette comprising a second plant egg-specific promoter operably linked to polynucleotide encoding a Babyboom polypeptide, wherein the plant has more efficient parthenogenesis than a control plant lacking at least the first, and optionally the second, expression cassette.

[0004] In some embodiments, the plant is diploid and progeny from the plant resulting from parthenogenesis are haploid. [0005] In some embodiments, the plant further comprises sufficient mitosis instead of meiosis (MiME) expression cassettes comprising a promoter operably linked to gRNAs to induce a MiME phenotype such that the plant produces clonal seed. In some embodiments, the MiMe expression cassettes comprise: an expression cassette comprising a promoter operably linked to a gRNA that targets OSD1 or an ortholog thereof; an expression cassette comprising a promoter operably linked to a gRNA that targets ATREC8 or an ortholog thereof; an expression cassette comprising a promoter operably linked to a gRNA that targets SPO11, or PRD1, or PRD2 or PRD3/PAIR1 or an ortholog thereof.

[0006] In some embodiments, the DWT1 polypeptide comprises an amino acid sequence at least 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NO:33. In some embodiments, the DWT1 polypeptide comprises an amino acid sequence at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NO: I, 2, 3, 4. 5, 6 or 72.

[0007] In some embodiments, the Babyboom polypeptide comprises an amino acid sequence at least 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NO: 10-29.

[0008] In some embodiments, the first egg-specific promoter and the second egg-specific promoter are the same. In some embodiments, the first egg-specific promoter and the second egg-specific promoter are the different. In some embodiments, the first egg-specific promoter and the second egg-specific promoter or both comprise SEQ ID NQ:30, SEQ ID NO:31 or SEQ ID NO:32.

[0009] In some embodiments, the plant is a rice plant.

[0010] Also provided is a method of making the plant as described above or elsewhere herein. In some embodiments, the method comprises, introducing the first expression cassette and the second expression cassette into the plant. In some embodiments, the method further comprises selecting plant cells, tissues, or plants with the introduced expression cassettes, and optionally regenerating plants from the selected plant cells, tissues, or plants.

[0011] In some embodiments, the introducing comprises transformation of the plant with the first or second or both expression cassettes, introducing the first or second or both expression cassettes into the plant with a sexual cross, or introducing one of the first and second expression cassettes into the plant via transformation and introducing one of the first and second expression cassettes into the plant via a sexual cross. [0012] Also provided is a method of generating haploid progeny (or progeny having half the ploidy of the parent plant(s)). In some embodiments, the method comprises cultivating a plant as described above or elsewhere herein (e.g., but not having the MiMe phenotype); and collecting haploid (or progeny having half the ploidy of the parent plant(s)) seed from the plant.

[0013] Also provided is a method of generating clonal progeny. In some embodiments, the method comprises growing a plant as described above or elsewhere herein and having the MiMe phenotype, and collecting clonal seed from the plant.

[0014] Also provided is a nucleic acid comprising an expression cassette comprising a plant egg-specific promoter operably linked to a polynucleotide encoding a DWT1 polypeptide. In some embodiments, the promoter comprises SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:32. In some embodiments, the DWT1 polypeptide comprises an amino acid sequence at least 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NO:33. In some embodiments, the DWT1 polypeptide comprises an amino acid sequence at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 98% identical to SEQ ID NO: 1, 2, 3, 4, 5, 6 or 72.

DEFINITIONS

[0015] An "endogenous" or "native" gene or protein sequence, as used with reference to an organism, refers to a gene or protein sequence that is naturally occurring in the genome of the organism.

[0016] A polynucleotide or polypeptide sequence is "heterologous" to an organism or a second polynucleotide sequence if it originates from a foreign species, or, if from the same species, is modified from its original form. For example, when a promoter is said to be operably linked to a heterologous coding sequence, it means that the coding sequence is derived from one species whereas the promoter sequence is derived from another, different species; or, if both are derived from the same species, the coding sequence is not naturally associated with the promoter (e.g., is a genetically engineered coding sequence, e.g., from a different gene in the same species, or an allele from a different ecotype or variety).

[0017] The term "promoter," as used herein, refers to a polynucleotide sequence capable of driving transcription of a coding sequence in a cell. Thus, promoters can include c/.s -acting transcriptional control elements and regulatory’ sequences that are involved in regulating or modulating the timing and/or rate of transcription of a gene. For example, a promoter can be a c7.s-acting transcriptional control element, including an enhancer, a promoter, a transcription terminator, an origin of replication, a chromosomal integration sequence, 5' and 3' untranslated regions, or an intronic sequence, which are involved in transcriptional regulation. These cv.s-acting sequences typically interact with proteins or other biomolecules to cany 7 out (turn on/off, regulate, modulate, etc.) gene transcription. A "plant promoter" is a promoter capable of initiating transcription in plant cells. A "constitutive promoter" is one that is capable of initiating transcription in nearly all tissue types, whereas a "tissue-specific promoter" initiates transcription only in one or a few particular tissue types.

[0018] The term "operably linked" refers to a functional linkage between a nucleic acid expression control sequence (such as a promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, wherein the expression control sequence directs transcription of the nucleic acid corresponding to the second sequence.

[0019] The term "plant" includes whole plants, shoot vegetative organs and/or structures (e.g., leaves, stems and tubers), roots, flowers and floral organs (e.g, bracts, sepals, petals, stamens, carpels, anthers), ovules (including egg and central cells), seed (including zygote, embry o, endosperm, and seed coat), fruit (e.g., the mature ovary ), seedlings, plant tissue (e.g., vascular tissue, ground tissue, and the like), cells (e.g., guard cells, egg cells, trichomes and the like), and progeny of same. The class of plants that can be used in the method of the invention is generally as broad as the class of higher and lower plants amenable to transformation techniques, including angiosperms (monocotyledonous and dicoty ledonous plants), gymnosperms, fems, and multicellular algae. It includes plants of a variety of ploidy levels, including aneuploid, polyploid, diploid, haploid, and hemizygous.

[0020] A "transgene" is used as the term is understood in the art and refers to a heterologous nucleic acid introduced into a cell by 7 human molecular manipulation of the cell's genome (e.g., by molecular transformation). Thus, a "transgenic plant" is a plant that carries a transgene, i.e., is a genetically -modified plant. The transgenic plant can be the initial plant into which the transgene was introduced as well as progeny thereof whose genomes contain the transgene.

[0021] The phrase "nucleic acid" or "polynucleotide sequence" refers to a single or doublestranded polymer of deoxyribonucleotide or ribonucleotide bases read from the 5' to the 3' end. Nucleic acids may also include modified nucleotides that permit correct read through by a polymerase, and/or formation of double-stranded duplexes, and do not significantly alter expression of a polypeptide encoded by that nucleic acid. [0022] The phrase "nucleic acid sequence encoding" refers to a nucleic acid which directs the expression of a specific protein or peptide. The nucleic acid sequences include both the DNA strand sequence that is transcribed into RNA and the RNA sequence that is translated into protein. The nucleic acid sequences include both the full-length nucleic acid sequences as well as non-full length sequences derived from the full length sequences. It should be further understood that the sequence includes the degenerate codons of the native sequence or sequences which may be introduced to provide codon preference in a specific host cell.

[0023] The terms "identical" or percent "identity," in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of nucleotides or amino acid residues that are the same, when compared and aligned for maximum correspondence over a comparison window, as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Two nucleic acid sequences or polypeptides are said to be "identical" if the sequence of nucleotides or amino acid residues, respectively, in the two sequences is the same when aligned for maximum correspondence as described below. When percentage of sequence identity is used in reference to proteins or peptides, it is recognized that residue positions that are not identical often differ by conservative amino acid substitutions, where amino acids residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity ) and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1 . The scoring of conservative substitutions is calculated according to, e.g., the algorithm of Meyers & Miller, Computer Applic. Biol. Sci. 4: 11-17 (1988) e.g., as implemented in the program PC/GENE (Intelligenetics. Mountain View, California, USA).

[0024] The phrase "substantially identical." used in the context of two nucleic acids or polypeptides, refers to a sequence that has at least 50% sequence identity with a reference sequence (e.g., any of SEQ ID NOs: 1-69). Alternatively, percent identity can be any integer from 50% to 100%. Some embodiments include at least: 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%, compared to a reference sequence using the programs described herein; preferably BLAST using standard parameters, as described below.

[0025] For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.

[0026] A "comparison window", as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity 7 method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and visual inspection.

[0027] Algorithms that are suitable for determining percent sequence identity and sequence similarity 7 are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1990) J. Mol. Biol. 215: 403-410 and Altschul et al. (1977) Nucleic Acids Res. 25: 3389- 3402. respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (NCBI) web site. The algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits acts as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction is halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a word size (W) of 28, an expectation (E) of 10, M=l, N=-2. and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a word size (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).

[0028] The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see. e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability' by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.01, more preferably less than about 10' 5 , and most preferably less than about 10' 20 .

[0029] An "expression cassette" refers to a nucleic acid construct that, when introduced into a host cell, results in transcription and/or translation of an RNA or polypeptide, respectively.

BRIEF DESCRIPTION OF THE DRAWINGS

[0030] Figure 1A-B: Schematic drawings of T-DNA vectors used for inducing parthenogenesis in Rice. Figure 1A: pEC1.2::OsDWTl transgene in a binary vector (pCAMBIA2300) for egg cell expression of OsDWTl. Figure IB: pEC1.2::OsBBMl in a binary vector (pCAMBIA1300) for egg cell expression of OsBBMl. [0031] Figure 2A-B: Characterization of parthenogenetic haploids. Figure 2A. Left, a haploid plant from transgenic line #12b from T2 generation. Right, T2 diploid progeny sibling plant from the same line #12b. Haploids are dwarf with narrow leaves and sterile due to meiotic defects. Figure 2B. Left, a haploid panicle showing complete sterility due to meiotic defects. Right, a fertile control diploid panicle from the same line #12b.

[0032] Figure 3A-C: Flow-cytometric DNA histograms for ploidy determination. Figure 3 A. Sorted nuclei from leaves of a diploid plant showing a 2n peak at 160. Figure 3B. Parthenogenetic haploid showing a In peak at 80. Figure 3C. A mixed sample of haploid and diploid nuclei showing In and 2n peaks.

DETAILED DESCRIPTION OF THE INVENTION

[0033] The inventors have discovered that expressing a Dwarf Tiller 1 (DWT1) polypeptide and Babyboom polypeptide in the egg of a plant greatly improves efficiency of parthenogenesis from the resulting plant compared to expression of a Babyboom polypeptide alone. Expression of DWT1 with Babyboom in egg cells (i.e., plant egg cells) allows for agronomically useful levels of parthenogenesis.

[0034] Thus in some embodiments, targeting expression of BABYBOOM and DWT1 to egg cells of a plant will result in production of progeny that have half the number of chromosomes compared to the parent. In addition, in embodiments in which the Meiosis-to- mitosis (MiMe) phenotype has been induced, synthetic apomixis will be induced in the plant at high efficiencies, resulting in clonal seed production.

[0035] Accordingly, the disclosure provides for egg-expression of DWT1 and Babyboom polypeptides in plants and optionally further genetic manipulation to result in the Mime phenotype.

[0036] DWT1 polypeptides are homeobox transcription factors naturally-expressed in plant reproductive structures (see, e.g.. Wang W, et al. (2014) PLoS Genet 10(3): e!004154; Fang et al., (2020). New Phytologist 225: 1234-1246; Anderson et al., 2017, Developmental Cell 43:349-358) and are characterized by a highly conserved (>85% identity) portion of 67 amino acids (SEQ ID NO:33). See, for example, the following alignment with the 67 amino acid portion bolded: OaDWL2 SLLLQWPA - SOYMPATELGGV- LGSSS-HTQTPAAI1TBPSTISPSVLLGLCWEA— 302 MalzeDWL2 QFLPQWQOGGOQHYYLPATELGGV-LDGBSBBTHEPPAAIHRPVSLSPSVLFGLCNEA 311

SoybeanDWTl S1OWQSQLQCQQWV - - -GFCTSLLLXEIWYGTI - SKKDQDEDKA 283 SoybeanDWll SHMQSQLQCQQWV - =====GPCTSLLLSEIK3HGTF - SKKDQDQDEA 282 AeborellaDWTl IXL - QQQEK 222 ToeatcDWTl T - FTPS - SSSSGLLLMEWMGGISTQAP1WSKKDENDK 245 KalzeDWTl - TEPT— VTGBRTCAWGPAGLGQFWPVGGADBHQHHKHWTT-AATWTVARDAAHEHA 348 KalzeDWLl - TEPTAWTGHKTCAWGPAGLGQSWPCGGADH3QPGianiNT-AARELA - BEDDA 347 OaDWTl - QAPW — WTGBKSCAW3-AGLGQHW-CGSADQLGLGKSSAASIATVSRPEEAHDVDA 363 OaDWLl 306 OaDWL2 - LGQH2QETKDOWnTCSWPSKV-FD - HHSKDDKSCT - DAVSAVNRDDEK 348 KalzeDWL2 - LRQDYCADISWPTKGLGHGHQniWSTTCGSOHtHrSWSKI - DAVSAVIRDDEK 363

SoybeanDWTl LKITHPQLSFPLTSTPPTT - TIAPSI3- 309 SoybeanDWll LKIKHPQL3XFPLTSTPTT - IIAPPIS- 308 AeborellaDWTl KKMGB - IAK- 230 ToeatcDWTl IMLQ3QLK3YTVTST - VS PLAT- 2o6 KalzeDWTl TTLGL LQYGFEA3AAMETASA - A— VPLAASPG TA - AS — VATAG LT — 389 KalzeDWLl T KLGLLQ YGrGATTAMEAAPA - V— APLAA3PAGGAVTMASVS — ASTAGLT — 394 OaDWTl TKHGLLQYGrGITTPQVHVDVT33AAGVLPPVP53P3PPWAAVTVA3V - AATASLT — 418 OaDWLl 306 OaDWL2 ARLGLLHYGIGVTAAAXEAPHHHHHHHHL - ASPVHDAVSAADASTAMaLPFTTT 402 KalzeDWL2 SRLGLLHYYGLAGATTTAAA - AV - APAPLAAD 393

SoybeanDWTl - TV - PCPITQLQGVGEVA-GD - RAKC 331 SoybeanDWll - TVL- DPSPITQLEGVGEVAAGD - RAKC 333 AeborellaDWTl - WD - ILXGVCECTAMSMCCSC - -CCCR 253 ToeatcDWTl - TT - IPTISHIQCVTVDFWDA - CPTR 289 KalzeDWTl — SLPASTXA-VWMYDLLQGLAVPGSG8GAVGV8TGGAPPV7WAAAPTAAQEGWVALC <46 KalzeDWLl — GrPASTWG-WAXYDLLQGLAVPGGGAGAGRAPA— AVAVAADAAPTAAQEC-WALC 448 aDWTl — DFAASAISAGAVAWWOFQCLADrGLVAGACS - GAGAAAAAAAP - EAGSSVAAV 469 OaDWLl 306 OaDWL2 AAATPSWWATS3ALADQLQGLLDAGLLQGCAAPPPPSATWAVSRD - DETMC 454 KalzeDWL2 -AAAGTATLLPSSAASDQLQGLLDAAGLMGETPPTPTATWA'ZARC - AVTCA 444 oybeanDWTl -TVEI-NGVEFEWMa— FniVBQAFGDEAVLIBSSGN - PVPTDKRGITLBPLHH 381 SoybeanDWll ITVFI-NDWFEIVWG-Pn<VRQAFGDEAVLIBSSQf - PVPTDEWGITLBPLHB 384 AeborellaDWTl VTVn-XEXAFEVGAGGRVXVREAFGE-AKLIBSSGB - PVPTMEWGFTLQPLQB 304 ToeatcDWTl STVPT-WDVAFEVGIG-PENVREVFGEDAVLIBSSGE - PLITWEWGITIQPLQH 340 KalzeDWT l ITDSVTGKSVABNVAAARLDVRAQrGEAAVLLRAVGDRGGLDLVPVPVDALGCTVEPLQB 506 KalzeDWLl ITDSITGKSVAHWVAAARLDVRAQFGEAAVLLRCGGE-RGLDLEPVPVDASGCTVEPLQR 507 OaDWTl VCVSVAGAAPPLFYPAAHnrVR-HYGDEAELLRY - RGGSRTEPVPVDESCVTVEPLQQ 525 OaDWLl 306 OeDWL2 TKTTSYSFPATKHLKVK-KFGEAAVLVRYSGE PVLVDDSGVTVEPLQQ 501 alzeDWL2 493

SoybeanDWTl GAYYYLV - 388 SoybeanDWLl GACYYLV - 391 AeborellaDWTl GBFYYLV - 311 ToaetoDWTl GAFYYLLRTSSIA3THHI 358 KalxeDWTl GAFYYVLV - 514 KalzeDWLl GATYYVLL - 515 OaDWTl GAVYIWM* - 533 OaDWLl - 3Q 6 OaDWL2 GATYYVLVSMAVH* - 515 KalzeDWL2 DTLYYVUQATNW - 506

[0037] DWT1 protein sequences indude the Ambordla trichopoda DWT1 protein sequence (SEQ ID NO:72). Even though Amborella is a basal angiosperm, i.a, its lineage diverged from those of monocots and dicots early in die evolution of flowering plants, die corresponding portion of die Ambordla DWT1 protein (SEQ ID NO: 73) has 88% identity to rice DWT1 in the higjily conserved 67 amino add domain. [0038] Moreover, the 67 amino acid sequence has comparatively low conservation (~ 40% - 70% identity) with WUS. W0X2 and other Wuschel-related protein families that are not DWT orthologs. The sequence of the 67 amino acid portion is therefore DWT/DWL- specific.

[0039] Accordingly, DWT1 polypeptides can comprise an amino acid sequence at least 80, 85, 90, 95, 98, 99 or 100% identical to PDPKPRWNPRPEQIRILEAI FNSGMVNPPRDEIPRIRMQLQEYGQVGDANVFYWFQNRKSRSKNKLR ( SEQ ID NO : 33 ) . In some embodiments, the DWT 1 polypeptide is from a species of plant of the genus Abelmoschus , Allium, A plum. Amaranthus, Arachis, Arabidopsis, Asparagus, Atropa, Avena, Benincasa, Beta, Brassica, Cannabis, Capsella, Cica, Cichorium, Citrus, Citrullus, Capsicum, Carthamus, Cocos, Coffea, Cucumis, Cucurbita, Cynasa, Daucus, Diplotaxis, Dioscorea, Elais. Eruca, Foeniculum, Fragaria. Glycine, Gossypium. Helianthus, Heterocallis, Hordeum, Hyoscyamus, Ipomea, Lactuca, Lagenaria, Lepidium, Linum, Lolium, Luffa, Luzula, Lycopersicon, Matus, Manihot, Majorana, Medicago, Momodica, Musa, Nicotiana, Olea, Oryza, Panicum, Pastinaca, Pennisetum, Per sea, Petroselinium, Phaseolus, Physalis, Pinus, Pisum, Populus. Pyrus, Prunus. Raphanus, Saccharum, Secale, Senecio, Sesamum, Sinapis, Solanum, Sorghum, Spinacia, Theobroma, Trichosantes , Trigonella, Triticum, Turritis, Valerianelle, Vitis, Cigna, or Zea. Exemplary DWT1 polypeptides can comprise, for example, an amino acid sequence at least 65, 70, 75, 80, 85, 90, 95, 98, 99 or 100% identical to SEQ ID NO: 1, 2, 3, 4, 5, 6 or 72.

[0040] Any naturally-or non-naturally -occurring active BABYBOOM polypeptide from a sexually reproducing plant can be expressed as described herein so long as the polypeptide (and/or RNA encoding the polypeptide) is expressed in egg cells in the plant. BABY BOOM polypeptides contain two conserved AP2 domains. The corresponding transcripts lack a miR172 binding site, thereby distinguishing BABY BOOM polypeptides from many other AP2 domain proteins that contain a miR172 binding site. In some embodiments, the BABYBOOM polypeptide is from a species of plant of the genus Abelmoschus, Allium, Apium, Amaranthus, Arachis, Arabidopsis, Asparagus, Atropa, Avena, Benincasa, Beta, Brassica, Cannabis, Capsella, Cica, Cichorium, Citrus, Citrullus, Capsicum, Carthamus, Cocos, Coffea, Cucumis, Cucurbita, Cynasa, Daucus, Diplotaxis, Dioscorea, Elais, Eruca, Foeniculum, Fragaria, Glycine, Gossypium, Helianthus, Heterocallis, Hordeum, Hyoscyamus, Ipomea, Lactuca, Lagenaria, Lepidium, Linum, Lolium, Luffa, Luzula, Lycopersicon, Malus, Manihot, Majorana, Medicago, Momodica, Musa, Nicotiana, Olea, Oryza, Panicum, Pastinaca, Pennisetum, Persea, Petroselinium, Phaseolus, Physalis, Pinus, Pisum, Populus, Pyrus, Prunus, Raphanus, Saccharum, Secale, Senecio, Sesamum. Sinapis, Solarium, Sorghum, Spinacia, Theobroma, Trichosantes , Trigonella, Triticum, Turritis, Valerianelle, Vitis, Vigna, or Zea. In some embodiments the BABYBOOM polypeptide is identical or substantially identical to any of SEQ ID NOs: 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28. or 29. See, also, Chahal, et al., Front. Plant Sci., 14 July 2022.

[0041] As noted above, both the DWT1 polypeptide and the Babyboom polypeptide will be expressed in the egg cell of the plant. In some embodiments, the plant comprises (i) a heterologous expression cassette comprising a promoter that at least directs expression to egg cells operably linked to a DWT1 polypeptide as described herein and (ii) a heterologous expression cassette comprising a promoter that at least directs expression to egg cells operably linked to a BABYBOOM polypeptide as described herein. In some embodiments, the promoter is egg cell-specific, meaning the promoter drives expression only or primarily in egg cells. ‘‘Primarily’' means that if there is expression in other tissue the levels are no more than 1/10 of the expression levels in egg cells as measured by quantitative RT-PCR.

[0042] Exemplary promoters that drive expression in at least egg cells of a plant include, but are not limited to, the promoter of the egg-cell specific gene ECI. 1 (e.g., SEQ ID NO:32), EC1.2, ECU, EC1.4, or EC1.5. See, e.g. Sprunck et al. Science, 338: 1093-1097 (2012);

AT2G21740; Steffen et al., Plant Journal 51 : 281-292 (2007). In some embodiments, the rice-specific promoter comprises SEQ ID NO:31, i.e., the rice egg cell-specific promoter sequence from the LOC_Os03gl8530 OsECAl gene. In some embodiments, the Arabidopsis DD45 promoter is used to express in rice egg cell (Ohnishi et al. Plant Physiology 165: 1533-1543 (2014). An exemplary 7 DD45 promoter sequence can comprise, for example, SEQ ID NO:30. Other promoters that can be used for egg cell expression include promoters of the egg cell-specific ECS1 (SEQ ID NO:69) and ECS2 (SEQ ID NO:70) genes (Yu et al. 2021, Nature 592:433-437) and the RWD2 gene (Koszegi et al. 2011 The Plant Journal 67:280-291).

[0043] Other promoters that are expressed in egg cells, but are not necessarily egg-cell specific, are described in, e.g., Anderson et al., The Plant Journal 76: 729-741 (2013). In some embodiments, the expression cassette further comprises a transcriptional terminator. Exemplary terminators can include, but are not limited to, the rbcS E9 or nos terminators. In some embodiments, the expression cassette will include an egg cell enhancer. Exemplary' egg cell enhancers include, but are not limited to, the EC 1.2 enhancer or EASE enhancer (Yang el al.. Plant Physiol. 139: 1421-32 (2005). In some embodiments, a different eggspecific promoter is operably linked to the coding sequence for DWT 1 and Babyboom to avoid possible recombination events.

[0044] In other embodiments, mutations can be introduced into one or both of the native BABYBOOM promoter and the native DWT1 promoter such that BABYBOOM and/or DWT1 is expressed in egg cells based from the modified native promoter. In such embodiments, one or more nucleotide of the BABYBOOM and/or DWT1 promoter is modified by non-natural substitution, deletion or insertion.

[0045] Manipulation of the native promoter can be achieved via site-directed or random mutagenesis. Methods for introducing genetic mutations into plant genes and selecting plants with desired traits are well known and can be used to introduce mutations into the BABYBOOM and/or DWT1 promoter to cause the promoter to drive expression in plant egg cells. For instance, seeds or other plant material can be treated with a mutagenic insertional polynucleotide (e.g., transposon, T-DNA, etc.) or chemical substance, according to standard techniques. Such chemical substances include, but are not limited to, the following: diethyl sulfate, ethylene imine, ethyl methanesulfonate and N-nitroso-N-ethylurea. Alternatively, ionizing radiation from sources such as, X-rays or gamma rays can be used. Plants having a mutated BABYBOOM promoter can then be identified, for example, by phenotype or by molecular techniques, including but not limited to TILLING methods. See. e.g., Comai, L. & Henikoff, S. The Plant Journal 45, 684-694 (2006).

[0046] Other mutation induction systems, such as genome editing methods, can be used to target mutations in the BABYBOOM and/or DWT1 promoter, having the advantages of increasing the frequency of single and multiple mutations at a defined target site (Lozano- Juste. J., and Cutler, S.R. (2014) Trends in Plant Science 19, 284-287). The sequencespecific introduction of a double stranded DNA break (DSB) in a genome leads to the recruitment of DNA repair factors at the breakage site, which then repair lesion by either the error-prone non-homologous end joining (NHEJ) or homologous recombination (HR) pathways. NHEJ repairs the breaks, but is imprecise and often creates diverse mutations at and around the DSB. In cells in which the HR machinery repairs the DSB, sequences with homology' flanking the DSB, including exogenously 7 supplied sequences, can be incorporated at the region of the DSB. DSBs can therefore be leveraged by geneticists to increase the frequency of mutations at defined sites, however intrinsic differences between the relative roles of HR and NHEJ can affect the mutation types at a targets locus. A number of technologies have been developed to create DSBs at specific sites including synthetic zinc finger nucleases (ZFNs), transcription activator-like endonucleases (TALENs) and most recently the clustered regularly interspaced short palindromic repeats (CRISPR)/ CRISPR- associated protein 9 (Cas9) system. This system is based on a bacterial immune system against invading bacteriophages in which a complex of 2 small RNAs, the CRISPR-RNA (crRNA) and the trans-activating crRNA (tracrRNA) directs a nuclease (Cas9) to a specific DNA sequence complementary to the crRNA. Using any of these systems, one can create DSBs at pre-determined sites in cells expressing the genome editing constructs. In order for homologous recombination to occur, a DNA cassette homologous to the targeted site must be provided, preferably at a high concentration so that HR is favored or NHEJ.

[0047] The present disclosure also provides for nucleic acids, including isolated nucleic acids, nucleic acid expression cassettes, and expression vectors, that encode a heterologous egg cell -expressing promoter operably linked to a DWT1 polypeptide coding sequence, and optionally further comprising an expression cassette comprising a heterologous egg cellexpressing promoter operably linked to a Babyboom polypeptide coding sequence as described herein. Also provided are host cells comprising the nucleic acids.

[0048] In some embodiments, recombinant DNA vectors suitable for transformation of plant cells and comprising the expression cassette are prepared. Techniques for transforming a wide variety of higher plant species are well known and described in the technical and scientific literature. See, e.g., Weising et al. Ann. Rev. Genet. 22:421-477 (1988). In some embodiments, the vector comprising the sequences (e.g., promoters or CENH3 coding regions) comprises a marker gene that confers a selectable phenotype on plant cells. For example, the marker may encode biocide resistance, particularly antibiotic resistance, such as resistance to kanamycin, G418, bleomycin, hygromycin, or herbicide resistance, such as resistance to chlorosluforon or Basta.

[0049] In some embodiments, any of a variety of different expression constructs, such as expression cassettes and vectors suitable for transformation of plant cells, can be prepared. Techniques for transforming a wide variety of higher plant species are well known and described in the technical and scientific literature. See. e.g., Weising et al. Am. Rev. Genet. 22:421-477 (1988). A DNA sequence coding for a protein can be combined with c/.s-acting (promoter) and /ram-acting (enhancer) transcriptional regulatory sequences to direct the timing, tissue type and levels of transcription in the intended tissues of the transformed plant. Translational control elements can also be used. In some embodiments, a terminator sequence is included in the expression construct. An exemplary NOS terminator sequence is CCGATCGTTCAAACATTTGGCAATAAAGTTTCTTAAGATTGAATCCTGTTGCCGG TCTTGCGATGATTATCATATAATTTCTGTTGAATTACGTTAAGCATGTAATAATTA ACATGTAATGCATGACGTTATTTATGAGATGGGTTTTTATGATTAGAGTCCCGCA ATTATACATTTAATACGCGATAGAAAACAAAATATAGCGCGCAAACTAGGATAA ATTATCGCGCGCGGTGTCATCTATGTTACTAGATCGGGAATTGATCCCCCCTCGA CAG.

[0050] Also provided are host cell(s) comprising a heterologous egg cell-expressing promoter operably linked to a DWT1 polypeptide coding sequence, and optionally further comprising an expression cassette comprising a heterologous egg cell-expressing promoter operably linked to a Babyboom polypeptide coding sequence, as described herein.

Exemplary host cells include, for example, prokaryotic (e.g., including but not limited to E. coli) cells or eukaryotic cells, and can for example plant, fungal, yeast, mammalian, insect, or other cells. Also provided as discussed above are plants comprising a heterologous egg cellexpressing promoter operably linked to a DWT1 polypeptide coding sequence, and optionally further comprising an expression cassette comprising a heterologous egg cell-expressing promoter operably linked to a Babyboom polypeptide coding sequence, as described herein.

[0051] Any method of introducing a first expression cassette comprising a heterologous egg cell-expressing promoter operably linked to a DWT1 polypeptide coding sequence, and a second expression cassette comprising a heterologous egg cell-expressing promoter operably linked to a Babyboom polypeptide coding sequence can be used. In some embodiments, both expression cassettes are introduced in one transformation. In other embodiments, a first expression cassette is introduced into a plant and then the resulting transformant is further transformed with the second expression caseate. See, e.g., the Example. In some embodiments, the expression cassettes as described herein are combined with suitable T- DNA flanking regions and introduced into a conventional Agrobacterium tumefaciens host vector. The virulence functions of the Agrobacterium tumefaciens host will direct the transfer of the T-DNA into plant cells when the cell is infected by the bacteria.

Agrobacterium tumefaciens -mediated transformation techniques, including disarming and use of binary vectors, are well described in the scientific literature. See, for example, Horsch et al. Science 233:496-498 (1984), and Fraley et al. Proc. Natl. Acad. Sci. USA 80:4803 (1983). Alternatively other transformation methods can be used.

[0052] In some embodiments, transformation will be performed on embryonic plant tissue. For example, Agrobacterium tumefaciens can be co-cultivated with seed embryo-derived secondary' calluses (see, e.g., Sallaud, C. et al., Theor. Appl. Genet. 106, 1396-1408 (2003); US Patent No. 10,584,345; EP0290395; and US2011/0212525). In some embodiments, transformation will be performed on somatic tissue. In some embodiments, transformation will be performed on plant protoplasts. Transformed cells can subsequently be selected (e.g., selecting for antibiotic resistance or other selectable marker introduced with the T-DNA or as otherwise known in the art). Primary’ transformed cells can subsequently be regenerated into plants.

[0053] The plant manipulated as described herein can be any plant species. In some embodiments, the plant is a di cot plant. In some embodiments the plant is a monocot plant. In some embodiments, the plant is a grass. In some embodiments, the plant is a cereal (e.g., including but not limited to Poaceae, e.g., rice, barley, wheat, maize). In some embodiments, the plant is a species of plant of the genus Abelmoschus, Allium. Apium. Amaranthus, Arachis, Arabidopsis, Asparagus, Airopa. Avena, Benincasa, Beta, Brassica, Cannabis, Capsella, Cica, Cichorium, Citrus, Citrulhis, Capsicum, Carthamus, Cocos, Coffea, Cucumis, Cucurbita, Cynasa, Daucus, Diplotaxis, Dioscorea, Elais, Eruca, Foeniculum, Fragaria. Glycine, Gossypium. Helianthus, Heterocallis, Hordeum. Hyoscyamus, Ipomea, Lactuca, Lagenaria, Lepidium, Linum, Lolium, Luffa, Luzula, Lycopersicon, Malus, Manihot, Majorana, Medicago, Momodica, Musa, Nicotiana, Olea, Oryza, Panicum, Pastinaca, Pennisetum, Persea, Petroselinium, Phaseolus, Physalis, Pinus, Pisum, Populus, Pyrus, Prunus. Raphanus, Saccharum, Secale, Senecio. Sesamum, Sinapis, Solanum, Sorghum, Spinacia, Theobroma, Trichosantes, Trigonella, Triticum, Turritis, Valerianelle, Vitis, Vigna, or Zea.

[0054] As noted above, by introducing the two expression cassettes or otherwise inducing expression of Baby boom and DWT1 in egg cells, one can induce a high rate of parthenogenesis. For example, expression of both Baby boom and DWT1 in egg cells results in greater rates of parthenogenesis than expression of Baby boom alone, DWT1 alone, or in the absence of expression of either in egg cells. In some embodiments, the rate of parthenogenesis is at least, e.g., 40%, 50%, 60%, 70%, 80%, 85%, 90%, or 95%.

Parthenogenesis results in a reduction (halving) of the number of chromosomes by delivering only one parent’s chromosomes into the egg. In the absence of MiMe (discussed below), this means for example that a diploid parent plant will produce seed that is haploid, or for example a tetrapioid plan would produce diploid seed. The percent of seed having only the chromosomes of one parent represents the rate of rate of parthenogenesis.

[0055] In some embodiments, a portion of the seed or seed coat is removed and a genetic test is performed to determine whether the seed is haploid prior to germination. In other embodiments, the seeds are germinated and the resulting progeny plants are screened for those that are haploid, either by testing their genotype or by observation (haploid plants in many cases are smaller than diploid progeny, see FIG. 2). Optionally, one can physically separate progeny into groups of only haploid plants, optionally discarding diploid progeny or otherwise physically separating diploid progeny from haploid progeny.

[0056] Once generated, haploid plants can be used for a variety of useful endeavors, including but not limited to the generation of doubled haploid plants, which comprise an exact duplicate copy of chromosomes. Such doubled haploid plants are of particular use to speed plant breeding, for example. A wide variety of methods are known for generating doubled haploid organisms from haploid organisms.

[0057] Somatic haploid cells, haploid embryos, haploid seeds, or haploid plants produced from haploid seeds can be treated with a chromosome doubling agent. Homozy gous double haploid plants can be regenerated from haploid cells by contacting the haploid cells, including but not limited to haploid callus, with chromosome doubling agents, such as colchicine, anti-microtubule herbicides, or nitrous oxide to create homozygous doubled haploid cells.

[0058] Methods of chromosome doubling are disclosed in, for example, US Patent No. 5,770,788; 7,135.615, and US Patent Publication No. 2004/0210959 and 2005/0289673; Antoine-Michard, S. et al.. Plant Cell, Tissue Organ Cult., Dordrecht, the Netherlands. Kluwer Academic Publishers 48(3):203-207 (1997); Kato, A., Maize Genetics Cooperation Newsletter 1997, 36-37; and Wan, Y. et al., Trends Genetics TT. 889-892 (1989). Wan, Y. et al., Trends Genetics 81: 205-211 (1991), the disclosures of which are incorporated herein by reference. Methods can involve, for example, contacting the haploid cell with nitrous oxide, anti-microtubule herbicides, or colchicine. Optionally, the haploids can be transformed with a heterologous gene of interest, if desired.

[0059] Double haploid plants can be further crossed to other plants to generate Fl, F2, or subsequent generations of plants with desired traits.

[0060] In some embodiments, one can make clonal plants from a parent plant expressing BABYBOOM and DWT1 in egg cells as described herein. This can be achieved, for example, when the parent plant, which is parthenogenic as described above, produces gametes (e.g., egg or pollen cells) having the same number of chromosomes as somatic cells in the plant. Thus for example, if the plant is diploid (the somatic tissue is diploid) then the gametes are also diploid. This can be achieved in various ways, for example by inducing a “mitosis instead of meiosis” (MiME) phenotype in the parent plant (in addition to the expression of BABYBOOM). See. e.g., US Patent Publication No. 2012/0042408 and PCT Publication No. WO 2012/075195. Seed from a plant expressing BABYBOOM and DWT1 in egg cells, and having mutations that induce the MiMe phenotype, will be clonal to the parent plant. Mutations that induce MiMe phenoty pe are knoyvn and can be introduced into the plant as desired. In some embodiments, a RNA-guided nuclease and sufficient guide RNAs are expressed in the plant to induce mutations that cause the MiMe phenotype.

[0061] As noted above, in some embodiments the plant also comprises an expression cassette comprising a promoter operably linked to an RNA-guided nuclease. The RNA- guided nuclease can recognize a sequence of a target nucleic acid (e.g., via an RNA guide), bind to the target nucleic acid, and modify the target nucleic acid. The RNA-guided nuclease has nuclease activity. For example, the RNA-guided nuclease can modify the target nucleic acid by cleaving the target nucleic acid. After the action of the nuclease at the beginning of a coding sequence (as targeted by a gRNA), the introduction of inserts or deletions by the error-prone non-homologous end joining repair of double-strand breaks (DSBs) introduces frame-shift mutations and for example subsequent premature stop codons, leading to mRNA elimination by nonsense-mediated mRNA decay. For example, the Cas nuclease can direct cleavage of one or both strands at a location in a target nucleic acid. Non-limiting examples of Cas nucleases include Casl, Cas IB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl and Csxl2), CaslO, Csyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Csxl, Csxl5, Csfl, Csf2, Csf3, Csf4, Cpfl, homologs thereof, variants thereof, mutants thereof, and derivatives thereof. There are three main types of Cas nucleases (ty pe I, type II, and type III), and 10 subtypes including 5 type I, 3 type II, and 2 type III proteins (see, e.g.. Hochstrasser and Doudna, Trends Biochem Sci, 2015:40(l):58-66). Type II Cas nucleases include Casl, Cas2, Csn2, Cas9, and Cfpl. These Cas nucleases are known to those skilled in the art. For example, the amino acid sequence of the Streptococcus pyogenes wild-type Cas9 polypeptide is set forth, e.g., in NBCI Ref. Seq. No. NP_269215, and the amino acid sequence of Streptococcus thermophilus w ild-type Cas9 polypeptide is set forth, e.g., in NBCI No. WP_01 1681470.

[0062] Cas nucleases, e.g., Cas9 nucleases, can be derived from a variety of bacterial species including, but not limited to, Veillonella atypical, Fusobacterium nucleatum, Filifactor alocis, Solobacterium moorei, Coprococcus catus, Treponema denticola, Peptoniphilus duerdenii, Catenibacterium mitsuokai, Streptococcus mutans, Listeria innocua, Staphylococcus pseudintermedius, Acidaminococcus intestine, Olsenella uh, Oenococcus kitaharae, Bifidobacterium bifidum, Lactobacillus rhamnosus, Lactobacillus gasseri, Finegoldia magna, Mycoplasma mobile, Mycoplasma gallisepticum, Mycoplasma ovipneumoniae. Mycoplasma cams, Mycoplasma synoviae, Eubacterium rectale, Streptococcus thermophilus, Eubacterium dolichum, Lactobacillus coryniformis subsp. Torquens, Ilyobacter polytropus, Ruminococcus albus, Akkermansia muciniphila, Acidothermus cellulolyticus, Bifidobacterium longum, Bifidobacterium dentium, Corynebacterium diphtheria, Elusimicrobium minutum, Nitratifractor salsuginis, Sphaerochaeta globus, Fibrobacter succinogenes subsp. Succinogenes. Bacteroides fragilis, Capnocytophaga ochracea, Rhodopseudomonas palustris, Prevotella micans, Prevotella ruminicola, Flavobacterium columnare, Aminomonas paucivorans, Rhodospirillum rubrum, Candidatus Puniceispirillum marinum. Verminephrobacter eiseniae, Ralstonia syzygii, Dinoroseobacter shibae, Azospirillum. Nitrobacter hamburgensis. Bradyrhizobium, Wolinella succinogenes. Campylobacter jejuni subsp. Jejuni. Helicobacter mustelae, Bacillus cereus, Acidovorax ebreus, Clostridium perfringens, Parvibaculum lavamentivorans, Roseburia intestinalis, Neisseria meningitidis, Pasteurella multocida subsp. Multocida, Sutterella wadsworthensis, proteobacterium. Legionella pneumophila, Parasutterella excrementihominis, Wolinella succinogenes, and Francisella novicida.

[0063] Cas9 protein refers to an RNA-guided double-stranded DNA-binding nuclease protein or nickase protein. Wild-ty pe Cas9 nuclease has two functional domains, e.g., RuvC and HNH, that cut different DNA strands. Cas9 can induce double-strand breaks in genomic DNA (target DNA) when both functional domains are active. The Cas9 enzyme can comprise one or more catalytic domains of a Cas9 protein derived from bacteria belonging to the group consisting of Corynebacter, Sutterella, Legionella, Treponema, Filifactor, Eubacterium, Streptococcus, Lactobacillus, Mycoplasma, Bacteroides, Flaviivola, Flavobacterium, Sphaerochaeta, Azospirillum, Gliiconacetobacter, Neisseria, Roseburia, Parvibaculum, Staphylococcus, Nitratifractor, and Campylobacter. In some embodiments, the Cas9 can be a fusion protein, e.g., the two catalytic domains are derived from different bacteria species.

[0064] In some embodiments, a Cas protein can be a Cas protein variant. For example, useful variants of the Cas9 nuclease can include a single inactive catalytic domain, such as a RuvC' or HNH" enzyme or a nickase. A Cas9 nickase has only one active functional domain and can cut only one strand of the target DNA, thereby creating a single strand break or nick. In some embodiments, the Cas9 nuclease can be a mutant Cas9 nuclease having one or more amino acid mutations. For example, the mutant Cas9 having at least a D10A mutation is a Cas9 nickase. In other embodiments, the mutant Cas9 nuclease having at least a H840A mutation is a Cas9 nickase. Other examples of mutations present in a Cas9 nickase include, without limitation, N854A and N863A. A double-strand break can be introduced using a Cas9 nickase if at least two DNA-targeting RNAs that target opposite DNA strands are used. A double-nicked induced double-strand break can be repaired by NHEJ or HDR (Ran et al., 2013, Cell, 154: 1380-1389). Non-limiting examples of Cas9 nucleases or nickases are described in, for example, U.S. Patent No. 8.895,308; 8,889,418; and 8,865,406 and U.S. Application Publication Nos. 2014/0356959, 2014/0273226 and 2014/0186919. The Cas9 nuclease or nickase can be codon-optimized for the target cell or target organism.

[0065] In some embodiments, the Cas nuclease can be a high-fidelity or enhanced specificiN Cas9 polypeptide variant with reduced off-target effects and robust on-target cleavage. Non-limiting examples of Cas9 polypeptide variants with improved on-target specificity include the SpCas9 (K855A), SpCas9 (K810A/K1003A/R1060A) (also referred to as eSpCas9(1.0)), and SpCas9 (K848A/K1003A/R1060A) (also referred to as eSpCas9(l. l)) variants described in Slaymaker et al., Science, 351(6268): 84-8 (2016), and the SpCas9 variants described in Kleinstiver et al.. Nature, 529(7587):490-5 (2016) containing one, two. three, or four of the following mutations: N497A, R661 A, Q695A, and Q926A (e.g., SpCas9-HFl contains all four mutations). [0066] The promoter operably linked to the sequence encoding the RNA guided nuclease can be a constitutive promoter or an egg-specific promoter or be otherwise selected such that the RNA guided nuclease is expressed in egg cells. Examples of constitutive promoters include the cauliflower mosaic virus (CaMV) 35S transcription initiation region, the 1'- or 2'- promoter derived from T-DNA of Agrobacterium tumafaciens, the parsley UBI promoter (Kawalleck et al., Plant Mol Biol. (1993 Feb) 21(4):673-84), RPS5 (Hiroki Tsutsui et al. Plant and Cell Physiology (2016)); 2X35SQ (Belhaj, Khaoula. et al. Plant methods 9. 1 (2013): 39); AtUBIlO (Callis J, et al. Genetics 139: 921-939 (1995)); S1UBI10 (Dahan-Meir, Tai, et al. The Plant Journal (2018)); G10-90 (Ishige, Fumiharu, et al. The Plant Journal 18.4 (1999): 443-448) and other transcription initiation regions from various plant genes known to those of skill. In some embodiments, each expression cassette in the single construct uses a different promoter.

[0067] The RNA-guided nuclease will be expressed with a sufficient set of expression cassettes directing expression of guide RNAs (gRNAs) to induce a meiosis-to-mitosis phenotype. Plant genes to be targeted to obtain a MiMe phenoty pe are known and are also described below. In general, expression of a single guide RNA per gene can be sufficient to reduce expression of each target gene, but if desired, two or more guide RNA can be targeted to one of more of the genes to further reduce its expression.

[0068] As used throughout, a guide RNA (gRNA) sequence is a sequence that interacts with a site-specific or targeted nuclease and specifically binds to or hybridizes to a target nucleic acid within the genome of a cell, such that the gRNA and the targeted nuclease colocalize to the target nucleic acid in the genome of the cell. Each gRNA includes a DNA targeting sequence or protospacer sequence of about 10 to 50 nucleotides in length that specifically binds to or hybridizes to a target DNA sequence in the genome. For example, the DNA targeting sequence is about 10. 11. 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25. 26, 27, 28, 29, 30, 31, 32, 33, 34. 35. 36. 37. 38. 39. 40. 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides in length. In some embodiments, the gRNA comprises a crRNA sequence and a transactivating crRNA (tracrRNA) sequence. In some embodiments, the gRNA does not comprise a tracrRNA sequence. The guide sequence can be used in a single-guide RNA (sgRNA) as described below, or in a split crRNA + tracrRNA construct.

[0069] In some embodiments, the targeted nuclease (e.g., a Cas protein) is guided to its target DNA by a single-guide RNA (sgRNA). An sgRNA is a version of the naturally occurring two-piece guide RNA (crRNA and tracrRNA) engineered into a single, continuous sequence. An sgRNA typically contains (1) a guide sequence (e.g, the crRNA equivalent portion of the sgRNA) that targets the Cas protein to the target DNA, and (2) a scaffold sequence that interacts with a nuclease such as a Cas protein (e.g., the tracrRNAs equivalent portion of the sgRNA). An sgRNA may be selected using a software. As a non-limiting example, considerations for selecting an sgRNA can include, e.g., the PAM sequence for the Cas9 protein to be used, and strategies for minimizing off-target modifications. Tools such as NUPACK® and the CRISPR Design Tool can provide sequences for preparing the sgRNA, for assessing target modification efficiency, and/or assessing cleavage at off-target sites.

[0070] The guide sequence in the sgRNA may be complementary to a specific sequence within a target DNA. The 3‘ end of the target DNA sequence can be followed by a PAM sequence. Approximately 20 nucleotides upstream of the PAM sequence is the target DNA. In general, a Cas9 protein or a variant thereof cleaves about three nucleotides upstream of the PAM sequence. The guide sequence in the sgRNA can be complementary to either strand of the target DNA.

[0071] The promoter operably linked to the sequence encoding the guide RNA can be a constitutive promoter or an egg-specific promoter or be otherwise selected such that the guide RNA is expressed in egg cells. Examples of constitutive promoters include the cauliflower mosaic virus (CaMV) 35 S transcription initiation region, the T- or 2'- promoter derived from T-DNA of Agrobacterium tumafaciens, the parsley UBI promoter (Kawalleck et al., Plant Mol Biol. (1993 Feb) 21(4):673-84), RPS5 (Hiroki Tsutsui et al. Plant and Cell Physiology (2016)); 2X35SQ (Belhaj, Khaoula, et al. Plant methods 9.1 (2.013): 39);

AtUBIlO (Callis J, et al. Genetics 139: 921-939 (1995)); S1UBI10 (Dahan-Meir, Tai, et al. The Plant Journal (2018)); G10-90 (Ishige, Fumiharu, et al. The Plant Journal 18.4 (1999): 443-448) and other transcription initiation regions from various plant genes known to those of skill.

[0072] Genes necessary to knock out for generation of plants having the MiME phenotype are known. See, e.g., US Patent Publication No. 2012/0042408; US Patent Publication No. 2014/0298507, and PCT Publication No. WO 2012/075195. A plant having the MiMe (mitosis instead of meiosis) genotype is a plant in which a deregulation of meiosis results in a mitotic-like division and in which meiosis is replaced by mitosis. Plants having the MiMe genotype produce functional (e.g., diploid) gametes that are genetically identical to their parent. Exemplary' MiMe plants combine phenotypes of (1) no second meiotic division, (2) no recombination and (3) modified chromatid segregation. MiMe plants are exemplified by MiMe-1 plants as described by d'Erfurth, I. et al. PLoSBiol 7, el000124 (2009) and WO2001/079432) and MiMe-2 plants as described by d'Erfurth, I. et al. PLoS Genet 6, el000989 (2010). In some embodiments, the MiMe phenoty pe is induced by inhibiting or mutating OSD1 or an ortholog thereof. REC8 or an ortholog thereof, and at least one of SPO1 1 or PRD1, or PRD2 or PRD3/PAIR1 (see, e.g., Mieulet D., Cell Res. 2016 Nov; 26(11): 1242-1254).

[0073] Exemplary MiMe-1 plants combine inactivation of the OSD1 gene, with the inactivation of two or more other genes, one which encodes a protein necessary for efficient meiotic recombination in plants (e.g., SP011-1, SPO11-2. PRD1, PRD2, or PAIR1), and whose inhibition eliminates recombination and pairing (see, e.g., Grelon M, et al. EMBO Journal 20, 589-600 (2001)), and another which encodes a protein necessary for the monopolar orientation of the kinetochores during meiosis, e.g., REC8, and whose inhibition modifies chromatid segregation (see, e.g., Chelysheva L, et al., Journal of Cell Science 118, 4621-4632. (2005)]. Exemplary MiMe-2 plants combine inactivation of the TAM gene with the inactivation of two or more other genes, one which encodes a protein necessary for efficient meiotic recombination in plants (e.g., SPO11-1, SP011-2, PRD1, PRD2, or PAIR1), and whose inhibition eliminates recombination and pairing, and another which encodes a protein necessary for the monopolar orientation of the kinetochores during meiosis. e.g., REC8, and whose inhibition modifies chromatid segregation.

[0074] Exemplary OSD1 gene sequences include, e.g., those described in US Patent Publication No. 2014/0298507 and rice and Arabidopsis OSD1 protein sequences as provided in SEQ ID NOS: 34 and 36, respectively.

[0075] Exemplary' TAM gene sequences are described in, e.g., US Patent Publication No. 2014/0298507. Arabidopsis TAM1 protein sequence is provided as SEQ ID NO:46. Illustrative rice Cyclin-Al protein sequences are provided as SEQ ID NOS:48, 50, 52, 54, and 56. Illustrative Cyclin-A3 protein sequences are provided as SEQ ID NOS:58 and 60.

[0076] Exemplary 7 Arabidopsis DYAD cDNA coding sequence and the sequence of the protein encoded by the nucleic acid are provided as SEQ ID NOS: 70 and 71, respectively. Exemplary rice DYAD homolog (SWITCH 1) protein sequences are provide as SEQ ID NOS:64, 66, and 68. [0077] Examples of SPO11-1 and SPO11-2 proteins are provided in US Patent Publication No. 2014/0298507. An illustrative Arabi dops is SP011-2 protein sequence is provided as SEQ ID NO:40.

[0078] Arabidopsis PAIR1 is described in, e.g., US Patent Publication No. 2014/0298507. An exemplary rice PAIR1 protein sequence is provided as SEQ ID NO:38.

[0079] Exemplary rice and Arabi dops is REC8 protein sequences are provided as SEQ ID NOS:60 and 62, respectively.

[0080] In some embodiments, sufficient expression cassettes to produce the MiMe phenotype include at least one expression cassette comprising a promoter operably linked to one or more guide RNA targeting a gene or coding sequence encoding (a) a TAM (Cylin A CYCA1;2) or DYAD protein or ortholog thereof; (b) a protein involved in initiation of meiotic recombination in plants exemplified herein as SPO11-1; SPO11-2; PRD; PRD2; or PAIR1 (also called PRD3) or ortholog thereof; and (c) a protein necessary for the monopolar orientation of the kinetochores during meiosis for example REC8 protein or ortholog thereof. Orthologs have the functionality of the proteins described herein but are from different plant species. Orthologs can be substantially identical to the polypeptides as provide herein or can otherwise be selected from genomic databases.

[0081] In some embodiments, sufficient expression cassettes to produce the MiMe phenotype include at least one expression cassette comprising a promoter operably linked to one or more guide RNA targeting a gene or coding sequence encoding (a) an OSD 1 protein or ortholog thereof; (b) a protein involved in initiation of meiotic recombination in plants exemplified herein as SPO11-1; SPO11-2; PRD; PRD2; or PAIR1 (also called PRD3) or ortholog thereof; and (c) a protein necessary for the monopolar orientation of the kinetochores during meiosis, for example REC8 protein or ortholog thereof.

EXAMPLE

[0082] The ability of another male-expressed transcription factor DWARF TILLER1 (OsDWTR) to induce parthenogenesis if expressed in the egg cell was tested. The DWT1 transcription factor is encoded by a member of the WUSCHEL-Homebox or WOX gene family (Wang et al. 2014). The parthenogenetic ability of OsDWTJ either alone, or in combination with OsBBMl, was test for the ability to increase the parthenogenesis efficiency in rice. Methods

[0083] OsDWTl CDS was cloned under Arabidopsis and rice egg-cell specific promoters (see sequences in List 2 below) for expression the egg cells (Figure 1, attached). OsDWTl CDS was initially amplified from cDNAs from rice callus tissues with primers listed in List 1. The CDS was amplified in three fragments which were joined at two unique (Ncol and NheT) restriction sites to complete the sequence. The complete CDS was then cloned into pCAMBIA2300 based binary vectors in which egg cell promoters from Arabidopsis and rice, and Nos transcription terminators were already cloned (Figure 1 A). The binary constructs containing the egg-cell specific OsDWTl expression cassettes were super transformed, using Agrobacterium mediated transformation, into seeds homozygous for pEC1.2::OsBBMl from line#8c (Figure IB; Khanday et al., 2019). Thirteen TO transgenic plants were raised with the pEC1.2:: OsDWTENosT construct. The primary TO transgenic plants were confirmed for the presence of the transgene by PCR amplifying the Nptll selection maker. Ten TO lines, hemizygous for the pEC1.2::OsDWTl:NosT transgene were analyzed for their capacity to induce haploidy. T1 seeds were germinated on Vi MS media containing 300 mg/L of G418 in a growth chamber with 16/8 hour light/dark cycle at 25 °C for 12 days. In parallel, T1 seeds were also germinated on !4 MS media without G418. The germinated seedlings, resistant to G418 (both hemizygous and homozygous for pEC 1.2: . OsDWTENosT) and those germinated on media without G418 were transferred to the greenhouse after 12 days. The seedlings were allowed to undergo flowering transition and phenotypes were scored after the panicles fully emerged.

Results

[0084] When OsDWTl alone was expressed in the egg cell, it was unable to induce parthenogenesis. We then tested the possibility of OsBBMl and OsDWTl acting synergistically to increase the parthenogenesis efficiency in rice. Several independent TO lines were generated by Agrobacterium transformation (see Methods) and their T1 progenies were analyzed (See methods for details above). Depending upon the efficiency of parthenogenesis, the T1 progenies are expected to be a mixture of haploids (parthenogenetic progeny) and diploids (sexual progeny). The ploidy determination was carried out by observing the plant phenotype. The haploids are dwarf with narrow leaves compared to diploids, and sterile due to meiotic defects (Figure 2). The final confirmation of ploidy was done by flow cytometry (Figure 3). As expected, haploid progenies displayed a flow cytometric peak at 80 (Figure 3B), whereas diploid peaks were double the size of haploids, at about 160 (Figure 3 A). Thus, these results confirm the genome size of haploid and diploid progenies. The haploid induction frequencies were calculated by the number of haploid progenies obtained, divided by the total number of seedlings germinated. From the hemizygous TO mother plant (line #12b), the induction frequency of haploids in T1 generation was calculated to be 45.8% (Table 1A). Since only half the egg cells of the hemizygous TO parent would have inherited the pEC 1.2: : OsDWT 1 :NosT transgene, the actual parthenogenesis efficiency (% of egg cells carrying the transgene that underwent parthenogenesis and produced haploids) will be higher, i.e. about 92%.

[0085] We then identified homozygous pEC 1.2: : OsDWTl :NosT diploid T1 individuals by germination on G418: All T2 progeny of homozygotes will be resistant to this antibiotic, whereas hemizy gous individuals will produce % resistant and A sensitive T2 progeny. The T2 progeny of these homozygous plants were analyzed to estimate parthenogenesis efficiency. The haploidfrequency in these T2 progenies from homozygous T1 mother plants increased to 91 % (Table 1A). As the T1 parents were homozygous for the transgene, the parthenogenesis efficiency is equal to the haploid frequency, i.e., 91%. This is a greater than 3-fold increase over the 15-29% efficiency of the parent pEC 1.2: . OsBBMl #8c line that was used for transformation of the pEC 1.2: . OsDWT l.NosT construct (Table 1A). .

[0086] We also screened for diploid T1 progenies in which pEC 1.2: . OsDWTl :NosT transgene had segregated out (negative for the pEC 1.2:: OsDWT ENosT transgene) from plants germinated on 'A MS media without G418. The parthenogenesis frequency in T2 progenies from these pEC 1.2: . OsDWT l.NosT negative plants was found to be 20. 1% (Table 1A). Thus, the combination of OsBBMl and OsDWTl act synergistically to increase the parthenogenesis efficiency, in this instance by 4.5-fold over OsBBMl alone, and thereby the number of haploid progenies.

[0087] We carried out similar analyses for an independent transgenic line # 7d (Table IB). Similar to line #12b, the parthenogenesis efficiencies were much higher than with just the pEC1.2::OsBBMl transgene. The haploid frequency of pEC 1.2:: OsDWTl ./Vas'/' hemizygous TO mother plants increased to 48.2%, and that of homozygous T1 mother plants increased to 86.4 % (Table IB). The haploid frequency from sibling T1 mother plants of the same line that were negative for presence of the pEC 1.2: . OsDWT l.NosT transgene, i.e. carrying only pEC1.2::OsBBMl, was 5.5% (Table IB). Thus, in this instance, the combination of OsBBMl and OsDWTl acting synergistically increased the parthenogenesis efficiency by 15-fold over OsBBMl alone. Together, these results show that OsDWTl when expressed in the egg cell, increases the parthenogenetic capacity of OsBBMl, by up to 4 to 15-fold over OsBBMl alone.

Discussion

[0088] Heterosis refers to the enhanced vigor in Fl progenies compared to their inbred parents. Fl hybrids have been used to substantially increase crop yields. However, due to genetic segregation resulting from sexual reproduction. Fl hybrid seeds need to be created afresh for cultivation during every sowing season, as high-yielding hybrids cannot be maintained through normal seed propagation. To circumvent this problem and to fix the vigor in hybrid crops, we combined the parthenogenesis ability of OsBBMl with a method of substituting mitosis for meiosis, thus bypassing segregation and fertilization to develop a method of synthetic apomixis or clonal seed formation (Khanday et al., 2019).

[0089] However, low parthenogenesis frequency (~ 29%) remains a bottleneck for this technology for field application, which would require frequencies of at least 80% parthenogenesis to be commercially useful. Through this invention, we have attained parthenogenesis efficiencies of 86 to 91%, which will pave the way for synthetic apomixis technology to be introduced into the farmer’s field for hybrid crop cultivation.

Table 1: Increase in parthenogenesis frequencies by DWT1 expressed in rice egg cell

Symbols: eDT = Transgene pEC1.2::DWTl eBl = Transgene pEC1.2::BBMl

A. Transformant #12b

TO genotype eBl/eBl; eDT/- (Homozygous for BBM1 transgene, hemizygous for DWT1 transgene)

% Haploids in T1 progeny of TO parent: 45.8% (n = 83)

Frequency of parthenogenesis in T1 generation measured by % of haploids in T2 progeny

Genotype of plants Progeny (Generation) %

Haploids

B. Transformant #7 d

TO genotype eBl/eBl; eDT/- (Homozygous for BBM1 transgene, hemizygous for DWT1 transgene)

% Haploids in T1 progeny of TO parent: 48.2% (n =56)

Frequency of parthenogenesis in T1 generation measured by % of haploids in T2 progeny

Genotype of plants Progeny (Generation) % Haploids

List 1; DNA Primers for PCR amplification of sequences Primers for rice egg cell promoter pECAl

REG1 F ATG GAA TGA TGG ATG AAT GTT CAC

REG1 R GG TTT TTC TTT CTA GCT TTG CTG Primers for amplification of Arabidopsis EC1.1 promoter EC 1. 1 F GTTGCCTTATGATTTCTTCGGTTT

EC 1. 1 R TTCTC AAC AGATTGAT AAGGTCGAAA

Primers for amplification of Arabidopsis EC1.2 promoter

DD45PstI F CTGCAGAAATGTTCCTCGCTGACGTA

DD45SalI R GTCGACTATTCTTTCTTTTTGGGGTTTTTG

Primers for amplification of DWT1

DWT1 FF ATG GCG TCG TCG AAC AGG CAC

DWT1 R2 GTCCATGGCCGTCGTCACG

DWT1 F2 CGT GAC GAC GGC CAT GGA C

DWT1 R3 CAGTCAGGCTAGCGGTGGC DWT1 F3 GCC ACC GCT AGC CTG ACT G

DWT1 RL TTACATGACAACAATGTAGACGGC

References:

1. Anderson SN, Johnson CS, Chesnut J, Jones DS, Khanday I, Woodhouse M, Li C, Conrad LJ, Russell SD, Sundaresan V (2017) The zygotic transition is initiated in unicellular plant zygotes with asymmetric activation of parental genomes. Developmental Cell 2017.43:349-358. e4.

2. Khanday, L, Skinner, D., Yang, B., Mercier, R., Sundaresan, V. (2019). A male- expressed rice embryogenic trigger redirected for asexual propagation through seeds. Nature 565: 91-95 DOI:doi.org/10.1038/s41586-018-0785-8

3. Wang W, Li G, Zhao J, Chu H, Lin W, et al. (2014) DWARF TILLER1, a WUSCHEL- Related Homeobox Transcription Factor, Is Required for Tiller Grow th in Rice. PLoS Genet 10(3): el004154. doi:10.1371/joumal.pgen,1004154

[0090] It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.

SEQUENCES

SEQ ID NO: 1

DWT1 Oryza sativa Japonica

1 massnrhwps mfrskhatqp wqtqpdmags ppsllsgssa gsaggggysl ksspf ssvge

61 ervpdpkprw nprpeqiril eaifnsgmvn pprdeiprir mqlqeygqvg danvf ywfqn 121 rksrsknklr sggtgraglg iggnrasapa aahreavaps ftppppilpa pqpvqpqqql 181 vspvaaptss sssssdrssg sskparatst qamsvttamd llsplaaach qqmlyqgqpl 241 esppapapkv hgivphdepv iiqwpqspci savdigaail ggqymhlpvp apqppsspga 301 agmfwglcnd vqapnntghk scawsaglgq hwcgsadqlg Igkssaasia tvs rpeeahd 361 vdatkhgllq ygfgittpqv hvdvtssaag vlppvpssps ppnaavtvas vaatasltdf 421 aasaisagav annqfqglad fglvagacsg agaaaaaaap eagssvaavv cvs vagaapp 481 Ifypaahfnv rhygdeaell ryrggsrtep vpvdesgvtv eplqqgavyi vvm

SEQ ID NO: 2 DWL2

Oryza sativa Japonica

1 maspnrhwps mfrsnlacni qqqqqpdmng ngsssssili spptaattgn gkpsilssgc

61 eegtrnpepk prwnprpeqi rilegifnsg mvnpprdeir rirlqlqeyg qvgdanvfyw

121 fqnrksrtkn klraaghhhh hgraaalpra sappstnivl psaaaaaplt pprrhllaat

181 sssssssdrs sgssksvkpa aaalltsaai dlfspapapt tqlpacqlyy hshptplard

241 dqlitspess slllqwpasq ympatelggv Igssshtqtp aaitthpsti spsvllglcn

301 ealgqhqqet mddmmitcsn pskvfdhhsm ddmsctdavs avnrddekar Igllhygigv

361 taaanpaphh hhhhhhlasp vhdavsaada staamilpft ttaaatpsnv vatssaladq 421 Iqglldagll qggaapppps atvvavsrdd etmctkttsy sfpatmhlnv kmfgeaavlv 481 rysgepvlvd dsgvtveplq qgatyyvlvs eeavh

SEQ ID NO: 3. DWL1

Oryza sativa Japonica

1 mtllavivfg gggggsrssa Ipslstgvag vegsatgpwl pesltlnaat hripstspls 61 ipqtltitrd ppypmlprsh ghrtggggfs Iksspfssvg eervpdpkpr rnprpeqiri 121 leaifnsgmv npprdeipri rmqlqeygqv gdanvfywfq nrksrsknkl rsggtgragl 181 glggnrasep paaatahrea vapsftpppi Ippqpvqpqq qlvspvaapt slsssssdrs 241 sgsskparat Itqamsvtaa mdllsplrrs arprqeqrhv

SEQ ID NO: 4 Maize DWT1

1 massfnnkts hwpsmfrskh aaepwqaqpd isssppslls gggsssttti grclkhplsg

61 ysggeertpd pkprwnprpe qirileaifn sgmvnpprde iprirmrlqe ygqvgdanvf

121 ywfqnrksrs knkqrtgqlg Iglarapgcg aaappvtpqp liqnqfqvla spaqapasss

181 ssssdrssgs skpapqpmsa taaamnfpgp Igaacaqmyy qahpvapvsa Ipahkvqdpv

241 asdepvfqpw Iqgyflsaae vasilggqyr hdvpvqqqpp atlpagaflg lynevteptv

301 tghrtcawgp aglgqfwpvg gadhhqhhKh nttaatntva rdaahehatt Igllqygfea

361 saametasaa vplaaspgta asvataglzs Ipastnavvv nydllqglav pgsgsgavgv

421 stggappvav aaaptaaqeg vvvalcitds vtgksvahnv aaarldvraq fgeaavllra

481 vgdrggldlv pvpvdalgct veplqhgafy yvlv

SEQ ID NO: 5 Maize DWL2

1 massnrhwps myrsslacnf qqpqpqpdmn nggksslmss rceenggrnp eprprwnprp

61 eqirilegif nsgmvnpprd eirrirlqlq eygpvgdanv fywfqnrksr tkhklraagq

121 Iqpsgsgrsa Iqaracapap vtpprnlqla aaapvappts ssssssdrss gssssksvtv

181 tpttavalas pagaapaavf rqqgvmptza mdlltplpss saalaarqly yqyhsqimap

241 aappmpdtvi aspeqflpqw qqggqqhyyl patelggvld ghshhthepp aaihrpvsls

301 psvlfglcne alrqdycadi svvptkglgh ghqfwnsttc gsdmgnsnsk idavsavird

361 deksrlgllh yyglagattt aaaavapapl aadaaagtat llpssaasdq Iqglldaagl

421 Imgetpptpt atvvavarda vtcaatataq fsvpasmrld vrlafgeaal larhtgeavp

481 vdesgvtvep Iqqdtlyyvl mqatnn

SEQ ID NO: 6 Maize DWL1

1 masssfnnsh wpsmfrsKha aepcqttqpd isssppslls aggasttttt grclkhsisv

61 ggeerapdpk prwnprpeqi rileaifnsg mvnpprdeip rirmrlqqyg qvgdanvfyw

121 fqnrksrskn klrsstagtg rlglqglara pgrgaaaapp pveppplvqn qfhmlaspaq

181 aptsssssss drssgsskpa aepampataa pmdllgplaa acpqmyyqgs pvapah^vld

241 Ivasvepvfq pwpqgyclsa aevatilggq ymhvpvqqqp paplpagall glcndvcept

301 avvtghktca wgpaglgqsw pcggadhhqp gknnntaare laheddatkl gllqygfgat

361 tameaapava plaaspagga vtmasvsast agltgfpast ngvvanydll qglavpggga

421 gagrapaava vaadaaptaa qegvvalcit dsitgksvah nvaaarldvr aqfgeaavll

481 rcggergldl epvpvdasgc tveplqrgaf yyvll

SEQ ID NO: 7: DWT1 = DWARF TILLER1 Nucleozide sequence

ATGGCGTCGTCGAACAGGCACTGGCCGAGCATGTTCAGGTCGAAGCACGCCACGCAG CCG

TGGCAGACGCAGCCTGACATGGCCGGGTCGCCGCCCTCCCTCCTCTCCGGCTCCTCC GCC

GGCAGCGCCGGCGGCGGCGGCTACTCCCTCAAGTCGTCGCCCTTCTCGTCAGTGGGC GAG

GAGAGGGTTCCGGACCCGAAGCCGCGGTGGAACCCGCGGCCGGAGCAGATCCGGATC CTG

GAGGCGATCTTCAACTCCGGCATGGTCAACCCGCCGCGCGACGAGATCCCGCGCATC CGC

ATGCAGCTGCAGGAGTACGGCCAGGTCGGCGACGCCAACGTCTTCTACTGGTTCCAG AAC

CGCAAGTCCCGCTCCAAGAACAAGCTGCGCTCCGGCGGGACAGGCCGCGCGGGGCTC GGC

CTCGGCGGCAACCGGGCCTCCGCGCCGGCGGCGGCGCACCGGGAGGCCGTGGCGCCG TCG

TTCACGCCGCCGCCACCAATCCTCCCGGCGCCCCAGCCGGTGCAGCCGCAGCAGCAG CTT

GTCTCGCCTGTGGCGGCGCCTACCTCGTCGTCGTCTTCCTCCTCCGACCGTTCGTCC GGG

TCCAGCAAGCCTGCGAGGGCTACGTCGACGCAGGCGATGTCCGTGACGACGGCCATG GAC

CTGCTCTCGCCGCTCGCCGCGGCGTGCCACCAGCAGATGCTCTATCAAGGCCAGCCA CTG

GAGTCGCCGCCGGCGCCTGCTCCCAAAGTGCACGGCATCGTGCCACACGACGAGCCG GTC

TTCCTGCAGTGGCCGCAGAGCCCCTGCCTGTCGGCCGTCGACCTCGGCGCCGCCATT CTT

GGCGGCCAGTACATGCACCTGCCGGTGCCCGCTCCGCAGCCACCGTCGTCGCCGGGC GCG

GCGGGCATGTTCTGGGGGCTCTGCAACGACGTGCAAGCGCCAAACAACACCGGCCAC AAG AGCTGCGCCTGGAGCGCCGGGCTCGGCCAGCACTGGTGCGGCTCCGCCGATCAGCTCGGC

CTCGGCAAGAGCAGCGCGGCGTCGATCGCCACCGTGTCTAGGCCGGAGGAGGCGCAC GAC

GTCGACGCCACGAAGCACGGTCTGCTACAGTACGGCTTTGGCATCACCACGCCGCAA GTG

CACGTGGACGTTACCTCCTCGGCTGCTGGCGTTCTGCCTCCTGTTCCGTCCTCGCCG TCG

CCGCCGAACGCCGCCGTCACCGTCGCGAGCGTGGCCGCCACCGCTAGCCTGACTGAT TTT

GCTGCAAGTGCTATATCTGCTGGCGCCGTCGCTAACAATCAGTTTCAAGGTCTCGCG GAT

TTCGGGCTCGTCGCCGGCGCCTGCTCCGGCGCCGGAGCCGCCGCCGCCGCCGCCGCG CCC

GAGGCGGGCAGTTCCGTGGCCGCGGTTGTGTGCGTCAGCGTCGCGGGCGCCGCGCCG CCG

CTCTTCTACCCGGCCGCGCACTTCAACGTGAGGCACTACGGCGACGAGGCCGAGCTG CTC

CGCTACAGAGGAGGCAGCCGCACGGAGCCTGTGCCCGTCGACGAGTCGGGCGTCACC GTC GAGCCGCTCCAGCAGGGCGCCGTCTACATTGTTGTCATGTAA

SEQ ID NO : 8 DWL2 = DWARF TI LLER-LIKE2 Nucleotide s equence

ATGGCCTCACCGAACAGGCACTGGCCGAGCATGTTCAGGTCCAATCTTGCCTGCAAC ATC

CAGCAGCAGCAGCAGCCTGACATGAACGGCAACGGCAGCTCGTCCTCTTCCTTCCTC CTC

TCGCCACCTACTGCTGCGACCACCGGCAACGGCAAGCCCTCCTTGCTCTCCTCAGGG TGT

GAGGAGGGGACGAGGAATCCGGAGCCGAAGCCGCGGTGGAACCCGAGGCCGGAGCAG ATA

AGGATACTGGAGGGGATCTTCAACTCCGGGATGGTGAACCCGCCGCGCGACGAGATC CGC

CGCATCCGCCTGCAGCTGCAGGAGTACGGCCAGGTCGGCGACGCCAACGTCTTCTAC TGG

TTCCAGAACCGCAAGTCCCGCACCAAGAACAAGCTGCGCGCCGCCGGCCACCACCAC CAC

CACGGCCGCGCCGCCGCCCTGCCGCGCGCGTCGGCGCCGCCGTCGACGAACATCGTA CTC

CCCTCTGCAGCGGCGGCGGCGCCCTTGACGCCGCCGCGGCGCCATCTCCTCGCCGCG ACC

TCCTCCTCGTCCTCCTCCTCCGACCGCTCCTCCGGGTCCAGCAAGTCGGTGAAACCA GCT

GCTGCCGCGCTGCTGACGTCAGCCGCCATCGACCTTTTCTCGCCGGCGCCGGCGCCG ACG

ACCCAGCTGCCCGCGTGCCAGCTCTACTACCATAGCCATCCCACGCCGCTGGCACGT GAT

GATCAGCTCATCACCTCGCCGGAGTCGTCGTCGCTCCTCCTGCAGTGGCCGGCGAGC CAG

TACATGCCGGCGACGGAGCTCGGCGGCGTCCTCGGCTCGTCGTCCCACACGCAAACC CCG

GCAGCGATCACCACCCACCCATCGACGATCTCACCCAGCGTGCTCCTCGGCCTATGC AAC

GAGGCACTAGGGCAGCATCAGCAAGAGACCATGGACGACATGATGATCACCTGCTCC AAC

CCCTCCAAGGTGTTCGACCACCATTCCATGGACGACATGAGCTGCACCGACGCGGTG AGC

GCCGTGAACAGGGACGACGAGAAGGCGAGGCTGGGGTTACTGCACTACGGCATCGGC GTC

ACTGCTGCTGCAAATCCGGCACCACATCATCATCATCATCATCATCATCTTGCCTCT CCT

GTGCATGATGCTGTCTCGGCTGCAGATGCTAGTACGGCGGCCATGATCCTTCCATTC ACC

ACCACTGCTGCTGCGACGCCGAGCAACGTCGTCGCTACAAGCTCTGCACTCGCTGAT CAG

TTGCAAGGGCTGTTGGATGCTGGGTTGCTGCAGGGAGGGGCGGCGCCGCCGCCGCCC TCG

GCGACGGTGGTGGCGGTGAGCCGCGACGACGAGACGATGTGCACCAAGACCACGAGC TAC

AGCTTCCCGGCGACGATGCACCTCAACGTGAAGATGTTCGGCGAGGCGGCCGTGCTG GTG

CGCTACAGCGGCGAGCCGGTGCTCGTCGACGACTCCGGCGTCACCGTCGAGCCGCTG CAG CAGGGCGCGACCTACTACGTGCTGGTATCTGAGGAAGCTGTGCATTGA

SEQ ID NO : 9 Rice DWL1 = DWARF TILLER-LI KE1 Nucleotide sequence

ATGATGGCCTTAGGCGTGCCACCGCCTCCCTCGCGCGCCTACGTGTCCGGCCCGCTA CGC

GACGATGACACTTTTGGCGGTGATCGTGTTCGGCGGCGGCGGCGGTGGCTCAAGGAG CAG

TGCCCTGCCATCATTGTCCACGGGGGTGGCAGGCGTGGAGGGGTCGGCCACAGGGCC CTG

GCTGCCGGAGTCTCTAAAATGCGTCTCCCAGCCCTAAACGCCGCCACCCACCGGATC CCC

TCCACCTCGCCCTTAAGTATCCCTCAGACCCTCACCATCACCCGCGATCCTCCCTAC CCA

ATGCTGCCTCGAAGTCACGGCCACCGGACCGGCGGCGGCGGCTTCTCCCTCAAGTCC TCG

CCCTTCTCGTCAGTGGGCGAGGAGAGGGTTCCGGACCCGAAGCCGCGGCGGAACCCG CGG

CCGGAGCAGATCCGGATCCTGGAGGCCATCTTCAACTCCGGCATGGTCAACCCGCCG CGC

GACGAGATCCCGCGCATCCGCATGCAGCTGCAGGAGTACGGCCAGGTCGGCGACGCC AAC

GTCTTCTACTGGTTCCAGAACCGCAAGTCCCGCTCCAAGAACAAGCTGCGCTCCGGC GGG

ACAGGCCGCGCGGGGCTCGGGCTCGGCGGCAACCGGGCCTCAGAGCCGCCGGCGGCG GCG

ACGGCGCACCGGGAGGCCGTGGCACCGTCGTTCACGCCGCCACCGATCCTCCCGCCC CAG

CCGGTGCAGCCGCAGCAGCAGCTTGTCTCGCCGGTGGCGGCGCCCACCTCGTTGTCG TCA

TCGTCCTCCGACCGCTCGTCCGGGTCCAGCAAGCCCGCGAGGGCTACGTTGACGCAG GCG

ATGTCCGTGACGGCGGCCATGGACCTGCTCTCGCCGCTCCGCCGATCAGCTCGGCCA CGG CAAGAGCAGCGCCATGTCTAG

SEQ ID NO : 10 OsBBMl GenBank acces sion number : AAX95437 . 1

MASITNWLGFSS SSFSGAGADPVLPHPPLQEWGSAYEGGGTVAAAGGEETAAPKLEDFLG

MQVQQETAAAAAGHGRGGSSSWGLSMIKNWLRSQPPPAWGGEDAMMALAVSTSASPP V DATVPACI SPDGMGSKAADGGGAAEAAAAAAAQRMKAAMDTFGQRTSIYRGVTKHRWTGR

YEAHLWDNSCRREGQTRKGRQVNAGGYDKEEKAARAYDLAALKYWGTTTTTNFPVSN YEK

ELDEMKHMNRQEFVASLRRKSSGFSRGASIYRGVTRHHQHGRWQARIGRVAGNKDLY LGT

FGTQEEAAEAYDIAAIKFRGLNAVTNFDMSRYDVKSIIESSNLPIGTGTTRRLKDSS DHT

DNVMDINVNTEPNNWSSHFTNGVGNYGSQHYGYNGWSPISMQPIPSQYANGQPRAWL KQ

EQDSSWTAAQNLHNLHHFSSLGYTHNFFQQSDVPDVTGFVDAPSRSSDSYSFRYNGT NG

FHGLPGGISYAMPVATAVDQGQGIHGYGEDGVAGIDTTHDLYGSRNVYYLSEGSLLA DVE

KEGDYGQSVGGNSWVLPTP*

SEQ ID NO: 11

Arabidopsis thaliana BABY BOOM (AtBBM) , NCBI Protein Accession # NP_001332647.1 ; NP_197245 1 MNSMNNWLGF SLSPHDQNHH RTDVDSSTTR TAVDVAGGYC FDLAAPSDES

51 SAVQTSFLSP FGVTLEAFTR DNNSHSRDWD INGGACNNIN NNEQNGPKLE 101 NFLGRTTTIY NTNETWDGN GDCGGGDGGG GGSLGLSMIK TWLSNHSVAN 151 ANHQDNGNGA RGLSLSMNSS TSDSNNYNNN DDWQEKTIV DWETTPKKT 201 IESFGQRTSI YRGVTRHRWT GRYEAHLWDN SCKREGQTRK GRQVYLGGYD 251 KEEKAARAYD LAALKYWGTT TTTNFPLSEY EKEVEEMKHM TRQEYVASLR 301 RKSSGFSRGA SIYRGVTRHH QHGRWQARIG RVAGNKDLYL GTFGTQEEAA 351 EAYDIAAIKF RGLSAVTNFD MNRYNVKAIL ESPSLPIGSS AKRLKDVNNP 401 VPAMMISNNV SESANNVSGW QNTAFQHHQG MDLSLLQQQQ ERYVGYYNGG 451 NLSTESTRVC FKQEEEQQHF LRNSPSHMTN VDHHSSTSDD SVTVCGNWS 501 YGGYQGFAIP VGTSVNYDPF TAAEIAYNAR NHYYYAQHQQ QQQIQQSPGG 551 DFPVAISNNH SSNMYFHGEG GGEGAPTFSV WNDT

SEQ ID NO: 12

Brassica napus BABY BOOM1 (BnBBMl)

NCBI Protein Accession # NP_001302749 ; AAM33802

1 mnnnwlgfsl spyeqnhhrk dvysstttcv vdvageycyd ptaasdessa iqtsfpspfg 61 vvvdaftrdn nshsrdwdin gcacnnihnd eqdgpklenf Igrtttiynt nenvgdgsgs 121 gcygggdggg gslglsmikt wlrnqpvdnv dnqengnaak glslsmnsst s cdnnndsnn 181 nvvaqgktid dsveatpkkt iesfgqrtsi yrgvtrhrwt gryeahlwdn s ckregqtrk 241 grqvylggyd keekaarayd laalkywgct tttnfpmsey ekeveemkhm trqeyvaslr 301 rkssgfsrga siyrgvtrhh qhgrwqarig rvagnkdlyl gtfgtqeeaa eaydiaaikf 361 rgltavtnfd mnrynvkail espslpigsa akrlkeanrp vpsmmmisnn vsesensasg 421 wqnaavqhhq gvdlsllhqh qeryngyyyn ggnlssesar acfkqeddqh hflsntqslm 481 tnidhqssvs ddsvtvcgnv vgyggyqgfa apvncdayaa sefdynarnh yyfaqqqqtq 541 qspggdfpaa mtnnvgsnmy yhgegggeva ptftvwndn

SEQ ID NO: 13

Brassica napus BABY BOOM2 (BnBBM2)

NCBI Protein Accession # AAM33801; NP_001303138

1 mnnnwlgfsl spyeqnhhrk dvcsstttca vdvageycyd ptaasdessa iqtsfpspfg 61 vvldaftrdn nshsrdwdin gsacnnihnd eqdgpklenf Igrtttiynt nenvgdidgs 121 gcygggdggg gslglsmikt wlrnqpvdnv dnqengngak glslsmnsst s cdnnnyssn 181 nlvaqgktid dsveatpkkt iesfgqrtsi yrgvtrhrwt gryeahlwdn s ckregqtrk 241 grqvylggyd keekaarayd laalkywgct tttnfpmsey ekeieemkhm trqeyvaslr 301 rkssgfsrga siyrgvtrhh qhgrwqarig rvagnkdlyl gtfgtqeeaa eaydiaaikf 361 rgltavtnfd mnrynvkail espslpigsa akrlkeanrp vpsmmmisnn vsesennasg 421 wqnaavqhhq gvdlsllqqh qeryngyyyn ggnlssesar acfkqeddqh hflsntqslm 481 tnidhqssvs ddsvtvcgnv vgyggyqgfa apvncdayaa sefdynarnh yyfaqqqqtq 541 hspggdfpaa mtnnvgsnmy yhgegggeva ptftvwndn

SEQ ID NO: 14

Oryza sativa BABY BOOM1 (OsBBMl)

NCBI Protein Accession # XP_015616214

1 masitnwlgf ssssfsgaga dpvlphpplq ewgsayeggg tvaaaggeet aapkledflg 61 mqvqqetaaa aaghgrggss svvglsmirn wlrsqpppav vggedammal avstsasppv 121 datvpacisp dgmgskaadg ggaaeaaaaa aaqrmkaamd tfgqrtsiyr gvtkhrwtgr 181 yeahlwdnsc rregqtrkgr qvylggydre ekaaraydla alkywgtttt tnfpvsnyek 241 eldemkhmnr qefvaslrrk ssgfsrgasi yrgvtrhhqh grwqarigrv agnkdlylgt 301 fgtqeeaaea ydiaaikfrg Inavtnfdms rydvksiies snlpigtgtt rrlkdssdht 361 dnvmdinvnt epnnvvsshf tngvgnygsq hygyngwspi smqpipsqya ngqprawlkq 421 eqdssvvtaa qnlhnlhhfs slgythnffq qsdvpdvtgf vdapsrssds ysfryngtng 481 fhglpggisy ampvatavdq gqgihgyged gvagidtthd lygsrnvyyl segslladve 541 kegdygqsvg gnswvlptp

SEQ ID NO: 15

Oryza sativa BABY BOOM (OsBBM)

NCBI Protein Accession # X? 015634444

1 ma tmnnwlaf slspqdqlpp sqtnstlisa aattttagds stgdvcfnip qdwsmrgsel

61 salvaepkle df Iggisf se qqhhhggkgg vipssaaacy assgssvgyl ypppsssslq

121 fadsvmvats spvvahdgvs gggmvsaaaa aaasgnggig Ismiknwlrs qpapqpaqal

181 slsmnmagtt taqgggamal lagagergrt tpaseslsts ahgattatma ggr keineeg

241 sgsagavvav gsesggsgav veagaaaaaa rks vdtf gqr tsiyrgvtrh rwtgryeahl

301 wdnscrregq trkgrqvylg gydkeekaar aydlaalkyw gpttttnfpv nnyekeleem

361 khmtrqef va slrrkssgfs rgasiyrgvt rhhqhgrwqa rigrvagnkd lylgtf stqe

421 eaaeaydiaa ikfrglnavt nfdms rydvk sildsaalpv gtaakrlkda eaaaaydvgr

481 iashlggdga yaahyghhhh saaaawptia f qaaaappph aaglyhpyaq plrgwcxqeq

541 dhaviaaahs Iqdlhhlnlg aaaaahdf f s qamqqqhglg sidnaslehs tgsns vvyng

601 dnggggggyi mapmsavsat atavasshdh ggdggkqvqm gydsylvgad ayggggagrm

661 pswamtpasa paatsssdmt gvchgaqlf s vwndt

SEQ ID NO: 16

Zea mays BABY BOOM1 ( ZmBBMl ) GRMZM2G366434

NCBI Protein Accession # NP 001147535

1 masannwlgf slsgqdnpqp nqdsspaagi disgasdfyg Iptqqgsdgh Igvpglrddh

61 asygimeayn rvpqetqdwn mrgldynggg selsmlvgss gggggngkra vedsepxled

121 flggnsfvsd qdqsggylfs gvpiassans nsgsntmels miktwlrnnq vaqpqppaph

181 qpqpeemstd asgssfgcsd smgrnsmvaa ggssqslals mstgshlpmv vpsgaasgaa

241 sestssenkr asgamdspgs aveavprksi dtfgqrtsiy rgvtrhrwtg ryeahlwdns

301 crregqsrkg rqvylggydk edkaaraydl aalkywgttt ttnfpisnye keleemxhmt

361 rqeyiaylrr nssgfsrgas kyrgvtrhhq hgrwqarigr vagnkdlylg tfsteeeaae

421 aydiaaikfr glnavtnfdm srydvksile sstlpvggaa rrlkdavdhv eagatiwrad

481 mdgavisqla eagmggyasy ghhgwptiaf qqpsplsvhy pygqpsrgwc kpeqdaaaaa

541 ahslqdlqql hlgsaahnff qasssstvyn ggagasggyq glgggssflm psstvvaaad

601 qghsstanqg stcsygddhq egkligydaa mvataaggdp yaaarngyqf sqgsgsrvsi

661 arangyannw sspfnngmg

SEQ ID NO: 17

Glycine max BABY BOOM1 (GmBBMl)

NCBI Protein Accession # XP_006586645.1 ; ADP37371

1 mgsmnllgfs Ispheehpss qdhsqttpsr fsfnpdgsis stdvaggcfd Itsdstphll

61 nlpsygiyea fhrnnsintt qdwkenynsq nlllgtscnk qnmnqnqqqq pklenflggh

121 sfgeheqtyg gnsastdymf paqpvsaggg gsgggsnnnn nsnsiglsmi ktwlrnqppn

181 seninnnnes ggnirssvqq tlslsmstgs qsstslpllt asvdngesps dnkqpncsaa

241 Idstqtgaie taprksidtf gqrtsiyrgv trhrwtgrye ahlwdnscrr egqtrkgrqv

301 ylggydkeek aaraydlaal kywgttttcn fpishyekel eemkhmtrqe yvaslrrkss

361 gfsrgasiyr gvtrhhqhgr wqarigrvag nkdlylgtfs tqeeaaeayd vaaikfrgls

421 avtnfdmsry dvksilestt Ipiggaakrl kdmeqvelsv dnghradqvd hsiimsshlt

481 qginnnyagg gtathhnwhn ahafhqpqpc ttmhypygqr inwckqeqqd nsdaphslsy

541 sdihqlqlgn ngthnffhtn sglhpmlsmd sasidnssss nsvvydgygg gggynvmpmg

601 tttavvasdg dqnprsnhgf gdneikalgy esvygsatds yhaharnlyy Itqqqsssvd

661 tvkasaydqg sacntwvpta ipthaprsrt smalchgatt pfsllhe

SEQ ID NO: 18 Capsicum annuum BABY BOOM (CaBBM) NCBI Protein Accession # XP 016568915

1 mkmksmndds sssnnsnsnn nnhssaatns nnwlgfslsp hmkmevtnas etqqqqhphq

61 qqfaqsfyls sspttmnvst asalcyennp fhsslsvmpl ksdgslcime als rshadam 121 vqssspkled flggasqygs hereamalsl dslyyhqnde diqvhshhpy yspmhchgmy 181 qeslleetkp tqisncdaqm tgnemkswgh yaidqhindt csmvaaaaav aagggggtvg 241 cndlqslsls mnpgtqsscv tprqisptgl ecvaieskkr asakvaqkqp vhrksidtf g 301 qrtsqyrgvt rhrwtgryea hlwdnsckre gqtrkgrqvy Iggydmedka araydqaalk 361 ywgpsthinf plenyqkele emknmtrqey vahlrrkssg fsrgasiyrg vtrhhqhgrw 421 qarigrvagn kdlylgtfst qeeaaeaydv aaikfrgvna vtnfdisryd vekimasnnl 481 pagelarrtk erepresiey nnisvhknee cvqnnnnngn itdwkmvlyq asnpsigsnn 541 yrnpsfsval qdligidsin nstshatild heqnkiganh fsnasslvts Igssreaspd 601 ktaaslvfak ptkfvvpttn vnacipsaql rpipvsmahl pvfaalnda

SEQ ID NO: 19 Medicago cruncatula BABY BOOM (MtBBM) ,

NCBI Protein Accession # XP 003624212

1 masmnllgfs Ispqeqhpst qdqtvasrfg fnpneisgsd vqgdhcydls shttphhsln

61 Ishpfsiyea fhtnnnihtt qdwkenynnq nlllgtscmn qnvnnnnqqa qpklenf Igg 121 hsftdhqeyg gsnsysslhl pphqpeascg ggdgstsnnn siglsmiktw Irnqppppen 181 nnnnnnesga rvqtlslsms tgsqssssvp llnanvmsge isssenkqpp ttavvldsnq 241 tsvvesavpr ksvdtfgqrt siyrgvtrhr wtgryeahlw dnscrregqt rkgrqvylgg 301 ydkeekaara ydlaalkywg tttttnfpis hyekeveemk hmtrqeyvas Irrkssgfsr 361 gasiyrgvtr hhqhgrwqar igrvagnkdl ylgtfstqee aaeaydvaai kfrglsavtn 421 fdmsrydvkt ilesstlpig gaakrlkdme qvelnhvnvd ishrteqdhs iinntshlte 481 qaiyaatnas nwhalsfqhq qphhhynann mqlqnypygt qtqklwckqe qdsddhstyt 541 tatdihqlql gnnnnnthnf fglqnimsmd sasmdnssgs nsvvygggdh ggyggnggym 601 ipmaiandgn qnprsnnnfg eseikgfgye nvfgtttdpy haqaarnlyy qpqqlsvdqg 661 snwvptaipt laprttnvsl cppftllhe

SEQ ID NO: 20

Cenchrus ciliaris CcASGR-BBM-likel

NCBI Protein Accession # ACD80125

1 mgstnnwlrf vsfsggggak daaallplpp sprgdvdeag aepkledflg Iqepsaaavg 61 agrpfavggg assiglsmik nwlrsqpapa gpaagvdsmv laaaaastev agdgaeggga 121 vadavqqrka aavdtfgqrt siyrgvtkhr wtgryeahlw dnscrregqt rkgrqvylgg 181 ydkeekaara ydlaalkyrg tttttnfpms nyekeleemk hmsrqeyvas Irrkssgfsr 241 gasiyrgvtr hhqhgrwqar igsvagnkdl ylgtfstqee aaeaydiaai kf rglnavtn 301 fdmsrydvks iiessslpvg gapkrlkevp dqsdmginin gdsaghmtai nlltdgndsy 361 gaesygysgw cptamtpipf qfsighdhsr Iwckpeqdna vvaalhnlhh Iqhlpapvgt 421 hnffqpspvq dmtgvadass ppvesnsfly ngdvgyhgam ggsyampvat Ivegnsagsg 481 ygveegtgse ifggrnlysl sqgssgancg kadayeswdp smlvisqksa nvtvchgapv 541 fsvwk

SEQ ID NO: 21

Pennisetum squamulatum PsASGR-BBM-likel

NCBI Protein Accession # ACD80127

1 mgstnnwlrf asfsggggak daaallplpp sprgdvdeag aepkledflg Iqepsaaavg 61 agrpfavggg assiglsmir nwlrsqpapa gpaagvdsmv laaaaastev agdgaeggga 121 vadavqqrka aavdtfgqrt siyrgvtkhr wtgryeahlw dnscrregqt rkgrqggydk 181 eekaaraydl aalkyrgttt ttnfpmsnye keleemkhms rqeyvaslrr kssgf srgas 241 iyrgvtrhhq hgrwqarigs vagnkdlylg tfstqeeaae aydiaaikfr glnavtnfdm 301 srydvksiie ssslpvggtp krlkevpdqs dmginingds aghmtainll tdgndsygae 361 sygysgwcpt amtpipfqfs nghdhsrlwc kpeqdnavva alhnlhhlqh Ipapvgrhnf 421 fqpspvqdmt gvadassppv esnsflyngd vgyhgamggs yampvatlve gnsagsgygv 481 eegtgseifg grnlyslsqg ssgantgkad ayeswdpsml visqksanvt vchgapvf s v 541 wk

SEQ ID NO: 22

Pennisetum squamulatum PsASGR-BBM-like2

NCBI Protein Accession # ACD80124.2 1 mgstnnwlrf asfsggggak daaallplpp sprgdvdeag aepkledflg Iqepsaaavg

61 agrpfavggg assiglsmir nwlrsqpapa gpaagvdsmv laaaaastev agdgaeggga

121 vadavqqrka aavdtfgqrt siyrgvtkhr wtgryeahlw dnscrregqt rkgrqvylgg

181 ydkeekaara ydlaalkyrg tttttnfpms nyekeleemk hmsrqeyvas Irrkssgfsr

241 gasiyrgvtr hhqhgrwqar igsvagnkdl ylgtfstqee aaeaydiaai kfrglnavtn

301 fdmsrydvks iiessslpvg gtpkrlkevp dqsdmginin gdsaghmtai nlltdgndsy

361 gaesygysgw cptamtpipf qfsnghdhsr Iwckpeqdna vvaalhnlhh Iqhlpapvgt

421 hnffqpspvq dmtgvadass ppvesnsfly ngdvgyhgam ggsyampvat Ivegnsagsg

481 ygveegtgse ifggrnlysl sqgssgancg kadayeswdp smlvisqksa nvtvchgapv

541 fsvwk

SEQ ID NO: 23

Rosa canina BABY BOOM1 (RcBBMl)

NCBI Protein Accession ft AGZ02154.1

1 mnmnwlgfsl spqeddhapi shqladqecl asrlgfnsne iqgagggtdv sgggssecfd

61 vtnsdstasv nhhltpsifg iheaaafnrn ndhihsqdwn mkgagmnssd snnykrosss

121 dlstmlmgsn tsttstsysi isqqanlenh hqqlpklenf Igrhsfadhd sssaghdymf

181 dmnngpgpvs snvmntktns nniglsmikt wlrnqpsqpr dnhhqleqes knskesrnqp

241 qsslslsmgt gslqtvttat aggaatgeot sssdkntkqs pvvtattttg tdaqtqstgg

301 aieavprkai dtfgqrtsiy rgvtrhrwog ryeahlwdns crregqtrkg rqvylggydk

361 edkaaraydl aalkywgttt ttnfpissye keidemkpmt rqeyvaslrr kssgfsrgas

421 iyrgvtrhhq hgrwqarigr vagnkdlylg tfstqeeaae aydiaaikfr glnavtnfdm

481 srydvksile ssalpittga takrlkdvqq qqppppadhh hqimlssvld hhgqvirsss

541 stehdimsnv ysaygsygaq qgcswptlaf nqaqaqaaaa phqapfaagi ngmqlhyspy

601 gygygnahaq rvwckqeqdt nsnqersfhh qdddhlrqql qlggthnffh dhdqqqqqqq

661 qqtsglmglm dssaasmehs sgsnsviysg gdhhgnnngy gssttgtggg yimpmvmstv

721 vanddqnqad gnnnningfg dgddqeanik aqqlgydhhp qnmflgssst idpayqhhas

781 nrnlyyhlpv qddqhesssv avatssttcn mnwvptavpt lahptftvwn dt

SEQ ID NO: 24

Rosa canina BABY BOOM2 (RcBBM2)

NCBI Protein Accession # AGZ02155.1

1 mnmnwlgfsl spqeddhapi shqladqecl asrlgfnsne ihgagggtdv sgggssecfd

61 Itnsdstasv nhhltpsifg iheaaafnrn ndhihsqdwn mkgagmnssd snnykrosss

121 dlstmlmgsn tttstscsii sqqanlenhh qqlpklenfl grhsfadhdr ssaghdymfd

181 mnngpgpvss nvmntktnsn niglsmiknw Irnqpsqprd nhhqleqesk nskesrnqpq

241 sslslsmgtg slqtvttata ggaataatge ttcssdkntk qspvvtattt tgtdaqcqst

301 ggaieavprk aidtfgqrts iyrgvtrhrw tgryeahlwd nscrregqtr kgrqvylggy

361 dkedkaaray dlaalkywgt ttttnfpiss yekeidemkp mtrqeyvasl rrkssgfsrg

421 asiyrgvtrh hqhgrwqari grvagnkdly Igtfstqeea aeaydiaaik frglnavtnf

481 dmsrydvksi lessalpitt gatakrlkev qqqqppppad hhhqimlssv Idhhgqiirs

541 ssstehdims nvysaygsyg aqqgcswpcl afnqaqaqaa aaphqapfaa gingmqlhys

601 pygygygnah aqrvwckqeq dtnsnqessf hhqdddhlrq qlqlggthnf fhdhdqqqts

661 glmglmdssa asmehssgsn sviysggdhh gnnngygsst tgtgggyimp mvmstvvand

721 dqnqadgnnn ningfedgdd qeanikaqql gydhhhqnmf Igssstidpa yqhhasnrnl

781 yyhlpvqddq hesssvavat ssttcnmnwv ptavptlahp tftvwndt

SEQ ID NO: 25

Ricinus communis AIL6

NCBI Protein Accession # XP 015583464

1 mapattnwls fslspmemlr sstesqfisy egsstatpsp hyfidnfyan gwgnpkeaqg

61 attmaaetsi Itsfidpeth hqqvpkledf Igdsssivry sdnsqtdtqd sslthiydqg

121 saayfseqqd Ikaiagfqaf stnsgsevdd sasiarthlg gefmghsids sgndqlggfs

181 nctaannals lavnnnnnnn gnqsatnsrt iapviesdcp kkiadtfgqr tsiyrgvtrh

241 rwtgryeahl wdnscrregq arkgrqgalf flfspsssyh Islfvacffn yssvkilgiy

301 axsaiphghv lyffqvtnyt keldemkyvs kqefiaslrr kssgfsrgas iyrgvtrhhq

361 qgrwqarigr vagnkdlylg tfateeeaae aydiaaikfr gmnavtnfem srydveaimk

421 salpiggaak rlklsleseq kpnlnheqqp qgsssnsssn nisfasmppv taipcgipfe

481 nttaqlyhhh hhhhhhqhhn Ifhhlqttnn nlggttdiss gsttssmatt msmlpqraef

541 flwphhqsy SEQ ID NO: 26

Elaeis guineensis EgAP2-l

NCBI Protein Accession # AAV98627 ; NP_001290493

1 mdmdtshswl afslsyhqpy llealssapp hggggmtaee rggsaevaam avvgpkledf

61 iggcgepmgr yaggetgdag giydselkhi aagylqglpa teqqdsemak vaapaesrka

121 vetfgqrtsi yrgvtrhrwt gryeahlwdn scrregqsrk grqvylggyd keekaarayd

181 laalkywgpt tttnfpisny ekeleemknm trqefvaslr rkssgfsrga siyrgvcrhh

241 qhgrwqarig rvagnkdlyl gtfstqeeaa eaydiaaikf rglnavtnfd isrydvrsia

301 nsnlpiggmt grpskatess pssssdamrv eakqlldgrd psaslgfaal pikhdqdfws

361 Ifalqqqqqq qqqqsnqasg fglfssgvcm dfstasngvi sqgcggslvw nggvvgqqqe

421 qsqnnscssi pyatpiafgg nyegssyvgs wvtpppsyyh epakpnvavf qtpifgme

SEQ ID NO: 27 Populus trichocarpa BABY BOOM1 (PtBBMl) NCBI Protein Accession # XP_ 002316179

1 masmnnwlgf slshqelpss qsdhhqdhsq ntdsrlgfhs deisgtnvsg ecfdltsdst 61 apslnlpatf gileaf rnnq pqdwnmkslg mnpdtnykta sglpifmgts cnsqtidqnq 121 epklenflgg hs fgnhehkl ngcntmydct gdyvfqncsl qlpseatsne rtsnngggdn 181 knssiglsmi ktwlrnqpap tqqdtnnknn ggaqslslsm stgsqsaasa Ipllavnggv 241 nntggdqsss dnnkqqkstt psldsqtgai esvprksidt fgqrtsiyrg vtrhrwcgry 301 eahlwdns cr regqtrkgrq vylggydkee kaaraydlaa Ikywgttttt nfpitnyeke 361 ieemkhmtrq eyvaslrrks sgf srgasiy rgvtrhhqhg rwqarigrva gnkdlylgt f 421 stqeeaaeay diaaikf rgl navtnfdms r ydvnsiles s tlpiggaakr Ikeaehaeia 481 mdiaqrtddh dnmgsqltdg issygavqhg wptvaf qqaq pf smhypygq rlwckqeqds 541 dnrs f qelhq Iqlgnthnf f qps vlhnlvs mdsssmehss gsnsvvyssg vndgtscgtn 601 ggyqgigygs sagyavpmat visnndnnhn qgngygdgdq vkalgyenmf spsdpyharn 661 Ihylsqqpsa ggikasaydq gsacynwvpt avptiaaars nnmavchgaq pftvwndgt

SEQ ID NO: 28

Populus trichocarpa BABY BOOM2 (PtBBM2)

NCBI Protein Accession # XP 002311259

1 mastnnwlgf slspqelpss qsdhhdhpqn tdsrlrfhsd eisgtdvsge s fdltsdsta 61 pslnlpasfg ileafrnnqs qdwnnmkrsg inedtsyntt sdvpifmgss cnsqnidqnq 121 epklenflgg hsfgnhehkl nvcstmygst ghymfhncsl qlpsedasne rts snggadt 181 sinnnntnss iglsmiktwl knqpaptqqd tnnksnggaq slslsmstgs qsgsdlplla 241 vngggnrtrg eqsssdnnkq qkttpsldsq tgaievvprk sidtfgqrts iyrgvtrhrw 301 tgryeahlwd nscrregqtr kgrqggydxe dkaaraydla alkywgtttt tnfpmsnyek 361 eieemkhmtr qehvaslrrk ssgfsrgasi yrgvtrhhqh grwqarigrv agnkdlylgt 421 fstqeeaaea ydiaaikfrg Inavtnfdmn rydvnsimes stlpiggaak rlkeaehaei 481 ttrvqrtddh dstssqltdg isnygtaahh gwptiafqqa qaftmhypyg qrlwckqeqd 541 sdnhsfqelh qlqlgntqnf Iqpsvlhnvm smesssmehs sgsdsvmyss gghdgtgtgt 601 ngsyqgigyg sntgyaipma tviandvncq dqgngygdge vkalgyenmf sssdpyharn 661 lyylsqqssa gvikasaydq gstcnnwlpt avptiaarsn nmavchgapt f tvwnest

SEQ ID NO: 29 Larix gmelinii var. olgensis x Larix kaempferi BABY BOOM (BBM) ; KJ004517 NCBI Protein Accession # AHH34920.1

1 mgstsnwlaf slsphltvdm pdstqprsrs aasnhsrhhn df sngtvhdc yelhptdtmq 61 mplrpdgslc ilealdrtqn nqdwqlksle npgsmdlesd vqsqqmmkse Isilaggsse 121 qmsasigrhk nvdqegpkle dflggaslrg hyndartdsi ygnddafdek mmapglrdvv 181 pnclngfdvt dtelssgskk tdqnqdstrn insiqnslvq dsydqnsndq ymf qdcslql 241 ppnsgannmi glsmiktwlr sqp cpenkmn aatnsstpts akdqslgnlt niqslslsms 301 pgsqssspla Ipvqyqntna dspsseskxr slekqslvsv eatprksidt fgqrtsiyrg 361 vtrhrwtgry eahlwdns cr regqtrkgrq vylggydkee kaaraydlaa ikywgpcttt 421 nfptgnyeke Ieemkhmtrq eyvaslrrks sgf srgasiy rgvtrhhqhg rwqarigrva 481 gnkdlylgtf ssqeeaaeay diaaikf rgl navtnfdmtr ydvnsiless tlpiggaaak 541 rikdaepsdp svdgrrtdde isstissqia dtltsygnaa ypnghagwpi iaf qqqcnph 601 apafysqqra aagwckqehn niqnhdlqlh f qs stqnf Iq psirmtsnant vlhnlmnles 661 saqldgtntn snsgl fsnis gnlagns lqm anspips git vcds artpfs tendgs s tkn 721 ss yndnml sn sdpfarglyy I sqhspsvvk anyenaaynn wmtpavqtla prpnltvcha 781 pi ftvwndt

SEQ ID NO:30: Arabidopsis DD45 promoter

CCTCTTTGTCACCGTCACTCTTCTCCTCGTTCTCAACGTCTCCAGCAGAGCACTCCC GCCCGTGGCGGATTCCAC CAACATAGCGGCTAGACTAACCGGAGGAGGACTGATGCAGTGTTGGGATGCACTCTACGA GCTGAAGTCATGTAC TAATGAGATCGTTCTCTTCTTTCTCAACGGTGAGACCAAACTCGGCTACGGTTGCTGCAA CGCCGTTGATGTCAT TACCACTGATTGTTGGCCGGCGATGCTTACTTCTCTTGGCTTTACACTGGAGGAAACCAA TGTCCTCCGTGGTTT CTGTCAATCTCCGAACTCCGGCGGTTCTTCTCCAGCTCTTTCCCCTGTCAAACTTTGATA AATGTTCCTCGCTGA CGTAAGAAGACATTAGTAATGGTTATAATATATAGCTTTCTATGAATGTATGGTGAGAAA ATGTCTGTTCACTGA TTTTGAGTTTG

GAATAAAAGCATTTGCGTTTGGTTTATCATTGCGTTTATACAAGGACAGAGATCCAC TGAGCTGGAATAGCTTAA AACCATTATCAGAACAAAATAAACCATTTTTTGTTAAGAATCAGAGCATAGTAAACAACA GAAACAACCTAAGAG AGGTAACTTGTCCAAGAAGATAGCTAATTATATCTATTTTATAAAAGTTATCATAGTTTG TAAGTCACAAAAGAT GCAAATAACAGAGAAACTAGGAGACTTGAGAATATACATTCTTGTATATTTGTATTCGAG ATTGTGAAAATTTGA CCATAAGTTTAAATTCTTAAAAAGATATATCTGATCTAGATGATGGTTATAGACTGTAAT TTTACCACATGTTTA ATGATGGATAGTGACACACATGACACATCGACAACACTATAGCATCTTATTTAGATTACA ACATGAAATTTTTCT GT7V1TACATGTCTTTGTACAT7V1TTTAAAAGTAATTCCTAAGAAATATATTTATACAA GGAGTTTA7\AG7\AAACA TAGCAT AAAGTT CAAT GAGTAGTAAAAAC CATATACAGTATATAGCATAAAGTT CAAT GAGTTTATTACAAAAGC ATTGGTTCACTTTCTGTAACACGACGTTAAACCTTCGTCTCCAATAGGAGCGCTACTGAT TCAACATGCCAATAT ATACTAAATACGTTTCTACAGTCAAATGCTTTAACGTTTCATGATTAAGTGACTATTTAC CGTCAATCCTTTCCC ATTCCTCCCACTAATCCAACTTTTTAATTACTCTTAAATCACCACTAAGCTTCGAATCCA TCCAAAACCACAATA TAAAAACAGAACTCTCGTAACTCAATCATCGCAAAACAAAACAAAACAAAACAAAAACCC CAAAAAGAAAGAATA

SEQ ID N0:31: Rice egg cell-specific promoter sequence from LOC_Os03gl8530 OsECAl gene

ATGGAATGATGGATGAATGTTCACGTTCTTGAGTTCCTAAATGGTACTAATTTTGCA AAAACTTTCTATATGTGT TTTTTGTTAAGAATGTTGTTTTAAACCCATCTTTTCACTTTATAATATTTAATTAAATCG TTCGTACCCTCGAAT AGTTATTGCAAATTATACTTAACTATTCAGTCATTCAGCACAAAAGAACAGGGCCATGAA ATTGTAATACTAGTA CATTTCTGTTCTTTTCTTTTCTTTTTGAGGTTGTCTGAAACACCTGTATCTTAAACTATC GCAGACTAGCCAATG AGTCGTACTCACCTGAAACTGAAACCAAGTGATTAACCAAGCTGGTTCGACAGTAATTCC ATCCATAATGCAGCT CCGGAGCCCTTCATATCCTGCATGTTACTCAAACAACATCCCCACCTCCTCATTTCCTCT CCCCTATTGCATTGC ATAATTGCAGAAGATTAAGCCGCTAATGCATAATTACACATTATTTGTGTCCACTAATTT TCCCTTTCCCACACG CTACGAAACTCAAAAGCCGGCCTCCTCGCCTCCTTCCCTGAACGTTACTAATCGCGTCAT GTATAAATACAGAGC TTGCCCACGCACCGGCACATTGCATCGCACTACGCACATCTACACGATACCCAAGCAGCA AAGCTAGAAAGAAAA ACC

SEQ ID NO : 32 : ARABI DOPS I S EGG CELL 1 . 1 ( EC1 . 1 ) PROMOTER SEQUENCES FROM AT1G76750 GENE

GTTGCCTTATGATTTCTTCGGTTTCAAGATGATCAAATAGTTATAGATTTCATGCT CACACATGCTCATTAGATGTGTACATACTTTACTTACCCAAATCTATTTTCTCGCA AAGATTTTGATGGTAAAGCTGATTTGGTTCTATTGAACTAAATCAAACGAGTTTC AGACTGAGTGATTCTAATCCGGCCCATTAGCCCCTAAACAGACCCACTAATTACG CAGCTTTTAATAGAGTAATTACACCTAGTTTACCCACTAAACCACTAAGCACTAA TTATCTCACAATCTAATGAGCTTCCCTCGTAATTACTTGGGCTTTCACTCTACCAT TTATTTGTAACAGTCAAGTCTCTACTGTCTCTATATAAACTCTCTAAAGTTAACAC ACAATTCTCATCACAAACAAATCAACCAAAGCAACTTCTACTCTTTCTTCTTTCG ACCTTATCAATCTGTTGAGAA

SEQ ID NO:33: DWT conserved central domain from rice:

PDPKPRWNPRPEQIRILEAI FNSGMVNPPRDEIPRIRMQLQEYGQVGDANVFYWFQNRKSRSKNKLR

SEQ ID NO:34 Rice OSD1; Lecuiit LOC_D«02g37850 CDS Sequence Os02g37850.2 atgcctgaagtgagaaattccggcggtagggcggcgctcgccgacccctcgggtggtggg ttctttatcaggaggacgacgtcgccgccgggagccgtggcggtcaagccgctggctcgg cgggccctgccgccgacgagcaacaaggagaacgtgccgccgtcctgggctgtgaccgtg agggctacacccaagaggaggagccccctgcccgagtggtacccgaggagcccactccgc gacatcacgtcagtcgtcaaggcagttgagaggaaaagtcgcctcggaaatgctgcggtt cggcagcagatccagttgagtgaagattcttcacgatctgtggatccagcaactccagta caaaaagaagaaggtgtccctcaaagcacaccaacaccaccaactcaaaaggccctggat gctgctgccccttgtcctggctcaacccaagctgttgcaagcacatcaacagcttacttg gccgagggcaagccgaaggcatcatcttcttctccatctgactgctcctttcagacacca tccagaccaaatgatccagctcttgctgatctcatggagaaggaactgtccagctccata gagcagatagagaagatggtaaggaagaacctcaagagagctccgaaggctgctcagcct tccaaggtgaccatccagaagcgcaccctgttgtccatgagatga

SEQ ID NO-J5

Rice OSD1; Lean# LOC_Oe02|37850 Protein Sequence

MPEVRNSGGRAALADP3GGGFFIRRTTSPPGAVAVKPLARRALPPTSNKENVPPSMA VTV RATPKRRSPLPEWYPRSPLRDITSWKAVERKSRLQiAAVRQQIQLSEDSSRSVDPATPV QKEEGVPQSTPTPPTQKALDAAAPCPGSTQAVASTSTAYLAEGKPKASSSSPSDCSFQTP SRPNDPALADIMEKELSSSIEQIEKMVRKNLKRAPKAAQPSKVTIQKRTLLSMR*

SEQ ID NO-36

Arabidopais OSD1; Loan#AT3G57860 CDS Sequence

AT3G57860.1

1 ATGCCAGAAG CAAGAGATCG AACCGAGAGG CCTGTGGATT ACTCGACTAT

51 ATTTGCOAAC CGACGGAGAC ATGGTATTTT ACTTGACGAG CCAGATTCAC 101 GGCTTAGTTT GATTGAATCT CCGGTGAATC CAGATATTGG GTCTATTGGT 151 GGAACGGGCG GGCTTGTGAG AGGCAATTTC ACTACATGGA GGCCTGGFAA 201 TGGCAGAGGT GGTCACACTC CATTTAGATT GCCAGAGGGA AGAGAGAAIA 251 TGCCCATAGT GACCGCTAGG CGTGGAAGAG GTGGTGGTTT GTTGCCTrCT 301 TGGTATCCAA GAACACCrCT ACGCGACATA ACXCAl’ATTG TGCGGGCTAT 351 TGAGAGAAGA AGAGGAGCTG GGAGTGGAGG AGACGATGGC CGAGTTATTG 401 AGATCCCAAC TCATCGACAA GTTGGTGTTC TTGAATCTCC AGTACCACTG 451 TCAGGAGAAC ACAAATGCTC GATGGTCACT CCTGGACGAT CTGTGGGATT 501 CAAGCGTAGT TGCCCACCAT CAACTGCTAA AGTTCAAAAG ATGTTACTTG 551 ACATCACTAA AGAGATAGCT GAGGAAGAAG CTGGCTTGAT CACACCCGAG 601 AAGAAGCTAC TCAATTCTAT TGACAAAGTT GAGAAAATTG TGATGGCGGA 651 GATCCAGAAC TTGAAGAGCA CTCCTCAAGC TAAAAGGGAA GAGCGCGAGA 701 AGAGGGTGCG GACTTTAATG ACTATGCGAT GA

SEQ ID NO-37

Arabidopais OSD1; Locus#uAcT3G57860 Protein Sequence

1 MPEARDRTER FVDYSTIEA# RPJIHG1LLDE PDSRLSLIKS PVMPDIGSIG

51 GTGGLVRGWF TTWRPG#GRG CTTPFRLPQG RKNMPI'VTAR RGRGGGLLPS 101 WZPRTPLRDI THIVRAIBRR RGAGTGGDtX» RVIEIPTHRQ VGVLESPVPL 151 3OEHKC3>TVT I-GP3VGFKR3 CPPSTAKVQK MLLDITKBIA SEBAGFXTPS 201 KiU-LHSIOKV EKIWCAKIQK LK5TPQAKRE ERBKKVRTU4 TMR

SEQ ID NO*38

Blce PAIRl, Loan# LOC_(M0tO159O CDS Sequence

>LOC_Os03g01590 . 1

ATGAAGCTTAAGATGAACAAGGCCTGCGACATCGCCTCCATCTCCGTCCTCCCTCCC CGG

AGGACCGGAGGGAGCAGCGGCGCGTCGGCTTCCGGTTCCGTGGCGGTGGCGGTGGCG TCT CAGCCGCGGTCGCAGCCGCTCTCGCAGTCGCAGCAGTCCTTCTCGCAGGGCGCCTCCGCC TCGCTCTTGCACTCGCAGTCGCAGTTCTCGCAGGTCTCCCTCGACGACAACCTCCTCACC CTCCTCCCTTCCCCCACCCGCGATCAGAGATTTGGCTTGCATGATGACTCATCCAAGAGG ATGTCCTCTTTACCAGCCAGTTCAGCTTCTTGCGCGCGAGAAGAGTCTCAGCTGCAACTG GCAAAATTACCAAGCAACCCAGTGCACCGCTGGAACCCCTCCATTGCAGATACTAGATCA GGTCAGGTTACTAATGAGGATGTTGAGCGCAAATTTCAGCATCTGGCAAGCTCAGTACAT AAGATGGGGATGGTGGTAGACTCAGTCCAAAGTGACGTAATGCAGTTAAACAGAGCCATG AAGGAGGCATCATTAGATTCTGGTAGCATACGGCAAAAGATTGCTGTCCTTGAAAGCTCA CTTCAGCAAATTCTTAAGGGACAAGACGATCTCAAAGCACTCTTTGGAAGCAGCACAAAA CACAATCCTGATCAGACAAGTGTTCTGAATTCTCTAGGCAGCAAATTGAATGAGATATCC TCGACCCTTGCAACCTTGCAGACACAAATGCAAGCAAGACAACTGCAGGGTGATCAGACA ACTGTTCTGAATTCTAATGCCAGCAAATCGAATGAGATATCCTCGACTCTTGCAACCCTG CAGACACAAATGCAAGCAGATATAAGACAACTGCGGTGTGACGTCTTCAGAGTTTTTACA AAAGAGATGGAGGGGGTTGTTAGAGCTATCAGGTCTGTCAATAGTAGGCCTGCTGCAATG CAAATGATGGCAGACCAGAGTTACCAAGTACCAGTTTCAAATGGATGGACCCAGATTAAC CAGACACCAGTAGCAGCTGGAAGGTCTCCAATGAACCGAGCACCAGTAGCAGCTGGAAGG TCCCGGATGAACCAATTACCTGAAACAAAAGTGCTTTCTGCACATTTGGTTTATCCTGCA AAGGT GACAGAT CTGAAGC CAAAGGT GGAGCAGGGAAAGGTAAAAGCAGCT CCACAAAAG CCGTTTGCTTCGAGCTACTACAGGGTGGCACCTAAACAGGAAGAGGTAGCGATTAGAAAG GT CAATATACAAGTGCCAGCAAAGAAGGCACCAGT CAGCATAAT CAT CGAGT CGGAT GAT GACAGTGAAGGACGTGCGTCCTGCGTGATTTTGAAGACAGAAACAGGTAGCAAGGAGTGG AAAGTGACAAAGCAAGGCACCGAAGAGGGCCTGGAGATCCTGCGGAGGGCGAGGAAGAGG AGGAGGAGAGAGATGCAGTCCATCGTGCTCGCATCCTAG

SEQ ID NO:39

Rice PAIR1, Locus# LOC_Os03g01590 Protein Sequence

MKLKMNKACDIAS I SVLPPRRTGGSSGASASGSVAVAVASQPRSQPLSQSQQSFSQGASA SLLHSQSQFSQVSLDDNLLTLLPSPTRDQRFGLHDDSSKRMS SLPAS SASCAREESQLQL AKLPSNPVHRWNPSIADTRSGQVTNEDVERKFQHLASSVHKMGMWDSVQSDVMQLNRAM KEASLDSGSI RQKIAVLES SLQQILKGQDDLKALFGSSTKHNPDQTSVLNSLGSKLNEIS STLATLQTQMQARQLQGDQTTVLNSNASKSNEI SSTLATLQTQMQADI RQLRCDVFRVFT KEMEGWRAI RSVNSRPAAMQMMADQSYQVPVSNGWTQINQTPVAAGRSPMNRAPVAAGR SRMNQLPETKVLSAHLVYPAKVTDLKPKVEQGKVKAAPQKPFAS SYYRVAPKQEEVAI RK VNIQVPAKKAPVS I I I ESDDDSEGRASCVI LKTETGSKEWKVTKQGTEEGLEILRRARKR RRREMQSIVLAS *

SEQ ID NO:40

Arabidopsis SPO11-2; Locus# ATIG63990 CDS Sequence

951 TTTGGTTCCC TTAAAGCCAA AAGATTCACA GATTGCTAAG AGCTTATTGT 1001 CCTCCAAAAT ATTGCAGGAA AACTACATAG AGGAGTTGTC ACTGATGGTT 1051 CAAACTGGTA AAAGAGCGGA AATTGAAGCT CTCTATTGTC ATGGTTAIAA 1101 TTATCTCGGT AAATATATAG CTACCAAGAT CGTGCAAGGC AAATACAIAT 1151 AA

SEQ ID NMl

AHMtopeta SPO11-2; Loa*# AT1G63990 Protein Sequence

1 MEE3SGLSSM KFFSDQHLSY ADILLPHEAR ARIEVSVLNL LRILNSPDPA 51 ISCLSLINRK RSN3CINKGI LTDVSY1FL3 T3FTKSSLTH AKTAKAFVFV 101 WKVMEICFQ1 LLQEKRVTQR ELFYKLLCDS PD1F3SQ1EV NRSVQDWAL 151 LRC3PY3LGI MASSRGLVAG RLFLQBPGKE AVDC3ACGSS GFAITGDLUL 201 LDNTIMRTDA RYI1IVEKHA IFHRLVEDRV FKH1PCVFIT AKGfPDIATR 251 FFLHRM3TTF PDLPILVLVD WUPAGLAILC TFKFGS1GMG LEAYR1ACNV 301 KNIGLRGDDL NLIPEE3LVP LKPKDSQIAK SLLSSKILQE NYIEELSLMV 351 QTGKPAEIEA LYCHGYNZLG KYIATK1VQG KYI

SEQ ID NO:42

RICE O1REC8; Loan# LOC_0>05(50410 CDS Sequence

>LOC_Os05g50410 . 1

ATGTTCTACTOKMCAGCTCCTCGCGCGGAAGGCTCCGCTCGGCCAGATATGGATGGC X; GCGACGCOTCACrcreAAGATCAACCGGAAGaHKnTGACAAt^

TGTGAGGAGATTTTGAACCCGTCGGTACCCATGGCACTAAGGCTCTCCGGAATTCTC A.TG GGTGGTGTGGCGATCGTGTACGAGAGGAAGOTGAAGGCTCTGTATGATGATGTGTCTCGG TTTCT<3VraGAGATCAACGMMXKATGGC®XCTa\AGC^

CXZCAAGGGCAAAACCCAAGCCAAGTATGAAGCAGTAACACTGCCAGAGAATATCAT GGAT ATGGATGTGGAGCAGCCCATGCTTTTCTCAGAGGCTGATACTACAAGGTTCCGGGGAATG CGTmGGfVGGArTTGGMGfMXyUVrACATTAATGTCAACCTAGACGATGATGACTTCTto OXIGCTGAGAATCATCACCAAGCTGATGCAGAAAATATCACCCTGGCTGArAATTTCGGG TCTGGGCTTGGAGAGACTGATGTGTTCAATCGTTTTGAGAGATTCGACATAACAGATGAT GATGCAACTTTCAATGTCACTCCT^TGGACACCCACAGGTTCCAAGTAATCTGGTTCCT TCTCCACCTAGGCAGGAAGMVCTCTCCTCAGCAACAAGAAAACCATCATGCTGCXn'CAT CX: CXTTCTTCACGAAGAAGCrrcyVKCAAGGGGGGGaATCTGTAAAAAATGAGCAAGAGCAG CAG AAGATGAAGGGTGAGCAACCTGCTAAATCATCAAAGAGAAAAAAACGTAGGAAAGATGAT GAGGTGATGATGGArAACGACCAGArAATGATCX:CAGGAAATGTATATCAAACATGGCT G AAGGATCCATCAAGCCTXyVCTACCAAAAGGCACAGU^ATCAACAGTAAAGTTAATCTTA TT C^GTCAATCAAGATAAGAGACCTCATGGACTTGCCCCTCGTTTCTCTAATATCTTCXrrT G GAGAAGTCACarCTAGAATTTTATTATCCTAAGGAACTTATGCAGCTTTGGAAGGAATGT ACTGAAGTCAAGTCCCCAAAAGCrrcCATCTTCAGGAGGGCAGCAGTCATCATCACCAGU KA CAACAGCAAAGAAACTTGCCTCCTCAGGCATTTCCAACCCAGCCTCAGGTTGATAATGAC AGGGAAATGGGATTTCACCCAGTGGACTTTGCAGATGACATCGAAAAACTCCGAGGAAAC ACTAGTGGGGAATATGGAAGAGATTATGATGCTTTTCAOXGTGATCATAGTGTTACTCCT GGAAGTCCTGMXKTAACTCGCAGOTXn'GCTTCAAGXnX^

ACGCAGTTGGATCCAGAAGTACAGTTGCCATCCGGAAGGTCCAAGAGGCAGCATTCA TCT GGAAAAAGCTTTGGGAACCTCGATC(y^GTTGAAGAAGAATTCCCATTCGAGCAAGAACT T AGAGATTTCAAGATGAGAAGGCmTCAGATGTTGGGCCAACTCCAGACCTGCTGGAAGUKA ATCGAACXrTACTCAAACCCCATATGAAAAGAAATCCAATCX^ATCGACCAGGTCACACA A TCAATCCACTCGTACCTCAAGCTACACTTTGACACCCCAGGGGCCTCACAGTCTGAATCA TTAAGTCAGCTAGCACATGGGArGACTACAGCAAAGGCTGCCCGACTCTTCTATCAAGCA TGCGTTTTAGX3KACTCATGATTTTATCAAGGrCTAACX3MCT®yU^ TTGATCTCGAGGGGACCAAAGATGTGA

SEQ ID NO:43

RICE O1REC8; Loa*# LOC_Oc05t5D410Pretein Sequence

MFYSHQLLARKAPLGQIWMAATLHSKINRKRLDKLDI I KICEEI LNPSVPMALRLSGI IM GGVAIVYERKVKALYDDVSRFLIEINEAWRVKPVADPTVLPKGKTQAKYEAVTLPENIMD MDVEQPMLFSEADTTRFRGMRLEDLDDQYINVNLDDDDFSRAENHHQADAENITLADNFG SGLGETDVFNRFERFDITDDDATFNVTPDGHPQVPSNLVPSPPRQEDSPQQQENHHAASS PLHEEAQQGGASVKNEQEQQKMKGQQPAKSSKRKKRRKDDEVMMDNDQIMI PGNVYQTWL KDPSSLITKRHRINSKVNLIRSIKIRDLMDLPLVSLISSLEKSPLEFYYPKELMQLWKEC TEVKSPKAPSSGGQQSSSPEQQQRNLPPQAFPTQPQVDNDREMGFHPVDFADDIEKLRGN

TSGEYGRDYDAFHSDHSVTPGSPGLSRRSASSSGGSGRGFTQLDPEVQLPSGRSKRQ HSS

GKSFGNLDPVEEEFPFEQELRDFKMRRLSDVGPTPDLLEEIEPTQTPYEKKSNPIDQ VTQ SIHSYLKLHFDTPGASQSESLSQLAHGMTTAKAARLFYQACVLATHDFIKVNQLEPYGDI LI SRGPKM*

SEQ ID NO:44

Arabidopsis REC8," Locus# AT5G05490 CDS Sequence

> AT5G05490 . 1

SEQ ID NO:45 Arabidopsis REC8,' Locus# AT5G05490 Protein Sequence

J. MLRLES LIVT VRGRATLLAR KAFJ GQ1 vVL'LL ?J? I HAKLRRK FL I

51 GLEI LR PSVR GGVV1VYFRK 1

101 TKSVRRPTL1 PKGKFHARKE AVTLPER FEA LEGRFEQTRR ' 151 QQTFI SMPLL ESHVNRNREP EELGQQEHQA PAEN ITLFEY i

201 IDRFEPFDI E GLYETQLlll SN PREGAEI PTT LI PSP?RH!-!O J 251 ORQEQQENRR DGFAEQL1EEQ li I RGKEEHRR RQPAKKRARR 301 T1 IAGRVYQE WLQDT3D1 LC RGEKRKVRGT IRPRAESEKR } 351 KL3SYRPQLY QLWSK’iTQVL QTSSSESRHJ? ELRAEQSPGF > 401 TDHHERSDTS SQHLDSPAEI LkTVRTGKGA SVESMHAGSR ASPETINRQA

451 ADINVTPFYS GDDVRSMPST PSARGAASIN MIEISSKSPM PNRKRPNSSP

501 RRGLEFZAEE RPWEHP.EYEF EFSMLPEKP.F TADKEILFET ASTQTQKPVC

551 NQSDEM1TDS IKSHLKTHFE TPGAPQVESL NKLAVGMDPN AAAKLFF23C

601 VLATRGVIKV NQAEPYGDIL IARGPUM

SEQ ID NO;46

ArebKepdi Gene Name : TAM1 (.TARDY ASYNCHRONOUS MEIOSIS1) ; Locus #AT1G7739O CDS Sequence

▻AT1G77390.1

1 ATGTCTTGTT CGTCGAGAAA TCTATCTCAG GAGAATCCGA TTCCTCGTCC

51 GAACTTAGCC AAGACTCGAA CCTOACTCCG CGATGTTGGA AACCGTCGTG 101 CTCCCCTCGG CGACATCACA AATCAGAAGA ATGGATCTAG AAATCCTTCA 151 CCGTCGTCTA CTCTGGTGAA TTGTTCAAAT AAGATCGGCC AATCTAAGAA 201 AGCACCAAAA CCTGCTTTAX CTCGTAATTG GAATTTGGGA ATTCTCGATT 251 CC3GTTTACC TCCOAASCCA AATGCGAAAT CAAACAIAAT CGTTCCTtAC 301 GAAGAGACCG AATTGCTCGA AAGCGATGM ACTCTTCTAT GTTCTTGACC 351 TGCATTATCC TTGGATGCCI' CTCCTACTCA ATCTGACCCG TCAATTTCCA 401 CTOATGACTC TTTGACGAAC GACGTTGTAG ATTACATGGT CGAGAGCACT 451 ACTGATGATG GAAATGATGA TGATGATGAT SAfiATTGTTA ACATTGATAG 501 TGACTTGATG GATCCACASC TTTGTGCTTC TTTTGCTTOT GATATCTACG 551 AGCATTTGCG TGTATCTGAG GTGAACAAAA G3UXGGCTCT AGATTAOA.TG 601 GAAAGAACTC AGTOAGCAT CAATGCTAGC ATGCGTTCTA TACTGATTGA 651 CTGGCTTGTG GAGGTTGCTG AAGACFTATAS GCTTTCGCCC GAGACGTTGT 701 ATTTGGCAGT AAACTACGTT GATCGGTATG TTACAGGAAA TGCAATCAAC 751 AAGCAAAATC TGCAGCTACT TGCTGTTACC TGCATGATGA TAGCAGCAAA 801 ATATCAAGAA CTCTCTCTCC CCCAACTGCA CGATTTCTGT TACATCACTG 851 ATAACACATA CTTAAGAAAT GAGCTTTTGG AGATGGAGTC TTCTGTTCTG 901 AACTAGTTGA AGTTCGAATT AACAACTCCA ACAGCAAAAT GTTTCTTGAG 951 GCGCTTTCTT CGTGCTGCTC AAGGCAGAAA GGAGGTACCA TOACTGCTGT 1001 CTGAGTGTCT GGCCTGCTAT CTCACCGAAT TATCGCTGTT AGATTACGCT 1051 ATGCTTCGAT ACGCTCCATC ACTTGTTGCA GCCTCTGCAG TTTTCTTGGC 1101 ACAATACACT CTACACCCTT CAAGAAAACC ATGGAATGCT ACGCTACStoC 1151 ATTACACATC GTAO.GGGCT AAACATATGG AAGCATGCGT TAAGMTCTr 1201 GTrOAGCTGT GTAATGA6AA ACTCTCATCT GATGTGGTTG CAATCAGAAA 1251 5AAGTACAGT CAACACAAAT ACAAGTTTGC: AGCAAAGAAG CTTTGTCCCA 1301 CGTCACTACC GCAAGAGCTT TTCCTCTGA

SEQ ID NOZ47

Afebktopeh TAM1 Protein Sequence

▻AT1G77390.1

1 MSS3SRNL30 BHPIPRPNLA KTRTSLRDVG NRRAPLGDIT NQKNGSRHPS

51 P3STLVHCSN KIGQ3KKAPK PALSRHWHLG ILDSGLPPKP NAKSNIIVPY 101 EDTELLQSDD SLLC3SPALS LDASPT'jSDP SISTHDSLTN HWDYMVEST 151 TDCGtiDDDDD EIVNIDSDLM DPQLCASFAC C17EHLRVSE VNKRPALDYM 201 ERTQ3SIHA3 MRS1LIDWLV EVAEEZRLSP ETLYLAVtIZV DRYLTGNAIN 251 KQHLQLLGVT CW41AAKXEE VCVPQVEDFC Y1TONTYLRN ELLEMESSVL 301 MiLKlE-.rrP TAKCFLRRFL HAAQGPJUSVP SLLSECLACY LTELSLLDYA 351 MLRYAP3LVA A3AVFLAQYT LHPSRKPWHA TLEHYTSYPA KHMEACVKHL 401 LQLCHEKLS3 DWAIRKKYS QHKYKFAAKK LCPT3LPQEL FL

SEQ ID NO:48 cycHn-Al; LOCUS# LOC_Oal2(20324 CDS Sequence

>LOC_Osl2g20324 . 3

ATGTCGACGTGCGACTCAATGAAAAGCCCAGA.CTTTGAGTATATTGATAATGGGGA TTCX: TCCTXy«^CTAGGTTCXnrrGa^GAAGAGXy\AAa3AGAACCTGCGTATCTC^^ AGAGATGTTGAAGU^AACTAAGTGGAAGAAGGATGCTCCTTCCCCAATGGAAATCGACCA A ATTTGTGATGTTGACAATAACTACGAGGATCCGCAGTTGTGTGCTACTCTTGCTTCTGAT ATCTACMGCACTTGCGKGMVGGCaMVGACa\GGM\AACATCCATCAACCGATT^ ACACTCCAAAAGGATGTAAACCCAAGCMGAGAGCGATCCTGMAGACTGGCTTGTGGAA GTCGCTGAAGAATATCGTCTTGTTCCTGATACATTATACCTGACAGTTAACTACATTGAC CGTTATCTTTCTGGCAATGAGATCAATCGTCAAAGACTGCAATTACTTGGAGTTGCTTGT ATGCTTATTGCTGCAAAATACAAGGAGATATGTGCACCTCAAGTAGAAGAATTCTGCTAT ATAACTGACAACACATACTTCAGAGATGAGGTTTTGGAAATGGAAGCTTCTGTCCTGAAT TACCTGAAGTTTGAAATGACTGCACCTACAGCAAAATGCTTTTTGAGGAGATTTGTCCGT GTTGCACAAGTATCTGATGAGGATCCAGCATTGCATCTTGAGTTCCTAGCCAATTATGTT GCTGAGCTATCACTGCTGGAGTACAATCTACTTTCTTACCCTCCTTCACTAGTAGCGGCA TCAGCTATTTTCCTGGCCAAATTCATACTGCAGCCAGCAAAGCACCCTTGGAATTCCACC CTTGCTCACTACACACAATACAAGTCGTCAGAGTTAAGCGACTGCGTTAAGGCATTGCAC CGCCTTTTCTGTGTTGGTCCTGGGAGTAACCTTCCTGCAATCAGGGAGAAGTATACCCAA CATAAGTACAAATTTGTGGCGAAGAAGCCCTGCCCACCCTCAATACCGACCGAATTCTTT CGCGACTCAACATGCTGA

SEQ ID NO:49 cyclin-Al; LOCUS# LOC_Osl2g20324 Protein Sequence

MSTCDSMKSPDFEYIDNGDSSSVLGSLQRRANENLRISEDRDVEETKWKKDAPSPME IDQ ICDVDNNYEDPQLCATLASDIYMHLREAETRKHPSTDFMETLQKDVNPSMRAILIDWLVE VAEEYRLVPDTLYLTVNYIDRYLSGNEINRQRLQLLGVACMLIAAKYKEICAPQVEEFCY ITDNTYFRDEVLEMEASVLNYLKFEMTAPTAKCFLRRFVRVAQVSDEDPALHLEFLANYV AELSLLEYNLLSYPPSLVAASAI FLAKFILQPAKHPWNSTLAHYTQYKSSELSDCVKALH RLFCVGPGSNLPAIREKYTQHKYKFVAKKPCPPSI PTEFFRDSTC

SEQ ID NO:50 cyclin-Al; LOCUS#LOC_Os05gl4730

CDS Sequence

>LOC_Os 05gl4730 . 1

ATGTCTAAGGAAGATGCTATGTCAACTGGTGATTCAACGGAAAGCCTTGATATTGAT TGC CTTGATGATGGGGACTCCGAAGTGGTATCTTCCTTGCAACATTTGGCAGATGATAAGCTT CATATTTCTGACAACAGGGATGTTGCAGGTGTGGCATCCAAATGGACGAAGCATGGTTGT AATTCAGTAGAAATTGATTATATCGTCGACATTGACAACAACCATGAGGATCCACAGCTG TGTGCAACTCTTGCTTTTGACATTTACAAGCACTTGCGAGTGGCTGAGACCAAGAAAAGG CCTTCAACAGATTTTGTGGAAACCATTCAGAAGAACATTGACACAAGCATGAGGGCAGTG TTAATAGACTGGCTTGTGGAAGTCACAGAAGAATATCGGCTTGTACCTGAAACCTTATAC CTCACAGTCAATTACATTGACCGGTATCTCTCGAGCAAGGTGATCAATCGGCGGAAAATG CAATTACTTGGTGTCGCTTGCCTGCTTATAGCTTCTAAGTATGAAGAGATATGCCCACCC CAAGTAGAAGAGCTCTGCTATATTTCTGACAATACATACACTAAGGATGAGGTTTTGAAA

ATGGAAGCTTCTGTCCTGAAATACTTGAAGTTTGAGATGACTGCACCTACAACAAAA TGC TTTTTGAGGAGATTTCTACGAGCTGCTCAAGTATGCCATGAGGCTCCAGTTTTGCATCTT GAGTTCCTAGCTAATTACATTGCGGAGCTATCACTTCTGGAGTACAGCTTAATTTGCTAT GTACCGTCACTTATAGCTGCGTCTTCTATTTTCTTGGCGAAGTTTATCCTTAAGCCAACA GAGAATCCTTGGAATTCAACACTTTCATTCTACACACAATACAAACCATCCGACCTATGC AATTGTGCAAAAGGACTACACCGGCTTTTCTTGGTTGGCCCTGGAGGCAACCTTCGAGCA GTTAGAGAAAAATACAGT CAACACAAGTACAAATT CGTAGCAAAGAAGTACT CT CCACCA T C AAT T C C AG CAG AGT T T T T C GAAG AT C CAAG CAG CT ACAAG C C T GAT T AA

SEQ ID NO:51 cyclin-Al; LOCUS#LOC_Os05gl4730Protein Sequence

MSKEDAMSTGDSTESLDIDCLDDGDSEWSSLQHLADDKLHI SDNRDVAGVASKWTKHGC NSVEIDYIVDIDNNHEDPQLCATLAFDIYKHLRVAETKKRPSTDFVETIQKNIDTSMRAV LIDWLVEVTEEYRLVPETLYLTVNYIDRYLSSKVINRRKMQLLGVACLLIASKYEEICPP QVEELCYI SDNTYTKDEVLKMEASVLKYLKFEMTAPTTKCFLRRFLRAAQVCHEAPVLHL EFLANYIAELSLLEYSLICYVPSLIAASSI FLAKFILKPTENPWNSTLSFYTQYKPSDLC NCAKGLHRLFLVGPGGNLRAVREKYSQHKYKFVAKKYSPPSI PAEFFEDPSSYKPD SEQ ID NO:52 cyclin-Al; Locus# LOC_OsO1gl3229 CDS Sequence

>LOC_Os 01gl3229 . 1

ATGGCCTTGGTGTGTGCGGAATGCCGTCTTGTTGACATCGTCCGGGTGCATGCGAAC CTG ATGGTACCCGAAATGGAGATCCAGCTTGGAGAAGAAGTGGGCGTCGCAAAGTTCATCAAG CAATTCGTCGAGAACCAAAATGGGAAACATGTCCTTGGCAGTCACCGTGTTGAGGGCTCG GTAGTCCACGTAGAAGCGCCATGGCTTGTGGACCAACAGAACCGACGACGAGAATGGCGA CGTGTTGGGGCAGATGGAGAGCTTGTCAGCGCATTATTTCTCCAATTCGTCGTTTTGAAG CTGTGGGTAGCGATACGGTCGAACCACTACCGGGGGCTTGCCGGGCTTGAGGTGGATGTG ATGATTGATGGATCGGGTCGGTGGGAGGCCGGTTGGAGCGCGAAGAGGTCGTCAAATTCC GCCAGCAAGGACGGAAGGAGATCGTCGGCGTGCAGTGCAGCGCAGCTGAGGTCGAGGGCG AAGCAGTCGACGTCGAACACCTCGGTGGCCACCTACAGGCGAAGCCCGTGCACGACACCC GCGCCGCGGATGCGATCGTTGTCAGCCACCACCACGCGGAGACCTTGGCGTACCGGTTGG AGAGGTAGGCCGATGCGCTGGGCCAGCTTGGTGGTGATGAAATTGTGTGTTGAAGCCGGA GTCAACAAGGGCCATCACCTGCTCGGCCGCGATACGCGCCACGAGGCACATCGTGCTAGT GCCATTGACACCGAAAATCGCGTTGAGCGAGATGTGTGGTTCATCGGTGTCGTCTGTGTC GTCCCATTCGCTGAAATTAGCCGTGGTGTCGTCGTACTCAAGAGGCCTTGTTTACAGCGC TCTGGTAACTCTGCTGGCGAGAGACAACGAAATTGGTGTCCTGGCACGAGCCAAGGCCAT GGCCTCCTGGAAATCCTCGGGCTGTTGGAGCTCGACATCGATGCAGATGTCGTCGGTGAG CCCCGCTGTGAAGAGCTGAACTTGTTGGCGGTCGGTGAGGAGGTCGGACGTGCGGCTACC CAACGCCAAAAGGCGCTGCTGGTACGTCGTCACCGTGCCAACCTGAGTGAGATGCTTGAG TTCACCCAGGGAATTGTTGCGGATCGGCGGTCCATACTGACCGTAACAGAGCTGCTTGAA CGTATCCCAATCCGGAGGACCCATGTGCTGCTCATAATGGAAGTACCATTCCTGTGCAGC TCAAGTGAGATGACCAGGAAACGTCCATCAACTGATTTTATGGAAACAATCCAAAAGGAT GT A7XAC C C AAGC AT GAGAG C GAT OCT GAT AGACT GGCTTGTG GAAGT C GC T GAAGAAT AT CGTCTTGTTCCTGATACATTATACCTGACAGTTAACTACATTGATCGTTATCTTTCTGGC AATGAGATCAATCGTCAAAGACTGCAATTACTTGGAGTTGCTTGTATGCTTATTGCTGCA AAATACGAGGAGATAT GT GCAC CTCAAGTAGAAGAATT CTGCTATATAACT GACAACACA TACTTCAGAGATGAGTGCTGGAATGAATCGAACTCTAATAACTCTCTTATTGCCTACAAC AGGAGATTTGTCCGTGTTGCACAAGTATCGGATGAGCTTTTCATCGTGCAGGATCCAGCA TTGCATCTTGAGTTCCTAGCCAATTATGTTGCTGAGCTATCACTGCTGGAGTACAATCTA CTTTCTTACCCTCCTTCACTAGTAGCGGCATCGGCTATTTTCTTGGCCAAATTCATACTG CAGCCAACAAAGCACCCTTGGAATTCCACCCTTGCTCACTACACACAATACAAGTCGTCA GAGTTAAGCGACTGTGTAAAGGCATTGCACCGCCTTTTTAGCGTTGGTCCCGGGAGTAAC CTTCCTGCAATCAGGGAGAAGTATACCCAACATAAGATACTGCATGCAGCTGATGTGATC GACTTGAACATGGCAAATGCATTTAAGAATGTGAAAATATTATGTCAATGTCCCTGTCAA TGCAACCTTCTTGAAGAAGTCATGCTCAAGCTATTTCCATACTGGAAGCTAAGCACAGCT GTTTAG

SEQ ID NO:53 cyclin-Al; Locus# LOC_OsO1gl3229 Protein Sequence

MALVCAECRLVDIVRVHANLMVPEMEIQLGEEVGVAKFI KQFVENQNGKHVLGSHRVEGS WHVEAPWLVDQQNRRREWRRVGADGELVSALFLQFWLKLWVAI RSNHYRGLAGLEVDV MI DGSGRWEAGWSAKRSSNSASKDGRRSSACSAAQLRSRAKQSTSNTSVATYRRS PCTTP APRMRSLSATTTRRPWRTGWRGRPMRWASLWMKLCVEAGVNKGHHLLGRDTRHEAHRAS AI DTENRVERDVWFI GWCWPFAEI SRGVWLKRPCLQRSGNSAGERQRNWCPGTSQGH GLLEI LGLLELDI DADWGEPRCEELNLLAVGEEVGRAATQRQKALLVRRHRANLSEMLE FTQGIVADRRSILTVTELLERI PIRRTHVLLIMEVPFLCSS SEMTRKRPSTDFMETIQKD VNPSMRAI LI DWLVEVAEEYRLVPDTLYLTVNYIDRYLSGNEINRQRLQLLGVACMLIAA KYEEI CAPQVEEFCYITDNTYFRDECWNESNSNNSLLAYNRRFVRVAQVSDELFIVQDPA LHLEFLANYVAELSLLEYNLLSYPPSLVAASAI FLAKFI LQPTKHPWNSTLAHYTQYKSS ELSDCVKALHRLFSVGPGSNLPAIREKYTQHKILHAADVIDLNMANAFKNVKILCQCPCQ CNLLEEVMLKLFPYWKLSTAV* SEQ ID NO:54 cyclin-Al; Locus # LQC_Osl2g31810 CDS Sequence

>LOC_Os 12g31810 . 1

ATGGCTGGAAGGAAGGAAAATCCGGTGCTTACTGCTTGCCAAGCACCCAGTGGTCGA ATC ACACGAGCTCAAGCTGCTGCAAATCGTGGACGGTTTGGGTTTGCTCCCTCCGTATCACTA CCCGCAAGAACTGAACGAAAGCAGACAGCAAAAGGAAAGACAAAAAGGGGAGCTTTGGAT GAAATCACTAGTGCAAGTACTGCAACTTCAGCTCCTCAGCCTAAACGGCGCACAGTGCTC AAGGATGTAACCAACATCGGCTGTGCCAACTCATCCAAAAATTGCACCACCACGAGCAAG CT GCAGCAAAAGT CAAAGC CCACCCAAAGGGT GAAACAAAT CCC GAGCAAAAAGCAGT GT GCAAAGAAGGTTCCTAAGCTACCCCCTCCGGCTGTTGCTGGAACTTCATTTGTGATTGAT TCTAAAAGTTCTGAAGAAACTCAAAAGGTGGAGCTTTTGGCAAAAGCAGAGGAACCCACA AATTTGTTTGAAAACGAGGGGTTACTGTCATTGCAGAATATTGAGCGAAACAGGGACAGT AATTGCCATGAGGCATTCTTTGAGGCAAGAAACGCCATGGATAAACATGAACTCGCTGAC TCCAAGCCTGGTGACTCTAGTGGTTTAGGTTTTATAGATATTGACAATGATAATGGAAAT CCTCAAATGTGTGCTTCCTATGCTTCAGAGATATACACAAATCTGATGGCCTCTGAGCTT AT CAGAAGAC CCAGGT CAAATTACAT GGAGGCTTT GCAACGT GACAT CACAAAGGGCATG CGAGGCATTCTCATTGATTGGCTTGTTGAGGTTTCTGAAGAATATAAGCTTGTGCCAGAC ACACTCTACCTAACCATTAATCTTATTGACCGATTTCTTTCTCAACATTATATTGAAAGA CAGAAACTCCAACTTCTTGGAATAACAAGCATGCTGATTGCCTCGAAATATGAAGAGATA TGTGCTCCTCGTGTTGAAGAATTTTGTTTCATAACTGACAATACATACACAAAAGCTGAG GTGCTGAAAATGGAGGGCCTGGTGCTTAATGATATGGGGTTTCATCTATCTGTTCCAACA ACAAAAACATTTCTCAGGAGATTCCTTAGAGCCGCACAGGCTTCTCGTAATGTTCCTTCA ATTACCTTGGGATATCTGGCCAATTATCTTGCAGAGCTGACCCTGATCGATTACAGTTTC CTCAAATTTCTTCCTTCAGTGGTGGCAGCATCTGCAGTCTTTCTTGCAAGATGGACACTT GACCAATCTGACATTCCATGGAATCATACTCTTGAGCACTACACTTCTTACAAAAGCTCT GATATTCAAATATGTGTCTGTGCTCTACGGGAACTGCAGCATAACACCAGTAATTGCCCT CT CAATGCTATAC GT GAAAAGTATAGGCAACAAAAGTTT GAGTGT GTAGC CAACCTGACA TCACCGGAGCTGGGGCAGTCACTCTTCAGCTGA

SEQ ID NO:55 cyclin-Al; Locus # LQC_Osl2g31810 Protein Sequence

MAGRKENPVLTACQAPSGRITRAQAAANRGRFGFAPSVSLPARTERKQTAKGKTKRG ALD EITSASTATSAPQPKRRTVLKDVTNI GCANSSKNCTTTSKLQQKSKPTQRVKQI PSKKQC AKKVPKLPPPAVAGTS FVI DSKS SEETQKVELLAKAEEPTNLFENEGLLSLQNI ERNRDS NCHEAFFEARNAMDKHELADSKPGDS SGLGFI DIDNDNGNPQMCASYASEIYTNLMASEL IRRPRSNYMEALQRDITKGMRGI LIDWLVEVSEEYKLVPDTLYLTINLI DRFLSQHYI ER QKLQLLGITSMLIASKYEEI CAPRVEEFCFITDNTYTKAEVLKMEGLVLNDMGFHLSVPT T KT FL RRFL RAAQAS RNVP S I T L GYLAN YLAE LTLIDYS FLKFLP S WAAS AVFLARWTL DQSDI PWNHTLEHYTSYKS SDIQICVCALRELQHNTSNCPLNAI REKYRQQKFECVANLT SPELGQSLFS *

SEQ ID NO:56 cyclin-Al; Locus # LOC_Os01gl3260 CDS Sequence

>LOC_Os 0 lg l3260 . 1

ATGTCGAGCAACCTAGCAGCCTCCCGCCGCTCGTCGTCGTCGTCCTCGGTGGCGGCG GCG GCGGCGGCGAAGCGACCCGCGGTGGGGGAGGGAGGAGGAGGAGGAGGAGGGAAGGCGGCA GCGGGCGCCGCCGCGGCAAAGAAGCGCGTGGCGCTTAGCAACATCAGCAACGTCGCCGCT GGTGGTGGCGCCCCAGGGAAGGCCGGCAATGCGAAGTTGAATTTAGCTGCCTCAGCTGCA CCAGTGAAGAAGGGATCTTTGGCCAGTGGCCGCAATGTGGGCACGAATCGGGCCTCGGCG GTGAAATCGGCTTCCGCCAAGCCGGCTCCGGCCATATCCCGCCATGAGAGCGCCACACAG AAGGAGTCTGTTCTTCCTCCTAAAGTGCCTAGCATTGTGCCGACTGCTGCACTGGCACCT GTCACTGTACCCTGCAGCAGCTTCGTCTCCCCTATGCATTCAGGAGATTCAGTTTCGGTT GACGAGACGATGTCGACGTGTGACTCAATGAAAAGCCCAGAATTTGAGTACATTGATAAT GGGGATTCCTCCTCAGTTCTAGGTTCCTTGCAGCGAAGAGCAAACGAAAACCTGCGTATC TCAGAGGATAGAGATGTCGAAGAAACTAAGTGGAAGAAGGATGCTCCTTCCCCAATGGAA ATCGACCAAATTTGTGATGTTGACAATAACTACGAGGATCCGCAGTTGTGTGCTACTCTT GCTTCTGATATCTACATGCACTTGCGCGAGGCTGAGACCAGGAAACGTCCATCAACTGAT TTTATGGAAACAATCCAAAAGGATGTAAACCCAAGCATGAGAGCGATCCTGATAGACTGG CTTGTGGAAGTCGCTGAAGAATATCGTCTTGTTCCTGATACATTATACCTGACAGTTAAC TACATTGATCGTTATCTTTCTGGCAATGAGATCAATCGTCAAAGACTGCAATTACTTGGA GTTGCTTGTATGCTTATTGCTGCAAAATACGAGGAGATATGTGCACCTCAAGTAGAAGAA TTCTGCTATATAACTGACAACACATACTTCAGAGATGAGGTTTTGGAAATGGAAGCTTCT GTCCTGAATTACCTGAAGTTTGAAGTGACTGCACCTACAGCAAAATGCTTTTTGAGGAGA TTTGTCCGTGTTGCACAAGTATCGGATGAGGATCCAGCATTGCATCTTGAGTTCCTAGCC AATTATGTTGCTGAGCTATCACTGCTGGAGTACAATCTACTTTCTTACCCTCCTTCACTA GTAGCGGCATCGGCTATTTTCTTGGCCAAATTCATACTGCAGCCAACAAAGCACCCTTGG AATTCCACCCTTGCTCACTACACACAATACAAGTCGTCAGAGTTAAGCGACTGTGTAAAG GCATTGCACCGCCTTTTTAGCGTTGGTCCCGGGAGTAACCTTCCTGCAATCAGGGAGAAG TATACCCAACATAAGTACAAATTTGTGGCGAAGAAGCCCTGCCCACCCTCAATACCGACC GAATTCTTTCGCGACGCAACATGCTGA

SEQ ID NO:57 cyclin-Al; Locus # LQC_Qs01gl3260 Protein Sequence

MS SNLAAS RRS S S S S S VAAAAAAKRPAVGEGGGGGGGKAAAGAAAAKKRVAL SN I SNVAA GGGAPGKAGNAKLNLAASAAPVKKGSLASGRNVGTNRASAVKSASAKPAPAI SRHESATQ KESVLPPKVPSIVPTAALAPVTVPCS SFVS PMHSGDSVSVDETMSTCDSMKS PEFEYI DN GDSSSVLGSLQRRANENLRI SEDRDVEETKWKKDAPSPMEI DQI CDVDNNYEDPQLCATL ASDIYMHLREAETRKRPSTDFMETIQKDVNPSMRAILI DWLVEVAEEYRLVPDTLYLTVN YI DRYLSGNEINRQRLQLLGVACMLIAAKYEEICAPQVEEFCYITDNTYFRDEVLEMEAS VLNYLKFEVTAPTAKCFLRRFVRVAQVSDEDPALHLEFLANYVAELSLLEYNLLSYPPSL VAASAI FLAKFILQPTKHPWNSTLAHYTQYKS SELSDCVKALHRLFSVGPGSNLPAI REK YTQHKYKFVAKKPCPPSI PTEFFRDATC*

SEQ ID NO:58

Cyclin-A3; Locus # LOC_Osl2g39210 CDS Sequence

>LOC_Os 12g39210 . 1

ATGGCTGACAAGGAGAACTCCACCCCGGCCTCCGCGGCGCGGCTCACCCGCTCGTCT GCG GCGGCTGGGGCGCAGGCCAAGCGTTCGGCCGCCGCGGGCGTCGCCGACGGTGGCGCGCCG CCGGCGAAGAGGAAGCGCGTCGCGCTCAGCGACCTCCCGACCCTCTCCAACGCCGTCGTC GTCGCCCCCCGCCAGCCGCACCACCCCGTCGTCATCAAGCCGTCGTCCAAGCAGCCCGAG CCCGCCGCGGAGGCGGCGGCGCCCAGCGGCGGCGGCGGCGGCTCCCCCGTGTCATCCGCG TCGACGTCGACGGCGTCGCCCTCCTCCGGTTGGGACCCGCAGTACGCCTCCGACATCTAC ACCTACCTCCGATCCATGGAGGTGGAGGCGCGGAGGCAGTCGGCGGCGGACTACATCGAG GCGGTGCAGGTGGACGTGACGGCGAACATGCGGGCCATCCTCGTGGACTGGCTGGTGGAG GTCGCCGACGAGTACAAGCTCGTCGCCGACACGCTCTACCTCGCCGTCTCCTACCTCGAC CGCTACCTCTCCGCCCACCCGCTCAGGCGCAACAGGCTGCAGCTCCTCGGCGTCGGCGCC ATGCTCATCGCTGCGAAGTACGAGGAGATTAGCCCTCCTCATGTGGAGGATTTCTGCTAC ATCACTGATAATACGTACACTAGGCAGGAGGTTGTCAAGATGGAGAGCGACATACTCAAG CTTCTCGAGTTCGAGATGGGCAATCCTACCATCAAGACATTCCTCAGGCGGTTCACGAGA TCTTGCCAGGAAGACAAAAAGCGCTCCAGCTTGTTATTGGAGTTCATGGGGAGTTATCTT GCTGAGCTTAGTCTACTTGACTACGGCTGTCTCCGGTTCTTGCCATCGGTGGTTGCTGCC TCAGTGGTGTTTGTTGCTAAACTGAACATTGATCCGTACACCAATCCTTGGAGCAAGAAG ATGCAGAAGTTGACAGGATACAAGGTGTCTGAACTGAAGGATTGCATCTTGGCCATTCAT GACTTGCAGCTCAGAAAAAAATGTTCAAACTTAACTGCAATTCGCGACAAGTACAAGCAA CACAAGTTCAAGTGTGTCTCAACATTGCTTCCCCCTGTTGATATCCCTGCGTCATACCTC CAAGATTTAACAGAGTAG

SEQ ID NO:59

Cyclin-A3; Locus # LOC Osl2g39210 Protein Sequence MADKENSTPASAARLTRSSAAAGAQAKRSAAAGVADGGAPPAKRKRVALSDLPTLSNAW VAPRQPHHPWIKPSSKQPEPAAEAAAPSGGGGGSPVSSASTSTASPSSGWDPQYASDIY TYLRSMEVEARRQSAADYIEAVQVDVTANMRAILVDWLVEVADEYKLVADTLYLAVSYLD RYLSAHPLRRNRLQLLGVGAMLIAAKYEEISPPHVEDFCYITBNTYTRQEWKMESDILK LLEFTMQiPTIKTFLRRFTRSCQEDKKRSSLLLEFMGSYLAELSLLDYGCLRFLPSWAA SWFVAKLNIDPYTNPWSKKMQKLTGYKVSELKDCILAIHDLQLRKKCSNLTAIRDKYKQ

HKFKCVSTLLPPVDI PAS YLQDLTE*

SEQ ID NO^O

Cydin-A3; Locus# LOC_Os03f41100 CDS sequence

>LOC_O803g41100. 1

ATGGCCGGCAAGGAGAACGCCGCGGCGGCGCAGCCCCGCCTCACCCGCGCCGCCGCX yUV; CGCGCGGCCGCCGTCACCGCCGTGGCCGTCGCCGCCAAGCGCAAGCGCGTCGCGCTCAGC GAGCTCC£CAaKrR7rayUtfAACAAC^CCT(CTa^a\AG^

GGCGGCAAGAGGGCCGCCTCCCACXX:CGCCGAGCCCAAGAAGCCAGCTCCGX:ax: CXtoCG CCGGCGGTGGTGGTCGTGGTCGACGACGACGAGGAGGGGGAGGGGGATCCGCAGCTCTGC GCGCCCTACGCCTCCGACATCAACTCCTACCTCCGCTCaXTGGAGGTGCAAGCGAAGCGG CGKKCGGCGGCGGACTACATCGAGACGCTGraVGGTGGACGTGMG^CAACATGCGAGGC ATCCTGGTCGACTGGCTCGTCGAGGTCGCCGAGGAGTACAAGCTCGTCTCCGACACKrcT C TACCTCACCGTCTCCTACATCGACCGCTTCCTCTCCGCCAAATCCATCAACCGCCAGAAG CTGCAGCTCCTCGGCGTCTCCGCCATGCTCATaKCTCGAAGTATGAGGAGATCAGCCCX: ayU\ATGTGGAGGATTTCTGCTATATAACCGACAATACCTATATGUKAACAGGAGGTTGT C AAGATGGAGCGCGATATACTGAATGTTCTCAAGTTTGAGATGGGCAATCCTACAACKAAG ACGTTCCTGAGGATGTTCATCAGATCTAGCCAAGAAGACGATAAGTATCCTAGCCTTCCC TTGGAGTTCATGTGTAGCTATCTTGCCXAGCTGAGCCTGCTGGAGTACGGCTGTGTTCGG CTCirTGC^TCXIG'ITrGTTGCAGCCTCAGTGGTGTTTGTTGCAAGGCTAACCCTTGAT TCA GACACCAATCCTTGGAGCAAGAAGTTGCAAGAGGTGACCGGCTACAGGGCATCTGAGTTG AAGGATTGCATTACCTGCATACA.TGACmGCAGCTAAAa\GGAAAGGGTCATCTCTAATG GCTATCCGKXMUyU^GTACAAGCAACA.TAGGaTCAAGX^CTATCAACA.TTGnTACCX: CCT GTTGAGATCCCTGCATCATACTTCGAAGACCTAAACGAGTAG

SEQ ID NM1

CycBn-A3; Locus# LOC_Os03f41100 Protein sequence

MAGKENAAAAQPRLTRAAAKRAAAVTAVAVAAKRKPVALSELPTLSNNNAVVLKPQP APR GGKRAASHAAEPKKPAPPPAPAWVWDDDEEGEGDPQLCAPYASDINSYLRSMEVQAKR RPAADYIETVQVDVTANMRGILVDWLVEVAEEYKLVSDTLYLTVSYIDRFLSAKSINRQK LQLLGVSAMLIASKYEEISPPNVEDFCYITIWTYMKQEWKMERDILNVLKFEMQiPTTK TFLRMFIRSSQEDDKYPSLPLEEUCSYLAELSLLEYGCVRLLPSWAASWFVARLTLDS DTNPWSKKLQEVTGYRASELKDCITCIHDLQLNRKGSSLMAIRDKYKQHRFKGVSTLLPP VEIPASYFEDUiE

SEQ ID N(k62

Arabidopsb Gene Name: DYAD; Loens #AT5G51330, CDS Sequence

1 ATGAGTAGTA CGATGTTCGT GAAACGGAAT CCGATTAGAG AAACCACCGC

51 CGGGAAAATC TCTTCGCCGT CGTCACCGAC TTTGAATGTT GCAGTCGCGC 101 ATATAAGAGC TGGATCTTAT TACGAAATCG ATGCTTCGAT TCTTCCTCAG 151 AGATCGCCGG AAAATCTTAA ATCGATTAGA GTCGTCATGG TGAGCAAAAT 201 CACGGCGAGT GACGTGTCTC TCCGSTACCC AAGCATGTTT TCACTCCGAT 251 CGCATTTCGA TTACAGTAGG ATGAACCGGA ATAAACCGAT GAAGAAGAGG 301 AGTGGTGGTG GTCTTCTTCC TGTTTTCGAC GAGAGTCATG TGATGGCTTC 351 GGAGCTAGCT GGAGACTTGC TTTACASAAG AATCGOACCT CATGAACTTT 401 CTATGAAIAG AAATTCCTGG GGTTTCTGGG TTTCTA6TTC TTCTCGOAGG 451 AACAAAT1TC CAAGAAGGGA GGl'GGTn'CT CAACCGGCGT ACAATACICG SOI TCTCTGTCGC GCTTGCTTCAC CGGAGGGAAA GTGCTCGTCT GAGCTGAAAT S51 CGGGAGGGAT GAI'CAAGTGG GGAAGGAGAT TGCGTGTGCA GTMCAGAGT 601 CGGCATATTG ATACTASGAA GAATAAGGAA GGTGAGGAGA GTTCTAGAGT 651 GAASGATGAA GTTTACAAAG AAGAAGAGAT GGAGAAAGAA GAGGATGATG 701 ATGATGGGAA TGAAATAGGA GGCACTAAAC AAGAGOCAAA GGAGATAACT 751 AATGGAAATC GTAAGAGAAA GCTGATTGAA TCAAGTACTG AGAGACTCOC 601 TCAGAAAGCT AAGGTTTATG ATCAGAAGAA GGAAACTCAA ATTGTGGTTT 851 ATAAGAGGAA ATCAGAGAGG AAGTTCATTG ATAGATGGTC TGTTGAGAGG 901 TACAAACTAG CTGAGAGGAA CATGTTAAAA GTGATGAAGG AGAAGAATGC 951 AGTGTTTGGC AACTCCATAC TCAGGCCAGA GTTGAGGTCA GAAGCAAGGA 1001 AGCTGATTGG TGACACAGGT CTATTGGATG ATCTGCTTAA GO.CATGGCT 1051 GGTAAGGTGG CTCCTGGAGG TCAAGATAGG TTTATGAGAA AGCACAATGC 1101 AGATGGGGCA ATGGAGTATT GGTTGGAGA5 TTCTGATTTG ATTCACAIAA 1151 GGAAAGAAGC AGGAGTTAAA GATCCTTACT tKACTCCTCC AOCTGGTTGG 1201 AAGCTTGCTG M>A£CCTTC TCAAGATCCT GTCTGCGCTG GAGAAAtCCG 1251 TGACATCAGA GAAGAATTAG CTAGCCTGAA AAGAGAATTG AAGAAACTTG 1301 CGTCAAAGAA GGAA6AGGAG GAGCTTGTTA TCATGACTAC GCCTAATTCT 1351 TGTGTTACTA GTCAGAATGA TAXTCTGATG ACT'JCAGCAA AGGAAATCTA 1401 CGCTGATCTG CTGAAAAAGA AATACAAAAT TGAGGACCAG CTAGTGATTA 1451 TTGGAGAAAC CTTGCGTAAA ATGGAGGAAG ACATGGGATG GCTTAAGAAA 1501 ACAGTGGACG AGAACTATCC TAAAAAGCCA GStCTCAACAG AGACACOMT 1551 GCTACTAGAG GATTCACCAC CAATACAGAC ACTAGAAGGA GAAGTGAAGG 1601 TGGTGAACAA GGGTAACCAA ATCACAGAGT CACCTCAAAA CAGAGAAAAA 1651 GGAAGGAAGC ATGATCAACX AGAAAGATC5A CCACTTrCAC TAATAAGCAA 1701 CACTCGTTTC AGAATCTGC31 GGCCTGTGGG GATGTTCGCA TGGCCCC55AT 1751 TGCCTGCTCT TGCTGCTGCT ACTGATACTA ATGCTTCTTC G0CAA6TCAC 1801 A6ACAAGCCT ACCCATCCCC TTTTCCAGTC AAGGCACTTG CAGCTAAGCG 1851 TCCTCTTGGC TTGACGTTTC CCTTCACCAT CATACCCGAA GAAGCTCCCA 1901 AGAATCTCTT CAACGTTTGA

SEQ ID N(h63

Arsbidopsis DYAD; Locus #AT5G51330 Protein Sequence

1 MSSTMFVKRH PIRETTACKI SSP3SPTLNV AVAHIRAGSY YEIDASILP2

51 RSPEtILKSIR WMV3KITAS DVSLRZP3MF SLRSHF0YSR MNRWKPMKKR 101 SGGGLLPVfD EStTVMASBLA GDLLYRRIAP HELSMNRIISW GFWVSSSSPR 151 KK1FRREVV3 QPAYNTRLCK AASPEGKCSS ELKSGGM1KW GHRLRVQYQS 201 RHIDTRKHKB GBES3RVKDE VYKEEEMBKE EDDDDGNEIG GTKQEAKEIT 251 NGNRKRKLIE S3TERLAQKA KVYDQKKETQ IWYKPK3ER KFIDRNSVER 301 YKLAERRMLk VMKEKNAVFG N3ILRPELRS EAPKLIGDTG LLDHLLKHMA 351 GKVAPGGODR FMRKHNADGA MEZWLES3DL IHIRKEAGVK DPYWTPPPGW 401 KLGDUP3QDP VCAGEIRDIR EELA3LKREL KKLA3KKEBE ELVIMTTPNS 451 CVTSQNDRLM TPAKEIYADL LKKKYKIEDQ LVIIGETLPX MEEDMGWLKK 501 TVCEHYPKKP D3TETPLLLE D3PPIQTLEG EVKWHKGUQ ITESPQNREK 551 GRKHDQQER3 PLSLISNTGF RICRPVGMFA WPQLPALAAA TDTHA3SP3H 601 RQAYPSPFPV KPLAAKRPLG LTFPFT1IPE EAPKHLFHV

SEQ ID N(k64

Rice homolots titXlDYAD SWITCH1; LOC Osl2t4M20 CDS Sequence

>LOC_Osl2g42820 . 1

ATGACGCCGCCGCCGTCAACGCCTCGCGCGTCATGCGCCGCCGCGCCGCCGGCGAGG AC CTGGGCGACGGCGACTTCTGGGCCGGTGGCGCCCCGCGCCTCTACGACTTCTCCCAGCAG GAGCALGAAGCCGTTCTTGCCGGCGCCGCCGTCGCCCGCGCCCGTCCCGGCGTCGCCGCC G TCCCCCGCGGCGGAGTCGGTGGCGCCGTGCCTCXn'CACGCTGCAGTGCAGCGGCGTCGG G TGGGGGGTCAGGAAGCGGGTCAGGTACGTCGGGAGGCA£:cACCACCTCGCGCGcaACC AC GCTCCCGAACGCGCCGTGGACGCCGCGCXSGGACGACGACGAGGCGAGCTCCGCaAAGGC C AAGAA.CGAGAGCCCGAAGGAGGAGGCGGCGGCGGCAGAAGAAGA.CGA.CGA.CGTCGA ACAC AAGGTGGCGGTGCGCACCAGGTCGGAGGAGAAXyUKGAAGAAGJUiGAGGAGGAAGCGCG GC CSTGGCCOTGTCCGTGGCCA.TGGCGTCGCCAAGCGTCCCAAGAAGGAGGATGAGGAGGG G ACGAAGCTCTCGGCTCCCAAGGCCGAGCAGCTCGAGGAGGAGGAGGAGGGCGCCGCGGTG GCGGCGCCGAGCGGCATGATCGA.CCGGTGGAAGGCGACCCGGTACGCCACCGCGGAGGC G TCGCrGCrCGCCATCATGCGC^CCCACGGCGCGCGCGCCGGGAAGCCCGTCCCGCGCGCG GCGCTGCXSGGAGGAGGCGCGCGCCCACATCGGCGACACGGGCCTCCTCGACCACCTCCT C AGGCACATCGCCGACAAGGTGGCGCCCGGCGGCGCCGAGCGGTTCCGGCGGCGGCACAAC GCCGGCGGCGGGCTGGAGTACTGGCTGGAGCCCGCCGAGCTCGCCGCCGTGCGGCGGAAG GCCGGCGTGGCCGACCCGTACTGGGTGCCTCCTCCCGGATGGAAGCCAGGGGACCCCGTG TCGCCGGAGGGCTACTTGCTGGAGGTGAGGAAGCAGGTGGAGCAGCTCGCCGTTGAGCTC GCCGGCGTCAGAAGGCACATGGATCACCTCACTTCCAATGTGAGTCAAGTGGGCAAGGAA ATCAAATCTGAGGCTGAGAAGTCCTACAATACATGTCAGGGTGGGGACCCACCCTACCTT GACCGGATCTCGATCCGTGCCTTCGCCCGGAAGCTAAGCCGCAGCTCGTCCTTGAGCGGG ACCCAGCGCCGTCCCCCGCCTGACACGGGGGACAGCCCTGTCATTCCCTCATAA

SEQ ID NO:65

SWITCH !,• LOC_Osl2g42820 Protein Sequence

MTAAAVNASRVMRRRAAGEDLGDGDFWAGGAPRLYDFSQQEQKPFLPAPPS PAPVPAS PP SPAAESVAPCLLTLQCSGVGWGVRKRVRYVGRHHHLARHHAPERAVDAARDDDEASSAKA KNESPKEEAAAAEEDDDVEHKVAVRTTSEEKKKKRRRKRGRGRVRGHGVAKRPKKEDEEG TKLSAPKAEQLEEEEEGAAVAAPSGMIDRWKATRYATAEASLLAIMRAHGARAGKPVPRA ALREEARAHI GDTGLLDHLLRHIADKVAPGGAERFRRRHNAGGGLEYWLEPAELAAVRRK AGVADPYWVPPPGWKPGDPVSPEGYLLEVRKQVEQLAVELAGVRRHMDHLTSNVSQVGKE IKSEAEKSYNTCQGGDPPYLDRI SIRAFARKLSRS SSLSGTQRRPPPDTGDS PVI PS *

SEQ ID NO:66

Locus #LOC_Osl2g42830 CDS Sequence

>LOC_Os l2g42830 . 1

ATGACCGCCGCCGCCGTCAACGCCTCGCGCGTCATGCGCCGCCGCGCCGCAGGCGAG GAC CTGGGCGACGGCGGCGATGGCGACGGCGACTTCTGGGCCGGTGGCGCCCCCCGCCTCTAC GACTTCTCCCAGCAGGAGCAGAAGCCGTTCTTGCCCGCGCCCGCGCCCGCGCCGCCGTCG CCCGCGCCCGTCCCGGCGTCGCCGCCGTCCCCCGCGGCGGAGTCGGTGGCGCCGTGCCTC CTCACGCTGCAGTGCAGCGGCGTCGGGTGGGGTGTCAGGAAGCGGGTCCGGTACGTCGGG AGGCACCACCACCTCGCGCGCCACCACGCTCCCGAGCGCGCCGTGGACGCCGCGCGGGAC GACGACGAGGCGAGCTCCGCCAAGGCCAAGAACGAGAGCCCGAAGGAGGAGGCAGCGGCG GCAGAAGAAGACGACGACAACGT CG7XACACAAGGT GGC GGT GCC CAC CAC GT CGGAGGAG AAGAAGAGGAGGAGGAGGAGGAAGCGTGGCCGTGGCCGTGTCGGTGGCCATGGCGTCGCC AAGCGTCCCAAGAAGGAGGAGGAGGAGGAGGAGACGAAGCTCTCGGCTCCCAAGGCCGAG CAGCTCGAGGAGGAGGAGGGCGCCGCGGTGGCGGCGCCGAGCGGCATGATCGACCGGTGG AAGGCGACCCGGTACGCCACCGCGGAGGCGTCGCTGCTCGCCATCATGCGCGCCCGCGGC GCGCGCGCCGGGAAGCCCGTCCCGCGCGGGGCGCTGCGGGAGGAGGCGCGCGCCCACATT GGCGACACGGGCCTCCTCGACCACCTCCTCAGGCACATCGCCGACAAGGTGGCGCCCGGC GGCGCCGAGCGGTTCCGGCGGCGGCACAACGCCGGCGGCGGGCTGGAGTACTGGCTCGAG CCCGCCGAGCTCGCCGCCGTGCGGCGGAACGCCGGCGTGGCCGACCCGTACTGGGTGCCT CCTCCCGGATGGAAGCCAGGGGACCCCGTCTCGCCGGAGGGCTACTTGCTGGAGGTGAGG AAGCAGGTGGAGAAGCTCGCCGTGGAGCTCGCCGGCGTCAGAAGGCACATGGATCACCTC TCTTCCAATGTGAGTCAAGTGGGCAAGG7XAATCA7XATCTGAGGCTGAGAAATCCTACA AT ACATGCCAGGAGAAGTATGCCTGTATGGAGAAAGCCAATGGCAATCTGGAAAAGCAGCTT CTGTCCTTGGAGGAGAAGTATGAGAATGCAACACACGCAAATGGCGAGCTGAAGGAGGAG TTGTTGTTTCTCAAGGAGAAGTTTGTGAGTGTGGTCGAGAACAACACCAGACTGGAGCAC CAGCTGACTGCTTTATCCACTTCTTTCCTGTCTCTAAAGGAGGAACTGCTCTGGCTGGAA AAAGAAGAAGCT GAT CT GT AT GT CAAGGAACCAT GGGAAGAC GAC GAT GAAAAGCAAGAA CACGATGCCGGGAAAGAGGCGAAGGACGACGATGTCGCCGGCGTCAGTGCAGCCAACGAC CAGCCGGACGTCGACGGCGATGGCACCACCACCACCACCACCACCAGCAGCAATGGTGGC AGCGGGAAGAGAACATCGAGGAAGTGCAGCGTGCGCATCTCCAAGCCGCAGGGCGCGTTC CAGTGGCCGACGCCGAGCCTGCCGTTCTCGCCGGAGCTCGCCGCGCCGCCGTCGCCGCCG CTGACCCCGACGGCGCCCGTCGTCGCCGGCGCCGCCAACTTCGCCACCATGGACGAGCTC TACGAGTACATGATGGCCGGCGGCCTCCCCACGCCACCGTCCACCACCAGCAACGCCGGG AAGCTCCCCTCGCTGCCCGCCGCCACGGCCTGCGCCACGACGCCGCCGGTGAAGACGGCG GACGCCGCCGGCGACGTGGGCACCGAGCTGGCACTGGCCACTCCCGCCTACTGA SEQ ID NO:67

Locus #LQC_Qsl2g42830 Protein sequence

MTAAAVNASRVMRRRAAGEDLGDGGDGDGDFWAGGAPRLYDFSQQEQKPFLPAPAPA PPS PAPVPASPPS PAAESVAPCLLTLQCSGVGWGVRKRVRYVGRHHHLARHHAPERAVDAARD DDEAS SAKAKNES PKEEAAAAEEDDDNVEHKVAVPTTSEEKKRRRRRKRGRGRVGGHGVA KRPKKEEEEEETKLSAPKAEQLEEEEGAAVAAPSGMIDRWKATRYATAEASLLAIMRARG ARAGKPVPRGALREEARAHI GDTGLLDHLLRHIADKVAPGGAERFRRRHNAGGGLEYWLE PAELAAVRRNAGVADPYWVPPPGWKPGDPVSPEGYLLEVRKQVEKLAVELAGVRRHMDHL SSNVSQVGKEIKSEAEKSYNTCQEKYACMEKANGNLEKQLLSLEEKYENATHANGELKEE LLFLKEKFVSWENNTRLEHQLTALSTSFLSLKEELLWLEKEEADLYVKEPWEDDDEKQE HDAGKEAKDDDVAGVSAANDQPDVDGDGTTTTTTTSSNGGSGKRTSRKCSVRI SKPQGAF QWPTPSLPFS PELAAPPS PPLTPTAPWAGAANFATMDELYEYMMAGGLPTPPSTTSNAG KL P S L PAAT ACAT T P P VKT ADAAGD VGT E LALAT PAY *

SEQ ID NO:68

SWI1; LOCUS# LOC_Os03g44760 CDS Sequence

>LOC_Os 03g44760 . 1

ATGGACGCGGAGATGGCGGCTCCTGCGCTTGCGGCAGCTCATCTGCTGGACTCGCCC ATG AGGCCACAGGTGAGCAGATACTACTCCAAGAAGAGGGGTAGCAGCCACAGCAGAAATGGC AAGGATGATGCCAACCATGACGAGTCCAAGAACCAATCACCCGGCTTGCCCCTGAGCAGA CAGAGCCTGTCCTCATCTGCCACCCACACCTACCACACCGGAGGGTTCTACGAGATCGAC CACGAGAAGCTTCCCCCCAAATCCCCAATTCATCTCAAGTCCATACGCGTGGTAAAGGTG AGCGGCTACACAAGCCTGGACGTCACAGTGAGCTTCCCGTCCCTCCTGGCGCTGCGAAGC TTCTTCTCCTCCTCCCCACGGTCGTGCACTGGGCCGGAGCTCGACGAGCGCTTCGTCATG AGCAGCAACCACGCGGCCCGCATCCTGCGCCGTCGGGTGGCCGAGGAGGAGCTCGCGGGC GACGTGATGCACCAGGACAGCTTCTGGCTCGTCAAGCCCTGCCTCTATGACTTCTCCGCG TCGTCACCACATGATGTGCTGACCCCGTCGCCGCCGCCTGCCACAGCGCAGGCGAAGGCG CCGGCAGCCAGTTCCTGCCTTCTCGACACCTTGAAGTGCGACGGCGCCGGGTGGGGCGTG AGGCGCCGTGTCAGGTACATTGGTCGCCACCACGATGCTTCCAAGGAGGCCAGCGCTGCC AGCCTCGATGGCTACAACACAGAGGTCAGCGTCCAGGAGGAGCAGCAGCAGCGACTGCGG CTTCGACTGCGGTTGCGACAACGCCGGGAGCAGGAAGACAACAAGAGCACTAGCAATGGC AAGAGGAAGC GGGAGGAGGCAGAGAGCAGCAT GGACAAGAGCAGAGC CGC CAGGAAGAAG AAAGCCAAGACTTACAAGAGTCCCAAGAAGGTGGAGAAGAGGCGCGTCGTGGAGGCTAAA GACGGCGACCCTCGGCGCGGCAAGGACCGGTGGTCGGCCGAGCGGTACGCAGCGGCGGAG AGGAGCCTGCTGGATATAATGCGCTCCCATGGTGCCTGCTTCGGTGCGCCGGTGATGCGG CAGGCTCTGCGGGAGGAAGCCCGCAAGCATATCGGTGACACCGGCCTCCTTGACCACCTG CTCAAGCACATGGCCGGCAGGGTACCGGAAGGCAGCGCGGACCGGTTCCGTCGCCGGCAC AATGCGGATGGTGCCATGGAGTACTGGCTGGAGCCGGCGGAGCTTGCCGAGGTACGGCGG CTGGCTGGAGTGTCTGATCCATACTGGGTGCCGCCACCTGGGTGGAAGCCAGGTGATGAC GTGTCCGCAGTCGCCGGTGACCTCCTGGTCAAGAAGAAGGTGGAAGAGCTCGCTGAGGAG GTTGATGGTGTAAAAAGGCACATCGAGCAGCTCAGTTCTAATTTGGTGCAGCTGGAGAAG GAAACAAAATCTGAGGCAGAGCGATCTTACAGCTCTAGGAAGGAGAAGTATCAGAAGTTG AT GAAGGCAAAT GAAAAGCT CGAGAAACAGGT GTTATCTAT GAAGGACAT GTAT GAGCAT CTGGTTCAGAAAAAGGGTAAGCTGAAGAAGGAGGTGCTGTCCTTGAAGGATAAATACAAG CTTGTGCTGGAGAAGAATGATAAACTGGAGGAACAGATGGCTAGTCTCTCCAGCTCCTTC CTTTCTTTGAAGGAACAATTGCTGCTGCCAAGAAATGGAGATAATCTGAACATGGAAAGG GAAAGGGTGGAAGTGACTTTGGGCAAGCAAGAAGGCCTTGTTCCCGGCGAACCACTGTAT GTTGATGGTGGTGACCGGATCAGCCAGCAAGCAGATGCCACCGTCGTCCAAGTCGGCGAG AAGAGGACGGCGAGGAAGAGCAGCTTCCGCATCTGCAAGCCACAGGGAACGTTCATGTGG CCACACATGGCGTCTGGCACGAGCATGGCCATCAGTGGGGGAGGCAGCAGCAGCTGCCCT GTCGCCTCCGGGCCAGAGCAGCTCCCTCGCAGCAGCAGCTGCCCCAGCATTGGGCCTGGT GGCCTCCCGCCGTCGTCACGAGCCCCAGCCGAGGTGGTGGTCGCGTCGCCACTGGACGAG CACGTGGCGTTCCGCGGGGGCTTCAACACGCCGCCCTCGGCATCGTCCACCAACGCCGCC GCTGCCGCCAAGCTGCCTCCCCTGCCCAGCCCGACGTCACCTCTCCAGACACGGGCCCTG TTCGCCGCTGGCTTCACTGTCCCGGCATTACACAACTTCTCCGGCCTCACCTTACGCCAT GTGGACTCCTCGTCGCCGTCGTCCGCGCCATGCGGTGCTAGGGAGAAGATGGTGACCCTG TTCGATGGAGACTGCCGGGGGATCAGCGTCGTGGGCACCGAGCTGGCACTGGCCACTCCG TCCTACTGCTGA

SEQ ID NO:69

SWI1; LOCUS# LOC_Os03g44760 Protein sequence

MDAEMAAPALAAAHLLDSPMRPQVSRYYSKKRGSSHSRNGKDDANHDESKNQSPGLP LSR QSLSSSATHTYHTGGFYEIDHEKLPPKSPIHLKSIRWKVSGYTSLDVTVSFPSLLALRS FFSSSPRSCTGPELDERFVMSSNHAARILRRRVAEEELAGDVMHQDSFWLVKPCLYDFSA S S PHDVLT P S P P PATAQAKAPAAS S CLLDT LKCDGAGWGVRRRVRYI GRHHDAS KEAS AA SLDGYNTEVSVQEEQQQRLRLRLRLRQRREQEDNKSTSNGKRKREEAESSMDKSRAARKK KAKTYKSPKKVEKRRWEAKDGDPRRGKDRWSAERYAAAERSLLDIMRSHGACFGAPVMR QALREEARKHIGDTGLLDHLLKHMAGRVPEGSADRFRRRHNADGAMEYWLEPAELAEVRR LAGVSDPYWVPPPGWKPGDDVSAVAGDLLVKKKVEELAEEVDGVKRHIEQLSSNLVQLEK ETKSEAERSYSSRKEKYQKLMKANEKLEKQVLSMKDMYEHLVQKKGKLKKEVLSLKDKYK LVLEKNDKLEEQMASLSSSFLSLKEQLLLPRNGDNLNMERERVEVTLGKQEGLVPGEPLY VDGGDRISQQADATWQVGEKRTARKSSFRICKPQGTFMWPHMASGTSMAI SGGGSSSCP VASGPEQLPRSSSCPSIGPGGLPPSSRAPAEVWASPLDEHVAFRGGFNTPPSASSTNAA AAAKLPPLPSPTSPLQTRALFAAGFTVPALHNFSGLTLRHVDSSSPSSAPCGAREKMVTL FDGDCRGI SWGTELALATPSYC

SEQ ID NO:70

ECS1 PROMOTER. Sense strand: gagatttgggaaatgtgcaatttgggtttatctggttttgttgttttggttatttagttt tgagtccggtttgaagaaatgcttcatagatatataa atgtaaccagaaatataaataaatcataactaagtagtattatatttttgttaagcttaa ttatttcaattccaagtctttcctaagaatttgttgaa aatttataatttacgttacactttgtaaaatcagaacgatccaattcacaagataatgct acgactctgtttttctttaaaaataaatcaataa tcatctccactaaacctctcaataacttagcagtcttaatgaaatttaaagctaatctat caatcacattctcaccacgtcgccaaactcgttg ccgtttcaatcttttcaagttccctttcatgctgtttataaacccttgcactctttcact cacagactcactacaagtctacaccacaaacttac caaatcatccaaaa

SEQ ID NO: 71

ECS2 PROMOTER Sense strand: acacgttatggaaagcaaagaataacaaaagtaatattcttacctatcttttttagttgg aaaacttgcattgtgtaacgtattcaaacattttc gaaatggtttattggtttttgtatataattaatttgggttaaactgatatattttatgag ataaatatagaatctcatgtgcttatacaaagcaact atattaattttgttaaccgtaagttacaaaaacagtggtcggaggaaatcaggaaaataa aaagagagaaaagagtctacacaatgggc caattattataagtaaatgatagtcatgaaagcccatttcagaagaagatcttttggaaa tgagaatagtgctctaggctcactggtccttttt actattggtatagaaactgtcaaagcccaacaggtttaaactagcatttcaggcgctgta attcttctgcagtttgtttgtataaacttggaat atggatgggttaaacactgatatttcttcactcgttttgtctcacatactgttcgatgct taacacaggtcttaaAAAGAAACTGG GTTTGATGTCTCAAATCTACTCAAGAAAGAAAGATATCTTGAGTTTGCATCGAGA CAGAAAAGAGTACGACTATACATAGCTGCTGGGGTAGTAGCCTGCAGAGAATAC AGATTTTGAACCACGTACAAGGAACCAAATCAGTGTATGTATCTAACTATTAACC TTGTGGTGTGATCTTGTCCTCTTAGGTATTGTGGAATCCTTGTAGGAAATGTCATG GCAACTCATTAGTCATCTTGAACCAAATGAGATGATACATGATGGTCTCAAATTG GACATGGTGGCACCTTTTGTTTCGTGAGTGGCTTTCAATTTATCTCCATAGAAATT GTTTAATTTTCGTTATTGGTGCCTGTCAATAAAAATTACAAACATATGCAGAGCG TTGGATTCGTGGATCGTTGAACATCCTATTGAGAGACAGGGCCAGCCTCCTAATT GTATGACATCGTCTCTTTCACAATATACTCATTAAATGAGAGGTTGAGATTTGAC TTATTTGCTTTATACAGCCTGCACAGTGTGGAAGACCCCTCTAAAGACTGAACTG GGGACAGCAACAATGGGAATCTGAccatcctcatgacagtacctggaaagagtctcagaa gcttcaagttcagt acgcagcttgaccagtctttcagtcatatagccataggggttgaattagtgtccatcttc ccattgtgattaacgttctgatttagcatgcac cttcgaattaagtgaatctattaccatgtgaccaagccattgcattactaatataagcat atcacatttcccttttctccgtgccaactgaattt gaattattttccctcaacttaatcacatgttttcctcacggccaaaagtactctcagtgt tacatgattaccacaacaaatgatttaaactttga acttttgaagttatcgagcaacatggcaaatcctggtcctatatgacataacatgagttc ctctgcctattgtaaaattaggaaacacaaaa ccaaaatgattatatctggtattatagtgtggtgtataacatatactcacacaagatatg ctcttaagatgataaatgtctaatcttccaagtc ccaattttgaaaacgttgatattaatttcccctcaaccccactagcctcaaattaaatta gcagccttagtgtgaaattaaaagatagctaat gaattgcatttcagactttcacctccccactcacgtagctataactccttaccgtttcaa atctcttcacttccccaattttgttgtgtataaaaa cctcttctccacttcactctttccaccacaaactttctaaaactaatcaaca

SEQ ID NO:72:

Amborella trichopoda DWTl protein sequence

MASSNRHWPSMFKSKPCNQWQHDINSPLICQKPPFTAEERSPEPKPRWNPKPEQIRI LEAI FNSGMVN PP

REEIRRIRAQLQEYGQVGDANVFYWFQNRKSRSKHKHKQLHQSSAKPATPSPPTVPN QNYQPTPQSSQ TP

NSSSSSSEKSEASPVQLGS IKPGATVNVMEGLNAANSPTCSVNQVAYLGSQPEPSPLFFQTESGCEMS AF

SELANMLQQQEKMKMGHIAMNDILNGVGEGTANSNGCSGGGGRVTVFINEMAFEVGA GGRVNVREAFG EA

MLIHSSGHPVPTNEWGFTLQPLQHGHFYYLV

SEQ ID NO: 73:

Amborella DWTl protein conserved 67 amino acid domain

PEPKPRWNPKPEQIRILEAIFNSGMVNPPREEIRRIRAQLQEYGQVGDANVFYWFQN R KSRSKHKHK