Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DOMESTICATION AND IMPROVEMENT OF COCOA PLANT
Document Type and Number:
WIPO Patent Application WO/2022/185312
Kind Code:
A1
Abstract:
The present disclosure relates to conferring desirable domestication traits in Cocoa plants. More particularly, the current invention pertains to producing Cocoa plants with improved traits by manipulating genes controlling plant architecture and metabolite expression.

Inventors:
MARGALIT IDO (IL)
SHERMAN TAL (IL)
COREM SHIRA (IL)
Application Number:
PCT/IL2022/050234
Publication Date:
September 09, 2022
Filing Date:
March 02, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BETTERSEEDS LTD (IL)
International Classes:
C12N15/82; A01H5/00; C12N5/14; C12N15/74
Domestic Patent References:
WO2019211750A12019-11-07
WO1998036053A21998-08-20
Foreign References:
US20090320163A12009-12-24
US20110041195A12011-02-17
Other References:
PREWITT SARAH F, AYRE BRIAN G, MCGARRY ROISIN C: "Cotton CENTRORADIALIS/TERMINAL FLOWER 1/SELF-PRUNING genes functionally diverged to differentially impact plant architecture", JOURNAL OF EXPERIMENTAL BOTANY, OXFORD UNIVERSITY PRESS, GB, vol. 69, 1 January 2018 (2018-01-01), GB , pages 5403 - 5417, XP055967356, ISSN: 0022-0957, DOI: 10.1093/jxb/ery324
XIE, KABIN ET AL.: "RNA-guided genome editing in plants using a CRISPR-Cas system", MOLECULAR PLANT, vol. 6, no. 6, 17 August 2013 (2013-08-17), pages 1975 - 1983, XP055154141, DOI: 10.1093/mp/sst119
OGITA, SHINJIRO ET AL.: "Application of RNAi to confirm theobromine as the major intermediate for caffeine biosynthesis in coffee plants with potential for construction of decaffeinated varieties", PLANT MOLECULAR BIOLOGY, vol. 54, no. 6, 1 April 2004 (2004-04-01), pages 931 - 941, XP019262473, DOI: 10.1007/s11103-004-0393-x
Attorney, Agent or Firm:
HAGAI, Keren et al. (IL)
Download PDF:
Claims:
CLAIMS

1.A modified Cocoa plant exhibiting at least one improved domestication trait as compared with a corresponding Cocoa plant lacking the modification, wherein said modified Cocoa plant comprises at least one mutated Cocoa (Theobroma cacao) gene selected from SELF PRUNING (SP), Caffeine synthase and/or Theobromine synthase gene.

2.The modified Cocoa plant according to claim 1, wherein said at least one Cocoa (Theobroma cacao) SELF PRUNING (SP) gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1, SEQ ID NO :142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:2, SEQ ID NO :143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, respectively, or a functional variant thereof.

3.The modified Cocoa plant according to claim 1, wherein said at least one Cocoa (Theobroma cacao) Caffeine synthase gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID N0:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, respectively, or a functional variant thereof.

4.The modified Cocoa plant according to claim 1, wherein said at least one Cocoa (Theobroma cacao) Theobromine synthase gene comprises a nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1663, SEQ ID NO:1858, SEQ ID NO:2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1664, SEQ ID NO: 1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534, respectively, or a functional variant thereof.

5.The modified Cocoa plant according to claim 1, wherein said mutation is introduced using mutagenesis, small interfering RNA (siRNA), microRNA (miRNA), artificial miRNA (amiRNA), DNA introgression, endonucleases or any combination thereof.

6.The modified Cocoa plant according to claim 1, wherein said mutation is introduced using targeted genome modification.

7.The modified Cocoa plant according to claim 6, wherein said mutation is introduced using CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and CRISPR-associated (Cas) gene (CRISPR/Cas), Transcription activator-like effector nuclease (TALEN), Zinc Finger Nuclease (ZFN), meganuclease or any combination thereof.

8.The modified Cocoa plant according to claim 7, wherein said Cas gene is selected from the group consisting of Cas9, Casl2, Casl3, Casl4, CasX, CasY, Csn1, Cpf1 and any combination thereof .

9.The modified Cocoa plant according to claim 1, wherein the at least one mutated Cocoa gene is a CRISPR/Cas9- induced heritable mutated allele.

10.The modified Cocoa plant according to claim 1, wherein said mutation is a missense mutation, nonsense mutation, insertion, deletion, indel, substitution or duplication.

11.The modified Cocoa plant of claim 10, wherein the insertion or the deletion produces a gene comprising a frameshift.

12.The modified Cocoa plant of claim 1, wherein said plant is homozygous for said at least one Cocoa SP, Caffeine synthase and/or Theobromine synthase mutated gene.

13.The modified Cocoa plant according to claim 1, wherein said mutation is in the coding region of said gene, a mutation in the regulatory region of said gene, or an epigenetic factor.

14.The modified Cocoa plant according to claim 1, wherein said mutation is a silencing mutation, a knockdown mutation, a knockout mutation, a loss of function mutation or any combination thereof.

15.The modified Cocoa plant according to claim 1 wherein said mutation is generated in planta.

16.The modified Cocoa plant according to claim 1 wherein said mutation in said at least one Cocoa (Theobroma cacao) SP gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:3-141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413-675, SEQ ID N0:678-900 and SEQ ID NO:903-1003, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:3-141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413-675, SEQ ID N0:678-900 and SEQ ID NO:903-1003.

17.The modified Cocoa plant according to claim 1 wherein said mutation in said at least one Cocoa (Theobroma cacao) Caffeine synthase gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:1006-1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO:1502-1662, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:1006-1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO:1502-1662.

18.The modified Cocoa plant according to claim 1, wherein said mutation in said at least one Cocoa (Theobroma cacao) Theobromine synthase gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:1665- 1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:1665-1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666.

19.The modified Cocoa plant according to any one of claims 16- 18 wherein said gRNA sequence comprises a 3' NGG Protospacer Adjacent Motif (PAM).

20.The modified Cocoa plant according to any one of claims 16- 18 wherein said construct is introduced into the plant cells via Agrobacterium infiltration, virus-based plasmids for delivery of the genome editing molecules, or mechanical insertion such as polyethylene glycol (PEG) mediated DNA transformation, electroporation or gene gun biolistics.

21.The modified Cocoa plant according to claim 1, wherein said plant has decreased expression levels of at least one of said Cocoa SP genes and/or said Cocoa Caffeine and/or Theobromine synthase genes.

22.The modified Cocoa plant according to claim 1, wherein said mutation confers reduced expression of said at least one SP gene and/or said at least one Cocoa Caffeine and/or Theobromine synthase gene.

23.The modified Cocoa plant according to claim 1, wherein said plant is semi-determinant.

24.The modified Cocoa plant according to claim 1, wherein said plant has determinant growth habit.

25.The modified Cocoa plant according to claim 1, wherein said plant flowers earlier than a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

26.The modified Cocoa plant according to claim 1, wherein said plant exhibits improved earliness as compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

27.The modified Cocoa plant according to claim 1, wherein said plant exhibits suppressed and/or similar sympodial shoot termination as compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

28.The modified Cocoa plant according to claim 1, wherein said plant exhibits suppressed or reduced day-length sensitivity as compared to a corresponding wild type or control Cocoa plant lacking said mutation.

29.The modified Cocoa plant according to claim 1, wherein said plant exhibits suppressed and/or no Caffeine and/or Theobromine synthase gene expression compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated Caffeine synthase and/or Theobromine synthase gene.

30.The modified Cocoa plant according to claim 1, wherein said domestication trait is selected from the group consisting of reduced flowering time, earliness, synchronous flowering, reduced day-length sensitivity, determinant or semideterminant architecture, early termination of sympodial cycling, earlier axillary shoot flowering, compact growth habit, reduced height, reduced number of sympodial units, adaptation to mechanical harvest, higher harvest index, reduced Caffeine level or concentration, reduced Theobromine level or concentration, and any combination thereof.

31.A Cocoa plant, plant part, plant tissue or plant cell according to claim 1 wherein said plant does not comprise a transgene .

32.A plant part, plant cell, plant tissue or plant seed of a plant according to claim 1.

33.A tissue culture of regenerable cells, protoplasts or callus obtained from the modified Cocoa plant according to claim 1.

34.A method for producing a modified Cocoa plant exhibiting at least one improved domestication trait as compared with a corresponding Cocoa plant lacking the modification, said method comprises steps of genetically modifying at least one Cocoa (Theobroma cacao) gene selected from SELF PRUNING (SP), Caffeine synthase, and Theobromine synthase gene, the resultant mutated gene has reduced expression level.

35.The method according to claim 34, wherein said method comprises steps of genetically modifying the at least one Cocoa (Theobroma cacao) gene using targeted genome editing introducing a loss of function mutation in the at least one Cocoa (Theobroma cacao) gene selected from SELF PRUNING (SP), Caffeine synthase, and Theobromine synthase gene.

36.The method according to any one of claims 34-35, wherein said method comprises steps of: a.identifying at least one Cocoa (Theobroma cacao) SP gene and/or Caffeine synthase and/or Theobromine synthase gene allele; b. synthetizing at least one guide RNA (gRNA) comprising a nucleotide sequence complementary to said at least one identified gene allele; c. transforming Cocoa plant cells with a construct comprising (a) Cas nucleotide sequence operably linked to said at least one gRNA, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and said at least one gRNA; d. screening the genome of said transformed plant cells for induced targeted loss of function mutation in at least one of said Cocoa SP gene and/or Cocoa Caffeine and/or Theobromine synthase gene allele; e.regenerating Cocoa plants carrying said loss of function mutation in at least one of said Cocoa SP gene and/or Cocoa Caffeine and/or Theobromine synthase gene allele; and f. screening said regenerated plants for a Cocoa plant with improved domestication trait.

37.The method according to any one of claims 34-36, wherein said step of screening the genome of said transformed plant cells for induced targeted loss of function mutation further comprises steps of obtaining a nucleic acid sample of said transformed plant and performing a nucleic acid amplification and optionally restriction enzyme digestion to detect a mutation in said at least one of said Cocoa SP genes and/or Cocoa Caffeine and/or Theobromine synthase gene allele.

38.The method according to any one of claims 34-37, wherein said at least one Cocoa (Theobroma cacao) SELF PRUNING (SP) gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1, SEQ ID NO :142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:2, SEQ ID NO :143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, respectively, or a functional variant thereof.

39.The method according to any one of claims 34-38, wherein said at least one Cocoa (Theobroma cacao) Caffeine synthase gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID N0:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, respectively, or a functional variant thereof .

40.The method according to any one of claims 34-39, wherein said at least one Cocoa (Theobroma cacao) Theobromine synthase gene comprises a nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1663, SEQ ID NO:1858, SEQ ID NO:2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1664, SEQ ID NO:1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534, respectively, or a functional variant thereof .

41.The method according to any one of claims 34-40, wherein said mutation is introduced using mutagenesis, small interfering RNA (siRNA), microRNA (miRNA), artificial miRNA (amiRNA), DNA introgression, endonucleases or any combination thereof.

42.The method according to any one of claims 34-41, wherein said mutation is introduced using targeted genome modification.

43.The method according to claim 42, wherein said mutation is introduced using CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and CRISPR-associated (Cas) gene (CRISPR/Cas), Transcription activator-like effector nuclease (TALEN), Zinc Finger Nuclease (ZFN), meganuclease or any combination thereof.

44.The method according to claim 43, wherein said Cas gene is selected from the group consisting of Cas9, Casl2, Casl3, Casl4, CasX, CasY, Csnl, Cpfl and any combination thereof.

45.The method according to any one of claims 34-44, wherein the at least one mutated Cocoa gene is a CRISPR/Cas9- induced heritable mutated allele.

46.The method according to any one of claims 34-45, wherein said mutation is a missense mutation, nonsense mutation, insertion, deletion, indel, substitution or duplication.

47.The method according to claim 46, wherein the insertion or the deletion produces a gene comprising a frameshift.

48.The method according to any one of claims 34-47, wherein said plant is homozygous for said at least one Cocoa SP, Caffeine synthase and/or Theobromine synthase mutated gene.

49.The method according to any one of claims 34-48, wherein said mutation is in the coding region of said gene, a mutation in the regulatory region of said gene, or an epigenetic factor.

50.The method according to any one of claims 34-49, wherein said mutation is a silencing mutation, a knockdown mutation, a knockout mutation, a loss of function mutation or any combination thereof.

51.The method according to any one of claims 34-50, wherein said mutation is generated in planta.

52.The method according to any one of claims 34-51, wherein said mutation in said at least one Cocoa (Theobroma cacao) SP gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:3-141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413-675, SEQ ID N0:678-900 and SEQ ID NO:903-1003, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:3-141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413-675, SEQ ID N0:678-900 and SEQ ID NO:903-1003.

53.The method according to any one of claims 34-52, wherein said mutation in said at least one Cocoa (Theobroma cacao) Caffeine synthase gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:1006-1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO:1502-1662, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:1006-1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO:1502-1662.

54.The method according to any one of claims 34-53, wherein said mutation in said at least one Cocoa (Theobroma cacao) Theobromine synthase gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:1665- 1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:1665-1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666.

55.The method according to any one of claims 52-54, wherein said gRNA sequence comprises a 3' NGG Protospacer Adjacent Motif (PAM).

56.The method according to any one of claims 52-54, wherein said construct is introduced into the plant cells via Agrobacterium infiltration, virus-based plasmids for delivery of the genome editing molecules, or mechanical insertion such as polyethylene glycol (PEG) mediated DNA transformation, electroporation or gene gun biolistics.

57.The method according to any one of claims 34-56, wherein said plant has decreased expression levels of at least one of said Cocoa SP genes and/or said Cocoa Caffeine and/or Theobromine synthase genes.

58.The method according to any one of claims 34-57, wherein said mutation confers reduced expression of said at least one SP gene and/or said at least one Cocoa Caffeine and/or

Theobromine synthase gene.

59.The method according to any one of claims 34-58, wherein said plant is semi-determinant.

60.The method according to any one of claims 34-59, wherein said plant has determinant growth habit.

61.The method according to any one of claims 34-60, wherein said plant flowers earlier than a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

62.The method according to any one of claims 34-61, wherein said plant exhibits improved earliness as compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

63.The method according to any one of claims 34-62, wherein said plant exhibits suppressed and/or similar sympodial shoot termination as compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

64.The method according to any one of claims 34-63, wherein said plant exhibits suppressed and/or no Caffeine and/or

Theobromine synthase gene expression compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated Caffeine synthase and/or Theobromine synthase gene.

65.The method according to any one of claims 34-64, wherein said domestication trait is selected from the group consisting of reduced flowering time, earliness, synchronous flowering, reduced day-length sensitivity, determinant or semideterminant architecture, early termination of sympodial cycling, earlier axillary shoot flowering, compact growth habit, reduced height, reduced number of sympodial units, adaptation to mechanical harvest, higher harvest index, reduced Caffeine level or concentration, reduced Theobromine level or concentration, and any combination thereof.

66.A Cocoa plant, plant part, plant tissue or plant cell produced by the method according to any one of claims 34-65, wherein said plant does not comprise a transgene.

67.A plant part, plant cell, plant tissue or plant seed of a plant produced by the method according to any one of claims 34-65 .

68.A tissue culture of regenerable cells, protoplasts or callus obtained from the modified Cocoa plant produced by the method according to any one of claims 34-65.

69.An isolated nucleotide sequence comprising at least 75% sequence identity to a Cocoa (Theobroma cacao) genomic sequence of (a) a SELF PRUNING (SP) gene comprising a sequence selected from SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, (b) a Caffeine synthase gene comprising a sequence selected from SEQ ID NO:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, and (c) a Theobromine synthase gene comprising a sequence selected from SEQ ID NO:1663, SEQ ID NO:1858, SEQ ID NO:2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533.

70.An isolated polypeptide sequence comprising at least 75% identity to a Cocoa (Theobroma cacao) polypeptide of (a) a SELF PRUNING (SP) gene encoding a polypeptide sequence selected from SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, (b) a Caffeine synthase gene encoding a polypeptide sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and (c) a Theobromine synthase gene comprising a sequence selected from SEQ ID NO:1664, SEQ ID NO:1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534.

71.An isolated nucleotide sequence comprising at least 75% sequence identity to a Cocoa (Theobroma cacao) (a) SELF PRUNING (SP) gene -targeted gRNA sequence selected from the group consisting of SEQ ID NO:3-141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413-675, SEQ ID N0:678-900 and SEQ ID NO:903-1003, (b) Caffeine synthase gene -targeted gRNA sequence selected from the group consisting of SEQ ID NO:1006- 1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO:1502-1662, and (c) Theobromine synthase gene -targeted gRNA sequence selected from the group consisting of SEQ ID NO:1665- 1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666.

72.Products derived from a modified Cocoa plant according to any one of claims 1-31 and/or from harvestable parts of a modified Cocoa plant according to any one of claims 1-31.

73.Use of a nucleic acid encoding a polypeptide comprising at least 75% sequence identity to a sequence as defined in SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO :677, SEQ ID NO:902, SEQ ID N0:1005, SEQ ID NO:1188, SEQ ID NO:1343, SEQ ID NO:1501, SEQ ID NO:1664, SEQ ID NO:1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534 in enhancing domestication traits in Cocoa plants.

Description:
DOMESTICATION AND IMPROVEMENT OF COCOA PLANT

FIELD OF THE INVENTION

The present disclosure relates to conferring desirable domestication traits in Cocoa plants. More particularly, the current invention pertains to producing Cocoa plants with improved traits by manipulating genes controlling plant architecture and metabolite expression.

BACKGROUND OF THE INVENTION

One of the most important determinants of crop productivity is plant architecture. For many crops, artificial selection for modified shoot architectures provided critical steps towards improving yield, followed by improvements enabling large- scale field production. A prominent example is tomato, in which the discovery of a mutation in the antiflorigen-encoding self-pruning gene (sp), led to determinate plants that provided a burst of flowering and synchronized fruit ripening, permitting mechanical harvesting.

The publication of Li et al (2018), nature biotechnology, "Domestication of wild tomato is accelerated by genome editing", teach the assembly of a set of six gRNAs to edit four genes (S1CLV3, S1WUS, SP and SP5G), into one construct. The construct was transformed into four 5.pimpinellifolium accessions, all of which are resistant to bacterial spot disease, and two of which are salt tolerant. Small indels and large insertions have been identified in the targeted regulatory regions of SlCLV3 and SlWUS in TO and their T1 mutant plants. It was reported in this publication that although SP and SP5G are crucial for improving the harvest index, the limited allelic variation has hampered efforts to optimize this trait. It was further reported that locule number was not increased in T0 and T1 plants with large insertions and inversions in the targeted SlCLV3 promoter region. One explanation for this finding is that the targeted region of the SlCLV3 promoter may not be essential for regulating SlCLV3 transcription. Alternatively, it was suggested that disruption of regions (gRNA-5) flanking the CArG element downstream of SlWUS may have decreased its transcription and counteracted the effects of mutation of SlCLV3, owing to a negative feedback loop of CLV3-WUS in controlling stem cell proliferation.

The publication of Zsögön et al (2018), nature biotechnology, "De novo domestication of wild tomato using genome editing", discloses a devised CRISPR-Cas9 genome engineering strategy to combine agronomically desirable traits with useful traits presented in Solanum pimpinellifolium wild lines. The four edited genes were SELF-PRUNING (SP), OVATE (0), FRUIT WEIGHT 2.2 (FW2.2) and LYCOPENE BETA CYCLASE (CycB).

Lemmon et al (2018), Nature Plants, "Rapid improvement of domestication traits in an orphan crop by genome editing", describes the usage of CRISPR-Cas9 to mutate orthologues of tomato domestication and improvement genes that control plant architecture, flower production and fruit size in the orphan Solanaceae crop 'groundcherry' (Physalis pruinosa).

In open field crops, such as Legumes and Cocoa, it is essential for enabling sustainable agriculture, to be able to harvest the plants mechanically, instead of manual labor. Furthermore, resistance to plant diseases, such as fungi and viruses are also essential in order to maintain profitability and sustainability for open field crops. Certain open field crops, including soybean, have been domesticated through conventional breeding to express determinate characteristics that enable easier harvesting via machinery.

Certain legumes, such as peanuts and/or cowpea, and also the cocoa plant, due to lack of genetic diversity or investment in breeding, do not exhibit this trait.

In addition to plant habit adaptation, certain crops possess metabolites which are either considered as harmful or undesired for human and animal consumption, such as phytoestrogens in Soybeans and Caffeine in coffee-beans. Cocoa plants also express metabolites which at certain levels are considered to be toxic to the general population and specifically to children and mammalian animals, whom are susceptible to such metabolites, such as Theobromine and Caffeine. Theobromine exists in trace amounts in processed chocolate, and more in darker chocolate than light chocolate, but it is highly toxic to some mammalian animals, namely pets such as dogs and cats and is also harmful to children. Caffeine also exists in Cocoa, in lesser amounts than in Coffee beans, but at concentrations that when combined with Theobromine can cause adverse effects if consumed in relatively large quantities, especially for children.

In view of the above, there is still a long felt and unmet need to manipulate Cocoa plant architecture and metabolites level in a rapid and efficient way to increase yield, reduce production costs, and improve the nutritional content.

SUMMARY OF THE INVENTION

It is one object of the present invention to disclose a modified Cocoa plant exhibiting at least one improved domestication trait as compared with a corresponding Cocoa plant lacking the modification, wherein said modified Cocoa plant comprises at least one mutated Cocoa (Theobroma cacao) gene selected from SELF PRUNING (SP), Caffeine synthase and/or Theobromine synthase gene.

It is another object of the present invention to disclose the modified Cocoa plant as defined above, wherein said at least one Cocoa (Theobroma cacao) SELF PRUNING (SP) gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, respectively, or a functional variant thereof.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said at least one Cocoa (Theobroma cacao) Caffeine synthase gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, respectively, or a functional variant thereof.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said at least one Cocoa (Theobroma cacao) Theobromine synthase gene comprises a nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1663, SEQ ID NO: 1858, SEQ ID NO:2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1664, SEQ ID NO:1859, SEQ ID NO: 2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534, respectively, or a functional variant thereof. It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation is introduced using mutagenesis, small interfering RNA (siRNA), microRNA (miRNA), artificial miRNA (amiRNA), DNA introgression, endonucleases or any combination thereof.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation is introduced using targeted genome modification.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation is introduced using CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and CRISPR-associated (Cas) gene (CRISPR/Cas), Transcription activator-like effector nuclease (TALEN), Zinc Finger Nuclease (ZFN), meganuclease or any combination thereof.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said Cas gene is selected from the group consisting of Cas9, Casl2, Casl3, Casl4, CasX, CasY, Csnl, Cpfl and any combination thereof.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein the at least one mutated Cocoa gene is a CRISPR/Cas9- induced heritable mutated allele.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation is a missense mutation, nonsense mutation, insertion, deletion, indel, substitution or duplication.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein the insertion or the deletion produces a gene comprising a frameshift. It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said plant is homozygous for said at least one Cocoa SP, Caffeine synthase and/or Theobromine synthase mutated gene.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation is in the coding region of said gene, a mutation in the regulatory region of said gene, or an epigenetic factor.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation is a silencing mutation, a knockdown mutation, a knockout mutation, a loss of function mutation or any combination thereof.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation is generated in planta.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation in said at least one Cocoa (Theobroma cacao) SP gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:3-141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413-675, SEQ ID N0:678-900 and SEQ ID N0:903-1003, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:3-141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413-675, SEQ ID NO:678-900 and SEQ ID N0:903-1003.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation in said at least one Cocoa (Theobroma cacao) Caffeine synthase gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:1006-1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO :1502-1662, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:1006- 1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO:1502- 1662.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation in said at least one Cocoa (Theobroma cacao) Theobromine synthase gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:1665-1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:1665-1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said gRNA sequence comprises a 3' NGG Protospacer Adjacent Motif (PAM).

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said construct is introduced into the plant cells via Agrobacterium infiltration, virus-based plasmids for delivery of the genome editing molecules, or mechanical insertion such as polyethylene glycol (PEG) mediated DNA transformation, electroporation or gene gun biolistics. It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said plant has decreased expression levels of at least one of said Cocoa SP genes and/or said Cocoa Caffeine and/or Theobromine synthase genes.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said mutation confers reduced expression of said at least one SP gene and/or said at least one Cocoa Caffeine and/or Theobromine synthase gene.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said plant is semi-determinant.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said plant has determinant growth habit.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said plant flowers earlier than a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said plant exhibits improved earliness as compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said plant exhibits suppressed and/or similar sympodial shoot termination as compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene. It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said plant exhibits suppressed or reduced day-length sensitivity as compared to a corresponding wild type or control Cocoa plant lacking said mutation.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said plant exhibits suppressed and/or no Caffeine and/or Theobromine synthase gene expression compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated Caffeine synthase and/or Theobromine synthase gene.

It is another object of the present invention to disclose the modified Cocoa plant as defined in any of the above, wherein said domestication trait is selected from the group consisting of reduced flowering time, earliness, synchronous flowering, reduced day-length sensitivity, determinant or semi-determinant architecture, early termination of sympodial cycling, earlier axillary shoot flowering, compact growth habit, reduced height, reduced number of sympodial units, adaptation to mechanical harvest, higher harvest index, reduced Caffeine level or concentration, reduced Theobromine level or concentration, and any combination thereof.

It is another object of the present invention to disclose a Cocoa plant, plant part, plant tissue or plant cell as defined in any of the above, wherein said plant does not comprise a transgene.

It is another object of the present invention to disclose a plant part, plant cell, plant tissue or plant seed of a plant as defined in any of the above. It is another object of the present invention to disclose a tissue culture of regenerable cells, protoplasts or callus obtained from the modified Cocoa plant as defined in any of the above.

It is another object of the present invention to a method for producing a modified Cocoa plant exhibiting at least one improved domestication trait as compared with a corresponding Cocoa plant lacking the modification, said method comprises steps of genetically modifying at least one Cocoa (Theobroma cacao) gene selected from SELF PRUNING (SP), Caffeine synthase, and Theobromine synthase gene, the resultant mutated gene has reduced expression level.

It is another object of the present invention to disclose the method as defined above, wherein said method comprises steps of genetically modifying the at least one Cocoa (Theobroma cacao) gene using targeted genome editing introducing a loss of function mutation in the at least one Cocoa (Theobroma cacao) gene selected from SELF PRUNING (SP), Caffeine synthase, and Theobromine synthase gene.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said method comprises steps of: (a) identifying at least one Cocoa (Theobroma cacao) SP gene and/or Caffeine synthase and/or Theobromine synthase gene allele; (b) synthetizing at least one guide RNA (gRNA) comprising a nucleotide sequence complementary to said at least one identified gene allele; (c) transforming Cocoa plant cells with a construct comprising (a) Cas nucleotide sequence operably linked to said at least one gRNA, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and said at least one gRNA; (d) screening the genome of said transformed plant cells for induced targeted loss of function mutation in at least one of said Cocoa SP gene and/or Cocoa Caffeine and/or Theobromine synthase gene allele; (e) regenerating Cocoa plants carrying said loss of function mutation in at least one of said Cocoa SP gene and/or Cocoa Caffeine and/or Theobromine synthase gene allele; and (f) screening said regenerated plants for a Cocoa plant with improved domestication trait.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said step of screening the genome of said transformed plant cells for induced targeted loss of function mutation further comprises steps of obtaining a nucleic acid sample of said transformed plant and performing a nucleic acid amplification and optionally restriction enzyme digestion to detect a mutation in said at least one of said Cocoa SP genes and/or Cocoa Caffeine and/or Theobromine synthase gene allele.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said at least one Cocoa (Theobroma cacao) SELF PRUNING (SP) gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, respectively, or a functional variant thereof.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said at least one Cocoa (Theobroma cacao) Caffeine synthase gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, respectively, or a functional variant thereof.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said at least one Cocoa (Theobroma cacao) Theobromine synthase gene comprises a nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1663, SEQ ID NO:1858, SEQ ID NO:2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1664, SEQ ID NO: 1859, SEQ ID NO: 2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534, respectively, or a functional variant thereof.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation is introduced using mutagenesis, small interfering RNA (siRNA), microRNA (miRNA), artificial miRNA (amiRNA), DNA introgression, endonucleases or any combination thereof.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation is introduced using targeted genome modification.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation is introduced using CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and CRISPR-associated (Cas) gene (CRISPR/Cas), Transcription activator-like effector nuclease (TALEN), Zinc Finger Nuclease (ZFN), meganuclease or any combination thereof.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said Cas gene is selected from the group consisting of Cas9, Cas12, Cas13, Cas14, CasX, CasY, Csn1, Cpf1 and any combination thereof.

It is another object of the present invention to disclose the method as defined in any of the above, wherein the at least one mutated Cocoa gene is a CRISPR/Cas9- induced heritable mutated allele.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation is a missense mutation, nonsense mutation, insertion, deletion, indel, substitution or duplication.

It is another object of the present invention to disclose the method as defined in any of the above, wherein the insertion or the deletion produces a gene comprising a frameshift.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said plant is homozygous for said at least one Cocoa SP, Caffeine synthase and/or Theobromine synthase mutated gene.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation is in the coding region of said gene, a mutation in the regulatory region of said gene, or an epigenetic factor.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation is a silencing mutation, a knockdown mutation, a knockout mutation, a loss of function mutation or any combination thereof.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation is generated in planta. It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation in said at least one Cocoa (Theobroma cacao) SP gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:3- 141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413-675, SEQ ID NO:678-900 and SEQ ID NO:903-1003, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:3-141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413-675, SEQ ID N0:678-900 and SEQ ID NO:903-1003.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation in said at least one Cocoa (Theobroma cacao) Caffeine synthase gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:1006-1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO:1502-1662, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:1006-1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO:1502-1662.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation in said at least one Cocoa (Theobroma cacao) Theobromine synthase gene is generated in planta via introduction of a construct comprising (a) Cas DNA and gRNA sequence selected from the group consisting of SEQ ID NO:1665-1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666, or (b) a ribonucleoprotein (RNP) complex comprising Cas protein and gRNA sequence selected from the group consisting of SEQ ID NO:1665-1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said gRNA sequence comprises a 3' NGG Protospacer Adjacent Motif (PAM).

It is another object of the present invention to disclose the method as defined in any of the above, wherein said construct is introduced into the plant cells via Agrobacterium infiltration, virus-based plasmids for delivery of the genome editing molecules, or mechanical insertion such as polyethylene glycol (PEG) mediated DNA transformation, electroporation or gene gun biolistics.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said plant has decreased expression levels of at least one of said Cocoa SP genes and/or said Cocoa Caffeine and/or Theobromine synthase genes.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said mutation confers reduced expression of said at least one SP gene and/or said at least one Cocoa Caffeine and/or Theobromine synthase gene.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said plant is semideterminant.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said plant has determinant growth habit.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said plant flowers earlier than a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene. It is another object of the present invention to disclose the method as defined in any of the above, wherein said plant exhibits improved earliness as compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said plant exhibits suppressed and/or similar sympodial shoot termination as compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated SP gene.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said plant exhibits suppressed and/or no Caffeine and/or Theobromine synthase gene expression compared to a corresponding wild type or control Cocoa plant lacking said at least one mutated Caffeine synthase and/or Theobromine synthase gene.

It is another object of the present invention to disclose the method as defined in any of the above, wherein said domestication trait is selected from the group consisting of reduced flowering time, earliness, synchronous flowering, reduced day-length sensitivity, determinant or semi-determinant architecture, early termination of sympodial cycling, earlier axillary shoot flowering, compact growth habit, reduced height, reduced number of sympodial units, adaptation to mechanical harvest, higher harvest index, reduced Caffeine level or concentration, reduced Theobromine level or concentration, and any combination thereof.

It is another object of the present invention to disclose a Cocoa plant, plant part, plant tissue or plant cell produced by the method as defined in any of the above, wherein said plant does not comprise a transgene. It is another object of the present invention to disclose a plant part, plant cell, plant tissue or plant seed of a plant produced by the method as defined in any of the above.

It is another object of the present invention to disclose a tissue culture of regenerable cells, protoplasts or callus obtained from the modified Cocoa plant produced by the method as defined in any of the above.

It is another object of the present invention to disclose an isolated nucleotide sequence comprising at least 75% sequence identity to a Cocoa (Theobroma cacao) genomic sequence of (a) a SELF PRUNING (SP) gene comprising a sequence selected from SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, (b) a Caffeine synthase gene comprising a sequence selected from SEQ ID NO:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, and (c) a Theobromine synthase gene comprising a sequence selected from SEQ ID NO:1663, SEQ ID NO:

1858, SEQ ID NO:2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533.

It is another object of the present invention to disclose an isolated polypeptide sequence comprising at least 75% identity to a Cocoa (Theobroma cacao) polypeptide of (a) a SELF PRUNING (SP) gene encoding a polypeptide sequence selected from SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, (b) a Caffeine synthase gene encoding a polypeptide sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and (c) a Theobromine synthase gene comprising a sequence selected from SEQ ID NO:1664, SEQ ID NO:

1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534. It is another object of the present invention to disclose an isolated nucleotide sequence comprising at least 75% sequence identity to a Cocoa (Theobroma cacao) (a) SELF PRUNING (SP) gene -targeted gRNA sequence selected from the group consisting of SEQ ID NO:3-141, SEQ ID NO:144-275, SEQ ID NO:278-410, SEQ ID NO:413- 675, SEQ ID N0:678-900 and SEQ ID N0:903-1003, (b) Caffeine synthase gene -targeted gRNA sequence selected from the group consisting of SEQ ID NO:1006-1186, SEQ ID NO:1189-1341, SEQ ID NO:1344-1499 and SEQ ID NO:1502-1662, and (c) Theobromine synthase gene -targeted gRNA sequence selected from the group consisting of SEQ ID NO:1665-1857, SEQ ID NO:1860-2064, SEQ ID NO:2067-2206, SEQ ID NO:2209-2378, SEQ ID NO:2381-2532 and SEQ ID NO:2535-2666.

It is another object of the present invention to disclose products derived from a modified Cocoa plant as defined in any of the above and/or from harvestable parts of a modified Cocoa plant as defined in any of the above.

It is another object of the present invention to disclose use of a nucleic acid encoding a polypeptide comprising at least 75% sequence identity to a sequence as defined in SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677, SEQ ID NO:902, SEQ ID NO:1005, SEQ ID NO:1188, SEQ ID NO:1343, SEQ ID NO:1501, SEQ ID NO:1664, SEQ ID NO:1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534 in enhancing domestication traits in Cocoa plants.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

In the following detailed description of the preferred embodiments, reference is made to the accompanying drawings that form a part hereof, and in which are shown by way of illustration specific embodiments in which the invention may be practiced. It is understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the present invention. The present invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the present invention is not unnecessarily obscured.

The present invention provides a modified Cocoa plant exhibiting at least one improved domestication (agronomic and/or nutritional trait) compared with wild type or control Cocoa, wherein said modified plant comprises at least one mutated Cocoa SELF PRUNING (SP) gene and/or mutated Cocoa Caffeine and/or Theobromine synthase gene.

According to main embodiments, the present invention provides a modified Cocoa plant exhibiting at least one improved domestication as compared with a corresponding Cocoa plant lacking the modification. The modified Cocoa plant comprises at least one mutated Cocoa (Theobroma cacao) gene selected from SELF PRUNING (SP), Caffeine synthase and/or Theobromine synthase gene.

It is within the scope that the at least one Cocoa (Theobroma cacao) SELF PRUNING (SP) gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, respectively, or a functional variant thereof.

It is further within the scope that the at least one Cocoa (Theobroma cacao) Caffeine synthase gene comprises a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, respectively, or a functional variant thereof.

It is further disclosed that the at least one Cocoa (Theobroma cacao) Theobromine synthase gene comprises a nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:

1663, SEQ ID NO:1858, SEQ ID NO:2065, SEQ ID NO:2207, SEQ ID

NO:2379 and SEQ ID NO:2533, and encodes a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:

1664, SEQ ID NO:1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID

NO:2380 and SEQ ID NO:2534, respectively, or a functional variant thereof.

The present invention further provides methods for producing the aforementioned modified Cocoa plant using genome editing or other genome modification techniques.

The solution proposed by the current invention is using genome editing such as the CRISPR/Cas system in order to create cultivated Cocoa plants with improved yield and more specifically with determinate growth habit and also improved metabolite content of Cocoa in the form of less Theobromine and Caffeine. Breeding using genome editing allows a precise and significantly shorter breeding process in order to achieve these goals with a much higher success rate. Thus genome editing, has the potential to generate improved varieties faster and at a lower cost.

It is further noted that using genome editing is considered as non GMO by the Israeli regulator and in the US, the USDA has already classified a dozen of genome edited plant as non-regulated and non GMO (https://www.usda.gov/media/press- releases/2018/03/28/secretary-perdue-issues-usda-statement- plant-breeding-innovation).

Legal limitations and outdated breeding techniques significantly hamper the efforts of generating new and improved Cocoa varieties fit for intensive agriculture.

The present invention provides Cocoa plants with improved domestication traits such as plant architecture, and Cocoa with less Theobromine and Caffeine metabolites. The current invention discloses the generation of non-transgenic Cocoa plants with improved yield and metabolite traits, using the genome editing technology, e.g., the CRISPR/Cas9 highly precise tool. The generated mutations can be introduced into elite or locally adapted Cocoa lines rapidly, with relatively minimal effort and investment.

Genome editing is an efficient and useful tool for increasing crop productivity, and there is particular interest in advancing manipulation of domestication genes in Cocoa wild species, which often have undesirable characteristics.

In the context of the present invention, domestication traits or genes include agronomic traits controlled by Cocoa (Theobroma cacao) SELF PRUNING (SP) genes, and nutritional-related traits controlled by Cocoa (Theobroma cacao) Caffeine synthase and/or Theobromine synthase genes.

Genome-editing technologies, such as the Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-CRISPR-associated protein-9 nuclease (Cas9) (CRISPR-Cas9) provide opportunities to address these deficiencies, with the aims of increasing quality and yield, improve adaptation and expand geographical ranges of cultivation. A major obstacle for CRISPR-Cas9 plant genome editing is lack of efficient tissue culture and transformation methodologies. The present invention achieves these aims and surprisingly provides transformed and regenerated Cocoa plants with modified desirable domestication and metabolite synthesis genes.

To that end, guide RNAs (gRNAs) were designed for each of the target genes identified in Cocoa to induce mutations in SP and Cocoa Caffeine and/or Theobromine synthase genes through genome editing.

As used herein the term "about" denotes ± 25% of the defined amount or measure or value.

As used herein the term "similar" denotes a correspondence or resemblance range of about ± 20%, particularly ± 15%, more particularly about ± 10% and even more particularly about ± 5%.

As used herein the term "corresponding" generally means similar, analogous, like, alike, akin, parallel, identical, resembling or comparable. In further aspects it means having or participating in the same relationship (such as type or species, kind, degree, position, correspondence, or function). It further means related or accompanying. In some embodiments of the present invention refers to plants of the same Cocoa species or strain or variety or to sibling plant, or one or more individuals having one or both parents in common.

A "plant" as used herein refers to any plant at any stage of development, particularly a seed plant. The term "plant" includes the whole plant or any parts or derivatives thereof, such as plant cell, plant tissue, seeds, plant protoplasts, plant cell tissue culture from which Cocoa plants can be regenerated, plant callus or calli, meristematic cells, microspores, embryos, immature embryos, pollen, ovules, anthers, ffruit, fflowers, leaves, cotyledons, pistil, seeds, beans, seed coat, roots, root tips and the like.

The term "plant cell" used herein refers to a structural and physiological unit of a plant, comprising a protoplast and a cell wall. The plant cell may be in a form of an isolated single cell or a cultured cell.

The term "plant cell culture" as used herein means cultures of plant units such as, for example, protoplasts, regenerable cells, cell culture, cells, cells in plant tissues, pollen, pollen tubes, ovules, embryo sacs, zygotes and embryos at various stages of development, leaves, roots, root tips, anthers, meristematic cells, microspores, flowers, cotyledons, pistil, fruit, seeds, seed coat or any combination thereof.

The term "plant material" or "plant part" used herein refers to leaves, stems, roots, root tips, flowers or flower parts, fruits, pollen, egg cells, zygotes, seeds, beans, seed coat, cuttings, cell or tissue cultures, or any other part or product of a plant or a combination thereof.

A "plant organ" as used herein means a distinct and visibly structured and differentiated part of a plant such as a root, stem, leaf, flower, flower bud, or embryo.

The term "Plant tissue" as used herein means a group of plant cells organized into a structural and functional unit. Any tissue of a plant in planta or in culture is included. This term includes, but is not limited to, whole plants, plant organs, plant seeds, tissue culture, protoplasts, meristematic cells, calli and any group of plant cells organized into structural and/or functional units. The use of this term in conjunction with, or in the absence of, any specific type of plant tissue as listed above or otherwise embraced by this definition is not intended to be exclusive of any other type of plant tissue.

As used herein, the term "progeny" or "progenies" refers in a non limiting manner to offspring or descendant plants. According to certain embodiments, the term "progeny" or "progenies" refers to plants developed or grown or produced from the disclosed or deposited seeds as detailed inter alia. The grown plants preferably have the desired traits of the disclosed or deposited seeds, i.e. loss of function mutation in at least one Cocoa SP gene and/or Cocoa Caffeine and/or Theobromine synthase gene.

The term "Cocoa" refers hereinafter to a genus of plants in the family Malvaceae, specifically to the Theobroma genus, and more specifically to the Theobroma cacao species. It is herein also acknowledged that Theobroma cacao, also called the cacao tree or cocoa tree, is an evergreen tree in the family Malvaceae. Its seeds, cocoa beans, are used in the chocolate and cocoa industries.

The term 'SELF-PRUNING' or 'SP' in the context of the present invention refers to a gene which encodes a flowering repressor that modulates sympodial growth. It is herein shown that mutations in the SP orthologue cause an acceleration of sympodial cycling and shoot termination. It is further acknowledged that the SELF PRUNING (SP) gene controls the regularity of the vegetative- reproductive switch along the compound shoot of, for example, tomato and thus conditions the 'determinate' (sp/sp) and 'indeterminate' (SP) growth habits of the plant. SP is a developmental regulator which is considered as similar or homologous to CENTRORADIALIS (CEN) from Antirrhinum and TERMINAL FLOWER 1 (TFL1) and FLOWERING LOCUS T (FT) from Arabidopsis.

The present invention discloses that SP is a member of a gene family in Cocoa. The Cocoa (Theobroma cacao) SP genes included within the scope of the present invention are encoded by genomic sequence comprising at least 75% identity to at least one sequence as set forth in SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901. The aforementioned Cocoa (Theobroma cacao) SP gene sequences encode a polypeptide sequence comprising at least 75% identity a sequence as set forth in SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, respectively.

The Cocoa (Theobroma cacao) Caffeine synthase genes included within the scope of the present invention are encoded by genomic sequence comprising at least 75% identity to at least one sequence as set forth in SEQ ID N0:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500. The aforementioned Cocoa (Theobroma cacao) Caffeine synthase gene sequences encode a polypeptide sequence comprising at least 75% identity a sequence as set forth in SEQ ID NO:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, respectively.

The Cocoa (Theobroma cacao) Theobromine synthase genes included within the scope of the present invention are encoded by genomic sequence comprising at least 75% identity to at least one sequence as set forth in SEQ ID NO:1663, SEQ ID NO: 1858, SEQ ID NO: 2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533. The aforementioned Cocoa (Theobroma cacao) Theobromine synthase gene sequences encode a polypeptide sequence comprising at least 75% identity a sequence as set forth in SEQ ID NO:1664, SEQ ID NO: 1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534, respectively.

According to main aspects of the present invention, genome editing- targeted mutation in at least one of the aforementioned Cocoa SP genes, which reduces the functional expression of the gene, affect the plant sympodial growth habit which plays a key role in determining plant architecture.

As used herein the term "genetic modification" refers hereinafter to genetic manipulation or modulation, which is the direct manipulation of an organism's genes using biotechnology. It also refers to a set of technologies used to change the genetic makeup of cells, including the transfer of genes within and across species, targeted mutagenesis and genome editing technologies to produce improved organisms. According to main embodiments of the present invention, modified Cocoa plants with improved domestication and/or nutritional traits are generated using genome editing mechanism. This technique enables to achieve in planta modification of specific genes that relate to and/or control the flowering time, plant architecture and metabolites level in the Cocoa plant.

The term "genome editing", or "genome/genetic modification" or "genome engineering" or "gene editing" generally refers hereinafter to a type of genetic engineering in which DNA is inserted, deleted, modified or replaced in the genome of a living organism. Unlike previous genetic engineering techniques that randomly insert genetic material into a host genome, genome editing targets the insertions to site specific locations.

It is within the scope of the present invention that the common methods for such editing use engineered nucleases, or "molecular scissors". These nucleases create site-specific double-strand breaks (DSBs) at desired locations in the genome. The induced double-strand breaks are repaired through nonhomologous endjoining (NHEJ) or homologous recombination (HR), resulting in targeted mutations ('edits'). Families of engineered nucleases used by the current invention include, but are not limited to: meganucleases, zinc finger nucleases (ZFNs), transcription activator-like effector-based nucleases (TALEN), and the clustered regularly interspaced short palindromic repeats (CRISPR/Cas9) system.

Reference is now made to exemplary genome editing terms used by the current disclosure:

According to specific aspects of the present invention, the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and CRISPR-associated (Cas) genes are used for the first time for generating genome modification in targeted genes in the Cocoa plant. It is herein acknowledged that the functions of CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and CRISPR-associated (Cas) genes are essential in adaptive immunity in select bacteria and archaea, enabling the organisms to respond to and eliminate invading genetic material. These repeats were initially discovered in the 1980s in E. coli. Without wishing to be bound by theory, reference is now made to a type of CRISPR mechanism, in which invading DNA from viruses or plasmids is cut into small fragments and incorporated into a CRISPR locus comprising a series of short repeats (around 20 bps). The loci are transcribed, and transcripts are then processed to generate small RNAs (crRNA, namely CRISPR RNA), which are used to guide effector endonucleases that target invading DNA based on sequence complementarity.

According to further aspects of the invention, Cas protein, such as Cas9 (also known as Csnl) is required for gene silencing. Cas9 participates in the processing of crRNAs, and is responsible for the destruction of the target DNA. Cas9's function in both of these steps relies on the presence of two nuclease domains, a RuvC-like nuclease domain located at the amino terminus and a HNH-like nuclease domain that resides in the mid-region of the protein. To achieve site-specific DNA recognition and cleavage, Cas9 is complexed with both a crRNA and a separate trans-activating crRNA (tracrRNA or trRNA), that is partially complementary to the crRNA. The tracrRNA is required for crRNA maturation from a primary transcript encoding multiple pre-crRNAs. This occurs in the presence of RNase III and Cas9.

Without wishing to be bound by theory, it is herein acknowledged that during the destruction of target DNA, the HNH and RuvC-like nuclease domains cut both DNA strands, generating double-stranded breaks (DSBs) at sites defined by a 20-nucleotide target sequence within an associated crRNA transcript. The HNH domain cleaves the complementary strand, while the RuvC domain cleaves the noncomplementary strand.

It is further noted that the double-stranded endonuclease activity of Cas9 also requires that a short conserved sequence, (2-5 nts) known as protospacer-associated motif (PAM), follows immediately 3 ' - of the crRNA complementary sequence. According to further aspects of the invention, a two-component system may be used by the current invention, combining trRNA and crRNA into a single synthetic single guide RNA (sgRNA) for guiding targeted gene alterations.

It is further within the scope that Cas9 nuclease variants include wild-type Cas9, Cas9D10A and nuclease-deficient Cas9 (dCas9).

An example of CRISPR/Cas9 mechanism of action is described by Xie, Kabin, and Yinong Yang. "RNA-guided genome editing in plants using a CRISPR-Cas system. " Molecular plant 6.6 (2013): 1975-1983. As shown in this publication, the Cas9 endonuclease forms a complex with a chimeric RNA (called guide RNA or gRNA), replacing the crRNA-transcrRNA heteroduplex, and the gRNA could be programmed to target specific sites. The gRNA-Cas9 should comprise at least 15- base-pairing (gRNA seed region) without mismatch between the 5'- end of engineered gRNA and targeted genomic site, and an NGG motif (called protospacer-adjacent motif or PAM) that follows the basepairing region in the complementary strand of the targeted DNA.

The term "meganucleases" as used herein refers hereinafter to endodeoxyribonucleases characterized by a large recognition site (double-stranded DNA sequences of 12 to 40 base pairs); as a result this site generally occurs only once in any given genome. Meganucleases are therefore considered to be the most specific naturally occurring restriction enzymes.

The term "protospacer adjacent motif” or "PAM" as used herein refers hereinafter to a 2-6 base pair DNA sequence immediately following the DNA sequence targeted by the Cas9 nuclease in the GRISPR bacterial adaptive immune system. PAM is a component of the invading virus or plasmid, but is not a component of the bacterial CRISPR locus. PAM is an essential targeting component which distinguishes bacterial self from non-self DNA, thereby preventing the GRISPR locus from being targeted and destroyed by nuclease.

The term "Next-generation sequencing" or "NGS" as used herein refers hereinafter to massively, parallel, high- throughput or deep sequencing technology platforms that perform sequencing of millions of small fragments of DNA in parallel. Bioinformatics analyses are used to piece together these fragments by mapping the individual reads to the reference genome.

The term "gene knockdown" as used herein refers to an experimental technique by which the expression of one or more of an organism's genes is reduced. The reduction can occur through genetic modification, i.e. targeted genome editing or by treatment with a reagent such as a short DNA or RNA oligonucleotide that has a sequence complementary to either gene or an mRNA transcript. The reduced expression can be at the level of RNA or at the level of protein. It is within the scope of the present invention that the term gene knockdown also refers to a loss of function mutation and /or gene knockout mutation or silencing mutation in which an organism's gene is made inoperative or nonfunctional.

The term "gene silencing" as used herein refers hereinafter to the regulation of gene expression in a cell to prevent or significantly reduce the expression of a certain gene. Gene silencing can occur during either transcription or translation. In certain aspects of the invention, gene silencing is considered to have a similar meaning as gene knockdown. When genes are silenced, their expression is reduced. In contrast, when genes are knocked out, they are completely not expressed. Gene silencing may be considered a gene knockdown mechanism since the methods used to silence genes, such as RNAi, CRISPR, or siRNA, generally reduce the expression of a gene by at least 70% but do not completely eliminate it. The term "loss of function mutation" as used herein refers to a type of mutation in which the altered gene or allele product lacks the function of the wild-type gene or allele. A synonym of the term included within the scope of the present invention is null mutation.

The term "microRNAs" or "miRNAs" refers hereinafter to small noncoding RNAs that have been found in most of the eukaryotic organisms. They are involved in the regulation of gene expression at the post-transcriptional level in a sequence specific manner. MiRNAs are produced from their precursors by Dicer-dependent small RNA biogenesis pathway. MiRNAs are candidates for studying gene function using different RNA-based gene silencing techniques. For example, artificial miRNAs (amiRNAs) targeting one or several genes of interest is a potential tool in functional genomics.

The term "in planta" means in the context of the present invention within the plant or plant cell(s). More specifically, it means introducing CRISPR/Cas complex into plant material comprising a tissue culture of several cells, a whole plant, or into a single plant cell, without introducing a foreign gene or a mutated gene. It also used to describe conditions present in a non-laboratory environment (e.g. in vivo).

The term "domestication trait" as used herein encompass traits conferred by SELF PRUNING (SP), Caffeine synthase and/or Theobromine synthase gene expression level. Domestication traits within the scope of the present invention include agronomic traits, plant architecture and metabolite expression traits and nutritional- related or nutritional quality or level traits, including, but not limited to reduced flowering time, earliness, synchronous flowering, reduced day-length sensitivity, determinant or semi-determinant architecture, early termination of sympodial cycling, earlier axillary shoot flowering, compact growth habit, reduced height, reduced number of sympodial units, adaptation to mechanical harvest, higher harvest index, reduced Caffeine level or concentration, reduced Theobromine level or concentration, and any combination thereof.

The term "Caffeine synthase" herein refers to a methyltransferase enzyme involved in the caffeine biosynthesis pathway. It is expressed in tea species, coffee species, and cocoa species. Caffeine synthase is a S-adenosyl-L-methionine (SAM)-dependent methyltransferase involved in the last two steps of caffeine biosynthesis.

The term "Theobromine synthase" herein refers to an enzyme that catalyzes the chemical reaction where, the two substrates of this enzyme are S-adenosyl methionine and 7-methylxanthine, whereas its two products are S-adenosylhomocysteine and 3,7-dimethylxanthine. This enzyme belongs to the family of transferases, specifically those transferring one-carbon group methyltransferases. The systematic name of this enzyme class is S-adenosyl-L-methionine:7- methylxanthine N3-methyltransferase. Other names in common use include monomethylxanthine methyltransferase, MXMT, CTS1, CTS2, and S-adenosyl-L-methionine:7-methylxanthine 3-N- methyltransferase.

Theobromine, also known as xantheose, is an alkaloid from the Theobroma family. It is the principal alkaloid of Theobroma cacao (cacao plant), tastes bitter, and has chemical formula C7H8N402. It is found in chocolate, as well as in a number of other foods, including the leaves of the tea plant, and the kola nut.

The term 'sympodial growth' as used herein refers to a type of bifurcating branching pattern where one branch develops more strongly than the other, resulting in the stronger branches forming the primary shoot and the weaker branches appearing laterally. A sympodium, also referred to as a sympode or pseudaxis, is the primary shoot, comprising the stronger branches, formed during sympodial growth. In some aspects of the present invention, sympodial growth occurs when the apical meristem is terminated and growth is continued by one or more lateral meristems, which repeat the process. The apical meristem may be consumed to make an inflorescence or other determinate structure, or it may be aborted.

It is further within the scope of the current invention that the shoot section between two successive inflorescences is called the 'sympodium', and the number of leaf nodes per sympodium is referred to as the 'sympodial index' (spi). The first termination event activates the 'sympodial cycle'. In sympodial plants, the apparent main shoot consists of a reiterated array of 'sympodial units'. A mutant sp gene accelerates the termination of sympodial units but does not change the sympodial habit. The result is a progressive reduction in the number of vegetative nodes between inflorescences in a pattern that depends on light intensity and genetic background.

The term "earliness" refers hereinafter to early flowering and/or rapid transition from the vegetative to reproductive stages, or reduced 'time to initiation of flowering' and more generally to earlier completion of the life-cycle.

The term 'reduced flowering time' as used herein refers to time to production of first inflorescence. Such a trait can be evaluated or measured, for example, with reference to the number of leaves produced prior to appearance of the first inflorescence.

The term 'harvest index' can be herein defined as the total yield per plant weight.

The term 'day length' or 'day length sensitivity' as used in the context of the present invention generally refers to photoperiodism, which is the physiological reaction of organisms to the length of day or night. Photoperiodism can also be defined as the developmental responses of plants to the relative lengths of light and dark periods. Plants are classified under three groups according to the photoperiods: short-day plants, long-day plants, and day-neutral plants. Photoperiodism affects flowering by inducing the shoot to produce floral buds instead of leaves and lateral buds. It is within the scope of the present invention that Cocoa is included within the short-day facultative plants. The Cocoa plants of the present invention are genetically modified so as to exhibit loss of day-length sensitivity, which is highly desirable agronomical trait enabling enhanced yield of the cultivated crop.

The term 'determinate' or 'determinate growth' as used herein refers to plant growth in which the main stem ends in an inflorescence or other reproductive structure (e.g. a bud) and stops continuing to elongate indefinitely with only branches from the main stem having further and similarly restricted growth. It also refers to growth characterized by sequential flowering from the central or uppermost bud to the lateral or basal buds. It further means naturally self-limited growth, resulting in a plant of a definite maximum size.

The term 'indeterminate' or 'indeterminate growth' as used herein refers to plant growth in which the main stem continues to elongate indefinitely without being limited by a terminal inflorescence or other reproductive structure. It also refers to growth characterized by sequential flowering from the lateral or basal buds to the central or uppermost buds.

It is within the scope of the present invention that 'yield related traits' comprise one or more of early flowering time, yield, biomass, seed yield, early vigour, greenness index, increased growth rate and improved agronomic traits (such as improved plant architecture, improved domestication i.e. determinate growth habit, and modulated metabolite expression or enhanced nutritional quality or value).

The term "yield" in general means a measurable produce of economic value, typically related to a specified crop, to an area, and to a period of time. Individual plant parts directly contribute to yield based on their number, size and/or weight, or the actual yield is the yield per square meter for a crop and year, which is determined by dividing total production (includes both harvested and appraised production) by planted square meters. The terms "yield" of a plant and "plant yield" are used interchangeably herein and are meant to refer to vegetative biomass such as root and/or shoot biomass, to reproductive organs, and/or to propagules such as seeds of that plant.

The terms "increase", "improve" or "enhance" are interchangeable and shall mean in the sense of the application at least a 3%, 4%, 5%, 6%, 7%, 8%, 9% or 10%, preferably at least 15% or 20%, more preferably 25%, 30%, 35% or 40% more yield, growth or any other agronomic trait such as domestication trait and/or plant architecture, and metabolite expression or nutritional quality or value trait such as Caffeine and/or Theobromine value, in comparison to control plants as defined herein.

Increased seed yield may be defined as one or more of the following:(a) an increase in seed biomass (total seed weight) which may be on an individual seed basis and/or per plant and/or per square meter; (b) increased number of flowers per plant; (c) increased number of seeds; and (d) increased harvest index, which is expressed as a ratio of the yield of harvestable parts, such as seeds, divided by the biomass of aboveground plant parts. An increase in seed yield may also be manifested as an increase in seed size and/or seed volume.

The term "biomass" as used herein is intended to refer to the total weight of a plant. Within the definition of biomass, a distinction may be made between the biomass of one or more parts of a plant, which may include: aboveground (harvestable) parts such as but not limited to shoot biomass, seed biomass, leaf biomass, etc., and/or (harvestable) parts below ground, such as but not limited to root biomass, etc., and/or vegetative biomass such as root biomass, shoot biomass, etc., and/or reproductive organs, and/or propagules such as seed.

Control plant(s) within the scope of the present invention include corresponding wild type plants or corresponding naturally occurring plants or corresponding plants lacking the edited or mutated gene of interest or the specific generated mutation. The choice of suitable control plants is a routine part of an experimental setup and may include corresponding wild type plants or corresponding plants without the gene of interest. The control plant is typically of the same plant species or the same genetic background or even of the same variety as the plant to be assessed. The control plant of the plant to be assessed may also be plant individuals missing the transgene or modified/edited gene. A "control plant" or a "wild type" plant as used herein refers not only to whole plants, but also to plant parts, including cells, seeds and seed parts.

The term "orthologue" as used herein refers hereinafter to one of two or more homologous gene sequences found in different species.

The term "functional variant" or "functional variant of a nucleic acid or amino acid sequence" as used herein of a sequence or part or fragment of a sequence which retains the biological function of the full non-variant allele (e.g. Cocoa SP genes and Cocoa Caffeine and/or Theobromine synthase gene allele) and hence has the activity of the expressed gene or protein. A functional variant also comprises a variant of the gene of interest encoding a polypeptide which has sequence alterations that do not affect function of the resulting protein, for example, in non-conserved residues. Also encompassed is a variant that is substantially identical, i.e. has only some sequence variations, for example, in non-conserved residues, to the wild type nucleic acid or amino acid sequences of the alleles as shown herein, and is biologically active or has the same biological activity (e.g. functional or nonfunctional).

The term "variety" or "cultivar" used herein means a group of similar plants that by structural features and performance can be identified from other varieties within the same species.

The term "allele" used herein means any of one or more alternative or variant forms of a gene or a genetic unit at a particular locus, all of which alleles relate to one trait or characteristic at a specific locus. In a diploid cell of an organism, alleles of a given gene are located at a specific location, or locus (loci plural) on a chromosome. Alternative or variant forms of alleles may be the result of single nucleotide polymorphisms, insertions, inversions, translocations or deletions, gene editing, or the consequence of gene regulation caused by, for example, chemical or structural modification, transcription regulation or post- translational modification/regulation. An allele associated with a qualitative trait may comprise alternative or variant forms of various genetic units including those that are identical or associated with a single gene or multiple genes or their products or even a gene disrupting or controlled by a genetic factor contributing to the phenotype represented by the locus. According to further embodiments, the term "allele" designates any of one or more alternative forms of a gene at a particular locus. Heterozygous alleles are two different alleles at the same locus. Homozygous alleles are two identical alleles at a particular locus. A wild type (or control) allele is a naturally occurring allele. In the context of the current invention, the term allele refers to the 6 identified SP Cocoa genes and 10 Cocoa Caffeine and/or Theobromine synthase genes, comprising the genomic nucleotide sequence as set forth in SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901 for Cocoa (Theobroma cacao) SP genes; SEQ ID N0:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500 for Cocoa (Theobroma cacao) Caffeine synthase genes; and SEQ ID NO: 1663, SEQ ID NO: 1858, SEQ ID NO: 2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533 for Cocoa (Theobroma cacao) Theobromine synthase genes.

As used herein, the term "locus" (loci plural) means a specific place or places or region or a site on a chromosome where for example a gene or genetic marker element or factor is found. In specific embodiments, such a genetic element is contributing to a trait.

As used herein, the term "homozygous" refers to a genetic condition or configuration existing when two identical or like alleles reside at a specific locus, but are positioned individually on corresponding pairs of homologous chromosomes in the cell of a diploid organism.

In specific embodiments, the Cocoa plants of the present invention comprise homozygous configuration of at least one of the mutated Cocoa SP genes and Cocoa Caffeine and/or Theobromine synthase genes. Conversely, as used herein, the term "heterozygous" means a genetic condition or configuration existing when two different or unlike alleles reside at a specific locus, but are positioned individually on corresponding pairs of homologous chromosomes in the cell of a diploid organism.

As used herein, the phrase "genetic marker" or "molecular marker" or "biomarker" refers to a feature in an individual's genome e.g., a nucleotide or a polynucleotide sequence that is associated with one or more loci or trait of interest. In some embodiments, a genetic marker is polymorphic in a population of interest, or the locus occupied by the polymorphism, depending on context. Genetic markers or molecular markers include, for example, single nucleotide polymorphisms (SNPs), indels (i.e. insertions deletions), simple sequence repeats (SSRs), restriction fragment length polymorphisms (RFLPs), random amplified polymorphic DNAs (RAFDs), cleaved amplified polymorphic sequence (CAPS) markers, Diversity Arrays Technology (DArT) markers, and amplified fragment length polymorphisms (AFLPs) or combinations thereof, among many other examples such as the DNA sequence per se. Genetic markers can, for example, be used to locate genetic loci containing alleles on a chromosome that contribute to variability of phenotypic traits. The phrase "genetic marker" or "molecular marker" or "biomarker" can also refer to a polynucleotide sequence complementary or corresponding to a genomic sequence, such as a sequence of a nucleic acid used as a probe or primer.

As used herein, the term "germplasm" refers to the totality of the genotypes of a population or other group of individuals (e.g., a species). The term "germplasm" can also refer to plant material; e.g., a group of plants that act as a repository for various alleles. Such germplasm genotypes or populations include plant materials of proven genetic superiority; e.g., for a given environment or geographical area, and plant materials of unknown or unproven genetic value; that are not part of an established breeding population and that do not have a known relationship to a member of the established breeding population.

The terms "hybrid", "hybrid plant" and "hybrid progeny" used herein refers to an individual produced from genetically different parents (e.g., a genetically heterozygous or mostly heterozygous individual).

As used herein, "sequence identity" or "identity" in the context of two nucleic acid or polypeptide sequences makes reference to the residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window. When percentage of sequence identity is used in reference to proteins, it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. The term further refers hereinafter to the amount of characters which match exactly between two different sequences. Hereby, gaps are not counted and the measurement is relational to the shorter of the two sequences.

It is further within the scope that the terms "similarity" and "identity" additionally refer to local homology, identifying domains that are homologous or similar (in nucleotide and/or amino acid sequence). It is acknowledged that bioinformatics tools such as BLAST, SSEARCH, FASTA, and HMMER calculate local sequence alignments which identify the most similar region between two sequences. For domains that are found in different sequence contexts in different proteins, the alignment should be limited to the homologous domain, since the domain homology is providing the sequence similarity captured in the score. According to some aspects the term similarity or identity further includes a sequence motif, which is a nucleotide or amino-acid sequence pattern that is widespread and has, or is conjectured to have, a biological significance. Proteins may have a sequence motif and/or a structural motif, a motif formed by the three- dimensional arrangement of amino acids which may not be adjacent.

As used herein, the terms "nucleic acid", "nucleic acid sequence", "nucleotide", "nucleic acid molecule" or "polynucleotide" are intended to include DNA molecules (e.g., cDNA or genomic DNA), RNA molecules (e.g., mRNA), natural occurring, mutated, synthetic DNA or RNA molecules, and analogs of the DNA or RNA generated using nucleotide analogs. It can be single-stranded or double-stranded. Such nucleic acids or polynucleotides include, but are not limited to, coding sequences of structural genes, anti-sense sequences, and non-coding regulatory sequences that do not encode mRNAs or protein products. These terms also encompass a gene. The term "gene", "allele" or "gene sequence" is used broadly to refer to a DNA nucleic acid associated with a biological function. Thus, genes may include introns and exons as in the genomic sequence, or may comprise only a coding sequence as in cDNAs, and/or may include cDNAs in combination with regulatory sequences. Thus, according to the various aspects of the invention, genomic DNA, cDNA or coding DNA may be used. In one embodiment, the nucleic acid is cDNA or coding DNA.

The terms "peptide", "polypeptide" and "protein" are used interchangeably herein and refer to amino acids in a polymeric form of any length, linked together by peptide bonds.

According to other aspects of the invention, a "modified" or a "mutant" plant is a plant that has been altered compared to the naturally occurring wild type (WT) or control plant. Specifically, the endogenous nucleic acid sequences of each of the Cocoa (Theobroma cacao) SP homologs (nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901), Caffeine synthase homologs (nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500), and Theobromine synthase homologs (nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO: 1663, SEQ ID NO: 1858, SEQ ID NO: 2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533) have been altered compared to wild type sequences using mutagenesis and/or genome editing methods as described herein. This causes inactivation of the endogenous Cocoa SP and/or Caffeine and/or Theobromine synthase gene and thus disables their expression and /or function.

According to one embodiment, Cocoa plants with the at least one mutated sp gene, have an altered phenotype and show improved domestication traits such as determinant plant architecture, synchronous and/or early flowering and loss of day length sensitivity compared to wild type or control plants (lacking the at least one mutated gene).

According to another embodiment, the Cocoa plants with the at least one mutated Caffeine and/or Theobromine synthase gene show lesser levels of Caffeine and/or Theobromine, respectively. Therefore, the improved domestication (including modulated metabolite level or expression or nutritional value) phenotype is conferred by the presence of at least one mutated endogenous SP and /or Caffeine and/or Theobromine synthase gene, respectively, in the Cocoa (Theobroma cacao) plant genome which has been specifically targeted using genome editing technique. According to further aspects of the present invention, the at least one improved domestication (including modulated metabolite level or expression or nutritional value trait) is not conferred by the presence of transgenes expressed in Cocoa.

It is further within the scope of the current invention that mutations that down-regulate or disrupt functional expression of the wild-type SP sequence and/or Cocoa Caffeine and/or Theobromine gene sequence, may be recessive, such that they are complemented by expression of a wild-type sequence.

It is further noted that a wild type or control Cocoa plant is a plant that does not have any mutant sp and/or Cocoa Caffeine and/or Theobromine alleles.

Main aspects of the invention involve targeted mutagenesis methods, specifically genome editing, and exclude embodiments that are solely based on generating plants by traditional breeding methods. In a further embodiment of the current invention, as explained herein, the improved domestication and/or modulated metabolite level or expression or nutritional value at least one trait is not due to the presence of a transgene.

The inventors have generated mutant Cocoa lines with mutations inactivating at least one Cocoa SP and/or Caffeine and/or Theobromine synthase gene homoeoallele which confer heritable improved domestication trait(s). In this way no functional Cocoa SP and/or Caffeine and/or Theobromine synthase gene protein is made. Thus, the invention relates to these mutant Cocoa lines and related methods.

It is further within the scope of the present invention that breeding Cocoa cultivars with mutated sp allele enables the mechanical harvest of the plant. According to a further aspect of the present invention, loss of SP function results in compact Cocoa plants with reduced height, reduced number of sympodial units and determinate growth when compared with WT Cocoa.

According to a main aspect of the present invention, modifying Cocoa shoot architecture by selection for mutations in florigen flowering pathway genes allowed major improvements in plant architecture and yield. In particular, a mutation in the antiflorigen SELFPRUNING (SP) gene (sp classic) provided compact 'determinate' growth that translated to a burst of flowers, thereby enabling largescale field production.

The work inter alia described has important implications. The results have shown that CRISPR/Cas9 can be used to create heritable mutations in florigen pathway family members that result in desirable phenotypic effects.

To edit multiple domestication genes simultaneously and stack the resulting allelic variants, one option is that several gRNAs can be assembled to edit several genes into one construct, by using the Csy4 multi-gRNA system. The construct is then transformed via an appropriate vector into several Cocoa accessions.

It is further within the scope of the current invention that Cocoa SP genes having genomic nucleotide sequence as set forth in SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, Cocoa Caffeine synthase genes having genomic nucleotide sequence as set forth in SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and Cocoa Theobromine synthase genes having genomic nucleotide sequence as set forth in SEQ ID NO: 1663, SEQ ID NO:1858, SEQ ID NO: 2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533 have been identified. Several mutated alleles of these genes have been generated by the current invention, using targeted genome editing. Notably, the plants with mutated sp alleles were more compact than the wild type plants lacking the mutated allele.

The loss of function mutation may be a deletion or insertion ("indels") with reference the wild type Cocoa SP genes and Cocoa Caffeine and/or Theobromine synthase gene allele sequence. The deletion may comprise 1-20 or more nucleotides, for example 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 12, 13, 14, 15, 16, 17, 18 or 20 nucleotides or more in one or more strand. The insertion may comprise 1-20 or more nucleotides, for example 1, 2, 3, 4, 5, 6,

7, 8, 9, 10, 1, 12, 13, 14, 15, 16, 17, 18 or 20 or more nucleotides in one or more strand.

The plant of the invention includes plants wherein the plant is heterozygous for the each of the mutations. In a preferred embodiment however, the plant is homozygous for the mutations. Progeny that is also homozygous can be generated from these plants according to methods known in the art.

It is further within the scope that variants of a particular Cocoa SP and /or Caffeine and/or Theobromine synthase gene nucleotide or amino acid sequence according to the various aspects of the invention will have at least about 50%-99%, for example at least 75%, for example at least 85%, 86%, 87%, 88%, 89%, 90%, 92%, 94%, 95%, 96%, 97%, 98% or 99% or more sequence identity to that particular non-variant Cocoa SP and/or Caffeine and/or Theobromine synthase gene sequence as follows:

(a) Cocoa (Theobroma cacao) SELF PRUNING (SP) gene comprising a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, respectively, or a functional variant thereof;

(b) Cocoa (Theobroma cacao) Caffeine synthase gene comprising a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, respectively, or a functional variant thereof; and

(c) Cocoa (Theobroma cacao) Theobromine synthase gene comprising a nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO: 1663, SEQ ID NO:1858, SEQ ID NO:2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1664, SEQ ID NO:1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534, respectively, or a functional variant thereof.

Also, the various aspects of the invention encompass not only a Cocoa SP genes and Cocoa Caffeine and/or Theobromine synthase genes or any other aforementioned nucleic acid sequence or amino acid sequence, but also fragments thereof. By "fragment" is intended a portion of the nucleotide sequence or a portion of the amino acid sequence and hence of the protein encoded thereby. Fragments of a nucleotide sequence may encode protein fragments that retain the biological activity of the native protein.

According to a further embodiment of the invention, the herein newly identified Cocoa SP genes and Cocoa Caffeine and/or Theobromine synthase genes have been targeted using the double sgRNA strategy. According to further embodiments of the present invention, DNA introduction into the plant cells can be done by Agrobacterium infiltration, virus based plasmids for delivery of the genome editing molecules and mechanical insertion of DNA (PEG mediated DNA transformation, biolistics, etc.).

In addition, it is within the scope of the present invention that the Cas9 protein is directly inserted together with a gRNA (ribonucleoprotein- RNP's) in order to bypass the need for in vivo transcription and translation of the Cas9+gRNA plasmid in planta to achieve gene editing.

It is also possible to create a genome edited plant and use it as a rootstock. Then, the Cas protein and gRNA can be transported via the vasculature system to the top of the plant and create the genome editing event in the scion.

It is within the scope of the present invention that the usage of CRISPR/Cas system for the generation of Cocoa plants with at least one improved domestication and/or modulated metabolite level or expression or nutritional quality trait, allows the modification of predetermined specific DNA sequences without introducing foreign DNA into the genome by GMO techniques. According to one embodiment of the present invention, this is achieved by combining the Cas nuclease (e.g. Cas9, Cpfl and the like) with a predefined guide RNA molecule (gRNA). The gRNA is complementary to a specific DNA sequence targeted for editing in the plant genome and which guides the Cas nuclease to a specific nucleotide sequence. The predefined gene specific gRNA's are cloned into the same plasmid as the Cas gene and this plasmid is inserted into plant cells. Insertion of the aforementioned plasmid DNA can be done, but not limited to, using different delivery systems, biological and/or mechanical, e.g. Agrobacterium infiltration, virus based plasmids for delivery of the genome editing molecules and mechanical insertion of DNA (PEG mediated DNA transformation, biolistics, etc.).

It is further within the scope of the present invention that upon reaching the specific predetermined DNA sequence, the Cas9 nuclease cleaves both DNA strands to create double stranded breaks leaving blunt ends. This cleavage site is then repaired by the cellular non homologous end joining DNA repair mechanism resulting in insertions or deletions which eventually create a mutation at the cleavage site. For example, it is acknowledged that a deletion form of the mutation consists of at least 1 base pair deletion. As a result of this base pair deletion the gene coding sequence is disrupted and the translation of the encoded protein is compromised either by a premature stop codon or disruption of a functional or structural property of the protein. Thus DNA is cut by the Cas9 protein and re-assembled by the cell's DNA repair mechanism.

It is further within the scope that improved domestication and/or modulated metabolite level or expression or nutritional-related traits in Cocoa plants is herein produced by generating gRNA with homology to a specific site of predetermined genes in the Cocoa genome i.e. SP gene and/or Cocoa Caffeine and/or Theobromine synthase genes, sub cloning this gRNA into a plasmid containing the Cas9 gene, and insertion of the plasmid into the Cocoa plant cells. In this way site specific mutations in the SP and/or Caffeine and/or Theobromine synthase genes are generated thus effectively creating non-active molecules, resulting in determinant growth habit and/or reduced Caffeine and/or Theobromine levels of the genome edited plant.

In order to understand the invention and to see how it may be implemented in practice, a plurality of preferred embodiments will now be described, by way of non-limiting example only, with reference to the following examples. EXAMPLE 1

Production of Cocoa plants with improved domestication trait by targeted genome editing

Production of Cocoa lines with mutated sp and/or Caffeine synthase and/or Theobromine synthase gene may be achieved by at least one of the following breeding/cultivation schemes:

Scheme 1:

• line stabilization by self pollination

• Generation of F6 parental lines

• Genome editing of parental lines

• Crossing edited parental lines to generate an FI hybrid plant Scheme 2:

• Identifying genes/alleles of interest

• Designing gRNA

• Transformation of plants with Cas9+gRNA constructs

• Screening and identifying editing events

• Genome editing of parental lines

It is noted that line stabilization may be performed by the following:

• Induction of male flowering on plants

• Self pollination

According to some embodiments of the present invention, line stabilization requires about 6 self-crossing (6 generations) and done through a single seed descent (SSD) approach.

FI hybrid seed production: Novel hybrids are produced by crosses between different Cocoa strains. According to a further aspect of the current invention, shortening line stabilization is performed by Doubled Haploids (DH). More specifically, the CRISPR-Cas9 system is transformed into microspores to achieve DH homozygous parental lines. A doubled haploid (DH) is a genotype formed when haploid cells undergo chromosome doubling. Artificial production of doubled haploids is important in plant breeding. It is herein acknowledged that conventional inbreeding procedures take about six generations to achieve approximately complete homozygosity, whereas doubled haploidy achieves it in one generation.

It is within the scope of the current invention that genetic markers specific for Cocoa are developed and provided by the current invention:

• Genotyping markers- germplasm used in the current invention is genotyped using molecular markers, in order to allow a more efficient breeding process and identification of the SP and/or Caffeine synthase and/or Theobromine synthase gene editing event.

It is further within the scope of the current invention that allele and genetic variation is analyzed for the Cocoa strains used.

Reference is now made to optional stages that have been used for the production of mutated SP and/or Caffeine synthase and/or Theobromine synthase Cocoa plants by genome editing:

Stage 1: Identifying Cocoa SP, and/or Cocoa Caffeine and/or

Theobromine synthase genes.

Six SP gene orthologues have herein been identified in Cocoa. These homologous genes have been sequenced and mapped. These Cocoa (Theobroma cacao) SELF PRUNING (SP) genes comprising a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1, SEQ ID NO:142, SEQ ID NO:276, SEQ ID NO:411, SEQ ID NO:676 and SEQ ID NO:901, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:2, SEQ ID NO:143, SEQ ID NO:277, SEQ ID NO:412, SEQ ID NO:677 and SEQ ID NO:902, respectively, or a functional variant thereof.

Four Caffeine synthase gene orthologues have herein been identified in Cocoa. These homologous genes have been sequenced and mapped. These Cocoa (Theobroma cacao) Caffeine synthase genes comprising a genomic nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1004, SEQ ID NO:1187, SEQ ID NO:1342 and SEQ ID NO:1500, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1005, SEQ ID NO:1188, SEQ ID NO:1343 and SEQ ID NO:1501, respectively, or a functional variant thereof.

Six Cocoa (Theobroma cacao) Theobromine synthase gene orthologues have herein been identified in Cocoa. These homologous genes have been sequenced and mapped. These Cocoa (Theobroma cacao) Theobromine synthase genes comprising a nucleic acid sequence with at least 75% identity to a sequence selected from SEQ ID NO:1663, SEQ ID NO:1858, SEQ ID NO:2065, SEQ ID NO:2207, SEQ ID NO:2379 and SEQ ID NO:2533, and encoding a polypeptide sequence with at least 75% identity to a sequence selected from SEQ ID NO:1664, SEQ ID NO:1859, SEQ ID NO:2066, SEQ ID NO:2208, SEQ ID NO:2380 and SEQ ID NO:2534, respectively, or a functional variant thereof.

Stage 2: Designing and synthesizing gRNA molecules corresponding to the sequence targeted for editing, i.e. sequences of each of the genes Cocoa SP genes and Cocoa Caffeine and/or Theobromine synthase genes. It is noted that the editing event is preferably targeted to a unique restriction site sequence to allow easier screening for plants carrying an editing event within their genome. According to some aspects of the invention, the nucleotide sequence of the gRNAs should be completely compatible with the genomic sequence of the target gene. Therefore, for example, suitable gRNA molecules should be constructed for different SP and/or Caffeine synthase and/or Theobromine synthase gene homologues of different Cocoa strains.

Reference is now made to Tables 1-3 presenting a summary of the sequences within the scope of the current invention, including gRNA molecules targeted for silencing Cocoa SP genes and Cocoa Caffeine and/or Theobromine synthase genes. The term 'PAM' refers hereinafter to Protospacer Adjacent Motif, which is a 2-6 base pair DNA sequence immediately following the DNA sequence targeted by the Cas9 nuclease in the CRISPR bacterial adaptive immune system.

Table 1: Summary of Cocoa (Theobroma cacao) SELF PRUNING (SP) sequences within the scope of the present invention

Table 2: Summary of Cocoa (Theobroma cacao) Caffeine synthase sequences within the scope of the present invention Table 3: Summary of Cocoa (Theobroma cacao) Theobromine synthase sequences within the scope of the present invention

The above gRNA molecules have been cloned into suitable vectors and their sequence has been verified. In addition different Cas9 versions have been analyzed for optimal compatibility between the Cas9 protein activity and the gRNA molecule in the Cocoa plant.

The efficiency of the designed gRNA molecules have been validated by transiently transforming Cocoa tissue culture. A plasmid carrying a gRNA sequence together with the Cas9 gene has been transformed into Cocoa protoplasts. The protoplast cells have been grown for a short period of time and then were analyzed for existence of genome editing events. The positive constructs have been subjected to the herein established stable transformation protocol into Cocoa plant tissue for producing genome edited Cocoa plants in SP genes Caffeine and/or Theobromine synthase genes.

Stage 3: Transforming Cocoa plants using Agrobacterium or biolistics (gene gun) methods. For Agrobacterium and bioloistics, a DNA plasmid carrying (Cas9 + gene specific gRNA) can be used. A vector containing a selection marker, Cas9 gene and relevant gene specific gRNA's is constructed. For biolistics, Ribonucleoprotein (RNP) complexes carrying (Cas9 protein + gene specific gRNA) are used. RNP complexes are created by mixing the Cas9 protein with relevant gene specific gRNA's. According to some embodiments of the present invention, transformation of various Cocoa tissues was performed using particle bombardment of:

• DNA vectors

• Ribonucleoprotein complex (RNP's)

According to further embodiments of the present invention, transformation of various Cocoa tissues was performed using Agrobacterium (Agrobacterium tumefaciens) by:

• Regeneration-based transformation

• Floral-dip transformation

• Seedling transformation

Transformation efficiency by A. tumefaciens has been compared to the bombardment method by transient GUS transformation experiment. After transformation, GUS staining of the transformants has been performed.

Screening for CRISPR/Cas9 gene editing events has been performed by at least one of the following analysis methods:

• Restriction Fragment Length Polymorphism (RFLP)

• Next Generation Sequencing (NGS)

• PCR fragment analysis

• Fluorescent-tag based screening

• High resolution melting curve analysis (HRMA)

References :

Tingdong Li, Xinping Yang, Yuan Yu, Xiaomin Si, Xiawan Zhai, Huawei Zhang, Wenxia Dong, Caixia Gao & Cao Xu. "Domestication of wild tomato is accelerated by genome editing" Nature Biotechnology 36(2018):1160-1163.

Agustin Zsögön, Tomas Cermak, Emmanuel Rezende Naves, Marcela Morato Notini, Kai H Edel, Stefan Weinl, Luciano Freschi, Daniel F Voytas, Jörg Kudla & Lazaro Eustaquio Pereira Peres. "De novo domestication of wild tomato using genome editing". Nature Biotechnology 36(2018): 1211-1216.

Zachary H. Lemmon, Nathan T. Reem, Justin Dalrymple, Sebastian Soyk, Kerry E. Swartwood, Daniel Rodriguez-Leal, Joyce Van Eck & Zachary B. Lippman. "Rapid improvement of domestication traits in an orphan crop by genome editing". Nature Plants 4(2018): 766-770.

Xie Kabin, and Yinong Yang. "RNA-guided genome editing in plants using a CRISPR-Cas system" Molecular plant 6.6 (2013): 1975- 1983.