Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
REGULATION OF GENE EXPRESSION
Document Type and Number:
WIPO Patent Application WO/2023/152723
Kind Code:
A1
Abstract:
The invention provides a method of producing a host cell, plant cell or plant with increased L-ascorbic acid (AsA) content or increased GDP-L-galactose phosphorylase (GGP) translation, production and/or activity, the method comprising transformation of the host cell or plant cell with a polynucleotide encoding a polypeptide that regulates GDP-L-galactose phosphorylase (GGP). The invention also provides host cells, plant cells and plants, genetically modified to contain and/or express the polynucleotides.

Inventors:
LI DAWEI (CN)
LIU XIAOYING (CN)
ZHONG CAIHONG (CN)
XIE XIAODONG (CN)
BULLEY SEAN MICHAEL WINSLEY (NZ)
Application Number:
PCT/IB2023/051306
Publication Date:
August 17, 2023
Filing Date:
February 14, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NZ INST PLANT & FOOD RES LTD (NZ)
WUHAN BOTANICAL GARDEN (CN)
LI DAWEI (CN)
LIU XIAOYING (CN)
ZHONG CAIHONG (CN)
XIE XIAODONG (CN)
BULLEY SEAN MICHAEL WINSLEY (NZ)
International Classes:
C12N15/82; C12N5/04; C12N9/12
Foreign References:
US20100077503A12010-03-25
US20140196161A12014-07-10
US20160264984A12016-09-15
US20170114359A12017-04-27
US20200299714A12020-09-24
US20040010745A12004-01-15
Other References:
LAING WILLIAM A., MARTÍNEZ-SÁNCHEZ MARCELA, WRIGHT MICHELE A., BULLEY SEAN M., BREWSTER DI, DARE ANDREW P., RASSAM MAYSOON, WANG D: "An Upstream Open Reading Frame Is Essential for Feedback Regulation of Ascorbate Biosynthesis in Arabidopsis", THE PLANT CELL, AMERICAN SOCIETY OF PLANT BIOLOGISTS, US, vol. 27, no. 3, 1 March 2015 (2015-03-01), US , pages 772 - 786, XP093085924, ISSN: 1040-4651, DOI: 10.1105/tpc.114.133777
DATABASE GenBank NCBI; . : "Malus x domestica MYBR domain class transcription factor (MYBR16) mRNA, complete cds", XP093085933
FENECH MARIO, AMAYA IRAIDA, VALPUESTA VICTORIANO, BOTELLA MIGUEL A.: "Vitamin C Content in Fruits: Biosynthesis and Regulation", FRONTIERS IN PLANT SCIENCE, vol. 9, XP093085940, DOI: 10.3389/fpls.2018.02006
CHEN YI-SHIH; CHAO YI-CHI; TSENG TZU-WEI; HUANG CHUN-KAI; LO PEI-CHING; LU CHUNG-AN: "Two MYB-related transcription factors play opposite roles in sugar signaling inArabidopsis", PLANT MOLECULAR BIOLOGY, SPRINGER, DORDRECHT., NL, vol. 93, no. 3, 19 November 2016 (2016-11-19), NL , pages 299 - 311, XP036153288, ISSN: 0167-4412, DOI: 10.1007/s11103-016-0562-8
LIU XIAOYING, WU RONGMEI, BULLEY SEAN M., ZHONG CAIHONG, LI DAWEI: "‐ascorbic acid biosynthesis by activating transcription of GDP‐L‐galactose phosphorylase 3", NEW PHYTOLOGIST, WILEY-BLACKWELL PUBLISHING LTD., GB, vol. 234, no. 5, 1 June 2022 (2022-06-01), GB , pages 1782 - 1800, XP093085948, ISSN: 0028-646X, DOI: 10.1111/nph.18097
Attorney, Agent or Firm:
AJ PARK (NZ)
Download PDF:
Claims:
CLAIMS

1. A method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content, the method comprising transforming a plant cell with a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 1-102 and 348- 356, or a variant of the polypeptide having at least about 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NOs: 1-102 and 348- 356.

2. A method of producing a plant cell or plant having increased GDP-L-ga lactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising transforming a plant cell with a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356, or a variant of the polypeptide having at least about 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356.

3. A method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content, the method comprising transforming a plant cell with a polynucleotide comprising a nucleotide sequence of any one of SEQ ID NOs: 103-108, or a variant thereof having at least about 70% sequence identity to a nucleotide having the nucleotide sequence of any one of SEQ ID NOs: 103-108.

4. A method of producing a plant cell or plant having increased GDP-L-ga lactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising transforming a plant cell with a polynucleotide comprising a nucleotide sequence of any one of SEQ ID NOs: 103-108, or a variant thereof having at least about 70% sequence identity to a nucleotide having the nucleotide sequence of any one of SEQ ID NOs: 103- 108.

5. A method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content or increased GDP-L-galactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising upregulating in the plant cell, or expressing in the plant, a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356.

6. A method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content or increased GDP-L-galactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising upregulating in the plant cell, or expressing in the plant, a polynucleotide comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103-108 or a variant having at least about 70% sequence identity to the nucleotide sequence of any one of SEQ ID NO: 103-108.

7. The method of any one of claims 1 to 6, wherein the method further comprises transforming, or co-transforming with the polynucleotide, a second polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109- 209, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-209.

8. The method of any one of claims 1 to 4 or 7, wherein the method comprises stably transforming the plant cell with the polynucleotide or variant thereof.

9. The method of any one of claims 1 to 8, wherein the method further comprises transforming, or co-transforming with the polynucleotide, a second polynucleotide having a nucleotide sequence of SEQ ID NO: 210, or a variant thereof having at least about 70% sequence identity to the nucleotide sequence of SEQ ID NO: 210.

10. The method of any one of claims 1 to 9, wherein the plant cell or plant is from a food or biofuel crop.

11. An isolated polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NOs: 1-4 and 102, or a variant of the polypeptide having at least about 75% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 1-4 and 102.

12. An isolated polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-153 155-157, 159, 162-164, 166-174, 176, 178, 180-183, 185, 186, 188-190, 192-200, 202-204, and 206-209, or a variant of the polypeptide having at least about 75% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-153 155-157, 159, 162-164, 166-174, 176, 178, 180-183, 185, 186, 188-190, 192-200, 202-204, and 206-209.

13. A genetic construct comprising a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 1-4 and 102, or a variant of the polypeptide having at least about 75% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 1-4 and 102.

14. A genetic construct comprising a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-153 155-157, 159, 162-164, 166- 174, 176, 178, 180-183, 185, 186, 188-190, 192-200, 202-204, and 206-209, or a variant of the polypeptide having at least about 75% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-153 155-157, 159, 162-164, 166-174, 176, 178, 180-183, 185, 186, 188-190, 192-200, 202-204, and 206-209.

15. A genetic construct comprising a polynucleotide comprising a nucleotide sequence of any one of SEQ ID NOs: 103-108, or a variant of the polypeptide having at least about 75% sequence identity to the nucleotide sequence of any one of SEQ ID NOs: 103-108.

16. A genetic construct comprising a polynucleotide comprising a nucleotide sequence of SEQ ID NO: 210, or a variant of the polypeptide having at least about 70% sequence identity to the nucleotide sequence of SEQ ID NO: 210.

17. The genetic construct of claim 13 or 15, wherein the construct further comprises a second polynucleotide a) encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-209, or a variant of the polypeptide having at least about 75% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-209; and/or b) having a nucleotide sequence of SEQ ID NO: 210, or a variant of the polypeptide having at least about 70% sequence identity to the nucleotide sequence of SEQ ID NO: 210.

18. A host cell genetically modified to express a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356.

19. A host cell genetically modified to express a polynucleotide comprising a nucleotide sequence of any one of SEQ ID NOs: 103-108, or a variant thereof having at least about 70% sequence identity to the nucleotide sequence of any one of SEQ ID NOs: 103-108.

20. A plant cell genetically modified to express a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356 or a variant of the polypeptide having at least 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356.

21. A plant cell genetically modified to express a polynucleotide comprising a nucleotide sequence selected from any one of SEQ ID NO: 103-108 or a variant thereof having at least 70% sequence identity to the nucleotide sequence of any one of SEQ ID NO: 103-108.

22. The host cell or plant cell of any one of claims 18 to 21, wherein the cell is genetically modified to express a second polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-209, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-209.

23. The host cell or plant cell of any one of claims 18 to 21, wherein the cell is genetically modified to express a second polynucleotide having a nucleotide sequence of SEQ ID NO: 210, or a variant thereof having at least about 70% sequence identity to the nucleotide sequence of SEQ ID NO: 210.

24. The method, isolated polynucleotide, genetic construct, host cell or plant cell of any one of claims 1-11, 13, 15, and 17-23, wherein the polynucleotide encodes a polypeptide with an amino acid sequence of: a) any one of SEQ ID Nos: 1-102 and 348-356; b) any one of SEQ ID Nos: 1-4, 38, 41-45, 48, 49, 51, 52, 56, 60, 63, 66, 79, 80, 85, 89, 90, 92, 96, 97, 100-102, and 348-356; c) any one of SEQ ID Nos: 1-4, 49, 51, 66, 80, 85, 90, 92, 96, 102, and 348- 356; d) any one of SEQ ID Nos: 1-4 and 102; e) any one of SEQ ID Nos: 1-4; or f) SEQ ID No: l or 102.

25. The method, isolated polynucleotide, genetic construct, host cell or plant cell of any one of claims 1-11, 13, 15, and 17-23, wherein the polynucleotide encodes a variant of the polypeptide having at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 or 100% sequence identity to an amino acid sequence of a) any one of SEQ ID Nos: 1-102 and 348-356; b) any one of SEQ ID Nos: 1-4, 38, 41-45, 48, 49, 51, 52, 56, 60, 63, 66, 79, 80, 85, 89, 90, 92, 96, 97, 100-102, and 348-356; c) any one of SEQ ID Nos: 1-4, 49, 51, 66, 80, 85, 90, 92, 96, 102, and 348- 356; d) any one of SEQ ID Nos: 1-4 and 102; e) any one of SEQ ID Nos: 1-4; or f) SEQ ID No: l or 102.

26. The method, isolated polynucleotide, genetic construct, host cell or plant cell of any one of claims 1-11, 13, 15, and 17-23, wherein the polynucleotide has a nucleotide sequence of a) any one of SEQ ID NOs: 103-108, b) SEQ ID No: 103 or 105, or c) SEQ ID No: 103.

27. The method, isolated polynucleotide, genetic construct, host cell or plant cell of any one of claims 1-11, 13, 15, and 17-23, wherein the polynucleotide encodes a variant having at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 or 100% sequence identity to a nucleotide sequence of a) any one of SEQ ID NOs: 103-108, b) SEQ ID No: 103 or 105, or c) SEQ ID No: 103.

28. The method, isolated polynucleotide, genetic construct, host cell or plant cell of any one of claims 8, 12, 17, 22 and 24-27, wherein the polynucleotide or second polynucleotide encodes a polypeptide with an amino acid sequence of: a) any one of SEQ ID Nos: 109-209; b) any one of SEQ ID Nos: 109-114, 125, 126, 142, 177, 179, 187, 191 and

205; c) any one of SEQ ID Nos: 109-114 and 179; d) any one of SEQ ID Nos: 109-114; or e) SEQ ID NQ: 109.

29. The method, isolated polynucleotide, genetic construct, host cell or plant cell of any one of claims 8, 12, 17, 22 and 24-27, wherein the polynucleotide or second polynucleotide encodes a variant having at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 or 100% sequence identity to an amino acid sequence of: a) any one of SEQ ID Nos: 109-209; b) any one of SEQ ID Nos: 109-114, 125, 126, 142, 177, 179, 187, 191 and

205; c) any one of SEQ ID Nos: 109-114 and 179; d) any one of SEQ ID Nos: 109-114; or e) SEQ ID NQ: 109.

30. The method, isolated polynucleotide, genetic construct, host cell or plant cell of any one of claims 8, 12, 17, 22 and 24-27, wherein the polynucleotide or second polynucleotide has a nucleotide sequence of SEQ ID No:210.

31. The method, isolated polynucleotide, genetic construct, host cell or plant cell of any one of claims 8, 12, 17, 22 and 24-27, wherein the polynucleotide or second polynucleotide encodes a variant having at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 or 100% sequence identity to a nucleotide sequence of SEQ ID No: 210.

32. A kit comprising a first genetic construct of any one of claims 13, 15, 24 and 25; and a second genetic construct of any one of claims 14, 16, 28 and 29.

33. A host cell comprising one or more genetic constructs of any one of claims 13 to 17 and 24 to 31.

34. A plant comprising a plant cell of any one of claims 20, 21 and 24 to 31.

35. A method for selecting a plant with altered GDP-L-galactose phosphorylase 3 (GGP) activity, the method comprising testing a plant for altered expression of a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356 or a variant of the polypeptide having at least 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356.

36. A method for selecting a plant with altered GDP-L-galactose phosphorylase 3 (GGP) activity, the method comprising testing a plant for altered expression of a polynucleotide comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103-108 or a variant thereof wherein the variant comprises a sequence that has at least 70% sequence identity to the nucleotide sequence of any one of SEQ ID NO: 103-108.

37. A method for selecting a plant with altered L-ascorbic acid (AsA) content, the method comprising testing a plant for altered expression of a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356 or a variant of the polypeptide having at least 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356.

38. A method for selecting a plant with altered L-ascorbic acid (AsA) content, the method comprising testing a plant for altered expression of a polynucleotide comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103-108 or a variant thereof comprising a sequence that has at least 70% sequence identity to the nucleotide sequence of any one of SEQ ID NO: 103-108.

39. A plant cell or plant produced by a method according to any one of claims 1 to 10 and 24-31.

40. A plant cell or plant selected by a method according to any one of claims 35-38.

41. A group or population of plants produced by a method according to any one of claims 1 to 10 and 24-31.

42. A method of producing AsA, the method comprising extracting AsA from a host cell of any one of claims 18, 19, and 22-31.

43. A method of producing AsA, the method comprising extracting AsA from a plant cell or plant of any one of claims 20 to 31.

Description:
REGULATION OF GENE EXPRESSION

FIELD OF THE INVENTION

[0001] The present invention relates to polynucleotides, genetic constructs and methods for producing plants with altered GDP-L-galactose phosphorylase 3 (GGP) translation, production and/or activity; and/or altered L-ascorbic acid (AsA) content; and plants and plant cells produced from said methods.

BACKGROUND TO THE INVENTION

[0002] L-ascorbic acid (AsA) is the most abundant soluble antioxidant in plants and is also an essential nutrient for humans and a few other animals. AsA contributes significantly to the overall intake of "free radical scavengers" or "anti-oxidative metabolites" in the human diet. Convincing evidence now shows that such metabolites either singly or in combination, benefit health and well-being, acting as anti-cancer forming agents and protecting against coronary heart disease. Eating sufficient amounts of fruits and vegetables to consistently maintain optimum AsA concentrations is still a challenge for people in both developed and developing countries.

[0003] Almost the entire dietary AsA intake in humans is derived from plant products. However, the AsA content of plant tissues is remarkably variable. Whilst leaf AsA content is generally high and relatively uniform in herbaceous and woody plants, a huge and unexplained variability in AsA content is found in non-green edible plant tissues.

[0004] Understanding how AsA biosynthesis is regulated may provide tools to manipulate biosynthesis in plants. Understanding the regulation of gene expression, and the factors/elements controlling such expression also provide valuable tools for genetic manipulation. The enzyme GDP-L-galactose phosphorylase (GGP) is known to play a key role in AsA biosynthesis. The regulation of GGP expression and activity is yet to be fully resolved.

[0005] It would be desirable to increase the AsA content of plants.

[0006] It is an object of the invention to provide new, improved and/or alternative tools for manipulating the expression and content of AsA in plants, and/or to at least provide the public with a useful choice.

[0007] In this specification, where reference has been made to external sources of information, including patent specifications and other documents, this is generally for the purpose of providing a context for discussing the features of the present invention. Unless stated otherwise, reference to such sources of information is not to be construed, in any jurisdiction, as an admission that such sources of information are prior art or form part of the common general knowledge in the art.

SUMMARY OF THE INVENTION

[0008] In a first aspect the invention relates to a method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content, the method comprising transforming a plant cell with a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356, or a variant of the polypeptide having at least about 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356.

[0009] In a second aspect the invention relates to a method of producing a plant cell or plant having increased GDP-L-galactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising transforming a plant cell with a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356, or a variant of the polypeptide having at least about 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356.

[0010] In a third aspect the invention relates to a method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content, the method comprising transforming a plant cell with a polynucleotide comprising a nucleotide sequence of any one of SEQ ID NOs: 103-108, or a variant thereof having at least about 70% sequence identity to a nucleotide having the nucleotide sequence of any one of SEQ ID NOs: 103- 108.

[0011] In a fourth aspect the invention relates to a method of producing a plant cell or plant having increased GDP-L-galactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising transforming a plant cell with a polynucleotide comprising a nucleotide sequence of any one of SEQ ID NOs: 103-108, or a variant thereof having at least about 70% sequence identity to a nucleotide having the nucleotide sequence of any one of SEQ ID NOs: 103-108.

[0012] In a fifth aspect the invention relates to a method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content or increased GDP-L-galactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising upregulating in the plant cell, or expressing in the plant, a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356.

[0013] In a sixth aspect the invention relates to a method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content or increased GDP-L-galactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising upregulating in the plant cell, or expressing in the plant, a polynucleotide comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103-108 or a variant having at least about 70% sequence identity to the nucleotide sequence of any one of SEQ ID NO: 103-108.

[0014] In a seventh aspect, the invention relates to an isolated polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NOs: 1-4 and 102, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 1-4 and 102, preferably at least about 75% sequence identity.

[0015] In an eighth aspect, the invention relates to an isolated polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-153 155-157, 159, 162-164, 166-174, 176, 178, 180-183, 185, 186, 188-190, 192-200, 202-204, and 206-209, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-153 155- 157, 159, 162-164, 166-174, 176, 178, 180-183, 185, 186, 188-190, 192-200, 202-204, and 206-209, preferably at least about 75% sequence identity.

[0016] In a ninth aspect the invention relates to a genetic construct comprising a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 1-4 and 102, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 1-4 and 102, preferably at least about 75% sequence identity.

[0017] In a tenth aspect the invention relates to a genetic construct comprising a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-153 155-157, 159, 162-164, 166-174, 176, 178, 180-183, 185, 186, 188-190, 192-200, 202-204, and 206-209, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-153 155-157, 159, 162-164, 166-174, 176, 178, 180-183, 185, 186, 188-190, 192-200, 202-204, and 206-209, preferably at least about 75% sequence identity. [0018] In various embodiments the genetic construct may further comprise a second polynucleotide a) encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-209, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-209, preferably at least about 75% sequence identity; and/or b) having a nucleotide sequence of SEQ ID NO: 210, or a variant of the nucleotide having at least about 70% sequence identity to the nucleotide sequence of SEQ ID NO: 210.

[0019] In an eleventh aspect the invention relates to a genetic construct comprising a polynucleotide comprising a nucleotide sequence of any one of SEQ ID NOs: 103-108, or a variant of the polypeptide having at least about 70% sequence identity to the nucleotide sequence of any one of SEQ ID NOs: 103-108, preferably at least about 75% sequence identity.

[0020] In a twelfth aspect the invention relates to a genetic construct comprising a polynucleotide comprising a nucleotide sequence of SEQ ID NO: 210, or a variant of the polypeptide having at least about 70% sequence identity to the nucleotide sequence of SEQ ID NO: 210.

[0021] In a thirteenth aspect the invention relates to a kit comprising a first genetic construct of the ninth or eleventh aspects described herein; and a second genetic construct of the tenth or twelfth aspects described herein.

[0022] In a fourteenth aspect the invention provides a host cell comprising one or more genetic constructs described herein.

[0023] In a fifteenth aspect the invention provides a host cell genetically modified to express a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356.

[0024] In a sixteenth aspect the invention provides a host cell genetically modified to express a polynucleotide comprising a nucleotide sequence of any one of SEQ ID NOs: 103-108, or a variant thereof having at least about 70% sequence identity to the nucleotide sequence of any one of SEQ ID NOs: 103-108. [0025] In a seventeenth aspect the invention provides a plant cell genetically modified to express a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356 or a variant of the polypeptide having at least 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356.

[0026] In an eighteenth aspect the invention provides a plant cell genetically modified to express a polynucleotide comprising a nucleotide sequence selected from any one of SEQ ID NO: 103-108 or a variant thereof having at least 70% sequence identity to the nucleotide sequence of any one of SEQ ID NO: 103-108.

[0027] In a nineteenth aspect the invention provides a plant comprising a plant cell described herein.

[0028] In a twentieth aspect the invention provides a method for selecting a plant with altered GDP-L-galactose phosphorylase 3 (GGP) activity, the method comprising testing a plant for altered expression of a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356 or a variant of the polypeptide having at least 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356.

[0029] In a further aspect the invention provides a method for selecting a plant with altered GDP-L-galactose phosphorylase 3 (GGP) activity, the method comprising testing a plant for altered expression of a polynucleotide comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103-108 or a variant thereof wherein the variant comprises a sequence that has at least 70% sequence identity to the nucleotide sequence of any one of SEQ ID NO: 103-108.

[0030] In a further aspect the invention provides a method for selecting a plant with altered L-ascorbic acid (AsA) content, the method comprising testing a plant for altered expression of a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356 or a variant of the polypeptide having at least 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356.

[0031] In a further aspect the invention provides a method for selecting a plant with altered L-ascorbic acid (AsA) content, the method comprising testing a plant for altered expression of a polynucleotide comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103-108 or a variant thereof comprising a sequence that has at least 70% sequence identity to the nucleotide sequence of any one of SEQ ID NO: 103-108.

[0032] In a further aspect the invention provides a plant cell or plant produced by any method described herein.

[0033] In a further aspect the invention provides a plant cell or plant selected by any method described herein.

[0034] In a further aspect the invention provides a group or population of plants produced by any method described herein.

[0035] In a further aspect the invention provides a method of producing AsA, the method comprising extracting AsA from a host cell described herein.

[0036] In a further aspect the invention provides a method of producing AsA, the method comprising extracting AsA from a plant cell or plant described herein.

[0037] The following embodiments may relate to any or all of the above aspects.

[0038] In various embodiments the method may further comprise transforming, or co-transforming with the polynucleotide, a second polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-209, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-209.

[0039] In various embodiments the method may further comprise transforming, or co-transforming with the polynucleotide, a second polynucleotide with a nucleotide sequence of SEQ ID NO: 210, or a variant thereof having at least about 70% sequence identity to the nucleotide sequence of SEQ ID NO: 210. In various embodiments the variant has at least about 75, 80, 85, 90, 95, 96, 97, 98, 99 or 100% sequence identity to the nucleotide sequence of SEQ ID NO: 210.

[0040] In various embodiments the host cell or plant cell may be genetically modified to express a second polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-209, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-209.

[0041] In various embodiments the host cell or plant cell may be genetically modified to express a second polynucleotide with a nucleotide sequence of SEQ ID NO: 210, or a variant thereof having at least about 70% sequence identity to the nucleotide sequence of SEQ ID NO: 210.

[0042] In various embodiments the polynucleotide may encode a polypeptide with an amino acid sequence of, or may encode a variant of the polypeptide having at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 or 100% sequence identity to an amino acid sequence of a) any one of SEQ ID Nos: 1-102 and 348-356; b) any one of SEQ ID Nos: 1-50, 52-60, 62-83, 85-95, 97-102, 348-350, and 352- 354; c) any one of SEQ ID Nos: 1-49, 53-56, 58-60, 62, 64-66, 70-79, 82, 83, 85-87, 89- 91, 93-95, 98, 102, 348-350, 352, and 353; d) any one of SEQ ID Nos: 1-49, 53-56, 58, 60, 62, 64, 66, 70, 71, 74-76, 78, 79, 83, 85-87, 89-91, 93-95, 98, 102, 348-350, 352, and 353; e) any one of SEQ ID Nos: 1-49, 53-56, 58, 60, 62, 64, 66, 70, 71, 74-76, 78, 79, 83, 85-87, 89-90, 93, 94, 98, 102, 348-350, 352, and 353; f) any one of SEQ ID Nos: 1-5, 7-10, 12-19, 21-24, 28-31, 33, 35, 36, 38-45, 48, 49, 53, 55, 56, 58, 60, 62, 64, 66, 71, 74-76, 79, 83, 85, 87, 89, 90, 93, 94, 98, 102, 350, 352; g) any one of SEQ ID Nos: 1-4, 38, 41-45, 48, 49, 51, 52, 56, 60, 63, 66, 79, 80, 85, 89, 90, 92, 96, 97, 100-102, and 348-356; h) any one of SEQ ID Nos: 1-4, 49, 51, 66, 80, 85, 90, 92, 96, 102, and 348-356; i) any one of SEQ ID Nos: 1-4 and 102; j) any one of SEQ ID Nos: 1-4; k) SEQ ID No: l or 102; l) SEQ ID No: l; or m) SEQ ID No: 102. [0043] In various embodiments the polynucleotide may have a nucleotide sequence of, or may encode a variant of the polynucleotide having at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 or 100% sequence identity to a nucleotide sequence of: a) any one of SEQ ID N0s: 103-108, b) SEQ ID No: 103 or 105, or c) SEQ ID No: 103.

[0044] In various embodiments the polynucleotide, or the second polynucleotide, may encode a polypeptide with an amino acid sequence of, or may encode a variant of the polypeptide having at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 or 100% sequence identity to an amino acid sequence of a) any one of SEQ ID Nos: 109-209; b) any one of SEQ ID Nos: 109-174, 176-178, and 180-209; c) any one of SEQ ID Nos: 109-159, 162-164, 166-174, 176, 178, 180-186, 188-

190, 192-204, and 206-209; d) any one of SEQ ID Nos: 109-153 155-157, 159, 162-164, 166-174, 176, 178, 180-186, 188-190, 192-204, and 206-209; e) any one of SEQ ID Nos: 109-153 155-157, 159, 162-164, 166-174, 176, 178, 180-183, 185, 186, 188-190, 192-200, 202-204, and 206-209; f) any one of SEQ ID Nos: 109-114, 125, 126, 142, 177, 179, 187, 191 and 205; g) any one of SEQ ID Nos: 109-114 and 179; h) any one of SEQ ID Nos: 109-114; or i) SEQ ID NQ: 109.

[0045] In various embodiments the polynucleotide, or the second polynucleotide, may have a nucleotide sequence of, or may encode a variant of the polynucleotide having at least about 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 or 100% sequence identity to a nucleotide sequence of SEQ ID NQ:210. [0046] In various embodiments the polypeptide or polypeptide variant may comprise one or more SANT/MYB domains. In various embodiments the polypeptide or polypeptide variant may comprise two SANT/MYB domains.

[0047] In various embodiments the upregulating may comprise genetic engineering.

[0048] In various embodiments the transformation comprises stable transformation.

In various embodiments, the method comprises stably transforming the plant cell with the polynucleotide or variant thereof. In various embodiments the host cell, or plant cell, is stably transformed and/or stably genetically modified.

[0049] In various embodiments the method may comprise transforming a plant cell with a genetic construct described herein. In various embodiments the method may comprise stably transforming a plant cell with a genetic construct described herein.

[0050] In various embodiments the host cell or plant cell comprises an endogenous GGP gene, preferably a functional GGP gene. For example, in some embodiments the host cell or plant cell has been previously transformed with a functional GGP gene. In some embodiments the method comprises transforming the host cell or plant cell with a GGP gene.

[0051] In some embodiments, the GGP gene has a promoter that comprises one or more MYBS1 binding sites, preferably two or more MYBS1 binding sites. In some embodiments, the method comprises gene-editing the promoter of an endogenous GGP gene to provide a promoter comprising one or more MYBS1 binding sites. In some embodiments, the method comprises modifying an endogenous GGP promoter to provide a GGP promoter comprising one or more MYBS1 binding sites. In some embodiments, the method comprises transforming the plant cell or host cell with a GGP gene. In some embodiments, the method comprises transforming the plant cell or host cell with a GGP gene having a promoter that comprises one or more MYBS1 binding sites, preferably two or more MYBS1 binding sites.

[0052] In various embodiments the upregulating may comprise crossing with a plant that expresses a polypeptide comprising an amino acid sequence having at least 70, 75, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356 and/or SEQ ID No: 109-209.

[0053] According to some embodiments, the upregulating comprises crossing with a plant which expresses a polynucleotide comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103 to 108 and 210. [0054] In various embodiments the genetic construct may comprise one or more polynucleotides operably linked to a promoter. In various embodiments the promoter is at least one of: i) a promoter that is not normally associated with the polynucleotide in nature, ii) a promoter derived from a bacterium, a fungus, an insect, a mammal or a virus. iii) a bacterial promoter, iv) a fungal promoter, v) an insect promoter, vi) a mammalian promoter, and vii) a virus promoter.

[0055] In various embodiments the promoter may be derived from a bacterium, a fungus, an insect, a mammal or a virus.

[0056] In various embodiments, the promoter may be a constitutive promoter, or a tissue-specific promoter. In various embodiments, the promoter results in expression in leaves and/or roots. Preferably, the promoter results in expression in leaves. More preferably, the promoter is a tissue-specific promoter that results in targeted expression in leaves and/or roots.

[0057] In various embodiments the one or more polynucleotides may be regulatable by a compound. In some embodiments the compound is L-ascorbic acid (AsA), or a related metabolite.

[0058] In various embodiments the host cell may be a plant cell.

[0059] The term "comprising" as used in this specification and claims means "consisting at least in part of". When interpreting statements in this specification and claims which include the term "comprising", other features besides the features prefaced by this term in each statement can also be present. Related terms such as "comprise" and "comprised" are to be interpreted in similar manner.

[0060] As used herein the term "and/or" means "and" or "or", or both.

[0061] As used herein the term '(s)' following a noun means the plural and/or singular form of that noun. [0062] It is intended that reference to a range of numbers disclosed herein (for example, 1 to 10) also incorporates reference to all rational numbers within that range (for example, 1, 1.1, 2, 3, 3.9, 4, 5, 6, 6.5, 7, 8, 9 and 10) and also any range of rational numbers within that range (for example, 2 to 8, 1.5 to 5.5 and 3.1 to 4.7) and, therefore, all sub-ranges of all ranges expressly disclosed herein are hereby expressly disclosed. These are only examples of what is specifically intended and all possible combinations of numerical values between the lowest value and the highest value enumerated are to be considered to be expressly stated in this application in a similar manner.

[0063] This invention may also be said broadly to consist in the parts, elements and features referred to or indicated in the specification of the application, individually or collectively, and any or all combinations of any two or more said parts, elements or features, and where specific integers are mentioned herein which have known equivalents in the art to which this invention relates, such known equivalents are deemed to be incorporated herein as if individually set forth.

[0064] Although the present invention is broadly as defined above, those persons skilled in the art will appreciate that the invention is not limited thereto and that the invention also includes embodiments of which the following description gives examples.

BRIEF DESCRIPTION OF THE DRAWINGS

[0065] The invention will now be described by way of example only and with reference to the drawings in which:

[0066] Figure 1 provides a line graph demonstrating AsA content in six species of Actinidia in the days after flowering (DAF).

[0067] Figure 2 provides graphs demonstrating the expression of GGP (GGP1, GGP2 and GGP3) (lines) and AsA content (bars) in (A) A. rufa and (B) A. eriantha in the days after flowering (DAF). Data is shown as the mean +/- standard deviation (n=3).

[0068] Figure 3 provides bar graphs demonstrating that over-expression or knock down expression of AceGGP3 alters AsA concentration in A. eriantha fruits. (A) shows the expression abundance of AceGGP3 and (B) shows the AsA content in A. eriantha fruit 7 days after infiltration; and (C) shows the expression abundance of AceGGP3 and (D) shows the AsA content in A. eriantha cal li 6 days after infiltration with: EV (empty vector); 0X-AceGGP3 (AceGGP3 overexpression) or TRV-AceGGP3 (AceGGPG-antisense expression). Data is shown as the mean +/- standard deviation (n = 3). Significant differences were detected by t-test using GraphPad Prism 8 (*, p <0.05; **, p <0.01; ***, p <0.001).

[0069] Figure 4 provides bar graphs showing (A) RT-qPCR analysis of AceGGP3 and (B) relative AsA content in 5 transgenic lines of A. eriantha with AceGGPG-overexpressed. WT: wild-type; OE-GGP3#10, OE-GGP3#14, OE-GGP3#20, OE-GGP3#21, OE-GGP3#23 represent the five different transgenic lines, respectively. Data is shown as the mean +/- standard deviation (n = 3). Significant differences were detected by t-test using GraphPad Prism 8 (*, p <0.05; **, p <0.01; ***, p <0.001).

[0070] Figure 5 provides a bar graph showing the relative AsA content of A. eriantha calli in CRISPR/Cas9-induced AceGGP3 mutants, wt: wild-type; ggp3#2, ggp3#ll, ggp3#13 and ggp3#15 represent four gene editing lines, respectively. Data is shown as the mean +/- standard deviation (n = 3). Significant differences were detected by t-test using GraphPad Prism 8 (*, p <0.05; **, p <0.01; ***, p <0.001).

[0071] Figure 6 provides bar and line graphs showing a correlation between (A) AsA content (bars) and expression of AceMYBSl (lines), and (B) the expression of AceGGP3 and AceMYBSl in different developmental stage of A. eriantha fruits (every 20 days). Error bars represent +/-SD (n = 3).

[0072] Figure 7A provides a bar graph showing the results of dual-luciferase assays in tobacco leaves showing that AceMYBSl activates transcription of different length fragments of AceGGP3 promoters (P2660, P2088, P1606, P1106). The empty vector (EV) is a control. Error bars: +/- SD. Significant differences were detected by t-test (**, p <0.01; ns: no significance). Figure 7B shows the position of two AceMYBSl predicted binding targets on the AceGGP3 promoter and electrophoretic mobility shift assay (EMSA) showing the binding of AceMYBSl to the AceGGP3 promoter. The unlabelled probes were used as competitors. lOOx and 300x represent the rates of the competitor.

[0073] Figure 8 provides bar graphs showing the results of RT-qPCR analysis of (A) AceMYBSl and (B) AceGGP3 transcripts, and (C) AsA content of A. eriantha fruits (of B) 7 days after infiltration with transient expressed-AceMYBSl vectors. EV: empty vector; OX- AceMYBSl: AceMYBSl-overexpression; TRV-AceMYBSl : AceMYBSl-antisense expression. Experiments were repeated three times and each experiment contained three to six kiwifruits per genotype. Error bars: +/-SD. Significant differences were detected by t-test (*, p<0.05; **, p<0.01; ***, p<0.001; ns: no significance).

[0074] Figure 9 provides bar graphs showing the results of RT-qPCR analysis of (A) AceMYBSl and (B) AceGGP3; and (C) AsA content in A. eriantha calli transfected with the following vectors: EV: empty vector; OX-AceMYBSl : AceMYBSl-overexpression; TRV- AceMYBSl: AceMYBSl-antisense expression; TRV- AceGGP3: AceGGP3-antisense expression; TRV-AceGGP3+OX-AceMYBSl: overexpressed AceMYBSl in TRV- AceGGP3 background. All experiments were performed in three replicates. Error bars: +/-SD. Significant differences were detected by t-test (*, p<0.05; **, p <0.01; ***, p <0.001; ns: no significance).

[0075] Figure 10 provides bar graphs showing the results of RT-qPCR analysis of (A) AceMYBSl and (B) AceGGP3 in wild-type (WT) and six AceMYBSl-overexpression transgenic kiwifruit (A. eriantha) lines: OE-AceMYBSl#l, OE-AceMYBSl#2, OE- AceMYBSl#3, OE-AceMYBSl#4, OE-AceMYBSl#5, OE-AceMYBSl#7.

[0076] Figure 11 provides bar graphs showing the results of (A) bimolecular luminescence complementation (BiLC) assay demonstrating that AceMYBSl interacts with AceGBF3 in vivo and (B) an in vitro pull-down assay showing the interaction between AceMYBSl and AceGBF3. In (A), Agrobacterium clones containing the respective recombinant plasmids were combined at 1 : 1 (v/v) and infiltrated into N. benthamiana leaves. Error bars show the mean +/- SD; Significant differences were detected by t- test: ***, p<0.001; ns: no significance. In (B), AceGBF3-GST protein was incubated with immobilized AceMYBSl-6xHis or 6xHis protein, and immuno-precipitated fractions were detected by Anti-GST antibody.

[0077] Figure 12 provides bar graphs showing the results of (A) dual luciferase assays in tobacco leaves showing the transcription of AceGGP3 activated by AceMYBSl and AceGBF3 individually or collectively; (B) AsA content and (C) RT-qPCR analysis of the expression level in transiently expressed A. eriantha fruits 7 days post-transformation; the expression level in transiently expressed A. eriantha call! of (D) AceGGP3 and (E) AsA content; and the expression level in transiently expressed tobacco leaves of (F) NbGGP and (G) AsA content. EV: empty vector; OX-AceGBF3: AceGBF3-overexpression; TRV- AceGBF3: AceGBF3-antisense expression. The experiments were repeated three times and each experiment contain three to six kiwifruits per genotype. All above error bars denoted standard deviation (+/-SD), and performed three technical repeats each experimental group. Significant differences were detected by t-test (*, p<0.05; **, p<0.01; ***, p<0.001; ns: no significance). Different letters above the bars indicated significant difference (p<0.05) as obtain by one-way ANOVA test.

[0078] Figure 13 provides bar graphs showing (A) AceGGP3 expression level as determined by RT-qPCR analysis and (B) AsA content in AceMYBSl and AceGBF3 transiently co-expressed A. eriantha fruits. EV: empty vector; TRV-AceMYBSl: AceMYBSl-antisense expression; OX-AceMYBSl : AceMYBSl-overexpression; TRV- AceMYBSl+TRV-AceGBF3: co-antisense expression AceMYBSl and AceGBF3; OX- AceMYBSl+OX-AceGBF3: co-overexpression AceMYBSl and AceGBF3; OX- AceMYBSl+TRV-AceGBF3: AceMYBSl-overexpression in the AceGBF3-antisense expression background; OX-AceGBF3+TRV-AceMYBSl : AceGBF3-overexpression in the AceMYBSl-antisense expression background. The experiments were repeated three times and each experiment contain three to six kiwifruits per genotype. Significant differences were detected by t-test (*, p<0.05; **, p<0.01; ***, pcO.OOl; ns: no significance).

[0079] Figure 14 provides bar graphs showing (A) relative AsA content and (B) RT- qPCR analysis of AceGGP3 in wild type (WT) and transgenic A. eriantha calli overexpressing AceGBF3. All above error bars denoted standard deviation (+/-SD), and performed three technical repeats each experimental group. Significant differences were detected by t-test (*, p<0.05; **, p<0.01; ***, pcO.OOl; ns: no significance).

[0080] Figure 15 provides bar graphs showing the mean leaf AsA content (in mg/lOOg fresh weight) of (A) rice, (B) soybean, and (C) Arabidopsis thaliana plants stably transformed to express AceMYBSl . Individual transformed lines are shown as separate bars. Measurements were performed in triplicate, and error bars denote the standard deviation. In (A) and (B), asterisks denote significant difference from wild-type according to Student's T-test (p<0.05). In (C), different letters denote significantly different groupings according to Student's T-test (p<0.05). WT: wild-type control plants.

DETAILED DESCRIPTION OF THE INVENTION

[0081] The present invention, in some embodiments thereof, relates to methods for producing host cells, including plant cells or plants, having increased L-ascorbic acid (AsA) and/or increased GGP translation, production and/or activity.

[0082] The present invention is based on the identification, through genetic and molecular characterisation described herein, of two bZIP transcription factors that are positive regulators of AsA biosynthesis by activating transcription of GGP. An MYBSl-like transcription factor in kiwifruit was shown to bind the promoter of GGP3 and when overexpressed in kiwifruit resulted in significantly increased GGP3 expression and AsA accumulation. Overexpression of a GBF3 bZIP transcription factor also increased AsA content in an additive manner with MYBS1.

[0083] Identification of transcription factors that contribute to the regulation of AsA biosynthesis in plants allows for marker aided selection to be developed to breed plants having higher AsA content, and for high levels of AsA to be produced via biotechnological means, such as biopharming in crop plants and metabolic engineering of host cells such as yeast.

[0084] Thus, according to one aspect of the invention there is provided a method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content, and/or increased GDP-L-galactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising transforming a plant cell with a polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356, or a variant of the polypeptide having at least about 70% sequence identity to a polypeptide with the amino acid sequence of any one of SEQ ID NOs: 1-102 and 348-356 (MYBSl-like protein).

[0085] In various embodiments the method may further comprise transforming, or co-transforming with the polynucleotide, a second polynucleotide encoding a polypeptide with an amino acid sequence of any one of SEQ ID NOs: 109-209, or a variant of the polypeptide having at least about 70% sequence identity to the amino acid sequence of any one of SEQ ID NOs: 109-209 (GBF3-like protein).

[0086] The applicants have identified a polynucleotide (SEQ ID No: 103) encoding an MYBSl-like transcription factor polypeptide (SEQ ID No: l) and a polynucleotide (SEQ ID No: 210) that encodes a GBF3-like transcription factor polypeptide (Seq ID No: 109) from Actinidia eriantha as described in the examples.

[0087] The applicants have shown that these polypeptides induce expression of GGP3 and increase AsA content in several kiwifruit species and a tobacco species.

[0088] The applicants have also identified a) MYBSl-like polypeptide sequences from a number of species that have significant sequence conservation with SEQ ID No: l and are variants of each other (SEQ ID Nos:2-102 and 348-356), and b) GBF3-like polypeptide sequences from a number of species that have significant sequence conservation with SEQ ID No: 109 and are variants of each other (SEQ ID Nos: 110-209).

[0089] Genetic constructs, vectors and plants containing polynucleotide sequences encoding an MYBSl-like polypeptide (SEQ ID NOs: 1-102 and 348-356) or sequences encoding the polypeptide sequences (SEQ ID NO: 103-108) and/or polynucleotide sequences encoding a GBF3-like polypeptide (SEQ ID Nos: 109-209) or sequences encoding the polypeptide sequences (SEQ ID NO: 210) are disclosed herein. [0090] In certain embodiments, there are provided plants and host cells comprising the genetic constructs and vectors disclosed herein. Preferably the plants and host cells are stably transformed with the genetic constructs and/or vectors.

[0091] In some embodiments, there are provided plants altered GGP translation, production and/or activity relative to suitable control plants, and plants altered in AsA content relative to suitable control plants. In some embodiments, there are provided plants with increased GGP translation, production and/or activity and increased AsA. Preferably the plants are stably transformed or modified.

[0092] In other embodiments there are provided methods for the production of such plants and methods of selection of such plants.

[0093] Suitable control plants include non-transformed plants of the same species or variety or plants transformed with control constructs.

1. Polynucleotides and fragments

[0094] The term "polynucleotide(s)" as used herein, means a single or doublestranded deoxyribonucleotide or ribonucleotide polymer of any length but preferably at least 15 nucleotides, and include as non-limiting examples, coding and non-coding sequences of a gene, sense and antisense sequences complements, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polynucleotides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers and fragments.

[0095] A "fragment" of a polynucleotide sequence provided herein is a subsequence of contiguous nucleotides that is capable of specific hybridization to a target of interest, e.g., a sequence that is at least 15 nucleotides in length. Fragments as herein disclosed comprise 15 nucleotides, preferably at least 20 nucleotides, more preferably at least 30 nucleotides, more preferably at least 50 nucleotides, more preferably at least 50 nucleotides and most preferably at least 60 nucleotides of contiguous nucleotides of a polynucleotide as herein disclosed. A fragment of a polynucleotide sequence can be used in antisense, gene silencing, triple helix or ribozyme technology, or as a primer, a probe, included in a microarray, or used in polynucleotide-based selection methods as herein disclosed.

[0096] The term "primer" refers to a short polynucleotide, usually having a free 3'OH group, that is hybridized to a template and used for priming polymerization of a polynucleotide complementary to the template. Such a primer is preferably at least 5, more preferably at least 6, more preferably at least 7, more preferably at least 9, more preferably at least 10, more preferably at least 11, more preferably at least 12, more preferably at least 13, more preferably at least 14, more preferably at least 15, more preferably at least 16, more preferably at least 17, more preferably at least 18, more preferably at least 19, more preferably at least 20 nucleotides in length.

[0097] The term "probe" refers to a short polynucleotide that is used to detect a polynucleotide sequence, that is complementary to the probe, in a hybridization-based assay. The probe may consist of a "fragment" of a polynucleotide as defined herein. Preferably such a probe is at least 5, more preferably at least 10, more preferably at least 20, more preferably at least 30, more preferably at least 40, more preferably at least 50, more preferably at least 100, more preferably at least 200, more preferably at least 300, more preferably at least 400 and most preferably at least 500 nucleotides in length.

2. Polypeptides and fragments

[0098] The term "polypeptide", as used herein, encompasses amino acid chains of any length but preferably at least 5 amino acids, including full-length proteins, in which amino acid residues are linked by covalent peptide bonds. Polypeptides as herein disclosed may be purified natural products, or may be produced partially or wholly using recombinant or synthetic techniques. The term may refer to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof.

[0099] A "fragment" of a polypeptide is a subsequence of the polypeptide that performs a function that is required for the biological activity and/or provides three dimensional structure of the polypeptide. The term may refer to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof capable of performing the above enzymatic activity.

[OO1OO] The term "isolated" as applied to the polynucleotide or polypeptide sequences disclosed herein is used to refer to sequences that are removed from their natural cellular environment. An isolated molecule may be obtained by any method or combination of methods including biochemical, recombinant, and synthetic techniques.

[00101] The term "recombinant" refers to a polynucleotide sequence that is removed from sequences that surround it in its natural context and/or is recombined with sequences that are not present in its natural context. [00102] A "recombinant" polypeptide sequence is produced by translation from a "recombinant" polynucleotide sequence.

[00103] The term "derived from" with respect to polynucleotides or polypeptides as disclosed herein being derived from a particular genera or species, means that the polynucleotide or polypeptide has the same sequence as a polynucleotide or polypeptide found naturally in that genera or species. The polynucleotide or polypeptide, derived from a particular genera or species, may therefore be produced synthetically or recombinantly.

3. Variants

[00104] As used herein, the term "variant" refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is deleted, substituted, or added. Variants may be naturally occurring allelic variants, or non-naturally occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. Variants described herein can also be created via site-directed mutagenesis of the coding sequence for a polypeptide, or by combining domains from the coding sequences for different naturally-occurring polypeptides ("domain swapping").

Techniques for modifying genes encoding functional polypeptides described herein are known and include, inter alia, directed evolution techniques, site-directed mutagenesis techniques and random mutagenesis techniques, and can be useful to increase specific activity of a polypeptide, alter substrate specificity, alter expression levels, alter subcellular location, or modify polypeptide-polypeptide interactions in a desired manner. Such modified polypeptides are considered variants.

[00105] In certain embodiments, variants of the polynucleotides and polypeptides disclosed herein possess biological activities that are the same or similar to those of the polynucleotides or polypeptides disclosed herein. The term "variant" with reference to polynucleotides and polypeptides encompasses all forms of polynucleotides and polypeptides as defined herein.

4. Polynucleotide variants

[00106] Variant polynucleotide sequences preferably exhibit at least 70%, more preferably at least 71%, more preferably at least 72%, more preferably at least 73%, more preferably at least 74%, more preferably at least 75%, more preferably at least 76%, more preferably at least 77%, more preferably at least 78%, more preferably at least 79%, more preferably at least 80%, more preferably at least 81%, more preferably at least 82%, more preferably at least 83%, more preferably at least 84%, more preferably at least 85%, more preferably at least 86%, more preferably at least 87%, more preferably at least 88%, more preferably at least 89%, more preferably at least 90%, more preferably at least 91%, more preferably at least 92%, more preferably at least 93%, more preferably at least 94%, more preferably at least 95%, more preferably at least 96%, more preferably at least 97%, more preferably at least 98%, and most preferably at least 99% identity to a sequence as disclosed herein. Identity is found over a comparison window of at least 20 nucleotide positions, preferably at least 50 nucleotide positions, more preferably at least 100 nucleotide positions, more preferably at least 200 nucleotide positions, more preferably at least 300 nucleotide positions, more preferably at least 400 nucleotide positions, more preferably at least 500 nucleotide positions, and most preferably over the entire length of a polynucleotide disclosed herein.

[00107] Polynucleotide sequence identity can be determined in the following manner. The subject polynucleotide sequence is compared to a candidate polynucleotide sequence using BLASTN (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in bl2seq (Tatiana A. Tatusova, Thomas L. Madden (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250), which is publicly available from NCBI (ftp://ftp.ncbi.nih.gov/blast/). The default parameters of bl2seq are utilized except that filtering of low complexity parts should be turned off.

[00108] The identity of polynucleotide sequences may be examined using the following unix command line parameters: bl2seq -i nucleotideseql -j nucleotideseq2 -F F -p blastn

[00109] The parameter -F F turns off filtering of low complexity sections. The parameter -p selects the appropriate algorithm for the pair of sequences. The bl2seq program reports sequence identity as both the number and percentage of identical nucleotides in a line "Identities = ".

[00110] Polynucleotide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs (e.g. Needleman, S. B. and Wunsch, C. D. (1970) J. Mol. Biol. 48, 443-453). A full implementation of the Needleman-Wunsch global alignment algorithm is found in the needle program in the EMBOSS package (Rice,P. Longden,!. and Bleasby,A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in Genetics June 2000, vol 16, No 6. pp.276-277) which can be obtained from http://www.hgmp.mrc.ac.uk/Software/EMBOSS/. The European Bioinformatics Institute server also provides the facility to perform EMBOSS-needle global alignments between two sequences online at http:/www. ebi.ac.uk/emboss/align/.

[00111] Alternatively the GAP program may be used which computes an optimal global alignment of two sequences without penalizing terminal gaps. GAP is described in the following paper: Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227-235.

[00112] Another method for calculating polynucleotide % sequence identity is based on aligning sequences to be compared using Clustal X (Jeanmougin et al., 1998, Trends Biochem. Sci. 23, 403-5.)

[00113] Polynucleotide variants disclosed herein also encompass those which exhibit a similarity to one or more of the specifically identified sequences that is likely to preserve the functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance. Such sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs described supra.

[00114] The similarity of polynucleotide sequences may be examined using the following unix command line parameters: bl2seq -i nucleotideseql -j nucleotideseq2 -F F -p tblastx

[00115] The parameter -F F turns off filtering of low complexity sections. The parameter -p selects the appropriate algorithm for the pair of sequences. This program finds regions of similarity between the sequences and for each such region reports an "E value" which is the expected number of times one could expect to see such a match by chance in a database of a fixed reference size containing random sequences. The size of this database is set by default in the bl2seq program. For small E values, much less than one, the E value is approximately the probability of such a random match.

[00116] Variant polynucleotide sequences preferably exhibit an E value of less than 1 x IO -10 more preferably less than 1 x IO -20 , more preferably less than 1 x IO -30 , more preferably less than 1 x IO -40 , more preferably less than 1 x IO -50 , more preferably less than 1 x 10 S0 , more preferably less than 1 x IO -70 , more preferably less than 1 x IO -80 , more preferably less than 1 x IO -90 and most preferably less than 1 x IO -100 when compared with any one of the specifically identified sequences.

[00117] Alternatively, variant polynucleotides as disclosed herein hybridize to the specified polynucleotide sequences, or complements thereof under stringent conditions.

[00118] The term "hybridize under stringent conditions", and grammatical equivalents thereof, refers to the ability of a polynucleotide molecule to hybridize to a target polynucleotide molecule (such as a target polynucleotide molecule immobilized on a DNA or RNA blot, such as a Southern blot or Northern blot) under defined conditions of temperature and salt concentration. The ability to hybridize under stringent hybridization conditions can be determined by initially hybridizing under less stringent conditions then increasing the stringency to the desired stringency.

[00119] With respect to polynucleotide molecules greater than about 100 bases in length, typical stringent hybridization conditions are no more than 25 to 30° C (for example, 10° C) below the melting temperature (Tm) of the native duplex (see generally, Sambrook et al., Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Ausubel et al., 1987, Current Protocols in Molecular Biology, Greene Publishing,). Tm for polynucleotide molecules greater than about 100 bases can be calculated by the formula Tm = 81. 5 + 0. 41% (G + C-log (Na + ). (Sambrook et al., Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Bolton and McCarthy, 1962, PNAS 84: 1390). Typical stringent conditions for polynucleotide of greater than 100 bases in length would be hybridization conditions such as prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 65°C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in IX SSC, 0.1% SDS at 65° C and two washes of 30 minutes each in 0.2X SSC, 0.1% SDS at 65°C.

[00120] With respect to polynucleotide molecules having a length less than 100 bases, exemplary stringent hybridization conditions are 5 to 10°C below Tm. On average, the Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/oligonucleotide length)°C.

[00121] With respect to the DNA mimics known as peptide nucleic acids (PNAs) (Nielsen et al., Science. 1991 Dec 6;254(5037): 1497-500) Tm values are higher than those for DNA-DNA or DNA-RNA hybrids, and can be calculated using the formula described in Giesen et al., Nucleic Acids Res. 1998 Nov l;26(21):5004-6. Exemplary stringent hybridization conditions for a DNA-PNA hybrid having a length less than 100 bases are 5 to 10° C below the Tm.

[00122] Variant polynucleotides as disclosed herein also encompass polynucleotides that differ from the sequences as herein disclosed but that, as a consequence of the degeneracy of the genetic code, encode a polypeptide having similar activity to a polypeptide encoded by a polynucleotide of the present invention. A sequence alteration that does not change the amino acid sequence of the polypeptide is a "silent variation". Except for ATG (methionine) and TGG (tryptophan), other codons for the same amino acid may be changed by art recognized techniques, e.g., to optimize codon usage in a particular host organism. [00123] Polynucleotide sequence alterations resulting in conservative substitutions of one or several amino acids in the encoded polypeptide sequence without significantly altering its biological activity are also included in the invention. A skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al., 1990, Science 247, 1306).

[00124] Variant polynucleotides due to silent variations and conservative substitutions in the encoded polypeptide sequence may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from NCBI (ftp://ftp.ncbi.nih.gov/blast/) via the tblastx algorithm as previously described.

[00125] The function of the polypeptide encoded by a variant polynucleotide disclosed herein as modifier of GGP translation, production and/or activity may be assessed for example by expressing such a sequence in a host cell and testing its activity as described herein in the Examples. Function of a variant may also be tested for its ability to alter AsA content in plants, also as described in the Examples section herein.

5. Polypeptide variants

[00126] The term "variant" with reference to polypeptides encompasses naturally occurring, recombinantly and synthetically produced polypeptides. Variant polypeptide sequences preferably exhibit at least 70%, more preferably at least 71%, more preferably at least 72%, more preferably at least 73%, more preferably at least 74%, more preferably at least 75%, more preferably at least 76%, more preferably at least 77%, more preferably at least 78%, more preferably at least 79%, more preferably at least 80%, more preferably at least 81%, more preferably at least 82%, more preferably at least 83%, more preferably at least 84%, more preferably at least 85%, more preferably at least 86%, more preferably at least 87%, more preferably at least 88%, more preferably at least 89%, more preferably at least 90%, more preferably at least 91%, more preferably at least 92%, more preferably at least 93%, more preferably at least 94%, more preferably at least 95%, more preferably at least 96%, more preferably at least 97%, more preferably at least 98%, and most preferably at least 99% identity to a sequences of the present invention. Identity is found over a comparison window of at least 20 amino acid positions, preferably at least 50 amino acid positions, more preferably at least 100 amino acid positions, and most preferably over the entire length of a polypeptide as herein disclosed.

[00127] Polypeptide sequence identity can be determined in the following manner. The subject polypeptide sequence is compared to a candidate polypeptide sequence using BLASTP (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in bl2seq, which is publicly available from NCBI (ftp://ftp.ncbi.nih.gov/blast/). The default parameters of bl2seq are utilized except that filtering of low complexity regions should be turned off.

[00128] Polypeptide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polypeptide sequences using global sequence alignment programs. EMBOSS-needle (available at http:/www. ebi.ac.uk/emboss/align/) and GAP (Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227-235.) as discussed above are also suitable global sequence alignment programs for calculating polypeptide sequence identity.

[00129] Another method for calculating polypeptide % sequence identity is based on aligning sequences to be compared using Clustal X (Jeanmougin et al., 1998, Trends Biochem. Sci. 23, 403-5.)

[00130] Polypeptide variants as disclosed herein also encompass those which exhibit a similarity to one or more of the specifically identified sequences that is likely to preserve the functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance. Such sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from NCBI (ftp://ftp.ncbi.nih.gov/blast/). The similarity of polypeptide sequences may be examined using the following unix command line parameters: bl2seq -i peptideseql -j peptideseq2 -F F -p blastp

[00131] The parameter -F F turns off filtering of low complexity sections. The parameter -p selects the appropriate algorithm for the pair of sequences. This program finds regions of similarity between the sequences and for each such region reports an "E value" which is the expected number of times one could expect to see such a match by chance in a database of a fixed reference size containing random sequences. For small E values, much less than one, this is approximately the probability of such a random match.

[00132] Variant polypeptide sequences preferably exhibit an E value of less than 1 x IO -10 more preferably less than 1 x IO -20 , more preferably less than 1 x IO -30 , more preferably less than 1 x IO -40 , more preferably less than 1 x IO -50 , more preferably less than 1 x 10 S0 , more preferably less than 1 x IO -70 , more preferably less than 1 x IO -80 , more preferably less than 1 x IO -90 and most preferably less than 1 x IO -100 when compared with any one of the specifically identified sequences. [00133] Conservative substitutions of one or several amino acids of a described polypeptide sequence without significantly altering its biological activity are also included in the invention. A skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al., 1990, Science 247, 1306).

[00134] The function of a variant polypeptide disclosed herein as modifier of GGP translation, production and/or activity may be assessed for example by expressing such a sequence in a host cell and testing its activity as described herein in the Examples. Function of a variant may also be tested for its ability to alter AsA content in plants, also as described in the Examples section herein.

[00135] In various embodiments the polypeptide or polypeptide variant may comprise one or more SANT/MYB domains. A SANT domain is a protein domain that allows many chromatin remodelling proteins to interact with histones (Boyer et al., 2002, Molecular Cell 10(4): 935-942; Boyer et al., 2004, Nature Reviews Molecular Cell Biology 5(2): 158-163). SANT domains have an acidic predicted isoelectric point (pl), whereas MYB domains have a basic pl (Ko et al., 2008, Molecular Cancer 7(1): 77). MYBS1 proteins are predicted to have two SANT/MYB domains. For example, MYBS1 from Actinidia eriantha (SEQ ID No: 103) has an N-terminal SANT domain with a predicted acidic isoelectric point (pl), and a second SANT/MYB region with a predicted basic pl.

6. Methods for identifying variants

Physical methods

[00136] Variant polypeptides may be identified using PCR-based methods (Mullis et al., Eds. 1994 The Polymerase Chain Reaction, Birkhauser). Typically, the polynucleotide sequence of a primer, useful to amplify variants of polynucleotide molecules as disclosed herein by PCR, may be based on a sequence encoding a conserved region of the corresponding amino acid sequence.

[00137] Alternatively library screening methods, well known to those skilled in the art, may be employed (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987). When identifying variants of the probe sequence, hybridization and/or wash stringency will typically be reduced relatively to when exact sequence matches are sought.

[00138] Polypeptide variants may also be identified by physical methods, for example by screening expression libraries using antibodies raised against polypeptides disclosed herein (Sambrook et a/., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987) or by identifying polypeptides from natural sources with the aid of such antibodies. Computer based methods

[00139] The variant sequences as disclosed herein, including both polynucleotide and polypeptide variants, may also be identified by computer-based methods well-known to those skilled in the art, using public domain sequence alignment algorithms and sequence similarity search tools to search sequence databases (public domain databases include Genbank, EMBL, Swiss-Prot, PIR and others). See, e.g., Nucleic Acids Res. 29: 1- 10 and 11-16, 2001 for examples of online resources. Similarity searches retrieve and align target sequences for comparison with a sequence to be analyzed (i.e., a query sequence). Sequence comparison algorithms use scoring matrices to assign an overall score to each of the alignments.

[00140] An exemplary family of programs useful for identifying variants in sequence databases is the BLAST suite of programs (version 2.2.5 [Nov 2002]) including BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX, which are publicly available from (ftp://ftp.ncbi.nih.gov/blast/) or from the National Center for Biotechnology Information (NCBI), National Library of Medicine, Building 38A, Room 8N805, Bethesda, MD 20894 USA. The NCBI server also provides the facility to use the programs to screen a number of publicly available sequence databases. BLASTN compares a nucleotide query sequence against a nucleotide sequence database. BLASTP compares an amino acid query sequence against a protein sequence database. BLASTX compares a nucleotide query sequence translated in all reading frames against a protein sequence database. tBLASTN compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. tBLASTX compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. The BLAST programs may be used with default parameters or the parameters may be altered as required to refine the screen.

[00141] The use of the BLAST family of algorithms, including BLASTN, BLASTP, and BLASTX, is described in the publication of Altschul et al., Nucleic Acids Res. 25: 3389- 3402, 1997.

[00142] The "hits" to one or more database sequences by a queried sequence produced by BLASTN, BLASTP, BLASTX, tBLASTN, tBLASTX, or a similar algorithm, align and identify similar portions of sequences. The hits are arranged in order of the degree of similarity and the length of sequence overlap. Hits to a database sequence generally represent an overlap over only a fraction of the sequence length of the queried sequence.

[00143] The BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX algorithms also produce "Expect" values for alignments. The Expect value (E) indicates the number of hits one can "expect" to see by chance when searching a database of the same size containing random contiguous sequences. The Expect value is used as a significance threshold for determining whether the hit to a database indicates true similarity. For example, an E value of 0.1 assigned to a polynucleotide hit is interpreted as meaning that in a database of the size of the database screened, one might expect to see 0.1 matches over the aligned portion of the sequence with a similar score simply by chance. For sequences having an E value of 0.01 or less over aligned and matched portions, the probability of finding a match by chance in that database is 1% or less using the BLASTN, BLASTP, BLASTX, tBLASTN or tBLASTX algorithm.

[00144] Multiple sequence alignments of a group of related sequences can be carried out with CLUSTALW (Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22:4673-4680) CLUSTAL Omega (Sievers et al., (2011). Molecular Systems Biology 7:539, https://www.ebi.ac.uk/Tools/msa/clustalo/) or T-COFFEE (Cedric Notredame, Desmond G. Higgins, Jaap Heringa, T-Coffee: A novel method for fast and accurate multiple sequence alignment, J. Mol. Biol. (2000) 302: 205-217)) or PILEUP, which uses progressive, pairwise alignments. (Feng and Doolittle, 1987, J. Mol. Evol. 25, 351).

[00145] Pattern recognition software applications are available for finding motifs or signature sequences. For example, MEME (Multiple Em for Motif Elicitation) finds motifs and signature sequences in a set of sequences, and MAST (Motif Alignment and Search Tool) uses these motifs to identify similar or the same motifs in query sequences. The MAST results are provided as a series of alignments with appropriate statistical data and a visual overview of the motifs found. MEME and MAST were developed at the University of California, San Diego.

[00146] PROSITE (Bairoch and Bucher, 1994, Nucleic Acids Res. 22, 3583; Hofmann et al., 1999, Nucleic Acids Res. 27, 215) is a method of identifying the functions of uncharacterized proteins translated from genomic or cDNA sequences. The PROSITE database (www.expasy.org/prosite) contains biologically significant patterns and profiles and is designed so that it can be used with appropriate computational tools to assign a new sequence to a known family of proteins or to determine which known domain(s) are present in the sequence (Falquet et al., 2002, Nucleic Acids Res. 30, 235). Prosearch is a tool that can search SWISS-PROT and EMBL databases with a given sequence pattern or signature. [00147] Another example of a protein domain model database is Pfam (Sonnhammer et al., 1997, A comprehensive database of protein families based on seed alignments, Proteins, 28: 405-420; Finn et al., 2010, The Pfam protein families database', Nucl. Acids Res., 38: D211-D222). "Pfam" refers to a large collection of protein domains and protein families maintained by the Pfam Consortium and available at several sponsored world wide web sites, including: pfam.xfam.org/ (European Bioinformatics Institute (EMBL- EBI). The latest release of Pfam is Pfam 35.0 (November 2021). Pfam domains and families are identified using multiple sequence alignments and hidden Markov models (HMMs). Pfam-A family or domain assignments, are high quality assignments generated by a curated seed alignment using representative members of a protein family and profile hidden Markov models based on the seed alignment. (Unless otherwise specified, matches of a queried protein to a Pfam domain or family are Pfam-A matches.) All identified sequences belonging to the family are then used to automatically generate a full alignment for the family (Sonnhammer (1998) Nucleic Acids Research 26, 320-322; Bateman (2000) Nucleic Acids Research 26, 263-266; Bateman (2004) Nucleic Acids Research 32, Database Issue, D138-D141; Finn (2006) Nucleic Acids Research Database Issue 34, D247-251; Finn (2010) Nucleic Acids Research Database Issue 38, D21 1-222). By accessing the Pfam database, for example, using the above-referenced website, protein sequences can be queried against the HMMs using HMMER homology search software {e.g., HMMER2, HMMER3, or a higher version, hmmer.org). Significant matches that identify a queried protein as being in a pfam family (or as having a particular Pfam domain) are those in which the bit score is greater than or equal to the gathering threshold for the Pfam domain. Expectation values (e values) can also be used as a criterion for inclusion of a queried protein in a Pfam or for determining whether a queried protein has a particular Pfam domain, where low e values (much less than 1.0, for example less than 0.1, or less than or equal to 0.01) represent low probabilities that a match is due to chance.

7. Methods for isolating or producing polynucleotides

[00148] The polynucleotide molecules disclosed herein can also be isolated by using a variety of techniques known to those of ordinary skill in the art. By way of example, such polynucleotides can be isolated through use of the polymerase chain reaction (PCR) described in Mullis et al., Eds. 1994 The Polymerase Chain Reaction, Birkhauser, incorporated herein by reference. The polynucleotides as herein disclosed can be amplified using primers, as defined herein, derived from the polynucleotide sequences as herein disclosed.

[00149] Further methods for isolating polynucleotides as disclosed herein include use of all, or portions of, the polynucleotides having the sequence set forth herein as hybridization probes. The technique of hybridizing labelled polynucleotide probes to polynucleotides immobilized on solid supports such as nitrocellulose filters or nylon membranes, can be used to screen the genomic or cDNA libraries. Exemplary hybridization and wash conditions are: hybridization for 20 hours at 65°C in 5. 0 X SSC, 0. 5% sodium dodecyl sulfate, 1 X Denhardt's solution; washing (three washes of twenty minutes each at 55°C) in 1. 0 X SSC, 1% (w/v) sodium dodecyl sulfate, and optionally one wash (for twenty minutes) in 0. 5 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C. An optional further wash (for twenty minutes) can be conducted under conditions of 0. 1 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C.

[00150] The polynucleotide fragments as disclosed herein may be produced by techniques well-known in the art such as restriction endonuclease digestion, oligonucleotide synthesis and PCR amplification.

[00151] A partial polynucleotide sequence may be used, in methods well-known in the art to identify the corresponding full-length polynucleotide sequence. Such methods include PCR-based methods, 5'RACE (Frohman MA, 1993, Methods Enzymol. 218: 340- 56) and hybridization-based method, computer/database-based methods. Further, by way of example, inverse PCR permits acquisition of unknown sequences, flanking the polynucleotide sequences disclosed herein, starting with primers based on a known region (Triglia et al., 1998, Nucleic Acids Res 16, 8186, incorporated herein by reference). The method uses several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then circularized by intramolecular ligation and used as a PCR template. Divergent primers are designed from the known region. In order to physically assemble full-length clones, standard molecular biology approaches can be utilized (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).

[00152] It may be beneficial, when producing a transgenic plant from a particular species, to transform such a plant with a sequence or sequences derived from that species. The benefit may be to alleviate public concerns regarding cross-species transformation in generating transgenic organisms. Additionally when down-regulation of a gene is the desired result, it may be necessary to utilise a sequence identical (or at least highly similar) to that in the plant, for which reduced expression is desired. For these reasons among others, it is desirable to be able to identify and isolate orthologues of a particular gene in several different plant species.

[00153] Variants (including orthologues) may be identified by the methods described herein. 8. Methods for isolating or producing polypeptides

[00154] The polypeptides as disclosed herein, including variant polypeptides, may be prepared using peptide synthesis methods well known in the art such as direct peptide synthesis using solid phase techniques (e.g. Stewart et al., 1969, in Solid-Phase Peptide Synthesis, WH Freeman Co, San Francisco California), or automated synthesis, for example using an Applied Biosystems 431A Peptide Synthesizer (Foster City, California). Mutated forms of the polypeptides may also be produced during such syntheses.

[00155] The polypeptides and variant polypeptides as disclosed herein may also be purified from natural sources using a variety of techniques that are well known in the art (e.g. Deutscher, 1990, Ed, Methods in Enzymology, Vol. 182, Guide to Protein Purification,).

[00156] Alternatively the polypeptides and variant polypeptides as disclosed herein may be expressed recombinantly in suitable host cells as disclosed herein and separated from the cells as discussed below.

9. Constructs, vectors and components thereof

[00157] According to one embodiment, the polynucleotides useful in the methods according to some embodiments of the invention may be provided in a nucleic acid construct useful in transforming a plant or host cell. Suitable plant and host cells are described herein.

[00158] The term "genetic construct" refers to a polynucleotide molecule, usually double-stranded DNA, which may have inserted into it another polynucleotide molecule (the insert polynucleotide molecule) such as, but not limited to, a cDNA molecule. A genetic construct may contain the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. The insert polynucleotide molecule may be derived from the host cell, or may be derived from a different cell or organism and/or may be a synthetic or recombinant polynucleotide. Once inside the host cell the genetic construct may become integrated in the host chromosomal DNA. The genetic construct may be linked to a vector.

[00159] The term "vector" refers to a polynucleotide molecule, usually double stranded DNA, which is used to transport the genetic construct into a host cell. The vector may be capable of replication in at least one additional host system, such as E. coli.

[00160] The term "expression construct" refers to a genetic construct that includes the necessary regulatory elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. An expression construct typically comprises in a 5' to 3' direction: a) a promoter functional in the host cell into which the construct will be transformed, b) the polynucleotide to be expressed, and c) a terminator functional in the host cell into which the construct will be transformed.

[00161] The term "coding region" or "open reading frame" (ORF) refers to the sense strand of a genomic DNA sequence or a cDNA sequence that is capable of producing a transcription product and/or a polypeptide under the control of appropriate regulatory sequences. The coding sequence is identified by the presence of a 5' translation start codon and a 3' translation stop codon. When inserted into a genetic construct, a "coding sequence" is capable of being expressed when it is operably linked to promoter and terminator sequences.

[00162] Because many microorganisms are capable of expressing multiple gene products from a polycistronic mRNA, multiple polypeptides can be expressed under the control of a single regulatory region for those microorganisms, if desired.

[00163] "Operably-linked" means that the sequence to be expressed is placed under the control of regulatory elements that include promoters, tissue-specific regulatory elements, temporal regulatory elements, enhancers, repressors and terminators. Typically, the translation initiation site of the translational reading frame of the coding sequence is positioned between one and about fifty nucleotides downstream of the regulatory region for a monocistronic gene.

[00164] "Regulatory region" refers to a nucleic acid having nucleotide sequences that influence transcription or translation initiation and rate, and stability and/or mobility of a transcription or translation product. Regulatory regions include, without limitation, promoter sequences, enhancer sequences, response elements, protein recognition sites, inducible elements, protein binding sequences, 5' and 3' untranslated regions (UTRs), transcriptional start sites, termination sequences, polyadenylation sequences, introns, and combinations thereof. A regulatory region typically comprises at least a core (basal) promoter. A regulatory region also can include at least one control element, such as an enhancer sequence, an upstream element or an upstream activation region (UAR). A regulatory region is operably linked to a coding sequence by positioning the regulatory region and the coding sequence so that the regulatory region is effective for regulating transcription or translation of the sequence. For example, to operably link a coding sequence and a promoter sequence, the translation initiation site of the translational reading frame of the coding sequence is typically positioned between one and about fifty nucleotides downstream of the promoter. A regulatory region can, however, be positioned as much as about 5,000 nucleotides upstream of the translation initiation site, or about 2,000 nucleotides upstream of the transcription start site.

[00165] The choice of regulatory regions to be included depends upon several factors, including, but not limited to, efficiency, selectability, inducibility, desired expression level, and preferential expression during certain culture stages. It is a routine matter for one of skill in the art to modulate the expression of a coding sequence by appropriately selecting and positioning regulatory regions relative to the coding sequence. It will be understood that more than one regulatory region can be present, e.g., introns, enhancers, upstream activation regions, transcription terminators, and inducible elements.

[00166] The term "noncoding region" includes to untranslated sequences that are upstream of the translational start site and downstream of the translational stop site. These sequences are also referred to respectively as the 5' UTR and the 3' UTR. These sequences may include elements required for transcription initiation and termination and for regulation of translation efficiency. The term "noncoding" also includes intronic sequences within genomic clones.

[00167] Terminators are sequences, which terminate transcription, and are found in the 3' untranslated ends of genes downstream of the translated sequence. Terminators are important determinants of mRNA stability and in some cases have been found to have spatial regulatory functions.

[00168] The term "promoter" refers to nontranscribed cis-regulatory elements upstream of the coding region that regulate gene transcription. Promoters comprise cisin itiator elements which specify the transcription initiation site and conserved boxes such as the TATA box, and motifs that are bound by transcription factors.

[00169] A "transgene" is a polynucleotide that is taken from one organism and introduced into a different organism by transformation. The transgene may be derived from the same species or from a different species as the species of the organism into which the transgene is introduced.

[00170] An "inverted repeat" is a sequence that is repeated, where the second half of the repeat is in the complementary strand, e.g., (5')GATCTA . TAGATC(3')

(3')CTAGAT . ATCTAG(5')

[00171] Read-through transcription will produce a transcript that undergoes complementary base-pairing to form a hairpin structure provided that there is a 3-5 bp spacer between the repeated regions.

10. Methods for producing constructs and vectors

[00172] The genetic constructs as disclosed herein comprise one or more polynucleotide sequences as disclosed herein and/or polynucleotides encoding polypeptides as disclosed herein, and may be useful for transforming, for example, bacterial, fungal, insect, mammalian or plant organisms. The genetic constructs disclosed herein are intended to include expression constructs as herein defined.

[00173] Methods for producing and using genetic constructs and vectors are well known in the art and are described generally in Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987; Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing, 1987).

11. Host cells

[00174] In other embodiments, there is provided a host cell which comprises a genetic construct or vector as disclosed herein. In preferred embodiments, the host cell is genetically modified to express one of: a) a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356, or a variant of the polypeptide, or b) a polynucleotide comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103-108, or a variant thereof, and/or, one of c) a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 109 to 209, or a variant of the polypeptide, or d) a polynucleotide comprising a nucleotide sequence of SEQ ID NO: 210, or a variant thereof.

[00175] Host cells comprising genetic constructs, such as expression constructs, as disclosed herein are useful in methods well known in the art (e.g. Sambrook et a/., Molecular Cloning : A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987 ; Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing, 1987) for recombinant production of polypeptides disclosed herein. Such methods may involve the culture of host cells in an appropriate medium in conditions suitable for or conducive to expression of a polynucleotide or polypeptide disclosed herein. The expressed recombinant polypeptide, which may optionally be secreted into the culture, may then be separated from the medium, host cells or culture medium by methods well known in the art (e.g. Deutscher, Ed, 1990, Methods in Enzymology, Vol 182, Guide to Protein Purification).

12. Methods for producing plant cells and plants comprising constructs and vectors

[00176] In other embodiments there is provided a plant cell which comprises a genetic construct as disclosed herein, and a plant cell modified to alter expression of a polynucleotide or polypeptide as disclosed herein. Plants comprising such cells are also provided.

[00177] Alteration of GGP translation, production and/or activity may be altered in a plant through methods according to some embodiments of the invention. Such methods may involve the transformation of plant cells and plants, with a construct designed to alter expression of a polynucleotide or polypeptide that alters GGP expression, and/or AsA content in such plant cells and plants. Such methods also include the transformation of plant cells and plants with a combination of a construct as disclosed herein and one or more other constructs designed to alter expression of one or more polynucleotides or polypeptides which modulate GGP activity and/or AsA content in such plant cells and plants.

[00178] Methods for transforming plant cells, plants and portions thereof with polypeptides are described in Draper et al., 1988, Plant Genetic Transformation and Gene Expression. A Laboratory ManuaL Blackwell Sci. Pub. Oxford, p. 365; Potrykus and Spangenburg, 1995, Gene Transfer to Plants. Springer-Verlag, Berlin.; and Gelvin et al., 1993, Plant Molecular Biol. Manual. Kluwer Acad. Pub. Dordrecht. A review of transgenic plants, including transformation techniques, is provided in Galun and Breiman, 1997, Transgenic Plants. Imperial College Press, London.

[00179] Transformation may be transient or stable, as is known in the art. Transient transformation results in the temporary introduction of nucleic acid into a cell, however the introduced nucleic acid does not integrate into the cell's genome. Stable transformation results in modification of the cell's genome, which will persist and may be passed on to subsequent generations of the cell. Preferably the plant cells and plants are stably transformed. [00180] In some embodiments, the plant cell to be transformed comprises an endogenous GGP gene, preferably a functional GGP gene. For example, in some embodiments, the plant cell to be transformed has been previously transformed with a functional GGP gene. In other embodiments, the plant cell is co-transformed or subsequently transformed with a functional GGP gene. A functional GGP gene is a gene that encodes and expresses a functional GGP protein. A functional GGP protein is one that is capable of performing one or more functions of a GGP protein, such as catalysing the conversion of GDP-L-galactose to L-galactose 1-phosphate.

[00181] In some embodiments the GGP gene is a GGP3 gene. In some embodiments, the GGP gene encodes a protein with at least about 70% sequence identity to SEQ ID No: 365, preferably at least about 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, at least about 99%, or 100% sequence identity to SEQ ID No: 365.

[00182] In some embodiments, the GGP gene has a promoter that comprises one or more MYBS1 binding sites, preferably two or more MYBS1 binding sites. In some embodiments, the methods described herein may further comprise gene-editing the promoter of an endogenous GGP gene to provide a promoter comprising one or more MYBS1 binding sites. This may be achieved, for example, by modifying an endogenous GGP promoter. Alternatively, this may be achieved by replacing an endogenous promoter with a GGP promoter comprising one or more MYBS1 binding sites, such as SEQ ID No: 345. In some embodiments, the methods described herein may further comprise transforming the plant cell with a GGP gene. In some embodiments, the methods may further comprise transforming the plant cell with a GGP gene having a promoter that comprises one or more MYBS1 binding sites, preferably two or more MYBS1 binding sites.

[00183] An MYBS1 binding site is any nucleotide sequence that an MYBS1 protein is capable of binding to. MYBS1 binding sites may be determined by a number of methods known in the art, for example using bioinformatic prediction (using, for example, JASPAR 2020 (Fornes et al., 2019, Nucleic Acids Research 48(D1): D87-D92)) or using an electrophoretic mobility shift assay (EMSA). Some exemplary methods for identifying MYBS1 binding sites are presented in Example 4.

[00184] Preferably the one or more MYBS1 binding sites comprise the sequence TCTTATC or its reverse complement GATAAGA. In some embodiments, the GGP promoter has at least about 70% sequence identity to SEQ ID No: 345, preferably at least about 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, at least about 99%, or 100% sequence identity to SEQ ID No: 345.

Methods for genetic manipulation of plants

[00185] A number of plant transformation strategies are available (e.g. Birch, 1997, Ann Rev Plant Phys Plant Mol Biol, 48, 297, Hellens RP, et al (2000) Plant Mol Biol 42: 819-32, Hellens R et al Plant Meth 1: 13). For example, strategies may be designed to increase expression of a polynucleotide/polypeptide in a plant cell, organ and/or at a particular developmental stage where/when it is normally expressed or to ectopically express a polynucleotide/polypeptide in a cell, tissue, organ and/or at a particular developmental stage which/when it is not normally expressed. The expressed polynucleotide/polypeptide may be derived from the plant species to be transformed or may be derived from a different plant species.

[00186] Genetic constructs for expression of genes in transgenic plants typically include promoters for driving the expression of one or more cloned polynucleotide, terminators and selectable marker sequences to detect presence of the genetic construct in the transformed plant.

[00187] The promoters suitable for use in the constructs as described herein are functional in a cell, tissue or organ of a monocot or dicot plant and include cell-, tissue- and organ-specific promoters, cell cycle specific promoters, temporal promoters, inducible promoters, constitutive promoters that are active in most plant tissues, and recombinant promoters. Choice of promoter will depend upon the temporal and spatial expression of the cloned polynucleotide, so desired. The promoters may be those normally associated with a transgene of interest, or promoters which are derived from genes of other plants, viruses, and plant pathogenic bacteria and fungi. Those skilled in the art will, without undue experimentation, be able to select promoters that are suitable for use in modifying and modulating plant traits using genetic constructs comprising the polynucleotide sequences as herein disclosed. Examples of constitutive plant promoters include the CaMV 35S promoter, the nopaline synthase promoter and the octopine synthase promoter, and the Ubi 1 promoter from maize. Plant promoters which are active in specific tissues, respond to internal developmental signals or external abiotic or biotic stresses are described in the scientific literature. Exemplary promoters are described, e.g., in WO 02/00894, which is herein incorporated by reference.

[00188] In some embodiments, the promoter is active in leaves and/or roots. In some embodiments, the promoter is a tissue-specific promoter that is active in leaves and/or roots. In a preferred embodiment, the promoter is active in leaves. Promoters that are active in leaves and/or roots, including tissue-specific promoters, are known in the art.

[00189] Exemplary terminators that are commonly used in plant transformation genetic construct include, e.g., the cauliflower mosaic virus (CaMV) 35S terminator, the Agrobacterium tumefaciens nopaline synthase or octopine synthase terminators, the Zea mays zein gene terminator, the Oryza sativa ADP-glucose pyrophosphorylase terminator and the Solarium tuberosum PI-II terminator.

[00190] Selectable markers commonly used in plant transformation include the neomycin phosphotransferase II gene (NPT II) which confers kanamycin resistance, the aadA gene, which confers spectinomycin and streptomycin resistance, the phosphinothricin acetyl transferase (bar gene) for Ignite (AgrEvo) and Basta (Hoechst) resistance, and the hygromycin phosphotransferase gene (hpt) for hygromycin resistance.

[00191] Use of genetic constructs comprising reporter genes (coding sequences which express an activity that is foreign to the host, usually an enzymatic activity and/or a visible signal (e.g., luciferase, GUS, GFP) which may be used for promoter expression analysis in plants and plant tissues are also contemplated. The reporter gene literature is reviewed in Herrera-Estrella et al., 1993, Nature 303, 209, and Schrott, 1995, In: Gene Transfer to Plants (Potrykus, T., Spangenberg. Eds) Springer Verlag. Berline, pp. 325- 336.

[00192] The following are representative publications disclosing genetic transformation protocols that can be used to genetically transform the following plant species: Rice (Alam et al., 1999, Plant Cell Rep. 18, 572); apple (Yao et al., 1995, Plant Cell Reports 14, 407-412); maize (US Patent Serial Nos. 5, 177, 010 and 5, 981, 840); wheat (Ortiz et al., 1996, Plant Cell Rep. 15, 1996, 877); tomato (US Patent Serial No. 5, 159, 135); potato (Kumar et a/., 1996 Plant J. 9, : 821); cassava (Li et al., 1996 Nat. Biotechnology 14, 736); lettuce (Michelmore et al., 1987, Plant Cell Rep. 6, 439); tobacco (Horsch et al., 1985, Science 227, 1229); cotton (US Patent Serial Nos. 5, 846, 797 and 5, 004, 863); grasses (US Patent Nos. 5, 187, 073 and 6. 020, 539); peppermint (Niu et al., 1998, Plant Cell Rep. 17, 165); citrus plants (Pena et al., 1995, Plant Sci.104, 183); caraway (Krens et al., 1997, Plant Cell Rep, 17, 39); banana (US Patent Serial No. 5, 792, 935); soybean (US Patent Nos. 5, 416, 011 ; 5, 569, 834 ; 5, 824, 877 ; 5, 563, 04455 and 5, 968, 830); pineapple (US Patent Serial No. 5, 952, 543); poplar (US Patent No. 4, 795, 855); monocots in general (US Patent Nos. 5, 591, 616 and 6, 037, 522); brassica (US Patent Nos. 5, 188, 958 ; 5, 463, 174 and 5, 750, 871); cereals (US Patent No. 6, 074, 877); pear (Matsuda et al., 2005, Plant Cell Rep. 24(1):45-51); Prunus (Ramesh et al., 2006 Plant Cell Rep. 25(8):821-8; Song and Sink

2005 Plant Cell Rep. 2006 ;25(2): 117-23; Gonzalez Padilla et al., 2003 Plant Cell Rep.22(l):38-45); strawberry (Oosumi et al., 2006 Planta. 223(6): 1219-30; Folta et al.,

2006 Planta Apr 14; PMID: 16614818), rose (Li et al., 2003), Rubus (Graham et al., 1995 Methods Mol Biol. 1995;44: 129-33), tomato (Dan et al., 2006, Plant Cell Reports V25:432-441), apple (Yao et al., 1995, Plant Cell Rep. 14, 407-412) and Actinidia erlantha (Wang et al., 2006, Plant Cell Rep. 25,5: 425-31). Transformation of other species is also contemplated by the invention. Suitable methods and protocols are available in the scientific literature.

[00193] In one embodiment, there is provided a method of producing a plant cell or plant with increased L-ascorbic acid (AsA) content or increased GDP-L-galactose phosphorylase 3 (GGP) translation, production and/or activity, the method comprising upregulating in the plant cell or plant expression of one of: a) a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356, or a variant of the polypeptide, or b) a polynucleotide comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103-108, or a variant thereof, and/or, one of c) a polynucleotide encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 109 to 209, or a variant of the polypeptide, or d) a polynucleotide comprising a nucleotide sequence of SEQ ID NO: 210, or a variant thereof.

[00194] Several methods known in the art may be employed to alter expression of a nucleotide and/or polypeptide as herein disclosed. Such methods include but are not limited to Tilling (Till et al., 2003, Methods Mol Biol, 2%, 205), and so called "Deletagene" technology (Li et al., 2001, Plant Journal 27(3), 235)

[00195] Other methods may involve the use of sequence-specific nucleases that generate targeted double-stranded DNA breaks in genes of interest. Examples of such methods include: zinc finger nucleases (Curtin, et al., 2011, Sander, et al., 2011), transcription activator-like effector nucleases or "TALENs" (Cermak, et al., 2011, Mahfouz, et al., 2011, Li, et al., 2012), and LAGLIDADG homing endonucleases, also termed "meganucleases" (Tzfira, et al., 2012). [00196] Targeted genome editing using engineered nucleases such as clustered, regularly interspaced, short palindromic repeat (CRISPR) technology, is an important new approach for generating RNA-guided nucleases, such as Cas9, with customizable specificities. Genome editing mediated by these nucleases has been used to rapidly, easily and efficiently modify endogenous genes in a wide variety of biomedically important cell types and in organisms that have traditionally been challenging to manipulate genetically. A modified version of the CRISPR-Cas9 system has been developed to recruit heterologous domains that can regulate endogenous gene expression or label specific genomic loci in living cells (Sander and Joung, 2014). The technique is applicable to fungi (Nodvig, et al., 2015).

[00197] Upregulating expression of a polypeptide in a plant, for example by genome editing, can be achieved by: (i) replacing an endogenous sequence encoding the polypeptide of interest or a regulatory sequence under the control which it is placed, and/or (ii) inserting a new gene encoding the polypeptide of interest in a targeted region of the genome, and/or (iii) introducing point mutations which result in up-regulation of the endogenous gene encoding the polypeptide of interest (e.g., by altering the regulatory sequences such as promoter, enhancers, 5'-UTR and/or 3'-UTR, or mutations in the coding sequence).

[00198] In this manner, an endogenous gene encoding a polypeptide with the amino acid sequence of any one of SEQ ID NO: 1-102 and 348-356 or 109-209 or a variant of the polypeptide, or comprising a nucleotide sequence selected from any one of the sequences SEQ ID NO: 103-108 or 210 or a variant thereof, may be upregulated, resulting in increased AsA content and/or increased GGP translation, production and/or activity.

[00199] Antibodies or fragments thereof, targeted to a particular polypeptide may also be expressed in plants to modulate the activity of that polypeptide (Jobling et al., 2003, Nat. Biotechnol., 21(1), 35). Transposon tagging approaches may also be applied. Additionally peptides interacting with a polypeptide as herein disclosed may be identified through technologies such as phage-display (Dyax Corporation). Such interacting peptides may be expressed in or applied to a plant to affect activity of a polypeptide as herein disclosed. Use of each of the above approaches in alteration of expression of a nucleotide and/or polypeptide as herein disclosed is specifically contemplated.

[00200] The terms "to alter expression of" and "altered expression" of a polynucleotide or polypeptide as herein disclosed, are intended to encompass the situation where genomic DNA corresponding to a polynucleotide as herein disclosed is modified thus leading to altered expression of a polynucleotide or polypeptide as herein disclosed. Modification of the genomic DNA may be through genetic transformation or other methods known in the art for inducing mutations. The "altered expression" can be related to an increase or decrease in the amount of messenger RNA and/or polypeptide produced and may also result in altered activity of a polypeptide due to alterations in the sequence of a polynucleotide and polypeptide produced.

13. Methods of selecting plants

[00201] Methods are also provided for selecting plants with altered GGP activity or AsA content. Such methods involve testing of plants for altered expression of a polynucleotide or polypeptide as herein disclosed. Such methods may be applied at a young age or early developmental stage when the altered GGP activity or AsA content may not necessarily be easily measurable.

[00202] The expression of a polynucleotide, such as a messenger RNA, is often used as an indicator of expression of a corresponding polypeptide. Exemplary methods for measuring the expression of a polynucleotide include but are not limited to Northern analysis, RT-PCR and dot-blot analysis (Sambrook et al., Molecular Cloning : A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987). Polynucleotides or portions of the polynucleotides as herein disclosed are thus useful as probes or primers, as herein defined, in methods for the identification of plants with altered levels GGP or AsA. The polynucleotides as herein disclosed may be used as probes in hybridization experiments, or as primers in PCR based experiments, designed to identify such plants.

[00203] Alternatively antibodies may be raised against polypeptides as herein disclosed. Methods for raising and using antibodies are standard in the art (see for example: Antibodies, A Laboratory Manual, Harlow A Lane, Eds, Cold Spring Harbour Laboratory, 1998). Such antibodies may be used in methods to detect altered expression of the polypeptides disclosed herein. Such methods may include ELISA (Kemeny, 1991, A Practical Guide to ELISA, NY Pergamon Press) and Western analysis (Towbin 8r Gordon, 1994, J Immunol Methods, 72, 313).

[00204] These approaches for analysis of polynucleotide or polypeptide expression and the selection of plants with altered GGP activity or altered AsA content are useful in conventional breeding programs designed to produce varieties with altered GGP activity or AsA content.

14. Plants

[00205] The term "plant" is intended to include a whole plant, any part of a plant, propagules and progeny of a plant. [00206] The term "propagule" means any part of a plant that may be used in reproduction or propagation, either sexual or asexual, including seeds and cuttings.

[00207] A "transgenic" or "transformed" plant refers to a plant which contains new genetic material as a result of genetic manipulation or transformation. The new genetic material may be derived from a plant of the same species as the resulting transgenic or transformed plant or from a different species. A transformed plant includes a plant which is either stably or transiently transformed with new genetic material. Preferably a transformed plant is stably transformed.

[00208] The plants and plant cells according to some embodiments of the invention may be grown and either self-ed or crossed with a different plant strain and the resulting hybrids, with the desired phenotypic characteristics, may be identified. Two or more generations may be grown to ensure that the subject phenotypic characteristics are stably maintained and inherited. Plants and plant cells resulting from such standard breeding approaches also form an aspect of the present invention.

[00209] The function of a variant polynucleotide disclosed herein as encoding a MYBSl-like or GBF3-like transcription factor may be assessed for example by expressing such a sequence in bacteria and testing activity of the encoded protein as described in the Example section herein.

[00210] Alteration of GGP activity and/or AsA content may also be altered in a plant or plant cell through methods according to some embodiments of the invention. Such methods may involve the transformation of plant cells and plants, with a construct as herein disclosed designed to alter expression of a polynucleotide or polypeptide which modulates GGP activity and/or AsA content in such plant cells and plants. Such methods preferably also include the transformation of plant cells and plants with a combination of the construct as herein disclosed and one or more other constructs designed to alter expression of one or more other polynucleotides or polypeptides which modulate AsA content in such plant cells and plants.

[00211] Any plant is suitable for use in the invention. The L-galactose biosynthetic pathway, which produces AsA is present in all plants. The enzyme GDP-L-galactose phosphorylase (GGP) is critical to the pathway and is also present in all plants. Therefore, the methods of the invention can be used to alter GGP expression and/or increase AsA production in any plant.

[00212] In various embodiments the plant or plant cell is a gymnosperm plant species. [00213] In a further embodiment the plant or plant cell is an angiosperm plant species.

[00214] In a further embodiment the plant or plant cell is a dicotyledonous plant species.

[00215] Plants and plant cells that are particularly useful in the methods of the invention disclosed herein include all plants and plant cells which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including a fodder or forage legume, ornamental plant, food crop, tree, or shrub selected from the list comprising Abrus precatorius, Acacia spp., Acer spp., Actinidia spp., Aesculus spp., Arachis duranensis, Arabidopsis spp., Arachis ipaensis, Betula spp., Brassica spp., Buddleja alternifolia, Cajanus cajan, Camellia sinensis, Capsicum spp., Carex littledalei, Carica papaya, Carya illinoinensis, Castanea mollissima, Catharanthus roseus, Cephalotus follicularis, Chenopodium quinoa, Cinnamomum cassia, Citrus Clementina, Citrus sinensis, Citrus unshiu, Coffea arabica, Coronillia varia, Corchorus olitorius, Corylus heterophyll, Cotoneaster serotina, Crataegus spp., Cucumis spp., Cupressus spp., Cyathea dealbata, Cydonia oblonga, Cryptomeria japonica, Cymbopogon spp., Dalbergia monetaria, Davallia divaricata, Desmodium spp., Dicksonia squarosa, Diheteropogon amplectens, Dioclea spp, Dolichos spp., Dorycnium rectum, Durio zibethinus, Echinochloa pyramidalis, Ehrartia spp., Eleusine coracana, Eragrestis spp., Erythrina spp., Eucalyptus spp., Euclea schimperi, Eulalia villosa, Fagopyrum spp., Feijoa sellowiana, Fragaria spp., Flemingia spp, Freycinetia banksii, Geranium thunbergii, Ginkgo biloba, Glycine javanica, Glycine max, Glycine soja, Gliricidia spp, Gossypium anomalum, Gossypium barbadense, Gossypium darwinii, Gossypium hirsutum, Gossypium mustelinum, Gossypium raimondii, Gossypium stocksii, Grevillea spp., Guibourtia coleosperma, Hedysarum spp., Helianthus annuus, Hemarthia altissima, Herrania umbratical, Heteropogon contortus, Hevea brasiliensis, Hibiscus syriacus, Hordeum vulgare, Hyparrhenia rufa, Hypericum erectum, Hyperthelia dissoluta, Indigo incarnata, Ipomoea nil, Ipomoea triloba, Iris spp., Jatropha curcas, Juglans macrocarpa, Juglans regia, Juglans macrocarpa x Juglans regia, Lactuca saligna, Lactuca sativa, Leptarrhena pyrolifolia, Lespediza spp., Lettuca spp., Leucaena leucocephala, Loudetia simplex, Lotonus bainesii, Lotus spp., Lupinus angustifolius, Macadamia integrifolia, Macrotyloma axillare, Malus spp., Manihot esculenta, Medicago sativa, Metasequoia glyptostroboides, Morelia rubra, Morus notabilis, Mucuna pruriens, Musa sapientum, Nicotiana spp., Nelumbo nucifera, Nyssa sinensis, Onobrychis spp., Ornithopus spp., Oryza spp., Panicum ha II 11, Panicum virga turn, Pa pa ver somniferum, Peltophorum africanum, Pennisetum spp., Persea gratissima, Petunia spp., Phalaenopsis equestris, Phaseolus spp., Phoenix canariensis, Phoenix dactylifera, Phormium cookianum, Photinia spp., Picea glauca, Pinus spp., Pisum sativum, Podocarpus totara, Pogonarthria fleckii, Pogonarthria squarrosa, Populus spp., Prosopis alba, Prosopis cineraria, Prunus armeniaca, Prunus avium, Prunus dulcis, Prunus mume, Prunus persica Prunus yedoensis var. nudiflora, Pseudotsuga menziesii, Pterolobium stellatum, Punica granatum, Pyrus spp., Quercus spp., Rhamnella rubrinervis, Rhaphiolepsis umbellata, Rhodamnia argentea, Rhododendron griersonianum, Rhododendron simsii, Rhododendron williamsianum, Rhopalostylis sapida, Rhus natalensis, Ribes grossularia, Ribes spp., Ricinus communis, Robinia pseudoacacia, Rosa spp., Rubus spp., Salix spp., Schyzachyrium sanguineum, Sciadopitys verticillata, Sequoia sempervirens, Sequoiadendron giganteum, Solanum chilense, Solanum commersonii, Solanum lycopersicum, Solanum pennellii, Solanum tuberosum, Sorghum bicolor, Spatholobus suberectus, Spinacia spp., Sporobolus fimbriatus, Stiburus alopecuroides, Stylosanthos humilis, Tadehagi spp, Taxodium distichum, Telopea speciosissima, Trema orientate, Tetracentron sinense, Themeda triandra, Theobroma cacao, Trema orientate, Trifolium spp., Triticum spp., Tsuga heterophylla, Vaccinium spp., Vicia spp., Vigna angularis, Vigna radiata var. radiata, Vigna unguiculate, Vitis riparia, Vitis vinifera, Watsonia pyramidata, Zantedeschia aethiopica, Zea mays, Ziziphusjujuba, amaranth, artichoke, asparagus, broccoli, Brussels sprouts, cabbage, canola, carrot, cauliflower, celery, collard greens, flax, kale, lentil, oilseed rape, okra, onion, potato, rice, soybean, straw, sugar beet, sugar cane, sunflower, tomato, squash and tea, amongst others.

[00216] In some embodiments, the plant or plant cell may be a crop plant, such as a food crop or a biofuel crop. In one embodiment, the plant or plant cell may be a cereal, legume, or fruit plant. In one embodiment, the plant or plant cell may be a cereal or legume plant. In one embodiment, the plant or plant cell may be from the family Poaceae. In one embodiment, the cereal may be rice, wheat, oats, barley, triticale, rye, finger millet, Sonoran millet, sorghum, or maize. In one embodiment, the cereal may be barley, wheat, rice, or maize. In another embodiment, the legume may be alfalfa or soybeans. In another embodiment, the fruit plant or plant cell may be apple, kiwifruit, or tomato. In one embodiment, the fruit plant or plant cell may be apple or kiwifruit. In one embodiment, the plant or plant cell may be selected from the group comprising kiwifruit, maize, tomato, wheat, barley, tobacco, soybean, rice, apple, cotton, brassicas, and alfalfa. In one embodiment, the plant or plant cell may be selected from the list consisting of Actinidia arguta, Actinidia chinensis, Actinidia eriantha, Arabidopsis thaliana, Glycine max, Gossypium hirsutum, Hordeum vulgare, Malus domestica, Medicago sativa, Nicotiana benthamiana, Nicotiana tabacum, Oryza sativa, Solanum lycopersicum, Triticum aestivum, and Zea mays. In one embodiment, the plant or plant cell may be a rice, soybean, or kiwifruit plant or plant cell. In one embodiment, the plant or plant cell may be selected from the list consisting of Actinidia arguta, Actinidia chinensis, Actinidia eriantha, Glycine max, and Oryza sativa.

[00217] In some embodiments, plants or plant cells grown specifically for "biomass" may be used. For example, suitable plants or plant cells include corn, switchgrass, sorghum, miscanthus, sugarcane, poplar, pine, wheat, rice, soy, cotton, barley, turf grass, tobacco, potato, bamboo, rape, sugar beet, sunflower, willow, and eucalyptus. In further embodiments, the plant or plant cell is switchgrass (Panicum virgatum), giant reed (Arundo donax), reed canarygrass (Phalaris arundinacea), Miscanthusxgiganteus, Miscanthus sp., sericea lespedeza (Lespedeza cuneata), millet, ryegrass (Lolium multiflorum, Lolium sp.), timothy, Kochia (Kochia scoparia), forage soybeans, alfalfa, clover, sunn hemp, kenaf, bahiagrass, bermudagrass, dallisgrass, pangolagrass, big bluestem, indiangrass, fescue (Festuca sp.), Dactylis sp., Brachypodium distachyon, smooth bromegrass, orchardgrass, or Kentucky bluegrass amongst others.

[00218] Alternatively algae and other non-Viridiplantae can be used for the methods of some embodiments of the invention.

[00219] In some embodiments the plant or plant cell is a fruit species. In one embodiment, the plant or plant cell is a plant of the Cucurbitaceae family, such as S. grosvenorii. In various embodiments the fruit species is selected from the group comprising the following genera: Actinidia, Malus, Citrus, Fragaria and Vaccinium.

[00220] In a further embodiment the plant or plant cell is selected from the group consisting of Actinidia eriantha, Cucumis sativus, Glycine max, Solanum lycopersicum, Vitis vinifera, Arabidopsis thaliana, Malus x domesticus, Medicago truncatula, Populus trichocarpa, Actinidia arguta, Actinidia chinensis, Fragaria vulgaris, Solanum tuberosum, and Zea mays. In a further embodiment the plant or plant cell is selected from the group consisting of Actinidia eriantha, Cucumis sativus, Glycine max, Vitis vinifera, Arabidopsis thaliana, Malus x domesticus, Medicago truncatula, Populus trichocarpa, Actinidia arguta, Actinidia chinensis, Fragaria vulgaris, and Zea mays.

[00221] According to one embodiment, the plant or plant cell is a plant of the Rosaceae family, such as but not limited to, apple tree, pear tree, quince tree, apricot tree, plum tree, cherry tree, peach tree, raspberry bush, loquat tree, strawberry plant, almond tree, and ornamental trees and shrubs (e.g. roses, meadowsweets, photinias, firethorns, rowans, and hawthorns).

[00222] In one embodiment the pear is of the genus Pyrus. Preferred pear species include: Pyrus calleryana, Pyrus caucasica, Pyrus communis, Pyrus elaeagrifolia, Pyrus hybrid cultivar, Pyrus pyrifolia, Pyrus salicifolia, Pyrus ussuriensis and Pyrus x bretschneideri .

[00223] In one embodiment the plant or plant cell is of the genus Malus. Preferred Malus species include: Malus aldenhamensis, Malus angustifolia, Malus asiatica, Malus baccata, Malus coronaria, Malus domestica, Malus doumeri, Malus florentina, Malus floribunda, Malus fusca, Malus halliana, Malus honanensis, Malus hupehensis, Malus ioensis, Malus kansuensis, Malus mandshurica, Malus micromalus, Malus niedzwetzkyana, Malus ombrophilia, Malus orientalis, Malus prattii, Malus prunifolia, Malus pumila, Malus sargentii, Malus sieboldii, Malus sieversii, Malus sylvestris, Malus toringoides, Malus transitoria, Malus trilobata, Malus tschonoskii, Malus x domestica, Malus x domestica x Malus sieversii, Malus x domestica x Pyrus communis Malus xiaojinensis, Malus yunnanensis, Malus sp., and Mespilus germanica. In one embodiment the plant species is Malus domestica. In a specific embodiment, the plant is a Malus domestica, Malus trilobata or Malus sieboldii.

[00224] In another embodiment, the plant or plant cell is a plant of a Vitis species. Exemplary Vitis species include, but are not limited to, Vitis piasezkii maxim and Vitis saccharifera ma kino.

[00225] In one embodiment the plant or plant cell is a plant from a species selected from a group comprising but not limited to the following genera: Smilax (eg Smilax glyciphylla), and Fragaria.

[00226] In a further embodiment the plant or plant cell is from a vegetable species selected from a group comprising but not limited to the following genera: Brassica, Lycopersicon and Solanum.

[00227] Particularly preferred vegetable plant species are: Solanum lycopersicum (formerly Lycopersicon esculentum) and Solanum tuberosum.

[00228] In a further embodiment the plant or plant cell is from monocotyledonous species.

[00229] In a further embodiment the plant or plant cell is from a crop species selected from a group comprising but not limited to the following genera: Glycine, Zea, Hordeum and Oryza. Particularly preferred crop plant species are: Oryza sativa, Glycine max and Zea mays.

[00230] In various embodiments, the plant or plant cell is from a food crop or biofuel crop. Species useful for food or biofuel production are known in the art. For example, species useful for biofuel production may include Miscanthus x giganteus, Cenchrus purpureus, Cocos nucifera L., Jatropha L., and Ricinus communis L.

[00231] In various embodiments, the plant or plant cell is selected from the group consisting of rice, wheat, oats, barley, triticale, rye, finger millet, Sonoran millet, sorghum, maize, banana, Miscanthus, elephant grass or Uganda grass, coconut, sugarcane, cotton, sunflower, soybean, flax, sesame, jatropha, sugar beet, alfalfa, forage brassica, oilseed rape, mustard seed, almond, walnut, pecan, macadamia, peanut, and castor bean.

[00232] Plants may be grouped by phylogenetic classification as described in APG IV (The Angiosperm Phylogeny Group et al. (2016), An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV, Botanical Journal of the Linnean Society 181(1), 1-20, https :/7doi .org/lO, 1111/boj .12385). which is incorporated herein by reference.

[00233] In a further embodiment the plant or plant cell is selected from the group consisting of magnoliids, monocots, basal eudicots, Gunnerales, Dilleniales, superrosids, Santalales, Berberidopsidales, Caryophyllales, Cornales, Ericales, Campanulids, Icacinales, Metteniusales, Garryales, Boraginales, Gentianales, Vahliales, and Lamiales.

[00234] In a further embodiment the plant or plant cell is selected from the group consisting of magnoliids, monocots, basal eudicots, Gunnerales, Dilleniales, superrosids, Santalales, Berberidopsidales, Caryophyllales, Cornales, Ericales, campanulids, Icacinales, Metteniusales, Garryales, Boraginales, Gentianales, Vahliales, Lamiales, Convolvulaceae, Montiniaceae, Sphenocleaceae, and Hydroleaceae.

[00235] In a further embodiment the plant or plant cell is selected from the group consisting of magnoliids, monocots, basal eudicots, Gunnerales, Dilleniales, Fabids, Geraniales, Myrtales, Crossosomatales, Picramniales, Malvales, Huerteales, Sapindales, Vitales, Saxifragales, Santalales, Berberidopsidales, Caryophyllales, Cornales, Ericales, campanulids, Icacinales, Metteniusales, Garryales, Boraginales, Gentianales, Vahliales, Lamiales, Convolvulaceae, Montiniaceae, Sphenocleaceae, and Hydroleaceae.

[00236] In a further embodiment the plant or plant cell is selected from the group consisting of magnoliids, monocots, basal eudicots, Gunnerales, Dilleniales, Fabids, Geraniales, Myrtales, Crossosomatales, Picramniales, Malvales, Huerteales, Sapindales, Vitales, Saxifragales, Santalales, Berberidopsidales, Caryophyllales, Cornales, Ericales, campanulids, Icacinales, Metteniusales, Garryales, Boraginales, Gentianales, Vahliales, Lamiales, Convolvulaceae, Montiniaceae, Sphenocleaceae, Hydroleaceae, Akaniaceae, Tropaeolaceae, Moringaceae, Caricaceae, Setchellanthaceae, Limnanthaceae, Salvadoraceae, Bataceae, Koeberliniaceae, Emblingiaceae, Pentadiplandraceae, Resedaceae, Gyrostemonaceae, Tovariaceae, Capparaceae, and Cleomaceae.

[00237] In some embodiments, the plant or plant cell contains an endogenous GGP3 gene, preferably a functional GGP3 gene. In some embodiments, the plant or plant cell has been previously transformed with a functional GGP3 gene. In some embodiments, the plant or plant cell is co-transformed or subsequently transformed with a functional GGP3 gene. In some embodiments, the GGP3 gene has a promoter comprising one or more MYBS1 binding sites, more preferably two or more MYBS1 binding sites. Preferably the MYBS1 binding sites comprise the sequence TCTTATC or its reverse complement GATAAGA.

15. Methods for extracting and measuring AsA from plants

[00238] Methods are also provided for the production of L-ascorbic acid (AsA) by extraction of AsA from a plant of the invention. AsA may be extracted from plants as follows:

[00239] Frozen tissue samples are ground to a fine powder in a Cryomill at liquid nitrogen temperature. About 200 mg of frozen powdered tissue is then suspended in 5 volumes of 7% metaphosphoric acid containing 2 mM TCEP (Pierce), vortexed for 20 sec and incubated in a heating block for 2 h at 40°C. TCEP is used in the extraction solution, because it is more effective reducing agent under acidic conditions than DTT, ensuring that all of vitamin C is in the ascorbic acid reduced form. The extract is centrifuged at 4 °C and twenty piL of the supernatant is injected into a Rocket Column And eluted using two solvents A (0.28% o-phosphoric acid, 0.1 mM EDTA and 0.25% methanol) and B (acetonitrile). L-ascorbic acid and other compounds were eluted using a 5-min gradient to 90% B. Standards were run with every batch or 20 samples processed. AsA is calculated from the area under the absorption at 240 nm curve at ~1 minute of elution.

[00240] This method may be up-scaled for larger scale AsA extraction using approaches well-known to those skilled in the art.

[00241] The above methods should be considered in no way limiting and suitable variations or alternatives will be apparent to those skilled in the art. EXAMPLE

1. Materials and methods

Plant materials and growth conditions

[00242] Fruit samples were collected from field-grown plants in Hubei, China, during the 2016-2018 growing season. The AsA content in the fruits of 48 Actinidia species and 1 cross-population (A. eriantha x A. rufa) were measured during 2017-2019. Six kiwifruit taxa with >150-fold variation in AsA content were selected to study the changes in AsA during fruit development (A. eriantha, A. latifolia, A chinensis var. deliciosa, A chinensis var. chinensis, A. rufa and A cylindrical. Tissue culture materials for tobacco Nicotiana benthamiana) and kiwifruit (A. eriantha) were grown at 23-25 °C under long- day conditions (16 h of light/8 h of darkness). Transgenic and gene-edited plants were potted and grown in Containment Glasshouse-1 at the Wuhan Botanical Garden, Chinese Academy of Sciences, Hubei, China (14 h of light/10 h of darkness, 18 °C min/30 °C max). Samples collected from individual plants were considered biological replicates.

Illumina RNA-seq and transcriptome analysis

[00243] Fruit RNAs from 20 days after fruiting (DAF20) to DAF120 were extracted separately from A. eriantha and A. rufa. RNA was extracted using RNeasy Plant Mini Kits (Qiagen, Inc., USA). Library construction and sequencing were performed at BerryGenomics (www.berrygenomics.com) using the Illumina NovaSeq 2000 platform, obtaining paired-end reads of 150 bp and generating > 20 million reads per sample. Sequencing adapters and low-quality raw reads were filtered out using Trimmomatic (DOI: 10.1093/bioinformatics/btul70) with default parameters. The filtered clean reads were mapped to the Hongyang_v3 reference genome (http://kiwifruitgenome.org/) by Hisat2 (DOI: 10.1038/S41587-019-0201-4). Read counts and FPKM (fragments per kilobase of exon per million mapped fragments) of annotated genes were calculated by HTSeq (DOI: 10.1093/bioinformatics/btu638.) and Stringtie2 (DOI: 10.1038/nbt.3122), respectively. The differentially expressed genes between the kiwifruit materials with differentiation of AsA in the same development stage were identified by an R package edgeR with a p-value < 0.05, and the Gene Ontology (GO) term enrichment was performed using the Gene Ontology Resource (http://geneontology.org/).

RNA extraction and quantitative RT-PCR analysis

[00244] Quantitative reverse transcription (qRT)-PCR was performed following previously described methods (Wang et al., Plant biotechnology journal 16(8): 1424- 1433.). Total RNA was isolated with an RNA Extraction Kit (TIANGEN, Beijing, China) and the single-stranded cDNA of all samples was obtained using a one-step gDNA removal and cDNA Synthesis Supermix Kit (TransGen, Beijing, China). Kiwifruit ACTIN Achnl07181) and Protein Phosphatase 2A (Achn381211') were used as the reference genes for expression normalisation. The 2 AACt method (Livak & Schmittgen, 2001) was used to calculate the relative expression of each gene. Primers used for qRT-PCR are provided in Table 1 below. All qPCR analyses were performed with three technical replicates.

AsA measurement

[00245] Measurement of total AsA concentration was performed using High- performance liquid chromatography (HPLC) following a previously described method (Queval & Noctor, 2007 Anal Biochem 363(1): 58-69; Li et al., 2016 Plant Molecular Biology 92(4): 473-482).

Phylogenetic analysis and tree construction

[00246] A phylogenetic tree was constructed using MEGA7.0 (Kumar et al., 2016 Mol Biol Evol 33(7): 1870-1874) software with kiwifruit and other species sequences retrieved from kiwifruit (http://kiwifruitgenome.org/) and GenBank databases (Table 2). Genetic distances were calculated using the Jukes-Cantor distance matrix and evolutionary relationships were inferred using the neighbour-joining method with 1000 bootstrap resampling.

Vector construction and kiwifruit transformation

[00247] The coding DNA sequence (CDS) of AceGGP3, AceMYBSl, AcrMYBSl and AceGBF3 were amplified from cDNA from A. eriantha or A. rufa and cloned into the overexpression vector (POE-3Flag-DN, from the Wuhan Botanical Garden laboratory, 35S promoter driven expression, and G418 (Geneticin, Invitrogen) or kanamycin selectable markers) to generate 35S: -.AceGGP3, 35S: :AceMYBSl, 35S: :AcrMYBSl, 35S: :AceGBF3 respectively.

[00248] The AceGGP3 and AceMYBSl were edited by CRISPR/Cas9 as described previously (Wang et al., 2018. Plant biotechnology journal 16(8): 1424-1433). CRISPR RGEN Tools (http://www.raenome.net/?tdsourcetaa=s ocaa aiomsa) was used to select specific sgRNAs that targeted AceGGP3 and AceMYBSl respectively and the sgRNAs were cloned into CRISPR/Cas9 vector to generate Cas9-AceGGP3 and Cas9-AceMYBSl editing vectors. Using Agrobacterium-mediated transformation, the recombinant plasmids were transformed into the calli of A.eriantha following previously described methods (Akba§ et al., 2009. Appl Biochem Biotechnol 158(2): 470-475; Yuan, 2011 Proceedings of the 2011 International Conference on Future Computer Science and Application (FCSA 2011). 17-20; Wang et al., 2018. Plant biotechnology journal 16(4): 844-855; Wang et al., 2018. Plant biotechnology journal 16(8): 1424-1433). Primers used for vector construction and identification of transgenic lines are listed in Table 1.

Transient expression assay in kiwi fruit and tobacco

[00249] Antisense viral vectors AceGGP3- RV2, AceMYBSl-TRV2 and AceGBF3-TRV2 were obtained by cloning the CDS of AceGGP3, AceMYBSl, and AceGBF3 into the TRV2 vector (Yu et al., 2019a) respectively. TRV1 vector was used as an auxiliary plasmid. The TRV1 and TRV2 vectors were transformed i nto Agrobacterium tumefaciens GV3101A respectively and mixed with each other before injection into the calli and fruits of A. eriantha. The same method was used for transient over expression experiments with 35S driven AceGGP3, AceMYBSl and AceGBF3 vectors in A. eriantha and Nicotiana benthamiana (tobacco) leaves.

Yeast one-hybrid (Y1H) assay

[00250] Yeast one-hybrid assay was performed as previously described (Lin et al., 2007. Science 318(5854): 1302-1305; Cheng et al., 2021. The Plant cell 33(4): 1229- 1251.). The promotor sequences of AceGGP3 (2.6-kb, Table S2) and AcrGGP3 (2.5-kb, Table S3) were amplified and inserted into the corresponding sites of the reporter plasmid pLacZi (Clontech) to generate AceGGP3 pro:. -LacZ and AcrGGP3pro: :LacZ. The CDS of AceMYBSl and AcrMYBSl were separately cloned into pJG4-5 vectors (Clontech) to construct pJG-AceMYBSl and pJG-AcrMYBSl. AceGGP3 pro: : LacZ or

AcrGGP3pro: :LacZ were co-transformed with pJG-AceMYBSl or pJG-AcrMYBSl into yeast strain EGY48 using a high-efficiency yeast transformation method (Gietz 8r Schiestl, 2007), respectively. The p53::LacZ+ pJG-p53 was used as positive control and the AceGGP3pro: :LacZ or AcrGGP3pro: :LacZ add the pJG4-5 empty vector were the negative control. Transformants were grew on SD/-Trp-Ura dropout medium: 6.7 g.L -1 yeast nitrogen base, 20 g.L 1 Galactose, 10 g.L 1 Raffinose, 2 g.L 1 dropout mix-TRP-URA and 20 g.L 1 agar. After sterilization at 121 °C for 15 min, added 100 mL 10 x BU salt (contain 70 g.L 1 Na2HPO4‘7H2O and 30 g.L 1 NaFhPC pH= 7.0) and 0.08 mg.mL 1 X-gal for colorimetric screening. All yeast strains were incubated at 30 °C for 3 d. Primers are listed in Table 1.

Yeast two-hybrid (Y2H) assay

[00251] Construction of a cDNA library of kiwifruit for yeast two-hybrid experiments were carried out by GeneCreate Biological Engineering Co., Ltd (Wuhan, China) using mRNA from the leaf and fruit. The CDS of AceMYBSl was cloned into the pGBKT7 vector (Clontech) as a bait plasmid. Y2H screening assays were performed as described in the BD Matchmaker Two-Hybrid Library Screening Kit user manual (Clontech). For Y2H assay, AceGBF3 was cloned from cDNA library and inserted into pGADT7 vector (Clontech). The combinations of recombinant plasmids were transferred into yeast strain Y2H gold and grown on SD/-Trp-Leu and SD/-Trp-Leu-His-Ade dropout medium plates supplemented with 0.02 mg.L 1 X-a-gal, at 30 °C for 3 d. The primers are listed in Table 1.

Transcriptional activation analysis

[00252] The full-length cDNA of AceMYBSl and AceGBF3 were PCR amplified and fused into pGBKT7 vector to generated two constructed (BD-AceMYBSl and BD- AceGBF3) and transformed into the yeast strain AH109. The transcriptional activation analysis was performed as described by (Geng & Liu, 2018. Journal of experimental botany 69(10): 2677-2692).

Dual-luciferase (Dual-LUC) assay

[00253] For the Dual-LUC assay, the full-length promoter of AceGGP3 (2.6 kb; SEQ ID No: 345) and AcrGGP3 (2.5 kb; SEQ ID No: 346) or their truncated fragments were cloned into the pGreen0800-Luc vector (Hellens et al., 2005) to obtain reporters: AceGGP3pro-2660: : LUC (P2660), AceGGP3pro-1106: :LUC (Pnos), AceGGP3pro-1606: :LUC (P16O6), AceGGP3pro-2088: :LUC (P2088), and AcrGGP3pro::LUC. The 35S: AceMYBSl, 35S: AcrMYBSl and 35S: AceGBF3 were transformed into A. tumefaciens strain EHA105. Effector and reporters were mixed 5: 1 (v/v) then co-infiltrated into 4-weeks-old tobacco leaves as previously (Gao et al., 2020. Journal of experimental botany 71(12): 3560- 3574). After 2-3 days at 23 °C, the promoter activities were determined by measuring Firefly Luciferase to Renilla Luciferase (LUC/REN) ratios using a Dual-luciferase Kit (TransGen, Beijing, China) with a Chemiluminescence Imaging System (Clinx, Shanghai, China).

Protein expression and electrophoretic mobility shift assay (EMSA)

[00254] The full-length AceMYBSl CDS was inserted into the pET32a vector (Novagen) containing 6xHis (both fusion in N-terminal and C-terminal) and expressed using E. coll strain BL21 (TransGen, Beijing, China) to produce recombinant AceMYBSl- His protein. An E. coll strain expressing 6xHis was used as a negative control. E. coll was cultivated 12 hours at 16 °C and then diluted 1: 100 (v/v) into fresh medium and grown on for another 2-3 hours at 37 °C. When cell growth reached the logarithmic phase, IPTG was added to 0.5 mM final concentration and induced protein expression proceeded at 16 °c for 10-12 hours. The fusion protein was purified according to manufacturer instructions using Proteinlso Ni-NAT Resin (TransGen, Beijing, China). The oligonucleotide probes (Table SI) containing the AceMYBSl binding sequences of AceGGP3, which was predicted by JASPAR 2020 (httoV/iasoar. generea.net/) were synthesized and labelled with biotin. Double strand DNA probes were obtained by annealing two complementary oligonucleotides. The fusion protein AceMYBSl-His was mixed with probes and incubated at room temperature for 20 min. DNA gel mobility shift assay was performed using the EMSA Kit (Beyotime, China) following the manufacturer protocol.

Subcellular localization

[00255] The AceMYBSl-YFP or AceGBF3-YFP mixed with NLS-mCherry-RFP were coinfiltrated into tobacco leaves with infiltration buffer by described previously (Gao et al., 2020. Journal of experimental botany 71(12): 3560-3574). Fluorescence was observed 48 h post-infiltration by Confocal Microscopy (Leica TCS-SP8; excitation wavelength with YFP: 510 nm and RFP: 552 nm).

Bimolecular fluorescence complementation assay (BiFC)

[00256] The full-length AceMYBSl CDS was fused with C-terminal YFP (cYFP), and AceGBF3 was fused with N- terminal YFP NYFP). The recombinant vectors or control (empty vectors) were transformed into A. tumefaciens strain EHA105 and then cotransformed into onion epidermal cells with infiltration buffer (Zhu et al., 2020. The Plant cell 32(10): 3155-3169). YFP fluorescence was detected by Confocal Microscopy (Leica TCS-SP8; excitation wavelength with YFP: 510 nm and DAPI: 488 nm) 48h after infiltration.

Pull down

[00257] The full-length CDS of AceGBF3 was cloned it into pGEX-4T vector (GE Healthcare) in which it is fused to glutathione-S-transferase (GST) sequence. The recombinant vectors were introduced into BL21 cells to produce AceGBF3-GST fusion protein, and expressed AceGBF3-GST and AceMYBSl-6xHis fusion protein was purified as described previously (Xu, 2020). Aliquots (20 pL) of AceGBF3-GST and AceMYBSl-His proteins were mixed with 1 mL binding buffer (50 mM Tris- HCI [pH= 7.5], 100 mM NaCI, 0.25 % Triton-XlOO- [v/v], 35 mM p-Mercaptoethanol), then 50 pL Proteinlso GST Resin or 50 pL Proteinlso Ni-NAT Resin (TransGen, Beijing, China) was added and the mixture was rotated at 4 °C for 3-4 hours. The samples were washed 4 times with binding buffer, with expressed GST or 6xHis used as negative controls. 6xSDS protein loading buffer was added to samples (to lx final) and samples were denatured by boiling for 10 min before electrophoresis. After electrophoresis the gels analysed by Western blot using a nti-HiS (1 : 10,000 [v/v], Proteintech, 66005-1-lg) and anti-GST (1: 10,000 (v/v), Proteintech, 66002-2-lg) antibodies. Bimolecular luminescence complementation (BiLC) assay

[00258] BiLC was performed as described (Chen et al., 2008. Plant physiology 146(2): 368-376). The full-length of AceMYBSl and AceGBF3 CDS were insert into pCAMBIA1300-cLUC and pCAMBIA1300-nLUC vector to create AceMYBSl-cLUC and AceGBF3-nLUC constructs, respectively. Agrobacterium cultures harbouring the different constructs were mixed at 1: 1 (v/v) and co-transformed into tobacco leaves. Plants were incubated under dark for 12 h and then transferred to light conditions at 25 °C for 48 h. Immediately prior to luciferase activity observation, the transformed tobacco leaves were soaked in 0.15 mg.mL 1 D-Luciferin potassium (Coolaber, China) for 2-3 min and images were captured using a Chemiluminescence Imaging System (Clinx, Shanghai, China).

Statistical analyses

[00259] One-way analysis of variance (ANOVA) was performed using SPSS v20 (IBM Corp., Armonk, NY, USA), and Student's t-test was performed using GraphPad 8.0 software. Significant differences were detected by t-tests. In the figures, the following notations are used: *, P < 0.05; **, P < 0.01; and ***, P < 0.001. In figures, the different letters above the bars denote significance groupings (P < 0.05) as determined by ANOVA, the data represent the mean values, and error bars represent standard deviations.

Accession numbers

[00260] The sequence information for this study was obtained from the Kiwifruit (htto://ki wifruitaenome.org/), Nicotiana benthamiana (httos://solaenomics.net / organism / Nicotiana benthamiana/aenome) and National Center for Biotechnology Information (https ://www. ncbi.nlm.nih.gov/) databases. Additionally, corresponding Acc gene models from A. chinensis (genotype Red5) are also referred to and represent PSI.1.69.0 version gene models (Pilkington et al., 2018, BMC Genomics 19). GenBank accession numbers are listed in Table 2.

Table 1: Primers used for qRT-PCR, vector construction and transformation and Yeast-l-Hybrid and Yeast-2-Hybrid.

Primers Sequence (5'-3') SEQ Objective

ID

AcrGGPl-qpcrFl TGTGCTGATGATAGAAGTGGCA 211 qRT-PCR

AcrGGPl-qpcrRl GGGCATCATTCGCTACTTCATG 212 qRT-PCR

AcrGGP3-qpcrFl TGAAGCCATCTCTTGTGCTGAT 213 qRT-PCR

AcrGGP3-qpcrRl TGC 1 1 C 1 1 CACGAGATTGAGGA 214 qRT-PCR

AceGGPl-qpcrFl TGAAGCCATCTCTTGTGCTGAT 215 qRT-PCR

AceGGPl-qpcrRl AAGGGCATCATTCGCTACTTCA 216 qRT-PCR

AceGGP3-qpcrFl TGAAGCCATCTCTTGTGCTGAT 217 qRT-PCR AceGGP3-qpcrRl TGC 1 1 C 1 1 CACGAGATTGAGGA 218 qRT-PCR

AcrGGPl-qpcrF2 ATGATAGAAGTGGCAGCACGGCTGA 219 qRT-PCR

AcrGGPl-qpcrR2 ATCATTCGCTACTTCATGAGATTGT 220 qRT-PCR

AcrGGP3-qpcrF2 TGGAAGCGGCAGCACAGCTGA 221 qRT-PCR

AcrGGP3-qpcrR2 ACAGTGGGAGCCTTGGTTAAG 222 qRT-PCR

AceGGPl-qpcrF2 ATGATAGAAGCGGCAGCACGGCTGG 223 qRT-PCR

AceGGPl-qpcrR2 ATCATTCGCTACTTCACGAGATTGA 224 qRT-PCR

AceGGP3-qpcrF2 GGGAAGCGGCAGCACAGCTGA 225 qRT-PCR

AceGGP3-qpcrR2 ACTGTGGGAGCC 1 1 1 G 1 1 AA 1 226 qRT-PCR

AcrGGP2-qpcrF CCGACGGTTGTTTCCAATTACC 227 qRT-PCR

AcrGGP2-qpcrR CTGAGGCAATTACGTCCACAAC 228 qRT-PCR

AceGGP2-qpcrF CCGACGGTTGTTTCCAATTACC 229 qRT-PCR

AceGGP2-qpcrR CTGAGGCAATTACGTCCACAAC 230 qRT-PCR

AceMYBSl- qpcrF AATTCACTGGGCAGAGGAC 231 qRT-PCR

AceMYBSl- qpcrR CATCCTCAACTAGTAATTGG 232 qRT-PCR

AcrMYBSl- qpcrF ATGCCATTGCAATTCACTG 233 qRT-PCR

AcrMYBSl- qpcrR TGCAGCTACATCCTCAACTA 234 qRT-PCR

AceGBF3- qpcrF GGGACCACCACAGCCTATGATGCCA 235 qRT-PCR

AceGBF3- qpcrR TGACATTGCTAGCCCATCGAAACCT 236 qRT-PCR

NbGGP3- qpcrF TTTCGCATCGATCAGGTTCTTCA 237 qRT-PCR

NbGGP3- qpcrR CTCAAACCTGAAAAGCACTTC 238 qRT-PCR

Nbp-Tubulin-F CAAGATGCTACTGCAGACGAG 239 qRT-PCR

NbP-Tubulin-R CTGGAAGTTGTGG 1 1 1 1 GGC 240 qRT-PCR

PMM- qpcrF CCATGTGGAGTGTGATGCCAGTGTC 241 qRT-PCR

PMM-qpcrR CAAAGCAGCAAGTTGGATGCCGTTC 242 qRT-PCR

GMP- qpcrF GCTCTGGCTAGGGACAAACTGAT 243 qRT-PCR

GMP- qpcrR TGGAATTCGATCATCTCTTTG 244 qRT-PCR

GME- qpcrF ATGGTCAGCATGAATGAGATGGCCG 245 qRT-PCR

GME-qpcrR GCTTCTCCTTAATCAGGGTGTTGTC 246 qRT-PCR

GGP- qpcrF GAGAGCTTCTTGCTTGC 247 qRT-PCR

GGP-qpcrR GCACCCAAGCTGTTGTAACC 248 qRT-PCR

GPP- qpcrF ATCAGAGAAGGGCGAAGGAGACAAT 249 qRT-PCR

GPP-qpcrR ACGAACCAGGTCAACGGGCGTCTTA 250 qRT-PCR

GLDH- qpcrF CGGCATCGTTACCTACTACTCCTTC 251 qRT-PCR

GLDH-qpcrR ATCTTCGGGCAATGGGGCGTAGCGG 252 qRT-PCR

GDH- qpcrF TCAACTTCTTCGACACCTCTCCGTA 253 qRT-PCR

POE-AceGGP3-F GGGACTCTAGAGGATCCATGTTGAAGATCAAGA 259 Over-

GGGTTC expression

POE-AceGGP3-R GTGGTACCCGGGGATCCTCAGTGCTGAACTAGG 260 Over-

CATTC expression

POE-AceMYBSl-F GGGACTCTAGAGGATCC 261 Over-

ATGTCAGCTTCAGTGGATT expression

POE-AceMYBSl-R GTGGTACCCGGGGATCCTTATTGGTGCATTGTTG 262 Over-

GGGGTGCC expression

POE-AcrMYBSl-F GGACTCTAGAGGATCCATGTCAGCTTCAGTGGAT 263 Over-

T expression

POE-AcrMYBSl-R TGGTACCCGGGGATCCTTGGTGCA 1 1 G 1 1 GGGG 264 Over-

GTGCC expression

POE-AceGBF3-F GGGACTCTAGAGGATCC 265 over-

ATGGGAAGTAGTGAAGACGTGA expression P0E-AceGBF3-R GTGGTACCCGGGGATCC 266 Over-

TCAGTTGACAGAGCCTG expression

POE-SnRKlA-F GGACTCTAGAGGATCC 267 Over-

ATGGATGGATCTGGTGGTCAAGGA expression

POE-SnRKlA -R TGGTACCCGGGGATCC 268 Over-

AAGAACCCGGAGCTGAGCAAGAAA expression

TRV2-AceGGP3/l-F TAAGGTTACCGAATTCTCACTAAAGTTGGACAGG 269 Inhibit-

AAG expression

TRV2-AceGGP3/l-R GCTCGGTACCGGATCCAGCAGCTCAGAGA 1 1 1 1 C 270 Inhibit-

A expression

TRV2-AceMYBSl-R GCTCGGTACCGGATCCAGGAGCATTTCAAGAAA 272 Inhibit-

CTT expression

TRV2-AceGBF3-F TAAGGTTACCGAATTCCAGTTAGGTTGAGCTAAG 273 Inhibit-

TC expression

TRV2-AceGBF3-R GCTCGGTACCGGATCCTGCTGTCAAGGACAATG 274 Inhibit-

TTG expression

Crispr-AceGGP3- GGTCTCTTGCACTCCAGAAGTGTTGCATTCAGTT 275 Gene editing

T1F TCAGAGCTATGC

Crispr-AceGGP3- GGTCTCTAAACCATTCACCCTCTTGAAAGCATGC 276 Gene editing

T2R ACCAGCCGGGAA

Crispr-AceMYBSl- GGTCTCTTGCACGTCAATAGTGGCGATGC 1 1 G 1 1 2/7 Gene editing

T1F TCAGAGCTATGCTGGA

Crispr-AceMYBSl- GGTCTCTAAACCAACATCCTCAACTAGTAATTGC 278 Gene editing

T2R ACCAGCCGGGAATCGA

Idcrispr- AceGGP3- AACCCAACTTAA 1 1 1 CGGGCC 1 1 1 C 279 Identification

F of gene edit

Idcrispr- AceGGP3- ACAAACTATAAAACTGTAGAGAACG 280 Identification

R of gene edit

Idcrispr- GGTCTGGACAA 1 1 I GGGAA 281 Identification

AceMYBSl-F of gene edit

Idcrispr- CTGGCGTCCCAACAGCTGAT 282 Identification

AceMYBSl-R of gene edit

YFP- AceMYBSl-F TTTACAATTACGGATCCATGTCAGCTTCAGTGGA 283 Subcellular

TT localization

YFP- AceMYBSl-R CCCTTGCCCATGGATCCTTGGTGCATTGTTGGGG 284 Subcellular

GTG localization

YFP- AceGBF3-F TTTACAATTACGGATCCATGGGAAGTAGTGAAGA 285 Subcellular

CGT localization

YFP- AceGBF3-R CCCTTGCCCATGGATCCGTTGACAGAGCCTGGCT 286 Subcellular localization pLaczi- AceGGP3-F GGAATTCCGAAAAGCTCAATTAAAATGAATATA 287 Yeast one hybrid pLaczi- AceGGP3-R GGGGTACCCCCTCGAACTCAGAAAACGCAAAAA 288 Yeast one

CA hybrid pLaczi- AcrGGP3-F GGAATTCCATACGACTCACTATAGGGCGAATTG 289 Yeast one hybrid pLaczi- AcrGGP3-R GGGGTACCCC 290 Yeast one

CTCGAACTCAGAAAACGCAAAAACA hybrid

PJG- AceMYBSl-F ATGCCTCTCCCGAATTCATGTCAGCTTCAGTGGA 291 Yeast one

TT hybrid

PJG- AceMYBSl-R GTCCAAAGCTTCTCGAGTTATTGGTGCA 1 1 G 1 1 G 292 Yeast one GGGGTGCC hybrid

PJG- AcrMYBSl-F ATGCCTCTCCCGAATTCATGTCAGCTTCAGTGGA 293 Yeast one

TT hybrid PJG- AcrMYBSl-R GTCCAAAGCTTCTCGAGTTATTGGTGCA 1 1 G 1 1 G 294 Yeast one GGGGTGCC hybrid pGreen- AceGGP3- GGTACCTTGAAAAGCTCAATTAAAATGAATATA 295 Dual-LUC lianF assay pGreen- AceGGP3- AAGCTTGTCGACACCGAGCTCGAATTCAAGCT 296 Dual-LUC lianR assay pGreen-AceGGP3- GGTACCTATCCCAAAATATTTATTCACTTAG 297 Dual-LUC

2088F assay pGreen-AceGGP3- GGTACCCGGATCAGCA 1 1 1 G 1 1 1 1 C 1 1 C 1 1 A 298 Dual-LUC

1606F assay pGreen-AceGGP3- TATAGGGCGAATTGG 300 Dual-LUC

606F TAGAAAGTGAGGTGTTGCGTCAAGA assay pGreen-AceGGP3- AGCTTGATATCGAATTGTCGACACCGAGCTCGAA 301 Dual-LUC inR TTCAAGCT assay pGreen-AcrGGP3- GGTACCCTAAACCTAACACAAATGGGAAGCA 302 Dual-LUC lianF assay pGreen-AcrGGP3- AAGCTTCTCGAACTCAGAAAACGCAAAAACA 303 Dual-LUC lianR assay pGreen-AcrGGP3- GGTACCCTAAACCTAACACAAATGGGAAGCA 304 Dual-LUC

1919F assay pGreen-AcrGGP3- GGTACCTCCAATCATCTCACGCCATCCAAGC 305 Dual-LUC

1439F assay pGreen-AcrGGP3- TATAGGGCGAATTGG 306 Dual-LUC

1009F ATCATAGGGTGGTTGGGTTGTTTGG assay pGreen-AcrGGP3- TATAGGGCGAATTGGGCCTCTCCTCACCTCACCC 307 Dual-LUC

515F CCAAAG assay pGreen-AcrGGP3- AGCTTGATATCGAATCTCGAACTCAGAAAACGCA 308 Dual-LUC inR AAAACA assay pGreen-AceMYBSl- CGGGGTACCGTGATTGGACCAAGAAACTGTTGG 309 Dual-LUC

F CTC 1 1 C 1 1 ATTGGAG assay pGreen-AceMYBSl- CCCAAGCTTATAGCTGGGGATTGGCACGTGTCC 310 Dual-LUC

R. GGA l l CGA l l GCAGC assay

AD- AceMYBSl-F GGAGGCCAGTGAATTC 311 Yeast two

ATGTCAGCTTCAGTGGATTG hybrid

AD- AceMYBSl-R CGAGCTCGATGGATCC 312 Yeast two

TTGGTGCATTGTTGGGGGTG hybrid

BD- AceMYBSl-F CATGGAGGCCGAATTC 313 Yeast two

ATGTCAGCTTCAGTGGATTG hybrid

BD- AceMYBSl-R GCAGGTCGACGGATCC 314 Yeast two

TTGGTGCATTGTTGGGGGTG hybrid

AD- AceGBF3-F GGAGGCCAGTGAATTC 315 Yeast two

ATGGGAAGTAGTGAAGACGT hybrid

AD- AceGBF3-R CGAGCTCGATGGATCC GTTGACAGAGCCTGGCT 316 Yeast two hybrid

BD- AceGBF3-F CATGGAGGCCGAATTC 317 Yeast two

ATGGGAAGTAGTGAAGACGT hybrid

BD- AceGBF3-R GCAGGTCGACGGATCC 318 Yeast two

GTTGACAGAGCCTGGCT hybrid

AD- AceGBF3-NF GGAGGCCAGTGAATTC 319 Yeast two

ATGGGAAGTAGTGAAGACGT hybrid

AD- AceGBF3-NR CGAGCTCGATGGATCC 320 Yeast two

CATAGCAACTCCATCCACA hybrid

YCE- AceMYBSl-F CGCCACTAGTGGATCCATGTCAGCTTCAGTGGAT 321 BiFC assay TGG YCE- AceMYBSl-R TCCCGGGAGCGGTACCTTGGTGCA 1 1 G 1 1 GGGG 322 BiFC assay GTG

YNE- AceGBF3-F CGCCACTAGTGGATCCATGGGAAGTAGTGAAGA 323 Bi FC assay

CGTG

YNE- AceGBF3-R TCCCGGGAGCGGTACCTCAGTTGACAGAGCCTG 324 Bi FC assay

GCT

1300CLUC- GACGAGCTCGGTACC 325 BiLC assay

AceMYBSl-F ATGGGAAGTAGTGAAGACGTG

1300CLUC- ACGAGATCTGGTCGAC 326 BiLC assay

AceMYBSl-R GTTGACAGAGCCTGGCTTGT

GST- AceGBF3-F GGTTCCGCGTGGATCCATGGGAAGTAGTGAAGA 331 Pull-down

CG

GST- AceGBF3-R CAGTCACGATGAATTCGTTGACAGAGCCTGGCTT 332 Pull-down

G

Bio- AceMYBSlA-F Bio- TAATTAAATAGATAAGAAAAGAGAAAAAGG 333 Probe of

EMSA assay

Bio- AceMYBSlA-R Bio- CC 1 1 1 1 1 C 1 C 1 1 1 1 CTTATCTA 1 1 1 AA 1 I'A 334 Probe of

EMSA assay

Bio- AceMYBSlA- Bio- TAATTAAATAAAAAAAAAAAGAGAAAAAGG 335 Mutated mutF probe of

EMSA assay

Bio- AceMYBSlA- Bio- CC I 1 1 1 I C I C I 1 1 1 1 1 1 1 1 1 I A I 1 I AATTA 336 Mutated mutR probe of

EMSA assay

Bio- AceMYBSIB-F Bio- AAATAAGGGCAATCTTATCATTTATGTCAA 337 Probe of

EMSA assay

Bio- AceMYBSIB -R Bio- TTGACATAAATGATAAGATTGCCCTTATTT 338 Probe of

EMSA assay

Bio- AceMYBSIB - Bio- AAATAAGGGCAAAAAAAAAATTTATGTCAA 339 Mutated mutF probe of

EMSA assay

Bio- AceMYBSIB- Bio- TTGACATAAA I 1 1 1 1 1 1 1 1 I GCCCTTA I 1 1 340 Mutated mutR probe of

EMSA assay

AceMYBSlA-F TAATTAAATAGATAAGAAAAGAGAAAAAGG 341 Competing probe of EMSA assay

AceMYBSlA-R CC 1 1 1 1 1 CTC 1 1 1 1 C ITATCTA 1 1 1 AATTA 342 Competing probe of EMSA assay

AceMYBSIB-F AAATAAGGGCAATCTTATCATTTATGTCAA 343 Competing probe of EMSA assay

AceMYBSIB -R TTGACATAAATGATAAGATTGCCCTTATTT 344 Competing probe of EMSA assay Table 2: GenBank accession numbers.

Gene name Gene ID Species CaMYBSl KAG7023245.1 Cucurbita argyrosperma subsp. argyrosperma

MtMYBSl XP_ 013458057.1 Medicago truncatula BhMYBSl XP_ 038904415.1 Benincasa hispida JcMYBSl XP_012092968.1 Jatropha curcas ZmMYBSl-1 XP_035817471.1 Zea mays ZmMYBSl-2 XP_ 008655472.1 Zea mays PduMYBSl XP 034216346.1 Prunus dulcis SpMYBSl XP_015067500.1 Solanum pennellii SIMYBS1 XP_ 004235740.1 Solanum lycopersicum ZjMYBSl XP_015883030.1 Ziziphus jujuba CisiMYBSl XP_ 006485754.1 Citrus sinensis PtMYBSl XP_006376751.1 Populus trichocarpa CcMYBSl XP_ 006440920.1 Citrus Clementina McMYBSl XP 022134696.1 Momordica charantia PavMYBSl XP_ 021810789.1 Prunus avium HuMYBSl XP 021292289.1 Herrania umbratica CnMYBSl KAG 1346991.1 Cocos nucifera GhMYBSl XP_016668580.2 Gossypium hirsutum RchMYBSl XP_ 024170989.1 Rosa chinensis

AtsMYBSl-1 XP_ 020182102.2 Aegilops tauschii subsp.

Strangulata

AtsMYBSl-2 XP_020153631.2 Aegilops tauschii subsp.

Strangulata

AtsMYBSl-3 XP 020147245.1 Aegilops tauschii subsp.

Strangulata

EgrMYBSl XP_ 018721495.1 Eucalyptus grandis PdaMYBSl XP_ 008777770.2 Phoenix dactylifera TwMYBSl XP_038680313.1 Tripterygium wilfordii BrMYBSl-1 XP_ 009107320.1 Brassica rapa BrMYBSl-2 XP_ 009147920.1 Brassica rapa StMYBSl KAF7806327.1 Senna tora HaMYBSl XP_ 022017137.1 Helianthus annuus TtMYBSl KAF5200833.1 Thalictrum thalictroides PaMYBSl XP_ 034914820.1 Populus alba VriMYBSl XP_ 034672935.1 Vitis riparia CsaMYBSl-1 XP_030492585.1 Cannabis sativa CsaMYBSl-2 XP_ 004138233.1 Cucumis sativus PgMYBSl XP_ 031371370.1 Punica grana turn PvMYBSl XP_031288186.1 Pistacia vera ItMYBSl XP_031105045.1 Ipomoea triloba QIMYBS1 XP_030937185.1 Quercus lobata RaMYBSl XP_030521694.1 Rhodamnia argentea SoMYBSl XP_030439880.1 Syzygium oleosum AhMYBSl-1 XP 025644418.1 Arachis hypogaea AhMYBSl-2 XP_ 025616452.1 Arachis hypogaea EguMYBSl XP_ 010923684.1 Elaeis guineensis PalMYBSl-1 XP_028753871.1 Prosopis alba PalMYBSl-2 XP_ 028753870.1 Prosopis alba DcMYBSl XP_ 020701896.1 Dendrobium catenatum CasiMYBSl XP_ 028110714.1 Camellia sinensis GsMYBSl RZC07495.1 Glycine soja ApMYBSl XP_ 027331692.1 Abrus precatorius CaMYBSl XP_004508571.1 Cicer arietinum OsMYBSl-1 XP_025882005.1 Oryza sativa Japonica Group OsMYBSl-2 XP_O 15627636.1 Oryza sativa Japonica Group RCOMYBS1 XP_ 002511974.1 Ricinus communis BdMYBSl-1 XP_ 010236797.1 Brachypodium distachyon BdMYBSl-2 XP_ 003565794.1 Brachypodium distachyon MnMYBSl XP_010105364.1 Morus notabilis EsMYBSl XP_006393319.1 Eutrema salsugineum QsMYBSl XP_023913108.1 Quercus suber LsMYBSl XP_023738581.1 Lactuca sativa CrMYBSl-1 XP_006305410.1 Capsella rubella CrMYBSl-2 XP_006298960.1 Capsella rubella SiMYBSl XP_004968851.1 Setaria italica VraMYBSl XP_ 014507110.1 Vigna radiata var. radiata BnMYBSl XP_ 013715778.1 Brassica napus HbMYBSl XP_ 021648857.1 Hevea brasiliensis MeMYBSl XP_ 021611780.1 Manihot esculenta SbMYBSl-1 XP 002447620.2 Sorghum bicolor SbMYBSl-2 XP_ 002455466.1 Sorghum bicolor AdMYBSl XP 015944219.1 Arachis duranensis AiMYBSl-1 XP_016189009.1 Arachis ipaensis AiMYBSl-2 XP_ 002891472.1 Arabidopsis lyrata subsp. Lyrata SinMYBSl XP_011073055.1 Sesamum indicum FcMYBSl BAG74460.1 Fag us crenata DeMYBSl-1 KAF8730393.1 Digitaria exit is DeMYBSl-2 KAF8670787.1 Digitaria exit is AsuGBF3.1 KAG7644487.1 Arabidopsis suecica AsuGBF3.2 KAG7644488.1 Arabidopsis suecica AthGBF3.1 NP_ 001323456.1 Arabidopsis thaliana AthGBF3.2 NP_ 001323457.1 Arabidopsis thaliana ECOGBF3 ACC77654.1 Eleusine coracana VviGBF3 RVX04237.1 Vitis vinifera AlyGBF3 XP_002880229.1 Arabidopsis lyrata subsp. lyrata CroGBF3 AAK14790.1 Catharanthus roseus AthGBF3.3 CAA45358.1 Arabidopsis thaliana AthGBF3.4 NP 174494.2 Arabidopsis thaliana AthGBF3.5 AAA90947.1 Arabidopsis thaliana AthGBF3.6 NP_ 850248.2 Arabidopsis thaliana AthGBF3.7 NP_ 171893.1 Arabidopsis thaliana BraGBF3 KAG5395572.1 Brassica rapa subsp. trilocularis DexGBF3.1 KAF8695903.1 Digitaria exit is DexGBF3.2 KAF8675825.1 Digitaria exit is

DexGBF3.3 KAF8641874.1 Digitaria exit is

CsaGBF3.1 KAF4389315.1 Cannabis sativa

CsaGBF3.2 KAF4354785.1 Cannabis sativa

AthGBF3.8 NP_ 171893.1 Arabidopsis thaliana

TurGBF3 EMS64924.1 Triticum urartu GarGBF3 XP_ 017635172.1 Gossypium arboreum MnoGBF3 EXC02957.1 Morus notabilis

AceGGP3 DTZ79_29gl0040/Actinidia32270 Actinidia eriantha/A. chinensis

AeGGPl DTZ79_ 17g07300/Actinidia05074 Actinidia eriantha/A. chinensis

AeGGP2 DTZ79_27g04660/Actinidia36370 Actinidia eriantha/A. chinensis

AceMYBSl DTZ79_16g09490/Actinidia31027 Actinidia eriantha/A. chinensis

AceGBF3 DTZ79_15g00300/Actinidia27344 Actinidia eriantha/A. chinensis

PMM DTZ79_08g09640 Actinidia eriantha

GMP DTZ79_03g03560 Actinidia eriantha

GME DTZ79 24g 13050 Actinidia eriantha

2. GGP3 regulates AsA biosynthesis in kiwifruit

[00261] To determine the AsA variation among members of the Actinidia genus, the AsA content in the fruits of 48 species was determined by HPLC. The concentration varied tremendously among the different species, ranging from 4.4 to 1185 mg.100 g 1 FW. Species with low (0-30 mg 100. g 1 FW), moderate (30-200 mg 100. g 1 FW) and high vitamin C (200-1200 mg.100 g 1 FW) contents constituted 29.1%, 56.3% and 14.6% of the species, respectively. Changes in AsA contents during the growing season were determined in six Actinidia species representing low, moderate and high AsA concentrations (Figure 1). AsA content in Actinidia species with high or moderate fruit AsA content accumulated AsA rapidly after fertilization, peaking at 60 days after flowering (DAF60) sampling point, before decreasing as the fruit progressed towards maturity, whereas that of low-AsA content species stayed low level during the whole growing season. AsA biosynthesis during early fruit development is thus the main reason for AsA accumulation among members of the Actinidia genus, and differences in AsA synthesis at this stage might constitute the main determinant of AsA variation among Actinidia species.

[00262] To identify potential AsA regulatory genes, transcriptome sequencing of the fruits at DAF20, DAF40, DAF60 and DAF120 was performed for A. eriantha and A. rufa, whose fruits presented a more than 30-fold difference in AsA content. A total of 24415 differentially expressed genes (DEGs) were identified by pairwise comparisons, and Gene Ontology (GO) enrichment analysis revealed that these DEGs were significantly enriched in biological processes related to the response to catalytic activity and biosynthetic processes (data not shown). The transcript Actinidia32270, encoded by GGP, was identified in A. eriantha on the basis of the increased transcript abundance in the L- galactose pathway. Three GGP homologous genes were further subjected to RT-qPCR and correlation analysis of AsA contents in the fruits of A. eriantha, A. rufa and their hybrids at different developmental stages was conducted (Figure 2, A and B). GGP3 in A. eriantha AceGGP3) was most highly expressed and correlated with high AsA content in the fruits.

[00263] GGP3 expression was also highly correlated with AsA concentration in the A. eriantha x A. rufa hybrid (data not shown), and the expression of AceGGP3 allele derived from A. eriantha was significantly higher than the A. rufa derived allele AcrGGP3 (data not shown).

3. AceGGP3 overexpression leads to a sharp accumulation of AsA in kiwifruit

[00264] Transient expression assays were performed to confirm the function of AceGGP3 in AsA accumulation using 35S driven AceGGP3 overexpression and silencing constructs in on-vine kiwifruit and calli in tissue culture. At 7 days after infiltration AceGGP3 expression and AsA content was significantly higher infiltrated fruit than in the control fruit (Figure 3A and B). Moreover, silencing of AceGGP3 in fruiting plants led to an approximately 24.0% decrease in AsA compared with that in fruits infiltrated with bacteria harbouring empty vectors (Figure 3B). Similar results were obtained in transient assays of A. eriantha calli, in which the overexpression of AceGGP3 significantly increased the AsA content (Figure 3C and D).

[00265] To further analyse AceGGP3 function, transgenic kiwifruit lines were generated by Agrobacterium transformation of calli. Five independent transgenic lines of A. eriantha overexpressing AceGGP3 were obtained. Compared to wild-type (WT) plants, the transgenic plants exhibited varying levels of AceGGP3 gene expression - from 3.1- to 8.1-fold (Figure 4A), and AsA contents increased by 6.3-, 20.0-, 22.7-, 16.7- and 14.1- fold, respectively (Figure 4B).

[00266] AceGGP3 was then mutated in kiwifruit via the CRISPR/Cas9 system, and the two targeted sites were located on the first and second exons of AceGGP3. In total, four G418-resistant lines containing three types of homozygous mutant AceGGP3 genes (ggp3#2, a 9 bp deletion; ggp3#ll and #15, a 1 bp insertions; ggp3#13, a 2 bp insertion) were selected. Three mutations resulted in frameshift mutations or amino acid deletions that induced premature termination or truncation of the predicted protein, causing a loss of the AceGGP3 gene function. The AsA content in the fruits of the ggp3 mutants decreased by 32.2%, 5.11% (not statistically significant), 18.2% and 15.9% compared with that of the WT (Figure 5). The 1 bp insertion in gg3#ll was predicted to cause a nonsense frameshift but AsA levels remains effectively unchanged. However, a scan of ORFs in this mutated CDS showed that translation initiation from the 4 th ATG codon downstream would produce a truncated version of GGP missing 53 amino acids from the N-terminus, and the unchanged AsA content in gg3#ll suggests this variant has functional GGP activity. Collectively, the results show that up- or downregulated expression of AceGGP3 significantly affects AsA accumulation in A. eriantha.

4. AceMYBSl acts as a transcriptional activator of AceGGP3

[00267] Through transcriptome analysis, a transcript, Actinidia31027 (Actinidia chinensis (Hong yang) protein v3; Actinidia39811 : Actinidia chinensis (Hong yang) protein v3; Accl8653.1:LG16-18681898..18684251 [e=0]), whose expression is strongly correlated with Actinidia32270 (GGP3; SEQ ID No: 347) expression, was identified (SEQ ID No:204), and RT-qPCR analysis also showed its expression was highly correlated with both AceGGP expression and AsA content in A. eriantha (Figure 6A and B). A phylogenetic tree comprising the sequences of amino acids of Actinidia31027 and MYBs of other plant species showed that Actinidia31027 was most closely related to TwMYBSl, as they shared 86% sequence identity, whereas it was most distant from CcMYB5 (not shown). Therefore, this protein has been designated as AceMYBSl. Multiple alignments performed on the SMART website (http ://sma rt.embl-heidelberg.de/) (Schultz et al., 1998, Proceedings of the National Academy of Sciences of the United States of America 95(11): 5857-5864; Letunic et a/., 2020, Nucleic Acids Research 49( DI): D458-D460) showed that Actinidia31027 contains two SANT/MYB regions. A SANT domain is a protein domain that allows many chromatin remodelling proteins to interact with histones (Boyer et al., 2002, Molecular Cell 10(4): 935-942; Boyer et al., 2004, Nature Reviews Molecular Cell Biology 5(2): 158-163), and because SANT domains share many similarities with MYB DNA binding domains and they are often conflated together. SANT and MYB domains can be distinguished by the predicted isolectric point (pl) of the domain peptide, with histone interacting SANT domains having acidic pls, and MYB DNA binding domains having basic pl (Ko et al., 2008, Molecular Cancer 7(1): 77). The N- terminal SANT of Actinidia31027 has a predicted acidic isoelectric point (pl), and the second SANT/MYB region has a predicted basic pl (data not shown). Therefore this predicts that AceMYBSl has one histone interacting domain at its N-terminus side and a MYB DNA binding domain in the middle of the protein.

[00268] To reveal the mechanism by which AceGGP3 is regulated by AceMYBSl, a Y1H assay was conducted. All the yeast cells grew well on SD/-Trp/-Ura media, whereas only the positive control and bait vector AceGGP3pro: : LacZ co-transformed with the prey vector pJG-AceMYBSl or pJG-AcrMYBSl had blue cells on media supplemented with X-gal (data not shown). Bioinformatics predictions were performed with JASPAR 2020 (Fornes et al., 2019, Nucleic Acids Research 48(D1): D87-D92). Among many predicted MYBS1 c/s-elements, two main c/s-elements were identified within the AceGGP3 promoter (-2455 bp, TCTTATC; -1354 bp, TCTTATC). Four 5'-truncated AceGGP3 promoters (designated Pno6, Piece, P2088, and P2660) were amplified from A. eriantha, and the transcriptional activity was assessed with a dual-LUC reporter system (Figure 7). The activities of Pieoe and P2660 were higher than those of Pnoe and P2088, indicating that c/s-elements are likely located between positions -1106-1606 bp and -2088-2660 bp, which is the same interval as the predicted position. However, regardless of which 5'-truncated AceGGP3 promoter was used to drive the reporter gene, AceMYBSl co-transfected tobacco presented a much higher relative luciferase LUC than did AceGGP3 co-transfected tobacco, suggesting that AceMYBSl also binds to other regions of the AceGGP3 promoter (Figure 7A). To verify the binding specificity of AceMYBSl to these motifs, we carried out EMSAs and found that AceMYBSl-HIS fusion proteins could bind DNA probes containing the motifs, whereas non-labelled competing probes effectively reduced the binding ability of AceMYBSl in a dose-dependent manner, and mutation of the core sequence abolished the binding (Figure 7B). The experiments above were repeated in A. rufa (has low AsA) and it was found that AcrMYBSl could not bind to the AcrGGP3 promoter to regulate its expression (data not shown). The A. rufa GGP3 promoter (SEQ ID No: 346) has a number of polymorphisms (deletions/insertions) compared to that of A. eriantha (SEQ ID No: 345), including putative MYBS1 binding sites. Without wishing to be bound by theory, the investigator's hypothesise that as the AcrMYBSl protein is almost identical to AceMYBSl (97.7% identity at the amino acid level), the lack of binding is likely due to a defective AcrGGP3 promoter, rather than a non-functional MYBS1 protein. It may therefore be possible to increase AsA content in A. rufa by replacing or modifying the AcrGGP3 promoter, either alone or in combination with overexpression of MYBS1.

[00269] To explore the AceMYBSl expression profiles of A. eriantha, RT-qPCR was performed to assess transcript accumulation in the fruits. Consistent with AsA content, AceMYBSl was highly expressed in the early fruit development stage and was positively correlated with the expression of AceGGP3 (Figure 6A and B). A transient expression assay in A. eriantha fruits confirmed that the overexpression of AceMYBSl increased the expression of AceGGP3 by 1.4-fold (Figures 8A and B), leading to a 1.7-fold increase in the AsA content in the fruits (Figure 8C). In contrast, the expression level of AceGGP3 in the TRV-AceMYBSl fruits decreased by 30%. As expected, overexpression of AceMYBSl in A. eriantha call! equally led to increased expression of AceGGP3 (1.8-fold) and content of AsA (2.36-fold), and compared with empty vector- infected calli, TRV-AceMYBSl- infected call! exhibited significantly decreased AceGGP3 expression and AsA content (Figures 9A-C). When AceGGP3 was suppressed by TRV-AceGGP3, AsA content did not increase, even when AceMYBSl was overexpressed in A. eriantha calli and transgenic lines, suggesting that AceMYBSl is an upstream regulatory gene of AceGGP3.

[00270] To gain further understanding of the regulatory roles of AceMYBSl, six stable overexpression transgenic kiwifruit lines were generated. RT-qPCR confirmed that the six independent transgenic lines accumulated high levels of AceMYBSl transcripts (>32- to 55-fold; Figure 10A), which successfully increased the expression levels of AceGGP3 by 2.5- to 4-fold (Figure 10B). AsA content increased by 1.4-, 1.6-, 2.0-, 1.5-, 1.7- and 1.8- fold that of controls, respectively (data not shown). Using the CRISPR/Cas9 system, AceMYBSl was further mutated in A. eriantha. The two targeted sites were located within the first and third exons. In total, five independent AceMYBSl mutants mybsl#12, a 1 bp deletion; mybsl#28, a 26 bp deletion; mybsl#30, a 3 bp deletion; mybsl#41, a 1 bp insertion; mybsl#53, a 4 bp deletion) were identified (data not shown). The AsA contents of the AceMYBSl-edited lines decreased by 33.2%, 23.1%, 9.8%, 40.4% and 40.0%, respectively (data not shown). For mybsl#30 which only had a 9.8% reduction in AsA content; while the 3 bp deletion of ATG was not in-frame, the effect was only slight because the result was effectively synonymous due to degeneracy and the only difference to WT was the loss of an aspartic acid at position 205 (just over 20 amino acids downstream from MYB DNA binding domain). Taken together, these findings support that AceMYBSl is a positive factor that modulates AsA synthesis.

5. AceGBF3 functions additively with AceMYBSl to upregulate the expression of AceGGP3 and increase the synthesis of AsA in kiwifruit

[00271] To identify AceMYBSl-interacting proteins a yeast two-hybrid (Y2H) screen was conducted. Approximately 85 yeast transformants were screened, and 8 positive clones were identified as containing the same cDNA as its full-length sequence. This sequence encodes a bZIP protein (Actinidia27344; SEQ ID No: 109), whose expression was highly correlated with AceGGP3 expression (SEQ ID No:347); this protein was then designated as AceGBF3 based on bioinformatic analysis (data not shown). Further domain mapping analysis revealed that the N-terminal region (AceGBF3-N, amino acids 1-151) but not the C-terminal half (AceGBF3-C, amino acids 152-299) of AceGBF3 interacted with AceMYBSl (data not shown). An additional transcriptional activity assay was then performed. Yeast cells transformed with BD-AceMYBSl or BD-AceGBF3 but not the empty pGBKT7 vector grew normally and turned blue on SD/-T/-H/-A/+X-a-Gal media (data not shown). Subcellular localization assays were performed in tobacco leaves transformed with a nuclear marker to visualize the subcellular locations of AceMYBSl and AceGBF3. Fluorescence of the AceMYBSl and AceGBF3-fused YFP proteins was detected only in the nucleus and perfectly merged with nuclear markers (data not shown).

[00272] To further explore potential interactions between the AceGBF3 and AceMYBSl proteins, three different methods were employed. Firstly, a BiFC experiment was selected as a method for an in vivo assay using a cell biology approach. AceMYBSl and AceGBF3 proteins were fused to the C-terminus of yellow fluorescent protein (YFP) (AceMYBSl-cYFP) and the N-terminus of YFP (AceGBF3-NYFP), respectively, and then transiently transformed into onion epidermal cells. The YFP signal localized to the nucleus showing close interaction between AceMYBSl and AceGBF3 as well as nuclear localization (data not shown). To confirm this interaction AceMYBSl was fused to the C-terminal half of LUC (AceMYBSl -cLUC) and AceGBF3 was fused to the N-terminal half of LUC (AceGBF3- n LUC) and the constructs were transiently expressed in tobacco leaves. Only leaves co-transformed with AceMYBSl-cLUC and AceGBF3-nLUC produced a strong LUC signal (Figure 11A). Finally, AceMYBSl-AceGBF3 physical interactions were confirmed via an in vitro pulldown assay using recombinant purified proteins. The AceGBF3-glutathione S-transferase (GST) fusion protein was precipitated with AceMYBSl-6x His but not with 6x His alone when GST resin was used (Figure 11B).

[00273] To test the hypothesis that AceGBF3 and AceMYBSl co-regulate AsA synthesis, we performed dual-LUC, overexpression and virus-induced gene silencing experiments. These showed that AceMYBSl and AceMYBSl plus AceGBF3 but not AceGBF3 alone were capable of inducing the expression of AceGGP3 (of which increased by 3.74- and 6.67-fold, respectively) (Figure 12A). Transient overexpression of AceGBF3 in different plant materials, including kiwifruit (Figure 12B, C), calli (Figure 12D, E) and tobacco leaves (Figure 12, F, H), consistently promoted the accumulation of AsA and expression of GGP3. Furthermore, co-overexpression of AceGBF3 together with AceMYBSl promoted the accumulation of AsA additively (Figure 13A, B).

[00274] When AceMYBSl was silenced in A. eriantha fruits, both the expression level of AceGGP3 and the content of AsA were notably reduced or unchanged, regardless of whether AceGBF3 was overexpressed or suppressed, confirming that AceGBF3 has to interact with AceMYBSl to regulate the transcription of. Moreover, five transgenic A. eriantha lines constitutively overexpressing AceGBF3 were generated, and their AceGGP3 transcript levels and AsA contents increased by 2.11-3.38-fold and 1.22-1.78-fold, respectively, consistent with previous results (Figure 14A, B). The average AsA content in calli (which was 2-fold higher than that in WT calli) of the AceGBF3 and AceMYBSl coexpression lines was significantly higher than that in the calli of the AceGBF3 (1.4-fold) and AceMYBSl (1.6-fold) lines alone (Figure 14B). Taken together, these results show that AceGBF3 interacts with AceMYBSl to co-regulate AceGGP3 expression to form an AceGBF3-AceMYBSl-AceGGP3 regulatory network involved in the synthesis and metabolism of AsA in kiwifruit.

6. AceMYBSl domain structure and genetics

[00275] The AceMYBSl identified herein is located on chrl6 and its A. chinensis and A. eriantha isoforms share 98% amino sequence identity and show no obvious perturbation in the two predicted SANT domains (data not shown). In addition to AceMYBSl there is another highly similar MYBSl-like gene sharing 87% amino acid identity on chr26, which lies inside the recently identified A. eriantha AsA supergene QTL interval (McCallum et al., 2019, Plants (Basel, Switzerland) 8(7): 237). Alignment of A. chinensis chromosome 26 and AceMYBSl (chrl6) protein sequences revealed the chromosome 16 and 26 alleles shared highly similar SANT/MYB domains and mainly differed by a 12 amino acid deletion in the chr26 allele in an unstructured region in the C-terminal third of the protein (data not shown). Unstructured regions are typically associated with protein-protein binding interactions (Kragelund et al., 2012, Trends in Plant Science 17(11): 625-632; Millard et al., 2019, Nucleic Acids Research 47(18): 9592-9608). Protein disorder predictions using ODiNPred (Dass et al., 2020, Scientific Reports 10(1): 14780) also support the SMART domain architecture predictions (data not shown). The high degree of similarity, particularly between both the SANT and MYB domains suggests that the chr26 AceMYBSl gene should also regulate GGP3 expression but it is very lowly expressed in fruit and had little correlation with AsA levels (data not shown).

7. Stable expression of AceMYBSl in plants

Materials and methods

[00276] A plasmid T-DNA construct as described in Li et al. (2022, New Phytologist 235(6): 2497-2497) containing the AceMYBSl gene (derived from Actinidia eriantha) was transformed using the Agrobacterium tumefaciens- mediated method into rice (Oryza sativa L.), soybean (Glycine max), and Arabidopsis (Arabidopsis thaliana). Stably transformed lines were selected by PCR methods. The mature leaf L-ascorbic acid content (AsA) of each of the lines (triplicate plants) was then measured using HPLC as previously described (Li et al., 2022).

Rice (Oryza sativa L.)

[00277] AsA content was measured in 11 transformed lines of Oryza sativa L. and compared to wild-type non-transformed controls (WT). AsA content was higher in 10 out of 11 lines (p<0.05), with variation ranging 1.5 to 89-fold that of untransformed WT (Table 3; Figure 15A).

Table 3: Leaf AsA content of stably-transformed rice plants.

* p<0.05. Soybean (Glycine max)

[00278] AsA content was measured in 5 transformed lines of Glycine max and compared to wild-type non-transformed controls (WT). AsA content was higher in 3 out of 5 lines (p<0.05), with one line having reduced AsA. Variation ranged from 0.17 to 5.7- fold that of untransformed WT (Table 4; Figure 15B). Table 4: Leaf AsA content of stably-transformed soybean plants.

* p<0.05.

Arabidopsis thaliana

[00279] The AsA content was measured in 15 transformed lines of Arabidopsis thaliana and compared to wild-type non-transformed controls (WT). Two lines displayed small increases (1.21-1.26-fold), while six lines had slightly reduced AsA (p<0.05). Variation ranged from 0.75 to 1.26-fold that of untransformed WT (Table 5; Figure 15E). This shows that by generating a number of transformed lines and selecting plants that show the highest AsA content, it is possible to produce Arabidopsis thaliana plants having increased AsA content compared to wild-type.

Table 5: Leaf AsA content of stably-transformed Arabidopsis thaliana plants.

Line Mean AsA (mg/lOOg AsA content Std T-test

Number fresh weight) relative to WT deviation stat

#1 1.29* 1.21 0.063 0.020

#2 1.34* 1.26 0.084 0.037

#3 1.16 1.09 0.046 0.102

#4 1.16 1.09 0.034 0.149

#5 0.96 0.91 0.051 0.125

#6 1.15 1.08 0.047 0.176

#8 0.82* 0.77 0.048 0.041

#9 0.96 0.90 0.129 0.217

#10 0.96* 0.90 0.009 0.042

#11 0.83* 0.78 0.027 0.003

#12 0.98 0.92 0.041 0.220

#13 0.80* 0.75 0.010 0.009

#14 0.97 0.91 0.033 0.057

#15 0.95* 0.90 0.059 0.011

#17 0.90* 0.85 0.015 0.024

WT 1.06 1.00 0.042

* p<0.05.

Discussion

[00280] This example shows that constitutive over-expression of AceMYBSl can greatly increase AsA concentration in plants. Very large increases were observed in some rice and soybean lines (~89 and ~5.7-fold of the wild-type control). This constitutes the largest increases in AsA observed by transgenic approaches to date (Bulley et al., 2012, Plant Biotechnology Journal 10(4): 390-397; Macknight et a/., 2017, Current Opinion in Biotechnology 44: 153-160). Furthermore, as rice (monocots) and soybeans (dicots) are only distantly related, this example suggests that overexpression of MYBS1 genes can increase AsA concentration in a wide variety of plant species.

[00281] While the AsA content of individual lines varied considerably, this is not unexpected. The expression of transgenes in plants is greatly affected by the genomic context into which they are inserted, and this will vary between individual transformed lines. It is common practice to generate many transformants and select the lines that show the greatest effect. Accordingly, it is not necessary for every transformed line to show an equivalent increase in AsA content. Rather, a combination of transformation and selection can be used to provide plants with greatly increased AsA content, even if most transformed lines show only modest increases.

[00282] Over-expression of AceMYBSl in Arabidopsis showed mixed results, but in two cases modest increases in AsA were observed. As previously stated, expression of transgenes can vary by genomic context, and this experiment confirms that it is possible to increase AsA content in Arabidopsis by over-expressing MYBS1 genes. Greater increases in AsA content may be possible by screening additional lines. This provides further evidence that the observed effect is broadly applicable to a variety of different plant species.

8. Expression of MYBS1 from various plant species.

[00283] MYBS1 genes from various plant species will be stably expressed in Arabidopsis thaliana, and the leaf AsA content measured.

[00284] The plant transformation vector pHex2s will be used, with the MYBS1 transgene being driven by a 35s constitutive promoter. The MYBS1 transgenes to be used are listed in Table 6.

Table 6: MYBS1 genes.

Gene Species SEQ ID No

AceMYBSl Actinidia eriantha 103

AcrMYBSl Actinidia rufa 357

MYBSl-like Actinidia chinensis 108

MYBSl-like Actinidia eriantha 358

AtMYBSl Arabidopsis thaliana 359

AtMYBS2 Arabidopsis thaliana 360

TaMYBSl Triticum aestivum 361

OsMYBSl Oryza sativa Japonica group 362

GmMYBSl Glycine max 363

CaMYBSl Capsicum annum 364

[00285] Arabidopsis thaliana (ColO) will be transformed using Agrobacterium- mediated transformation using the floral dip method according to standard protocols.

[00286] Transformed lines will be identified and gene expression confirmed using qRT-PCR methods as described in Example 1. Transformed lines will also have their ascorbate content determined as described in Example 1, and will also be tested for increased expression of the MYBS1 target gene GGP (GDP-galactose phosphorylase) using qRT-PCR. [00287] It is not the intention to limit the scope of the invention to the abovementioned examples only. As would be appreciated by a skilled person in the art, many variations are possible without departing from the scope of the invention as set out in the appended claims. INDUSTRIAL APPLICATION

[00288] The methods and tools described herein are useful for the production of plants having altered AsA content, including crops such as fruits and grains.