Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYNTHETIC GROWTH ON ONE-CARBON SUBSTRATES
Document Type and Number:
WIPO Patent Application WO/2022/047039
Kind Code:
A1
Abstract:
Many biotechnologically relevant organisms cannot utilize cheap and abundant one carbon feedstocks, e.g. CO2, CO, formaldehyde, methanol, and methane, for growth and instead prefer complex feedstocks such as sugars. Disclosed herein is a system that enables organisms to consume one carbon molecules for growth and maintenance via a formyl-CoA elongation pathway. Utilization of one carbon feedstocks can replace the use of sugar as the primary means of cultivating organisms in biotechnological applications. This has the potential to be more cost effective and avoid the controversial use of food as feedstocks. Intermediates of the formyl-CoA elongation pathway may be also be converted to desired chemical products.

Inventors:
GONZALEZ RAMON (US)
CHOU ALEXANDER (US)
CLOMBURG JAMES (US)
ZHU FAYIN (US)
LEE SEUNG HWAN (US)
Application Number:
PCT/US2021/047765
Publication Date:
March 03, 2022
Filing Date:
August 26, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV SOUTH FLORIDA (US)
International Classes:
C12N15/52; C12N15/70; C12P1/04
Domestic Patent References:
WO2016069929A12016-05-06
WO2014099725A12014-06-26
Attorney, Agent or Firm:
BABEL, Angeline R. et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A non-natural microbial system capable of utilizing one-carbon (C1) substrates for growth and product synthesis, comprising: a first set of nucleic acids encoding enzymes to convert the single carbon substrate to formyl-CoA and formaldehyde; and a second set of nucleic acids encoding enzymes to convert formyl-CoA and formaldehyde to native multi-carbon substrates or metabolites that enable growth.

2. The system of claim 1 , wherein the system comprises a one carbon substrate, optionally, wherein the C1 substrate comprises methane.

3. The system of claim 2, wherein the first set of metabolic enzymes comprises a methane monooxygenase that can convert the methane to methanol, a methanol dehydrogenase that can convert the methanol to formaldehyde, and an acyl-CoA reductase that can convert the formaldehyde to formyl-CoA.

4. The system of claim 1 , wherein the C1 substrate comprises carbon dioxide.

5. The system of claim 4, wherein the first set of metabolic enzymes comprises a formate dehydrogenase that can convert the carbon dioxide to formate.

6. The system of claim 5, wherein the first set of metabolic enzymes further comprises an enzyme that can convert the formate to formyl-CoA.

7. The system of claim 5, wherein the first set of metabolic enzymes further comprises a formaldehyde dehydrogenase that can convert formate to formaldehyde, and an acyl-CoA reductase that can convert the formaldehyde to formyl-CoA.

8. The system of claim 5, wherein the first set of metabolic enzymes further comprises a formate kinase that can convert the formate to formyl-phosphate and a phosphotransacylase that can convert the formyl-phosphate to formyl-CoA.

9. The system of claim 1 , wherein the C1 substrate comprises formate.

10. The system of claim 9, wherein the first set of metabolic enzymes comprises an enzyme that can convert the formate to formyl-CoA.

11 . The system of claim 9, wherein the first set of metabolic enzymes comprises a formaldehyde dehydrogenase that can convert formate to formaldehyde, and an acyl-CoA reductase that can convert the formaldehyde to formyl-CoA.

12. The system of claim 9, wherein the first set of metabolic enzymes comprises a formate kinase that can convert the formate to formyl-phosphate and a phosphotransacylase that can convert the formyl-phosphate to formyl-CoA.

13. The system of claim 1 , wherein the C1 substrate comprises carbon monoxide.

14. The system of claim 13, wherein the first set of metabolic enzymes comprises a carbon monoxide dehydrogenase that can convert the carbon monoxide to carbon dioxide, and a formate dehydrogenase that can convert the carbon dioxide to formate.

15. The system of claim 14, wherein the first set of metabolic enzymes further comprises an enzyme that can convert the formate to formyl-CoA.

16. The system of claim 14, wherein the first set of metabolic enzymes further comprises a formaldehyde dehydrogenase that can convert formate to formaldehyde, and an acyl-CoA reductase that can convert the formaldehyde to formyl-CoA.

17. The system of claim 14, wherein the first set of metabolic enzymes further comprises a formate kinase that can convert the formate to formyl-phosphate and a phosphotransacylase that can convert the formyl-phosphate to formyl-CoA.

18. The system of claim 1 , wherein the C1 substrate comprises methanol.

19. The system of claim 18, wherein the first set of metabolic enzymes comprises a methanol dehydrogenase that can convert the methanol to formaldehyde, and an acyl-CoA reductase that can convert the formaldehyde to formyl-CoA.

20. The system of claim 1 , wherein the C1 substrate comprises formaldehyde.

21 . The system of claim 20, wherein the first set of metabolic enzymes comprises an acyl-CoA reductase that can convert the formaldehyde to formyl-CoA.

22. The system of any one of claims 1 to 21 , wherein the second set of metabolic enzymes comprises 2-hydroxyacyl-CoA lyase (HACL).

23. The system of claim 22, wherein the second set of metabolic enzymes further comprises an acyl-CoA reductase.

24. The system of claim 23, wherein the second set of metabolic enzymes further comprises a 1 ,2-diol oxidoreductase.

25. The system of claim 24, wherein the second set of metabolic enzymes further comprises a diol dehydratase.

26. The system of any one of claims 1 to 25, further comprising a third set of nucleic acids encoding enzymes that produce molecules of interest.

27. The system of any one of claims 1 to 26, comprising one microbial host encompassing all nucleic acids.

28. The system of any one of claims 1 to 27, comprising multiple microbial host strains having different sets of nucleic acids.

29. A method of culturing the microbial system of any one of claims 1 to 28, comprising incubating the microbial system with a C1 feedstock under suitable conditions, wherein the C1 feedstock is utilized for growth.

30. The method of claim 29 further comprising isolating a product of interest from the microbial culture system.

31 . A metabolically engineered microorganism, comprising: a first set of nucleic acids encoding metabolic enzymes that convert a single carbon substrate to formyl-CoA; and a second set nucleic acids encoding metabolic enzymes that extend a carbon backbone via a formyl-CoA elongation pathway that uses the formyl-CoA as an elongation unit.

32. The metabolically engineered microorganism of claim 31 , wherein one or more intermediates of the formyl-CoA elongation pathway serve as metabolites for the growth of the microorganism.

33. The metabolically engineered microorganism of claims 31 or 32, wherein one or more intermediates of the formyl-CoA elongation pathway serve as precursors to one or more chemical products.

34. The metabolically engineered microorganism of any one of claims 31 to 33, wherein the metabolic enzymes encoded by the first set and the second set of nucleic acids are recombinantly expressed or overexpressed in the microorganism.

35. The metabolically engineered microorganism of any one of claims 31 to 34, wherein one or more genes encoding a metabolic enzyme involved in a metabolic pathway that competes with the formyl-CoA elongation pathway are knocked out in the microorganism.

36. The metabolically engineered microorganism of any one of claims 31 to 35, further comprising a third set of nucleic acids encoding metabolic enzymes that convert one or more intermediates of the formyl-CoA elongation pathway to a chemical product.

37. A method of cultivating a microorganism on a single carbon substrate, comprising: providing the microorganism with a first set of nucleic acids encoding metabolic enzymes for converting the single carbon substrate to formyl-CoA, and a second set of nucleic acids encoding metabolic enzymes for extending a carbon backbone via a formyl-CoA elongation pathway that uses the formyl-CoA as an elongation unit; and culturing the microorganism in a growth medium containing the single carbon substrate, wherein one or more intermediates of the formyl-CoA elongation pathway serve as a growth substrate or a precursor to a growth substrate of the microorganism.

38. The method of claim 37, wherein the growth substrate includes one or more of an aldose sugar, a diol, a 2-hydroxyacid, glycolic acid, glyceraldehyde, lactic acid, and acetyl-CoA.

39. A method of chemical product synthesis from a single carbon substrate, comprising: providing a microorganism with a first set of nucleic acids encoding metabolic enzymes that convert the single carbon substrate to formyl-CoA, and a second set of nucleic acids encoding metabolic enzymes that extend a carbon backbone via a formyl-CoA elongation pathway that uses the formyl-CoA as an elongation unit; and feeding the microorganism the single carbon substrate, wherein one or more intermediates of the of the formyl-CoA elongation pathway serve as a precursor to a chemical product.

40. The method of claim 39, wherein the chemical product is selected from a 2- hydroxyacid, an aldose, a diol, a polyol, a carboxylic acid compound, lactic acid, and an alcohol.

41. The method of claims 39 or 40, wherein the chemical product is a multi- carbon chemical.

42. The method of any one of claims 39 to 41 , further comprising purifying the chemical product from a fermentation broth of the microorganism.

43. The method of any one of claims 39 to 42, further comprising providing the microorganism with a third set of nucleic acids encoding metabolic enzymes for converting intermediates of the formyl-CoA elongation pathway to the chemical product.

44. The method of any one of claims 37 to 43, wherein providing the microorganism with the first set of nucleic acids and the second set of nucleic acids comprises transforming the microorganism with one or more expression vectors containing one or more genes encoding one or more of the metabolic enzymes.

45. The method of any one of claims 37 to 43, wherein providing the microorganism with the first set of nucleic acids and the second set of nucleic acids comprises adding genes encoding one or more of the metabolic enzymes by one of recombinant engineering, homologous recombination, and gene editing.

46. The method of any one of claims 37 to 45, further comprising overexpressing one or more of the metabolic enzymes for converting the single carbon substrate to formyl-CoA and/or one or more of the metabolic enzymes for extending the single carbon substrate via the formyl-CoA elongation pathway.

47. The method of any one of claims 37 to 46, further comprising suppressing metabolic pathways that compete with the formyl-CoA elongation pathway in the microorganism by knocking out one or more genes in the microorganism that encode a metabolic enzyme involved in the competing metabolic pathway.

48. A cell-free system, comprising: a first set of metabolic enzymes that convert a single carbon substrate to formyl-CoA; and a second set of metabolic enzymes that extend a carbon backbone via a formyl-CoA elongation pathway that uses the formyl-CoA as an elongation unit.

49. The cell-free system of claim 48, wherein the first set of metabolic enzymes and the second set of metabolic enzymes are derived from extracts of one or more microorganisms expressing the first and second set of metabolic enzymes.

50. The cell-free system of claims 48 or 49, wherein the cell-free system further comprises a third set of metabolic enzymes that convert one or more intermediates of the formyl-CoA elongation pathway into a chemical product.

51. The metabolically engineered microorganism, the method, or the cell-free system of any of claims 31 to 50, wherein the formyl-CoA elongation pathway includes at least one of an aldose elongation pathway and an aldehyde elongation pathway.

52. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 31 to 51 , wherein the single carbon substrate is selected from methane, methanol, carbon dioxide, formate, carbon monoxide, and combinations thereof.

53. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 31 to 52, wherein the metabolic enzymes that convert the single carbon substrate to formyl-CoA include one or more enzymes that convert the single carbon substrate to formaldehyde and an acyl-CoA reductase that converts the formaldehyde to formyl-CoA.

54. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 31 to 53, wherein the single carbon substrate is methane.

55. The metabolically engineered microorganism, the method, or the cell-free system of claim 54, wherein the metabolic enzymes that convert the single carbon substrate to formyl-CoA include a methane monooxygenase that oxidizes the methane to methanol.

56. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 31 to 53, wherein the single carbon substrate is methanol.

57. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 55 or 56, wherein the metabolic enzymes that convert the single carbon substrate to formyl-CoA include a methanol dehydrogenase that oxidizes the methanol to formaldehyde, and an acyl-CoA reductase that converts the formaldehyde to formyl-CoA.

58. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 31 to 53, wherein the single carbon substrate is carbon dioxide.

59. The metabolically engineered microorganism, the method, or the cell-free system of claim 58, wherein the metabolic enzymes that convert the single carbon substrate to formyl-CoA include a formate dehydrogenase that converts the carbon dioxide to formate.

60. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 31 to 53, wherein the single carbon substrate is carbon monoxide.

61. The metabolically engineered microorganism, the method, or the cell-free system of claim 60, wherein the metabolic enzymes that convert the single carbon substrate to formyl-CoA include a carbon monoxide dehydrogenase that converts the carbon monoxide to carbon dioxide, and a formate dehydrogenase that converts the carbon dioxide to formate.

62. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 31 to 52, wherein the single carbon substrate is formate.

63. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 59, 61, or 62, wherein the metabolic enzymes that convert the single carbon substrate to formyl-CoA include one or more enzymes that convert the formate to formyl-CoA.

64. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 59, 61, or 62, wherein the metabolic enzymes that convert the single carbon substrate to formyl-CoA include a formaldehyde dehydrogenase that converts the formate to formaldehyde, and an acyl-CoA reductase the converts the formaldehyde to formyl-CoA.

65. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 59, 61, or 62, wherein the metabolic enzymes that convert the single carbon substrate to formyl-CoA include a formate kinase that converts the formate to formyl-phosphate and a phosphotransacylase that converts the formyl-phosphate to formyl-CoA.

66. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 59, 61, or 62, wherein the metabolic enzymes that convert the single carbon substrate to formyl-CoA include an acyl-CoA synthetase that coverts the formate to formyl-CoA.

67. The metabolically engineered microorganism, the method, or the cell-free system of any one of claims 31 to 66, wherein the metabolic enzymes that extend the carbon backbone via the formyl-CoA elongation pathway include 2-hydroxyacyl-CoA lyase (HACL) or oxalyl-CoA decarboxylase (OXC), and wherein the HACL or the OXC produce 2-hydroxyacyl-CoA from formyl-CoA and an aldehyde.

68. The metabolically engineered microorganism, the method, or the cell-free system of claim 67, wherein the metabolic enzymes that extend the carbon backbone via the formyl-CoA elongation pathway further include an acyl-CoA reductase that reduces the 2-hydroxyacyl-CoA to a 2-hydroxyaldehyde.

69. The metabolically engineered microorganism, the method, or the cell-free system of claim 68, wherein the HACL or the OXC catalyze the one-carbon elongation of the 2-hydroxyaldehyde with formyl-CoA.

70. The metabolically engineered microorganism, the method, or the cell-free system of claim 68, wherein the metabolic enzymes that extend the carbon backbone via the formyl-CoA elongation pathway further include a 1 ,2-diol oxidoreductase or alcohol dehydrogenase that reduce the 2-hydroxyaldehyde to a 1 ,2-diol.

71. The metabolically engineered microorganism, the method, or the cell-free system of claim 70, wherein the metabolic enzymes that extend the carbon backbone via the formyl-CoA elongation pathway further include a diol dehydratase that dehydrates the 1 ,2-diol to an aldehyde.

72. The metabolically engineered microorganism, the method, or the cell-free system of claim 71, wherein the HACL or OXC catalyze the one carbon elongation of the aldehyde with formyl-CoA.

73. A two-strain microbial system, comprising: a first microorganism including nucleic acids encoding one or more first metabolic enzymes that convert a single carbon substrate to formyl-CoA, and nucleic acids encoding one or more second metabolic enzymes that produce glycolate from the formyl-CoA, wherein the first microorganism is unable to consume and grow on the glycolate; and a second microorganism lacking nucleic acids encoding the first and second metabolic enzymes, wherein the second microorganism is able to consume and grow on the glycolate, and wherein co-culturing the first microorganism and the second microorganism in media containing the single carbon substrate leads to growth of the second microorganism.

Description:
SYNTHETIC GROWTH ON ONE-CARBON SUBSTRATES

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is related to, claims priority to, and incorporates herein by reference for all purposes U.S. Provisional Patent Application No. 63/070,464, filed on August 26, 2020.

BACKGROUND

[0002] One-carbon (C1) compounds represent potential low-cost and abundant feedstocks for the chemical industry (Durre, P. & Eikmanns, BJ. Curr. Opin. Biotechnol. 35:63-72 (2015)). Due to the often dilute and disperse nature of these feedstocks, biochemical processes have the potential to be effective technologies for C1 utilization by enabling lower capital expenditure (CapEx) and distributed manufacturing in ways that current chemical technologies are limited (Clomburg, JM., et al. Science 355:aag0804 (2017)). While C1 molecules can be effectively utilized by biology for growth, the efficient biological production of varied industrial chemicals from C1 substrates remains an open challenge.

[0003] Approaches toward biological product synthesis from C1 substrates involving both natural (Bar-Even, A., et al. J. Exp. Bot. 63:2325-42 (2012); Kalyuzhnaya, MG., et al. Metab. Eng. 29:142-152 (2015)) and synthetic (Bogorad, I. W. et al. Proc. Natl. Acad. Sci. U. S. A. 111:15928-33 (2014); Siegel, J. B. et al. Proc. Natl. Acad. Sci. 112: 3704-3709 (2015); Schwander, T., et al. Science 354:900-904 (2016); Lu, X. et al. Nat. Commun. 10: 1378 (2019); Kim, S. et al. Nat. Chem. Biol. 16:538-545 (2020)) pathways tend to share a common metabolic architecture (Fig. 1A). C1 molecules at varying levels of reduction are first assimilated to produce C2 and C3 central metabolites. These C2/C3 metabolites are either precursors for product synthesis pathways or are further converted through central metabolic pathways to generate said precursors. As a result, carbon assimilation, central metabolism, and product synthesis pathways must be concurrently engineered to enable effective production of targeted chemicals. Conceptually, though, this architecture should not be a necessity for chemical production from C1 molecules. Unlike the utilization of multi-carbon substrates for which there is an advantage to conserve carbon-carbon bonds, all carbon-carbon bonds must be created de novo when starting from C1 molecules. In principle, then, it should be possible to build diverse chemical products from C1 molecules using C1 “building blocks” without the added complexity of first generating multi-carbon units. [0004] While fuel and chemical production based on C1 feedstocks is promising, significant challenges impede the implementation of efficient biocatalysts for C1 bioconversion. Many biotechnologically relevant organisms cannot utilize cheap and abundant one carbon feedstocks (e.g. CO 2 , methane) for growth and instead prefer complex feedstocks such as sugar.

SUMMARY

[0005] As disclosed herein, formyl-CoA can serve as a C1 building block or elongation unit in a reaction catalyzed by 2-hydroxyacyl-CoA lyase (HACL) or oxalyl- CoA decarboxylase (OXC) allowing organisms to utilize C1 feedstocks for growth and resulting in a more cost-effective way of cultivating said organisms. These pathways, referred to herein as formyl-CoA elongation (FORCE) pathways, can be used for the production of growth substrates for biocatalyst growth or maintenance and ultimately for biochemical product synthesis. In some embodiments, the disclosed FORCE pathways can be used with multi-carbon substrates, such as C2, C3, C4, C5, C6 substrates. For example, multicarbon co-substrates can include for example: sugars (e.g. glucose), glycerol, acetate, and fatty acids.

[0006] Therefore, disclosed herein are microorganisms that are not naturally able to utilize C1 substrates for growth (i.e. heterotrophs) but which have been engineered to be able to do so. Engineering of these organisms, which are referred to as either methylotrophs, formatotrophs, or autotrophs, involves providing a cell system a first set of metabolic enzymes to convert the single carbon substrate to formyl-CoA and formaldehyde, a second set of metabolic enzymes to elongate aldoses or aldehydes with the formyl-CoA molecules, feeding the system a C1 substrate under suitable conditions for the metabolic enzymes to produce multi- carbon native substrates or metabolites, and optionally providing a third set of metabolic enzymes to convert substrates or metabolites into a desired multi-carbon chemical.

[0007] Disclosed herein is a non-natural microbial system capable of utilizing one-carbon (C1) substrates for growth and product synthesis. The non-natural microbial system may include a first set of nucleic acids encoding enzymes to convert the single carbon substrate to formyl-CoA and formaldehyde, and a second set of nucleic acids encoding enzymes to convert formyl-CoA and formaldehyde to native multi-carbon substrates or metabolites that enable growth.

[0008] Further disclosed herein is a metabolically engineered microorganism. The metabolically engineered microorganism may include a first set of nucleic acids encoding metabolic enzymes that convert a single carbon substrate to formyl-CoA, and a second set of nucleic acids encoding metabolic enzymes that extend a carbon backbone via a formyl-CoA elongation pathway that uses the formyl-CoA as an elongation unit.

[0009] Also disclosed herein is a method of cultivating a microorganism on a single carbon substrate. The method may include providing the microorganism with a first set of nucleic acids encoding metabolic enzymes for converting the single carbon substrate to formyl-CoA, and a second set of nucleic acids encoding metabolic enzymes for extending a carbon backbone via a formyl-CoA elongation pathway that uses the formyl-CoA as an elongation unit. The method may further include culturing the microorganism in a growth medium containing the single carbon substrate. One or more intermediates of the formyl-CoA elongation pathway may serve as a growth substrate or a precursor to a growth substrate of the microorganism.

[0010] Further disclosed herein is a method of chemical product synthesis from a single carbon substrate. The method may include providing a microorganism with a first set of nucleic acids encoding metabolic enzymes that convert the single carbon substrate to formyl-CoA, and a second set of nucleic acids encoding metabolic enzymes that extend a carbon backbone via a formyl-CoA elongation pathway that uses the formyl-CoA as an elongation unit. The method may further include feeding the microorganism the single carbon substrate. One or more intermediates of the formyl-CoA elongation pathway may be a chemical product or may serve as a precursor to a chemical product.

[0011] Also disclosed herein is a cell-free system including a first set of metabolic enzymes that convert a single carbon substrate to formyl-CoA, and a second set of metabolic enzymes that extend a carbon backbone via a formyl-CoA elongation pathway that uses the formyl-CoA as an elongation unit.

[0012] Also disclosed herein is a two-strain microbial system. The two-strain microbial system may include a first microorganism including nucleic acids encoding one or more first metabolic enzymes that convert a single carbon substrate to formyl- CoA, and nucleic acids encoding one or more second metabolic enzymes that produce glycolate from the formyl-CoA. The first microorganism may be unable to consume and grow on the glycolate. The two-strain microbial system may further include a second microorganism lacking nucleic acids encoding the first and second metabolic enzymes. The second microorganism may be able to consume and grow on glycolate. Coculturing the first microorganism and the second microorganism in media containing the single carbon substrate may lead to growth of the second microorganism.

[0013] The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.

DESCRIPTION OF DRAWINGS

[0014] Figure 1. FORCE pathways for product synthesis from C1 substrates, a) A synthetic, orthogonal architecture for C1 utilization based on formyl- CoA elongation (FORCE) pathways. Carbon skeletons are directly built from activated C1 units in the form of formyl-CoA, thus bypassing the “bowtie” architecture of metabolism for product synthesis, b) One-carbon substrates are activated to the C1 elongation unit formyl-CoA through various redox reactions (blue box). Formyl- CoA serves to elongate an aldehyde in a reaction catalyzed by HACL, resulting in the production of 2-hydroxyacyl-CoA. 2-Hydroxyacyl-CoA can be further reduced to a 2- hydroxyaldehyde. The 2-hydroxyaldehyde can be further elongated by formyl-CoA, which we refer to as aldose elongation. Alternatively, a-reduction can take place via reduction to a 1 ,2-diol and dehydration to a nonfunctionalized aldehyde. The resulting aldehyde can then be further elongated. These collective routes for elongation, referred to as formyl-CoA elongation (FORCE), are boxed in green. The various intermediates of these elongation pathways can be converted to desirable chemical products (red) including 2-hydroxy-acids, aldoses, diols, polyols, carboxylic acids, and alcohols. A number of these products and intermediates can also serve as substrates for growth (highlighted in orange), such as glycolic acid, glyceraldehyde, and acetyl-CoA. Abbreviations: MDH: methanol dehydrogenase; ACR: acyl-CoA reductase; FaldDH: formaldehyde dehydrogenase; ACS: acyl-CoA synthetase; ACT: acyl-CoA transferase; FOK: formate kinase; PTA: phosphotransacylase; HACL: 2- hydroxyacyl-CoA lyase; ADH: alcohol dehydrogenase; DDR: diol dehydratase; TES: thioesterase; ALDH: aldehyde dehydrogenase. Standard Gibbs free energies of reactions are given for each pathway reaction in the direction indicated by the arrow.

[0015] Figure 2. Thermodynamic analysis of FORCE pathways. Thermodynamic feasibility was evaluated by calculating the Min-max Driving Force (MDF) of specified conversions, a) The MDF for the utilization of different C1 substrates for the production of glycolate or acetate via the synthetic pathway. Open bars refer to the standard conditions (maximum substrate concentration constraint 10 mM), while filled bars refer to adjusted constraints reflecting the approximate toxicity of each substrate to E. coli. Three pathways for formate utilization were evaluated, each different in the ATP requirement for formate activation, b) The influence of factors such as NADH/NAD + ratio, ATP consumption, and formate concentration on the MDF of formate to product conversion, c) The MDF of iterative FORCE pathways as a function of the product carbon chain length using formaldehyde as the representative substrate and where the product is the aldose or aldehyde corresponding to the indicated chain length. These aldose and aldehyde elongation pathways are shown in Fig. 2 (see formyl-CoA elongation panel).

[0016] Figure 3. In vitro assessment of core module of the FORCE pathway using purified enzymes, a) Pathways for conversion of C1 substrates formaldehyde and formate (individually and in combination) to glycolyl-CoA. Enzymes and co-factors for each step are indicated. Substrate(s) added are shown in bold and underlined, b) A Liquid Chromatography-Mass Spectrometry (LC-MS) extracted ion chromatography (EIC) of formyl-CoA and glycolyl-CoA through Find By Formula (FBF) function in MassHunter Qualitative Analysis B.05.00. The data is representative of duplicate experiments; c) Relative abundance of formyl-CoA and glycolyl-CoA in the in vitro samples. Formyl-CoA and glycolyl-CoA were quantified based on EIC peak area by LC-MS analysis. Resulting C2 in the samples were further quantified as glycolic acid after NaOH hydrolysis of glycolyl-CoA. All data is shown for technical replicates (n=2) with bars drawn to the mean values.

[0017] Figure 4. Cell-free prototyping the a-reduction variant of the FORCE product synthesis pathway, a) Overview of the prototyped a-reduction pathway for the production of various C2 products from formaldehyde. Products that were detected in this work are boxed with a solid outline. Enzyme abbreviations: DDR: Klebsiella oxytoca diol dehydratase; End. (1)’. endogenous aldehyde dehydrogenase; End. (2): endogenous thioesterase; End. (3): endogenous alcohol dehydrogenase; FucO: E. coli 1 ,2-diol oxidoreductase; LmACR: Listeria monocytogenes acyl-CoA reductase; RuHACL: Rhodospirillales bacterium URHD0017 HACL. b) Product and substrate profiles of cell-free systems with indicated pathway enzymes as detected by HPLC under conditions in which carboxylates are detected in their acid form. Concentrations are given on a carbon basis. All data points are shown for triplicate technical replicates. Lines are drawn to the mean values with error bars indicating the standard deviation. [0018] Figure 5. Resting cell bioconversions of C1 substrate formaldehyde using the aldose elongation and a-reduction variants of the FORCE pathways, a) Strategies used in this work to demonstrate diverse product synthesis using FORCE pathways from formaldehyde. Detected products and byproducts are boxed with a solid outline. Knockout strategies to reduce byproduct synthesis are indicated in red. b) Metabolite profiles for strains engineered for product synthesis from formaldehyde using FORCE pathways after 24 hour resting cell bioconversions with OD 600 = 10 (5*10 9 CFU/mL) and two additions of 10 mM formaldehyde at 0 and 1.5 hours. In the legend, + refers to overexpression of the indicated enzyme. Δaldh refers to knockouts of aldehyde dehydrogenases: ΔaldA ΔaldB ΔpatD ΔpuuC. End. tes refers to endogenous thioesterases and spontaneous thioester hydrolysis. No multi-carbon products were observed in a strain that was expressing LmACR and EcAldA only without RuHACL. Concentrations are given on a carbon basis and were determined by HPLC under conditions in which carboxylates are detected in their acid form. All data points are shown for duplicate technical replicates. Bars are drawn to the mean values, c) Spectra of multi-carbon products generated from experiments using 13C- labeled formaldehyde in comparison to products from unlabeled formaldehyde. The [M-15] + ion is shown. A +2 shift in m/z is observed for glycolic acid and ethylene glycol, and a +3 shift in m/z is observed for glyceric acid.

[0019] Figure 6. FORCE pathway implementation in growing cell cultures using methanol as the C1 substrate, a) Host and pathway designs for the production of glycolate from methanol in actively growing E. coli cultures. Knockout strategies to reduce byproduct synthesis and prevent glycolate utilization are indicated in red and correspond to host strain AC440. ATE refers to knockouts of endogenous thioesterases (ΔyciA ΔtesA ΔtesB ΔybgC Δydil ΔfadM). End. (1) refers to endogenous aldehyde oxidation activity. Enzyme abbreviations: BmMDH2 MGA3 : Bacillus methanolicus MGA3 NAD + -dependent methanol dehydrogenase; LmACR: Listeria monocytogenes acyl-CoA reductase; RuHACL G390N : Rhodospirillales bacterium URHD0017 HACL (G390N); BsmHACL: Beach sand metagenome HACL; EcAldA: E. coli aldehyde dehydrogenase A; CbAbfT: Clostridium aminobutyricum CoA transferase, b) Time course of production of glycolate and formate from methanol. FORCE pathway designs were implemented by overexpressing LmACR, EcAldA, and BmMdh MGA3 with or without RuHACL G390N . All data is shown for biological replicates (n=3 for samples with RuHACL G390N ; n=2 for samples without RuHACL G390N ). Lines are drawn to the mean values with error bars indicating the standard deviation. Concentrations are given on a carbon basis, c) Improvement of glycolate production from methanol in growing E. coli cultures via rational engineering. Glycolate and formate concentrations are given on a carbon basis for the 72-hour time point. All data is shown for biological replicates (n=3 for samples with RuHACL G390N ; n=4 for others). Bars are drawn to the mean values, d) Spectra of the [M-15] + ion of the 2TMS derivative of glycolic acid produced by E. coli incubated with either 12C (unlabeled) or 13C (labeled) methanol.

[0020] Figure 7. Simulated flux maps from genome scale E. coli models for growth using FORCE pathways variants: a) (form)aldehyde elongation, b) α- reduction, c) aldose elongation. Substrate uptake reactions are indicated in green. The reactions implemented for each FORCE pathway variant are drawn in blue. FORCE pathway termination is indicated in orange. Carbon dissimilated as CO 2 export is highlighted in red. Fluxes are given in mmol/g DCW/hr. Only major fluxes (threshold set as > p) are drawn for clarity. Reactions of the pentose phosphate pathway, resulting in the rearrangement of erythrose 4-phosphate into glyceraldehyde 3-phosphate in panel b are simplified.

[0021] Figure 8. Two-strain system for evaluating the ability of FORCE pathways to enable growth on C1 substrates, a) FORCE pathways can enable synthetic methylotrophy by converting non-native C1 substrates into native multi- carbon substrates that serve as carbon and energy sources, b) Conceptual scheme of the two-strain system. Producer strains (yellow outline) that are unable to consume glycolate were engineered to produce glycolate from one of three C1 substrates: methanol (red), (para)formaldehyde (blue), or formate and formaldehyde (green). A second consumer strain capable of consuming glycolate was added to the culture, acting as a detectable signal to evaluate growth, c) Time course of glycolate concentration (blue) and cell-growth (orange) in the two-strain system with (para)formaldehyde as the sole source of carbon. 5 mM (mass equivalent) paraformaldehyde added to AC440 (3*10 9 CFU/mL) expressing LmACR, AldA, and BsmHACL. All data points are shown for duplicate replicates. The line for glycolate concentration is drawn to the mean values. The line for cell growth is the fit of the data to exponential growth by least squares regression, which was used to calculate the specific growth rate (p). Full metabolite and cell growth profiles, including for control samples are shown in Fig. 16. d) Growth of the consumer strain when incubated for the indicated time with the relevant producer strain with (+) or without (- ) HACL and the indicated C1 substrate (pFALD: paraformaldehyde; MeOH: methanol; FALD: formaldehyde; FA: sodium formate). See also Fig. 16-18. All data is shown for duplicate technical replicates with bars drawn to the mean values, e) Plate images demonstrating growth of the consumer strain corresponding to the conditions in panel d.

[0022] Figure 9. Canonical (a) and orthogonal, synthetic (b) architectures for biological C1 utilization, a) “Bowtie” architecture of metabolism in which carbon substrates are consolidated into central metabolites from which a host of products can be produced through fermentative and biosynthetic pathways. Metabolic engineering typically operates within this framework by manipulating either one or all of the three components of the bowtie, b) The orthogonal FORCE pathways serve as a platform for both product synthesis and for providing substrates/metabolites for growth. This is an alternative framework to the traditional approach, which feeds all carbon through central metabolism, and from which both products and biomass are derived.

[0023] Figure 10. An alternative FORCE pathway based on dehydration of the 2-hydroxyacyl-CoA and a-reduction. The pathway resembles β-oxidation reversal (P-reduction) 39 . This pathway is also a potential route for the production of unsaturated products. HACL: 2-hydroxyacyl-CoA lyase; HACD: 2-hydroxyacyl-CoA dehydratase; TER: trans-2-enoyl-CoA reductase; ACR: acyl-CoA reductase.

[0024] Figure 11. The impact of NADH/NAD + ratio on formaldehyde (top) and methanol (bottom) conversion to glycolate or acetate via FORCE pathways.

[0025] Figure 12. The impact of NADH/NAD* ratio on formaldehyde (top) and methanol (bottom) conversion to glycolate or acetate via FORCE pathways. Termination by hydrolysis of the acyl-CoA to produce sugar acids increases the driving force of the pathway for low numbers of iterations, but the driving forces converge as the number of iterations increase.

[0026] Figure 13. Production of glycolate from formate by E. coli engineered with a formate-activating pathway. Resting cell experiments were performed with a strain expressing CaAbfT and BsmHACL (blue bars) and the corresponding control lacking BsmHACL (orange bars). Cultures (2.5 OD 600 = 2.5*10 9 CFU/mL) were incubated at 30°C for 24 hours in 25 mL flasks shaking at 200 rpm using 10 mM formate (plus 1 mM formaldehyde) as carbon source (control cultures with 1 mM formaldehyde and no formate also shown). All data points are shown for n = 6 replicates. Bars are drawn to the mean values. [0027] Figure 14. Predicted biomass electron and carbon yields from various C1 substrates by the implementation of select pathways enabling methylotrophy. Abbreviations: Formald-formaldehyde, FORCE-Glycerald - FORCE pathway with reactions enabling glyceraldehyde production, RuMP - Ribulose monophosphate pathway, FORCE-Ac - FORCE pathway with reactions enabling acetate production, SACA - Synthetic Acetyl-CoA pathway, FORCE- Glycolate - FORCE pathway with reactions enabling glycolate production. The scenarios in bold correspond to the predicted flux maps illustrated in Fig. 8.

[0028] Figure 15. Paraformaldehyde solubilization rate and resting cell bioconversion with paraformaldehyde, a) Solubilization rate of commercially available paraformaldehyde (pFALD) with different particle sizes. Solubilization rates are measured in 10 mL M9 media in a 25 mL flask at 30°C shaking at 200 rpm. b) Resting cell bioconversion of strains expressing BsmHACL, LmACR and AldA induced with 40 μM cumate and 100 μM IPTG. 3 mg prilled paraformaldehyde is added to 20 mL M9 media (2.5 mM formaldehyde equivalent) in a 25 mL flask at 30°C shaking at 200 rpm. Formaldehyde accumulates only at sub-millimolar concentrations under these conditions.

[0029] Figure 16. Time course profiles for glycolate, formate, and formaldehyde concentration and cell-growth of the sensor strain in the two- strain system with 5 mM paraformaldehyde, a) Time course in which the producer strain did not express an HACL. b) Plates from a representative experiment of the time course shown in panel a. c) Time course in which the producer strain expresses HACL. d) Plates from a representative experiment of the time course shown in panel c. 50 μL of cultures (5x10 -3 dilution) at various time points plated on minimal media plates containing 2.5 g/L glycolate. All data is shown for duplicate replicates (n = 2). Lines are drawn to the mean values.

[0030] Figure 17. Time course profiles for glycolate, formate, and formaldehyde concentration and cell-growth of the sensor strain in the two- strain system with 500 mM methanol, a) Time course in which the producer strain did not express an HACL. b) Plates from a representative experiment of the time course shown in panel a. c) Time course in which the producer strain expresses HACL. d) Plates from a representative experiment of the time course shown in panel c. 50 μL of cultures (5x10 -3 dilution) at various time points plated on minimal media plates containing 2.5 g/L glycolate. All data is shown for duplicate replicates (n = 2). Lines are drawn to the mean values. [0031] Figure 18. Time course profiles for glycolate, and formaldehyde concentration in the two-strain system with 1 mM formaldehyde and 10 mM formate, a) Time course in which the producer strain expresses BsmHACL. b) Plates from a representative experiment of the time course shown in panel a. 50 μL of cultures (5x10 -3 dilution) at various time points plated on minimal media plates containing 2.5 g/L glycolate. All data is shown for duplicate replicates (n = 2). Lines are drawn to the mean values.

DETAILED DESCRIPTION

[0032] Before the present disclosure is described in greater detail, it is to be understood that this disclosure is not limited to particular embodiments described, and as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present disclosure will be limited only by the appended claims.

[0033] Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the disclosure. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.

[0034] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present disclosure, the preferred methods and materials are now described.

[0035] All publications and patents cited in this specification are herein incorporated by reference as if each individual publication or patent were specifically and individually indicated to be incorporated by reference and are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present disclosure is not entitled to antedate such publication by virtue of prior disclosure. Further, the dates of publication provided could be different from the actual publication dates that may need to be independently confirmed.

[0036] As will be apparent to those of skill in the art upon reading this disclosure, each of the individual embodiments described and illustrated herein has discrete components and features which may be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present disclosure. Any recited method can be carried out in the order of events recited or in any other order that is logically possible.

[0037] Embodiments of the present disclosure will employ, unless otherwise indicated, techniques of chemistry, biology, and the like, which are within the skill of the art.

[0038] The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to perform the methods and use the probes disclosed and claimed herein. Efforts have been made to ensure accuracy with respect to numbers (e.g., amounts, temperature, etc.), but some errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, temperature is in °C, and pressure is at or near atmospheric. Standard temperature and pressure are defined as 20 °C and 1 atmosphere.

[0039] Before the embodiments of the present disclosure are described in detail, it is to be understood that, unless otherwise indicated, the present disclosure is not limited to particular materials, reagents, reaction materials, manufacturing processes, or the like, as such can vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting. It is also possible in the present disclosure that steps can be executed in different sequence where this is logically possible.

[0040] It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise.

[0041] As defined herein, the phrases “recombinant host microorganism”, “genetically engineered host microorganism”, “engineered host microorganism” and “genetically modified host microorganism” may be used interchangeably and refer to host microorganisms that have been genetically modified to (a) express one or more exogenous polynucleotides, (b) over-express one or more endogenous and/or one or more exogenous polynucleotides, such as those included in a vector, or which have an alteration in expression of an endogenous gene or (c) knock-out or down-regulate an endogenous gene. In addition, certain genes may be physically removed from the genome (e.g., knock-outs) or they may be engineered to have reduced, altered or enhanced activity.

[0042] The terms “engineer”, “genetically engineer” or “genetically modify” refer to any manipulation of a microorganism that results in a detectable change in the microorganism, wherein the manipulation includes, but is not limited to, introducing non-native metabolic functionality via heterologous (exogenous) polynucleotides or removing native-functionality via polynucleotide deletions, mutations or knock-outs. The term “metabolically engineered” generally involves rational pathway design and assembly of biosynthetic genes (ORFs), genes associated with operons, and control elements of such polynucleotides, for the production of a desired metabolite. “Metabolically engineered” may further include optimization of metabolic flux by regulation and optimization of transcription, translation, protein stability and protein functionality using genetic engineering and appropriate culture condition including the reduction of, disruption, or knocking out of, a competing metabolic pathway that competes with an intermediate leading to a desired pathway.

[0043] The phrases “metabolically engineered microorganism” and “modified microorganism” are used interchangeably herein, and refer not only to the particular subject host cell, but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[0044] The term “mutation” as used herein indicates any modification of a nucleic acid and/or polypeptide which results in an altered nucleic acid or polypeptide (i.e. , relative to the wild-type nucleic acid or polypeptide sequence). Mutations include, for example, point mutations, substitutions, deletions, or insertions of single or multiple residues in a polynucleotide (or the encoded polypeptide), which includes alterations arising within a protein-encoding region of a gene as well as alterations in regions outside of a protein-encoding sequence, such as, but not limited to, regulatory or promoter sequences. A genetic alteration may be a mutation of any type. For instance, the mutation may constitute a point mutation, a frame-shift mutation, an insertion, or a deletion of part or all of a gene. In certain embodiments, a portion of a genetically modified microorganism's genome may be replaced with one or more heterologous (exogenous) polynucleotides. In some embodiments, the mutations are naturally-occurring. In other embodiments, the mutations are the results of artificial selection pressure. In still other embodiments, the mutations in the microorganism genome are the result of genetic engineering.

[0045] The term “expression” or “expressed” with respect to a gene sequence, an ORF sequence or polynucleotide sequence, refers to transcription of the gene, ORF or polynucleotide and, as appropriate, translation of the resulting mRNA transcript to a protein. Thus, as will be clear from the context, expression of a protein results from transcription and translation of the open reading frame sequence. The level of expression of a desired product in a host microorganism may be determined on the basis of either the amount of corresponding mRNA that is present in the host, or the amount of the desired product encoded by the selected sequence. For example, mRNA transcribed from a selected sequence can be quantitated by PCR or by northern hybridization (see Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, 1989). Protein encoded by a selected sequence can be quantitated by various methods (e.g., by ELISA, by assaying for the biological activity of the protein, or by employing assays that are independent of such activity, such as western blotting or radioimmunoassay, using antibodies that are recognize and bind reacting the protein).

[0046] The term “endogenous”, as used herein with reference to polynucleotides (and the polypeptides encoded therein), indicates polynucleotides and polypeptides that are expressed in the organism in which they originated (i.e. , they are innate to the organism). In contrast, the terms “heterologous” and “exogenous” are used interchangeably, and as defined herein with reference to polynucleotides (and the polypeptides encoded therein), indicates polynucleotides and polypeptides that are expressed in an organism other than the organism from which they (i.e., the polynucleotide or polypeptide sequences) originated or where derived.

[0047] The term “feedstock” is defined as a raw material or mixture of raw materials supplied to a microorganism, or fermentation process, from which other products can be made. For example, as set forth in the present invention, a methane carbon source or a methanol carbon source or a formaldehyde carbon source, either alone or in combination, are feedstocks for a microorganism that produces a bio-fuel or bio-based chemical in a fermentation process. However, in addition to a feedstock (e.g., a methane substrate) of the invention, the fermentation media contains suitable minerals, salts, cofactors, buffers and other components, known to those skilled in the art, suitable for the growth of the cultures and promotion of the enzymatic pathways necessary for multi-carbon compound production.

[0048] The term “substrate” refers to any substance or compound that is converted, or meant to be converted, into another compound by the action of an enzyme. The term includes not only a single compound, but also combinations of compounds, such as solutions, mixtures and other materials which contain at least one substrate, or derivatives thereof. Further, the term “substrate” encompasses not only compounds that provide a carbon source suitable for use as a starting material (e.g., methane), but also intermediate and end product metabolites used in a pathway associated with a metabolically engineered microorganism as described herein.

[0049] The term "native multi-carbon substrate" as used herein refers to multi-carbon compounds that serve as a growth substrate or metabolite that enables growth of a microorganism.

[0050] The term “fermentation” or “fermentation process” is defined as a process in which a host microorganism is cultivated in a culture medium containing raw materials, such as feedstock and nutrients, wherein the microorganism converts raw materials, such as a feedstock, into products.

[0051] The term “polynucleotide” is used herein interchangeably with the term “nucleic acid” and refers to an organic polymer composed of two or more monomers including nucleotides, nucleosides or analogs thereof, including but not limited to single stranded or double stranded, sense or antisense deoxyribonucleic acid (DNA) of any length and, where appropriate, single stranded or double stranded, sense or antisense ribonucleic acid (RNA) of any length, including siRNA. The term “nucleotide” refers to any of several compounds that consist of a ribose or deoxyribose sugar joined to a purine or a pyrimidine base and to a phosphate group, and that are the basic structural units of nucleic acids. The term “nucleoside” refers to a compound (as guanosine or adenosine) that consists of a purine or pyrimidine base combined with deoxyribose or ribose and is found especially in nucleic acids. The term “nucleotide analog” or “nucleoside analog” refers, respectively, to a nucleotide or nucleoside in which one or more individual atoms have been replaced with a different atom or with a different functional group. Accordingly, the term polynucleotide includes nucleic acids of any length, including DNA, RNA, ORFs, analogs and fragments thereof. [0052] As defined herein, the term “open reading frame” (hereinafter, “ORF”) means a nucleic acid or nucleic acid sequence (whether naturally occurring, non- naturally occurring, or synthetic) comprising an uninterrupted reading frame consisting of (i) an initiation codon, (ii) a series of two (2) of more codons representing amino acids, and (iii) a termination codon, the ORF being read (or translated) in the 5' to 3' direction.

[0053] It is understood that the polynucleotides described herein include “genes” and that the nucleic acid molecules described herein include “vectors” or “plasmids”. Accordingly, the term “gene”, refers to a polynucleotide that codes for a particular sequence of amino acids, which comprise all or part of one or more proteins or enzymes, and may include regulatory (non-transcribed) DNA sequences, such as promoter sequences, which determine for example the conditions under which the gene is expressed. The transcribed region of the gene may include untranslated regions, including introns, 5'-untranslated region (UTR), and 3'-UTR, as well as the coding sequence.

[0054] The term “promoter” refers to a nucleic acid sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3' to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic nucleic acid segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as “constitutive promoters”. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.

[0055] The term “operably linked” refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of effecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.

[0056] The term “codon-optimized” as it refers to genes or coding regions of nucleic acid molecules (or ORFs) for transformation of various hosts, refers to the alteration of codons in the gene or coding regions of the nucleic acid molecules to reflect the typical codon usage of the host organism without altering the polypeptide encoded by the DNA.

[0057] The term “operon” refers to two or more genes which are transcribed as a single transcriptional unit from a common promoter. In certain embodiments, the genes, polynucleotides or ORFs comprising the operon are contiguous genes. It is understood that transcription of an entire operon can be modified (i.e. , increased, decreased, or eliminated) by modifying the common promoter. Alternatively, any gene, polynucleotide or ORF, or any combination thereof in an operon can be modified to alter the function or activity of the encoded polypeptide. The modification can result in an increase or a decrease in the activity or function of the encoded polypeptide. Further, the modification can impart new activities on the encoded polypeptide.

[0058] A “vector” is any means by which a nucleic acid can be propagated and/or transferred between organisms, cells, or cellular components. Vectors include viruses, bacteriophage, pro-viruses, plasmids, phagemids, transposons, and artificial chromosomes such as YACs (yeast artificial chromosomes), BACs (bacterial artificial chromosomes), and PLACs (plant artificial chromosomes), and the like, that are “episomes”, that is, that replicate autonomously or can integrate into a chromosome of a host microorganism. A vector can also be a naked RNA polynucleotide, a naked DNA polynucleotide, a polynucleotide composed of both DNA and RNA within the same strand, a poly-lysine-conjugated DNA or RNA, a peptide-conjugated DNA or RNA, a liposome-conjugated DNA, or the like, that are not episomal in nature, or it can be an organism which comprises one or more of the above polynucleotide constructs such as an agrobacterium or a bacterium.

[0059] The term “homolog”, as used with respect to an original enzyme, polypeptide, gene or polynucleotide (or ORF encoding the same) of a first family or species, refers to distinct enzymes, genes or polynucleotides of a second family or species, which are determined by functional, structural or genomic analyses to be an enzyme, gene or polynucleotide of the second family or species, which corresponds to the original enzyme or gene of the first family or species. Most often, “homologs” will have functional, structural or genomic similarities. Techniques are known by which homologs of an enzyme, gene or polynucleotide can readily be cloned using genetic probes and PCR. Identity of cloned sequences as “homologs” can be confirmed using functional assays and/or by genomic mapping of the genes. [0060] A polypeptide (or protein or enzyme) has “homology” or is “homologous” to a second polypeptide if the nucleic acid sequence that encodes the polypeptide has a similar sequence to the nucleic acid sequence that encodes the second polypeptide. Alternatively, a polypeptide has homology to a second polypeptide if the two proteins have “similar” amino acid sequences. Thus, the terms “homologous proteins” or “homologous polypeptides” is defined to mean that the two polypeptides have similar amino acid sequences. In certain embodiments of the invention, polynucleotides and polypeptides homologous to one or more polynucleotides and/or polypeptides set forth in Table 1 may be readily identified using methods known in the art for sequence analysis and comparison.

[0061] The term "CoA" as used herein refers to coenzyme A.

[0062] A homologous polynucleotide or polypeptide sequence of the invention may also be determined or identified by BLAST analysis (Basic Local Alignment Search Tool) or similar bioinformatic tools, which compare a query nucleotide or polypeptide sequence to a database of known sequences. For example, a search analysis may be done using BLAST to determine sequence identity or similarity to previously published sequences, and if the sequence has not yet been published, can give relevant insight into the function of the DNA or protein sequence.

[0063] The current invention provides systems and microorganisms engineered to endow them with pathways that enable growth on C1 substrates (without said engineered/synthetic pathways, said microorganisms are not able to grow on any C1 substrate). In some embodiments, the system comprises a C1 substrate and modified organisms capable of growth on C1 substrates. This invention provides systems, organisms, and methods of conversion of C1 substrates to cells (i.e. growth on C1 substrates). As demonstrated in the Examples and Figures 7-9, 14, and 16-18 (and material that relates to these figures) demonstrate growth on C1 substrates.

[0064] In various embodiments, the invention provides for the single carbon (C1) compound serving as a source of both energy and carbon for the organism. Single carbon molecules of various reduction levels are interconverted to produce formyl-CoA, the single carbon unit used to extend a carbon backbone. Systems and methods for bioconversion of C1 feedstocks based on the use of formate acyltransferases are described in WO 2017/210381, which is incorporated by reference for these teachings. In contrast, the disclosed system uses formyl-CoA as the C1 building block or elongation unit in a reaction catalyzed by 2-hydroxyacyl-CoA lyase (HACL). This approach is both simpler in design (fewer overall reaction steps) and in practice (increased oxygen tolerance).

[0065] In some embodiments, single carbon (C1) molecules are the solely supplied carbon source. In these situations, a one-carbon acyl-CoA, formyl-CoA, is produced. In some embodiments, formate can be converted to formyl-CoA either directly by a suitable acetyl-CoA synthetase or through the intermediate formyl- phosphate by a suitable formate kinase and phosphate acetyl-transferase. Formaldehyde can also be converted to formyl-CoA by a suitable acyl-CoA reductase.

[0066] Combinations of the above reactions can be used to generate formyl- CoA from other single carbon molecules. For example, an implementation that makes use of methane would include the expression of a methane monooxygenase, a methanol dehydrogenase, and an acyl-CoA reductase. Even more combinations of the described reactions and accompanying enzymes can be used to allow for implementations that use a mixture of single carbon units, for example a combination of methane and carbon dioxide through all of the described reactions. At a minimum, this function can be accomplished from either formaldehyde, by the expression of an acylating aldehyde dehydrogenase, or from formate, by a suitable acetyl-CoA synthetase or combined formate kinase and phosphate acetyl-transferase.

[0067] Therefore, disclosed herein is a method for enabling a heterotroph to utilize single carbon (C1) substrates (e.g. methane, methanol, carbon dioxide, formate, formaldehyde) for growth comprising the steps of providing a cell system containing a first set of metabolic enzymes to convert the single carbon substrate to formyl-CoA and formaldehyde, a second set of metabolic enzymes to elongate aldoses or aldehydes (including the produced formaldehyde) with the formyl-CoA molecules, feeding the system a C1 substrate under suitable conditions for the metabolic enzymes to produce aldehyde or aldose intermediates of desired carbon lengths, and optionally providing a third set of metabolic enzymes to convert the aldehyde or aldose intermediates into the desired multi-carbon chemical.

[0068] The first step in the disclose systems and methods is the conversion of the single carbon substrate (e.g. methane, methanol, carbon dioxide, formate, formaldehyde) into formyl-CoA and formaldehyde. This step is referred to herein as C1 activation. [0069] In general, the conversion of methane (CH 4 ) to formaldehyde (H 2 C=O) and formyl-CoA requires at least the following three steps: (1) the methane (CH 4 ) substrate is first oxidized to methanol (CH 3 OH) via a methane monooxygenase (MMO; EC 1.14.13.25), (2) the methanol (CH 3 OH) is then oxidized to formaldehyde (H 2 C=O) via a methanol dehydrogenase (MDH; E.C. 1.1.1.244, 1.1.2.7), and (3) some of the formaldehyde (H 2 C=O) is oxidized to formyl-CoA via an acyl-CoA reductase (ACR). Exemplary Acyl-CoA reductases or acyling aldehyde dehydrogenases include fatty acyl-CoA reductase (EC 1.2.1.84), succinyl-CoA reductase (EC 1.2.1.76), acetyl-CoA reductase, butyryl-CoA reductase, propionyl- CoA reductase (EC 1.2.1.10).

[0070] In general, the conversion of carbon dioxide (CO 2 ) to formyl-CoA first requires that the CO 2 substrate be reduced to formate (HCOO-) via formate dehydrogenase (E.C. 1.2.1.2). The produced formate can then be converted to formyl-CoA by one of three pathways. In some embodiments, the formate is converted to formyl-CoA via acyl-CoA synthetase (ACS; E.C. 6.2.1.1). In some embodiments, the formate is converted to formaldehyde (H2C=O) by formaldehyde dehydrogenase (FaldDH; E.C. 1.2.1.46), which is then oxidized to formyl-CoA via acyl-CoA reductase (ACR; E.C. E.C. 1.2.1.-, e.g. 1.2.1.10, 1.2.1.76, 1.2.1.84). In some embodments, the formate is converted to formyl-phosphate via formate kinase (FOK; E.C. 2.7.2.6), which is then converted to formyl-CoA via phosphotransacylase (PTA; EC 2.3.1.8).

[0071] As disclosed herein, formyl-CoA can be used as the C1 building block or elongation unit in a reaction catalyzed by a 2-hydroxyacyl-CoA lyase (HACL) or an oxalyl-CoA decarboxylase (OXC; E.C. 4.1.1.8). These enzymes can ligate formyl- CoA with a variety of carbonyl-containing acceptors of broad chain length and functionalization, including the C1 compound formaldehyde. Therefore, disclosed herein are reaction pathways that convert the product of the HACL-catalyzed reaction, a 2-hydroxyacyl-CoA, to an aldehyde that can be further extended by formyl-CoA.

[0072] In some embodiments, the 2-hydroxyacyl-CoA is reduced to a 2- hydroxyaldehyde by an acyl-CoA reductase (ACR; E.C. 1.2.1.-, e.g. 1.2.1.10, 1.2.1.76, 1.2.1.84). Further ligation of the 2-hydroxyaldehydes with formyl-CoA by HACL give polyhydroxyacyl-CoAs and further polyhydroxyaldehydes, commonly known as aldoses. Polyhydroxyaldehydes can in principle serve as substrates of the HACL-catalyzed reaction, which is referred to herein as “aldose elongation.” [0073] Further reduction of the 2-hydroxyaldehyde to give a 1 ,2-diol is possible by a suitable 1 ,2-diol oxidoreductase (DOR; E.C. 1.1.1.77) or alcohol dehydrogenase (ADH; E.C. 1.1.1.71). In some embodiments, the DOR is E. coli FucO.

[0074] Escherichia coli FucO is described in Pereira, B. et al. Metab. Eng. 34, 80-87 (2016), which incorporated by reference for the teaching of this enzyme. Bacteroides thetaiotaomicron RhaO is described in Patel, E.H., et al. Res. Microbiol. 159, 678-684 (2008), which incorporated by reference for the teaching of this enzyme. Clostridium sphenoides DOR is described in Tran-Din, K., et al. Arch. Microbiol. 142, 87-92 (1985), which incorporated by reference for the teaching of this enzyme. Microcyclus eburneus DOR is described in Kawagishi, T., et al. Agric. Biol. Chem. 44, 949-950 (1980), which incorporated by reference for the teaching of this enzyme. Paenibacillus macerans DOR is described in Weimer, P.J. Appl. Environ. Microbiol. 47, 263-267 (1984), which incorporated by reference for the teaching of this enzyme.

[0075] Dehydration of 1 ,2-diol can be catalyzed by the activity of diol dehydratase (DDR; E.C. 4.2.1.28) to give an aldehyde. Further elongation of the aldehyde by formyl-CoA, which is referred to herein as “aldehyde elongation,” results in the extension of an alkyl chain, analogous to the fatty acid biosynthesis or reverse β-oxidation pathways.

[0076] In some embodiments, a combination of the above routes can be implemented at the same time such that for some molecules, elongation takes place through aldose elongation, whereas for other molecules, elongation takes place through aldehyde elongation. Both routes can be simultaneously present at the same time in the same system.

[0077] In some embodiments, intermediates of the above reactions serve as metabolites for the growth of the microorganism. In other embodiments, the intermediates of the above reactions serve as precursors to or are converted to growth substrates of the microorganism. Examples of these products are highlighted in FIG. 2 and include ketoacids, hydroxyacids, aldehydes, diols and polyols.

[0078] In some embodiments, the described pathways are provided within the context of a microbial host. In some embodiments, the microbial host is cultured in a fermentation system to produce the multi-carbon molecules. In other embodiments, a microbial system is used to produce the enzymes, which are then extracted from the microbes for use in a cell-free system. In other embodiments, the enzymes are produced separately and individual added to the system.

[0079] In some embodiments, the microbial system is comprised of more than one engineered microbial host, where the functions of C1 utilization, biomass production, and product synthesis are divided into multiple organisms, which are cultured in a fermentation system known as a coculture and wherein the overall result of the coculture is the conversion of C1 substrates into biomass and/or chemical products.

[0080] The pathway in a living system is generally made by transforming the microbe with one or more expression vector(s) containing a gene encoding one or more of the enzymes, but the genes can also be added to the chromosome by recombinant engineering, homologous recombination, gene editing, and similar techniques. Where the needed protein is endogenous, as is the case in some instances, it may suffice as is, but is usually overexpressed for better functionality and control over the level of active enzyme. In some embodiments, one or more, or all, such genes are under the control of an inducible promoter.

[0081] The enzymes can be added to the genome or via expression vectors, as desired. Preferably, multiple enzymes are expressed in one vector or multiple enzymes can be combined into one operon by adding the needed signals between coding regions. Further improvements can be had by overexpressing one or more, or even all of the enzymes, e.g., by adding extra copies to the cell via plasmid or other vector. Initial experiments may employ expression plasmids hosting 3 or more ORFs for convenience, but it may be preferred to insert operons or individual genes into the genome for stability reasons.

[0082] Still further improvements in yield can be had by reducing or supressing competing pathways, such as those pathways for making e.g., acetate, formate, ethanol, and lactate, and it is already well known in the art how to reduce or knockout these pathways. See e.g., U.S. Patent Nos. 7,569,380, 7,262,046, 8,962,272, 8,795,991 , 8,129,157, and 8,691 ,552, each incorporated by reference herein in its entirety for all purposes. Many others have worked in this area as well.

[0083] Following the construction of a suitable strain containing the engineered pathway, culturing of the developed strains can be performed to evaluate the effectiveness of the pathway at its intended goal — the production of products from single carbon compounds. The organism can be cultured in a suitable growth medium, and can be evaluated for product formation on single carbon substrates, from methane to CO 2 , either alone or in combination with multi-carbon molecules. The amount of products produced by the organism can be measured by ultra performance liquid chromatograph (UPLC) or gas chromatography (GC), and indicators of performance such as growth rate, productivity, titer, yield, or carbon efficiency can be determined.

[0084] Further evaluation of the interaction of the pathway enzymes with each other and with the host system can allow for the optimization of pathway performance and minimization of deleterious effects. Because the pathway is under synthetic control, rather than under the organism's natively evolved regulatory mechanisms, the expression of the pathway is usually manually tuned to avoid potential issues that slow cell growth or production and to optimize production of desired compounds.

[0085] Additionally, an imbalance in relative enzyme activities might restrict overall carbon flux throughout the pathway, leading to suboptimal production rates and the buildup of pathway intermediates, which can inhibit pathway enzymes or be cytotoxic. Analysis of the cell cultures by high performance liquid chromatography (HPLC) or GC can reveal the metabolic intermediates produced by the constructed strains. This information can point to potential pathway issues.

[0086] As an alternative to the in vivo expression of the pathway, a cell free in vitro version of the pathway can be constructed. By purifying the relevant enzyme for each reaction step, the overall pathway can be assembled by combining the necessary enzymes in a reaction mixture. With the addition of the relevant cofactors and substrates, the pathway can be assessed for its performance independently of a host.

[0087] In some embodiments of, single carbon molecules, such as carbon dioxide, formate, formaldehyde, methanol, methane, and carbon monoxide are solely used in the production of products containing at least one carboxyl group. In this embodiment, both formaldehyde and formyl-CoA, are produced from single carbon molecules as described earlier.

[0088] General methods for gene synthesis and DNA cloning, as well as vector and plasmid construction, are well known in the art, and are described in a number of publications. More specifically, techniques such as digestion and ligation- based cloning, as well as in vitro and in vivo recombination methods, can be used to assemble DNA fragments encoding a polypeptide that catalyzes a substrate to product conversion into a suitable vector. These methods include restriction digest cloning, sequence- and ligation-independent Cloning (SLIC), Golden Gate cloning, Gibson assembly, and the like. Some of these methods can be automated and miniaturized for high-throughput applications.

[0089] Gene cassettes for expressing an engineered metabolic pathway in a host microorganism are known in the art. The cassette can comprise one or more open reading frames (ORFs) which encode the enzymes of the introduced pathway, a promoter for directing transcription of the downstream ORF(s) within the operon, ribosome binding sites for directing translation of the mRNAs encoded by the individual ORF(s), and a transcriptional terminator sequence. Due to the modular nature of the various components of the expression cassette, one can create combinatorial permutations of these arrangements by substituting different components at one or more of the positions. One can also reverse the orientation of one or more of the ORFs to determine whether any of these alternate orientations improve the product yield.

[0090] In some embodiments, the host microorganism for expressing the plasmid is a methanotroph, and plasmid vector(s) containing the metabolic pathway expression cassettes are mobilized into these organisms via conjugation.

[0091] In an alternative method for expressing metabolic pathway genes in a microbial host, the biosynthetic pathway genes can be inserted directly into the chromosome. Methods for chromosomal modification include both non-targeted and targeted deletions and insertions.

[0092] In some embodiments, the disclosed systems and methods also involve recovering and purifying the desired product from the fermentation broth. The method to be used depends on the physico-chemical properties of the product and the nature and composition of the fermentation medium and cells. For example, U.S. Pat. No. 8,101 ,808 describes methods for recovering C3-C6 alcohols from fermentation broth using continuous flash evaporation and phase separation processing. In some embodiments, solids may be removed from the fermentation medium by centrifugation, filtration, decantation. In some embodiments, the multi- carbon compounds are isolated from the fermentation medium using methods such as distillation, azeotropic distillation, liquid-liquid extraction, adsorption, gas stripping, membrane evaporation, or pervaporation.

[0093] For longer-chain alcohols, such as fatty alcohols, U.S. Pat. No. 8,268,599 describes methods for separating these components from the aqueous phase of the fermentation by bi-phasic separation, whereby the immiscibility of the product compounds with the fermentation broth allows the organic phase to be collected and removed. This separation can also reduce the toxic effects of the product on the host microbial cells. U.S. Publication No. 2007/0251141 describes methods for recovering fatty acid methyl esters (FAMEs) from a liquid suspension by adding urea and creating a phase separation whereby the saturated and unsaturated FAMEs can be recovered separately. Membrane separation methods can also be applied to purifying fatty acid ester products such as biodiesel.

[0094] In certain embodiments, a methane substrate of the invention is provided or obtained from a natural gas source, wherein the natural gas is “wet” natural gas or “dry” natural gas. Natural gas is referred to as “dry” natural gas when it is almost pure methane, having had most of the other commonly associated hydrocarbons removed. When other hydrocarbons are present, the natural gas is referred to as “wet”. Wet natural gas typically comprises about 70-90% methane, about 0-20% ethane, propane and butane (combined total), about 0-8% CO 2 , about 0-5% N2, about 0-5% H2S and trace amounts of oxygen, helium, argon, neon and xenon. In certain other embodiments, a methane substrate of the invention is provided or obtained from methane emissions, or methane off-gases, which are generated by a variety of natural and human-influenced processes, including anaerobic decomposition in solid waste landfills, enteric fermentation in ruminant animals, organic solids decomposition in digesters and wastewater treatment operations, and methane leakage in fossil fuel recovery, transport, and processing systems.

[0095] A number of embodiments of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.

EXAMPLES

Example 1 :

[0096] This Example investigates an alternative to canonical C1 metabolism that is orthogonal (Pandit, AV., et al R. Nat. Commun. 8:1-11 (2017)) to the central metabolic processes of the host organism and that is based on the newly discovered application of formyl-CoA as a C1 elongation unit by HACL. Rather than the “bowtie” architecture of metabolism (FIG. 9A), in which central metabolic processes are the major branch point between product synthesis and growth, the orthogonal architecture allows for product synthesis independently from the host central and product synthesis pathways (FIG. 1A). This type of metabolic architecture relies on the ability to produce carbon skeletons necessary for diverse product synthesis directly from C1 substrates. This Example reports conceptualization and design of biochemical pathways enabling this orthogonal architecture, based on formyl-CoA elongation (FORCE) pathways, provide analysis of their feasibility and performance, and demonstrate their function in prototype systems. FORCE pathways can serve as the basis for both bioproduct synthesis and C1 -trophy via the production of growth substrates native to the microbial host organism. This this example provides systems comprising a one carbon substrate capable of growing microorganisms that were not previously able to grow on such medium. This growth on C1 substrates is novel and was by system design utilizing the one carbon substrate as the only energy source. | Figures 7-9, 14, and 16-18 (and material that relates to these figures) described in detail growth on C1 substrates.

[0097] Results

[0098] Design of an orthogonal metabolic architecture based on C1 utilization and product synthesis

[0099] The orthogonal metabolic architecture developed here has three primary features 1) activation of C1 substrates into a suitable building block for carbon chain elongation; 2) iterative elongation of a carbon chain by one carbon per cycle; and 3) termination of the pathway resulting in accumulation of the product of interest. Based on our previous findings (Chou, A., et al. Nat. Chem. Biol. 15:900- 906 (2019)), whether a design conceived and implemented based on the use of formyl-CoA was developed.

[0100] The role of formyl-CoA in metabolism is most well-established in the degradation of multi-carbon compounds and reports of the generation of formyl-CoA from C1 molecules are sparse. Acyl-CoAs, though, are a convenient intermediate between the carboxylate and aldehyde forms enabling formyl-CoA generation from both oxidized and reduced C1 substrates (Fig. 1 b: one-carbon activation panel). From formaldehyde, formyl-CoA can be produced via acyl-CoA reductase (ACR) 16 activity, and methanol oxidation to formaldehyde by methanol dehydrogenase (MDH) has been the subject of numerous studies 18-20 . Formyl-CoA may be produced from formate by CoA transferases 21 or CoA ligases, such as the promiscuous activity Escherichia coli acetyl-CoA synthetase (EcACS) 6 . While the latter is AMP forming (consuming 2 ATP equivalents), evidence of an ADP forming route exists via the intermediate formyl-phosphate through formate kinase (FOK) and phosphotransacylase (PTA) 22 . ATP-independent conversion of formate to formyl-CoA via reduction of formate to formaldehyde by formaldehyde dehydrogenase (FaldDH) is also possible 23 , albeit thermodynamically challenging (Fig. 1b). Furthermore, CO 2 can be converted to formate by the reverse activity of formate dehydrogenase (or carbon dioxide reductase) 24 25 and methane to methanol by methane monooxygenase 26 , which when coupled to the reactions described above can lead to formyl-CoA formation.

[0101] The orthogonal, de novo construction of diverse carbon skeletons by C1 elongation necessitates an iterative pathway similar to those found in nature that construct carbon skeletons from C2-C5 metabolites 27 , yet existing outside of central metabolism. Because 2-hydroxyacyl-CoA lyase (HACL) has broad carbon chain length specificity 16 , it is a good candidate for establishing an iterative pathway. We evaluated reaction pathways potentially enabling iteration by converting the product of the HACL-catalyzed reaction, 2-hydroxyacyl-CoA, to an aldehyde that can be further extended by formyl-CoA. At the a-carbon, dehydration is possible, transforming the 2-hydroxyacyl-CoA to a 2-enoyl-CoA 28 (Figure 10) similar to the well-established acrylate pathway 29 . 2-enoyl-CoA generation is also convenient as these intermediates are involved in β-oxidation, potentially allowing the use of the enzymatic toolkit and knowledge established for the β-oxidation reversal platform 30- 32 . Dehydration of 2-hydroxyacyl-CoA, however, is much more challenging than dehydration of 3-hydroxyacyl-CoA, thus requiring an oxygen-sensitive radical mechanism 33 . It also requires the existence of a β-carbon thus restricting pathway implementation to intermediates 3 carbons or larger.

[0102] Due to these issues, we investigated transformations of the thioester. Reduction of the CoA-thioester gives a 2-hydroxyaldehyde (Fig. 1b: formyl-CoA elongation panel), which is possible due to the non-specific activity of certain acyl- CoA reductases (ACRs) 16 . Ligation of 2-hydroxyaldehydes with formyl-CoA by HACL gives polyhydroxyacyl-CoAs and further polyhydroxyaldehydes, commonly known as aldoses. Polyhydroxyaldehydes can in principle serve as substrates of the HACL- catalyzed reaction, which we refer to as aldose elongation (Fig. 1 b: formyl-CoA elongation panel).

[0103] Reduction of the 2-hydroxyaldehyde via diol oxidoreductase (DOR) activity to give a 1 ,2-diol is also possible. For example, E. coli FucO catalyzes the interconversion of 1 ,2-diols with 2-hydroxyaldehydes 34 . 1 ,2-diol dehydration to an aldehyde can be catalyzed by the activity of diol dehydratase (DDR), effectively accomplishing a-reduction. While diol dehydration also requires a radical mechanism, the B12-dependent DDR is oxygen tolerant and has been the subject of numerous protein and metabolic engineering studies 35-37 . Elongation of this aldehyde by formyl-CoA, which we refer to as aldehyde elongation, enables extension of an alkyl chain, analogous to the two-carbon elongation in fatty acid biosynthesis 38 or reverse β-oxidation 39 pathways. We collectively refer to these pathways (aldose elongation, a-reduction, and aldehyde elongation) as formyl-CoA elongation (FORCE) pathways, as they facilitate the use of formyl-CoA as a carbon chain elongation unit (Fig. 1b: formyl-CoA elongation panel).

[0104] Various product classes can be produced as intermediates or from derivatives of intermediates of FORCE pathways (Fig. 1b), some of which also support microbial growth (Figure 9). Aldose sugars, for example, are a direct result of the 2-hydroxyaldehyde node. Diols, including major industrial chemicals such as ethylene glycol, are a result of the 1 ,2-diol node. Derivatives of the 2-hydroxyacyl- CoA node include 2-hydroxyacids, such as industrial products glycolic and lactic acids, produced by a thioesterase catalyzed reaction. Numerous chemical classes can be derived from the aldehyde node 40 , including carboxylic acids, alcohols, and acyl-CoAs that can serve as precursors of other products.

Thermodynamic analysis of FORCE pathways for C1 utilization

[0105] The standard Gibbs free energies of the pathway reactions (Fig. 1 b) make readily apparent the potential challenging reactions, but a holistic analysis of pathway thermodynamics is necessary that considers the ability for reactions to influence each other. For this, we applied the “Max-min Driving Force (MDF)” approach 41 .

[0106] We evaluated the MDF of the FORCE pathways to produce C2 metabolites glycolate and acetate from C1 substrates. Only soluble C1 substrates were evaluated as mass transport limitations can significantly limit CO 2 and methane utilization and are outside the scope of this analysis. The representative C2 products, glycolate and acetate, are both pathway products and growth substrates, with glycolate requiring the shortest pathway and acetate requiring the entire sequence of aldehyde elongation reactions. As shown in Fig. 2a, there is a greater driving force toward glycolate production than acetate for each substrate. This results from the thermodynamically favorable hydrolysis of glycolyl-CoA, whereas acetate production requires the thermodynamically challenging reduction of glycolyl-CoA. Using standard limits of metabolite concentrations 41 , formaldehyde allows the greatest MDF as it does not require NAD + -dependent methanol oxidation or formyl-CoA reduction to formaldehyde. However, despite the thermodynamic challenge of methanol oxidation, there is sufficient driving force in the desired direction for the net production of glycolate and acetate. The driving force for formate utilization is the lowest. Here, ATP hydrolysis assists in the activation of formate. The hydrolysis of 2 ATP equivalents provides just enough driving force for the net production of acetate, while the utilization of 1 ATP equivalent only provides enough driving force for glycolate production. As expected, the ATP-independent route is not feasible under these conditions.

[0107] While the above analysis assumes a standard constraint on metabolite concentrations (1 μM-10 mM 41 ), we sought to apply realistic substrate concentrations based on physical limitations such as the toxicities of the C1 substrates (Fig. 2a). Although some organisms can survive at formaldehyde concentrations on the order of 10 mM 42 , the upper bound of formaldehyde was adjusted to a more reasonable 0.1 mM, resulting in a decrease of the MDFs. Increasing the upper bound of methanol, which has been used at concentrations on the order of 100 mM 43 , led to larger MDFs. Interestingly, at these concentrations, the driving force for methanol utilization becomes greater than that for formaldehyde. Similarly, E. coli has the ability to grow in the presence of formate concentrations on the order of 100 mM 9 44 . Increasing the bound on formate concentration had no effect on the MDF in the 1 or 2 ATP consumption scenarios, but it had a major impact on the MDF of the 0 ATP route, enabling the synthesis of glycolate.

[0108] The NADH/NAD + ratio was also a major constraint on pathway thermodynamics. While we initially used a constraint of 0.1 41 , reflecting growth of E. coli under aerobic conditions, the physiological NADH/NAD + can vary, reaching values near or greater than 1 under anaerobic conditions 45-47 . In the physiological range (0.1-1), pathway driving forces remained positive for formaldehyde and methanol as substrates (Figure 11). As expected, a low ratio was favorable when the pathway was redox generating (methanol to acetate/glycolate and formaldehyde to glycolate), with a high ratio favorable when the pathway was redox consuming (formate to acetate/glycolate) or redox balanced (formaldehyde to acetate - likely because the reduction reactions are more thermodynamically challenging). The NADH/NAD + ratio is critical for the driving force of formate utilization pathways (Fig. 2b). Here, a NADH/NAD + ratio at the higher end of the physiological range in combination with increasing the formate concentration to 100 mM enables a positive driving force for glycolate or acetate production even without ATP hydrolysis.

[0109] We further evaluated the ability of FORCE pathways to support iteration using formaldehyde as the exemplary substrate due to its intermediate redox state. The thermodynamics of both the aldose and aldehyde elongation pathways support iteration up to 4 carbons (Fig. 2c). After 4 carbons, the aldose elongation mode becomes unfavorable, likely due to the cumulative effect of successive acyl- CoA reduction reactions. The aldehyde elongation mode remains favorable despite requiring the same acyl-CoA reductions, likely due to the thermodynamically favorable reactions catalyzed by DOR and DDR. Different C1 activation and termination pathways have an influence on the MDF of the overall elongation cycles when the number of iterations is low. As the number of iterations increases, the thermodynamics of the elongation cycle reactions dominate (Figure 12).

In vitro pathway validation

[0110] The key prerequisite to FORCE pathways is the generation of formyl- CoA and formaldehyde. To verify the function of these reactions, we developed purified enzyme systems to monitor formation of formyl-CoA and the HACL condensation product glycolyl-CoA from different C1 substrates (Fig. 3).

[0111] Since formyl-CoA can be produced from formaldehyde by an ACR, we observed the formation of both formyl-CoA and glycolyl-CoA in a reaction containing Listeria monocytogenes ACR (LmACR) and Rhodospirillales bacterium URHD0017 HACL (RuHACL) 16 using formaldehyde as the only C1 substrate. Formyl-CoA can also be derived from oxidized C1 substrates by the activation of formate. Using a formyl-CoA transferase from Oxalobacter formigenes (OfFrc) and succinyl-CoA as the CoA donor, formate activation to formyl-CoA was observed, and, with the addition of formaldehyde, resulted in the formation of glycolyl-CoA. Using formate as sole substrate, formaldehyde was generated in situ by formyl-CoA reduction using LmACR, albeit with lower glycolyl-CoA production than when formaldehyde was added. Together these results suggest that in this enzyme system, the limitation is imposed by the ACR reaction either due to the activity of the enzyme or constraints due to the need for the appropriate form of NAD(H). In support of the latter hypothesis, in the oxidative direction (i.e. formaldehyde to formyl-CoA) the amount of glycolate observed following hydrolysis of the CoA thioesters was nearly equivalent to the 1 mM NADH added to the reaction. In the opposite direction, a less than equivalent amount of glycolate was observed, consistent with the thermodynamics of the reaction becoming unfavorable with decreasing NADH/NAD + (Fig. 1b, 2b).

[0112] Having validated the core pathway reactions, cell-free metabolic engineering 48 was used to prototype FORCE pathways for product synthesis. Extracts of E. coli expressing each enzyme comprising the a-reductive FORCE pathway were successively combined, demonstrating pathway function in a stepwise manner (Fig. 4). Outside of the direct generation of 2-hydroxycarboxylates (e.g. glycolate) via thioester cleavage of the HACL generated 2-hydroxyacyl-CoA, other C2 products, as well as the aldose or aldehyde elongation pathways require reduction of this 2-hydroxyacyl-CoA (e.g. glycolyl-CoA for formaldehyde and formyl- CoA ligation) (Fig. 1b: formyl-CoA elongation panel). As we previously found that LmACR was able to act upon glycolaldehyde 16 , we used it to catalyze both formaldehyde oxidation to formyl-CoA and glycolyl-CoA reduction to glycolaldehyde (Fig. 4a). While LmACR alone resulted in only the conversion of formaldehyde to formate, inclusion of RuHACL resulted in glycolate production (Fig. 4b). Glycolaldehyde was not detected, possibly due to the presence of endogenous oxidoreductases in the cell extract system, which catalyzed its oxidation to glycolate (e.g. AldA, AldB, PuuC, PatD) or, to a lesser extent, reduction to ethylene glycol (e.g. FucO, YqhD, AdhP, EutG, and others 49 ).

[0113] The synthesis of the next reduction product, ethylene glycol (Fig. 4a), was significantly increased by the addition of a cell extract of E. coli overexpressing E. coli FucO, a 1 ,2-diol oxidoreductase (2-fold increase, from 1.37±0.1 mM to 2.73±0.03 mM) (Fig. 4b). Upon further addition of E. coli cell extract expressing Klebsiella oxytoca DDR, along with coenzyme B12, ethanol was detected (1.90±0.03 mM at one hour: Fig. 4b), likely due to the reduction of acetaldehyde (formed via ethylene glycol dehydration) by endogenous oxidoreductases, along with a corresponding decrease in ethylene glycol. At later time points an increase in acetate was observed, likely due to the oxidation of ethanol and acetaldehyde by endogenous oxidoreductase activity.

In vivo implementation of FORCE pathways

[0114] We sought to demonstrate key features of the designed platforms, as well as the synthesis of additional products and utilization of various C1 substrates using both resting and growing E. coli cultures (Fig. 5 and 6). A key feature of the FORCE pathway design is iteration, which can be achieved through aldose or aldehyde elongation (Fig. 1b: formyl-CoA elongation panel, Fig. 2c). To demonstrate iterative aldose elongation in vivo we targeted the synthesis of three carbon product glycerate from formaldehyde (Fig. 5a). We started with a previously developed strain having C1 dissimilation and glycolate consumption knockouts (AC440: MG1655(DE3) ΔfrmA ΔfdhF ΔfdnG ΔfdoG ΔglcD) and overexpressing RuHACL G390N , LmACR, and EcAldA 16 . To promote glycolaldehyde accumulation and condensation with formyl-CoA, we removed EcAldA from the expression vector. While formaldehyde consumption was significantly reduced, accumulation of glycolaldehyde and glycerate was observed (Fig. 5b), demonstrating the iterative aldose elongation pathway. To increase the production of these compounds, we deleted genes encoding aldehyde dehydrogenases (ΔaldA ΔaldB ΔpatD ΔpuuC, collectively referred to as Δaldh), resulting in lower glycolate and higher glycolaldehyde when EcAldA was not overexpressed. However, these knockouts did not impact the accumulation of glycerate, perhaps indicating a limitation on the condensation reaction between glycolaldehyde and formyl-CoA catalyzed by RuHACL. We also extended the pathway to the next reduction product, ethylene glycol, by overexpressing E. coli fucO 50 , which led to increased accumulation of ethylene glycol in the extracellular medium, with the Δaldh background further improving production (Fig. 5b). To verify that the observed products were derived from formaldehyde and not from residual multi-carbon substrates or biomass components, 13C-labeled formaldehyde was used as the substrate. Glycolic acid, ethylene glycol, and glyceric acid were found to be fully 13C labeled based on the characteristic [M-15] + ions of the TMS derivatives of the products (Fig. 5c).

[0115] To extend the above established formaldehyde utilization pathway to methanol, we expressed a well-studied MDH variant from Bacillus methanolicus MGA3 (BmMDH2 MGA3 ) 18 for conversion of methanol to formaldehyde, in combination with RuHACL G390N , LmACR, and EcAldA. Unlike formaldehyde, where toxicity necessitates the use of resting cells, methanol can also be directly added to growing E. coli cultures. When the engineered methanol utilizing strain was grown in the presence of complex nutrients and 500 mM methanol, glycolate formation was observed only in the strain expressing RuHACL (Fig. 6b). The conversion of methanol to glycolate by this strain was inefficient, however, with substantial accumulation of formate.

[0116] Seeking to improve performance, we replaced RuHACL G390N with a newly identified HACL sourced from beach sand metagenome referred to here as BsmHACL (UniProt: A0A3C0TX30). BsmHACL increased glycolate accumulation about 3-fold (Fig. 6c). Despite improved glycolate production, formate accumulation remained high. In an effort to address this issue, the termination enzyme EcAldA was replaced with a CoA-transferase from Clostridium aminobutyricum (CaAbfT) previously found to have better properties than OfFrc 51 . CaAbfT serves to both release glycolate from glycolyl-CoA and reactivate formate to formyl-CoA for further condensation. When CaAbfT was expressed, glycolate accumulation increased around 33%, while formate accumulation was reduced by approximately 36%. Finally, with CaAbfT serving to terminate the pathway via the release of glycolate, endogenous thioesterases were not expected to be needed and were presumed to be in part responsible for the observed formate. Using a host strain deficient in thioesterases (Δyc/A ΔtesA ΔtesB ΔybgC Δydii ΔfadM), formate accumulation was further reduced. To verify that glycolate was derived from methanol, we used 13C- labeled methanol and observed that the [M-15] + ion of the TMS derivative of glycolic acid was fully derived from 13C-methanol (Fig. 6d).

[0117] Having established CaAbfT as a promising route for formate activation, we evaluated whether CaAbfT could be used to incorporate exogenously supplied formate. Here, CaAbfT was expressed to activate formate without LmACR overexpression as no interconversion of formaldehyde and formyl-CoA is needed upon addition of formaldehyde. In the engineered strain expressing BsmHACL, a 12- fold increase in glycolate was observed when formate was included in the media compared to when formaldehyde was supplied alone (Figure 13) with the total carbon accumulated as glycolate greater than the amount originally added as formaldehyde. Flux balance analysis of synthetic methylotrophy

[0118] Having demonstrated FORCE pathways for direct product synthesis some of which (e.g. glycolate, glycerate, acetate) can serve as growth substrates, their ability to enable synthetic methylotrophy in E. coli was evaluated in silico. Using a previously developed genome scale model of E. coli, iML1515 52 , we added reactions to the model comprising select pathways reported or proposed to enable methylotrophy. All pathways were evaluated with the reactions enabling the interconversion of C1 molecules at different reduction levels present. The full reactions implementing each pathway are shown in Table 3.

[0119] The simulation results suggest that all pathways previously proposed to enable some form of methylotrophy in E. coli, both natural (ribulose monophosphate or RuMP, serine) and synthetic (formolase, Synthetic Acetyl-CoA or SACA, reductive glycine) 8 53, 54 , are able to do so (Figure 14). The FORCE pathways evaluated for the conversion of non-native C1 substrates to native growth substrates glycolate, acetate, and glyceraldehyde were no exception. This demonstrates another advantage of the platform’s orthogonality, as direct route(s) to compound(s) representing physiological substrates for E. coli, or any other organism, enables FORCE pathways to be integrated at varying or multiple metabolic nodes to capitalize on native metabolism and regulation of substrate (s) utilization, opposed to needing to engineer them. An analysis of the flux distributions of the three modeled FORCE pathways provides further insights (Fig. 7). The FORCE pathway leading to the formation of glycolate utilizes a carbon-inefficient glycolate utilization pathway present in E. coli, which requires the decarboxylating condensation of two molecules of glyoxylate (Fig. 7a) 55 . As such, production of more reduced C2 metabolites, such as glycolaldehyde or acetate, is preferred to glycolate as growth substrate. The predicted metabolism of glycolaldehyde is particularly interesting, as the model suggests a route for glycolaldehyde assimilation involving condensation with glycine and a reverse pyridoxal-5-phosphate biosynthesis pathway, ultimately resulting in pentose phosphate rearrangements to give glyceraldehyde-3-phosphate (Fig. 7b). This route appears to be preferred to the assimilation of acetyl-CoA via the glyoxylate bypass based on the predicted flux distribution. Direct production of glyceraldehyde from the HACL-based pathway results in the conversion of glyceraldehyde to glycerol, followed by native glycerol metabolism (Fig. 7c). As a result, pathways that lead to C3 molecules such as glyceraldehyde or di hydroxyacetone can take advantage of glycolytic reactions for the net production of ATP, ultimately enabling greater biomass yield. FORCE pathways also had promising characteristics based on other metrics such as redox balance, ATP requirements, and number of reactions required (Table 4).

Two-strain co-culture system to evaluate synthetic methylotrophy

[0120] The orthogonality of FORCE pathways to metabolism allows full decoupling of the C1 conversion pathway from growth. This enables unique designs to evaluate the methylotrophic potential of the pathway (Fig. 8a; Figure 9b). One potentially advantageous implementation might employ division of labor by separating multi-carbon compound generation and cell growth into two hosts, which would not be possible if the pathway directly interfaced with central metabolism, for example via aldose phosphates or acetyl-CoA, two common products of C1 assimilation pathways. Using this concept, we evaluated the ability for FORCE pathways to support E. coli growth on C1 substrates formaldehyde, formate, and methanol.

[0121] A two-strain E. coli system was designed and constructed to work in co-culture (Fig. 8b). The first strain, referred to as the producer strain, contained constructs to express the FORCE pathway for conversion of C1 substrates to the native C2 growth substrate glycolate but was deficient in the ability to consume glycolate. The second strain, referred to as the sensor strain, retained the ability to grow on glycolate and additionally constitutively expressed eGFP as a signal but did not express the FORCE pathway for glycolate production. These strains could thus be differentiated by both selection on glycolate minimal media plates and by detection of fluorescent colonies. To assess the feasibility of different substrates, three different producer strains were devised: for formaldehyde utilization, the producer strain expressed LmACR, BsmHACL, and EcAldA; for evaluating formate utilization with formaldehyde, BsmHACL was expressed with CaAbfT; and for methanol utilization, a thioesterase deficient background expressing BmMdh MGA3 , LmACR, BsmHACL, and CaAbfT (Fig. 8b) was utilized.

[0122] To enable growth conditions with formaldehyde, paraformaldehyde was used. Paraformaldehyde gradually depolymerizes to formaldehyde in aqueous media, with control over the solubilization rate through the selection of particle size and concentration (Fig. 15a). This enabled a system where formaldehyde could be kept at sub-millimolar concentrations, avoiding accumulation to toxic levels, with significant glycolate production still observed (Fig. 15b). In minimal media with (para)formaldehyde (the equivalent of 5 mM) as the sole carbon substrate, growth of the sensor strain was observed as indicated by the increase in colony-forming units (CFUs) relative to a control system in which the producer strain did not express BsmHACL (Fig. 8c, Fig. 16). Glycolate accumulated rapidly in the first 8 hours with sustained exponential growth of the sensor strain occurring after an initial lag phase. The sensor strain was found to have undergone around 6.6 doublings in 30 hours.

[0123] With methanol, growth of the sensor strain was observed only when the producer strain expressed BsmHACL (Fig. 8d, Fig. 17), however compared to the case for paraformaldehyde utilization the growth kinetics of the sensor strain differed, reflecting an approximately linear increase in CFUs over time. The difference in observed dynamics might reflect the limitation imposed by the rate of glycolate production from methanol by the producer strain, analogous to the phenomenon observed in constant feed-rate fed-batch culture 56 . The utilization of methanol was substantially slower than the utilization of (para)formaldehyde, resulting in approximately 4.6 doublings in 72 hours.

[0124] A similar experiment was performed using 1 mM formaldehyde and 10 mM formate co-substrate system. Here, more carbon was observed in glycolate than was added as formaldehyde, indicating the incorporation of formate (Fig. 18). Growth of the sensor strain was faster than growth on methanol but did not result in as many doublings as on 5 mM (para)formaldehyde. In 27 hours, around 4.9 doublings were observed (Fig. 8d).

Discussion

[0125] In the canonical architecture of metabolism, substrates are funneled into central metabolism with biosynthetic building blocks and products of interest derived from the resulting central metabolites. To date, attempts to engineer C1 bioconversion, even those exploiting synthetic pathways 5-9 or novel enzyme designs 6 , have relied on central carbon metabolism. These designs, which exhibit minimal orthogonality, require optimizing a host’s metabolic network to accommodate C1 bioconversion, which has proven challenging.

[0126] In this work, we present the design, analysis, and implementation of formyl-CoA elongation (FORCE) pathways, enabling C1 utilization and bioconversion in a manner orthogonal to the host metabolism. FORCE pathways are based on using formyl-CoA as an anabolic metabolite, which is enabled by 2-hydroxyacyl-CoA lyase (HACL) catalyzed acyloin condensation between formyl-CoA and carbonyl- containing substrates. Product synthesis is achieved with relatively high orthogonality to central metabolism compared to other approaches. Our thermodynamic analysis suggested favorable driving forces for FORCE pathway conversions of formate, formaldehyde, and methanol to glycolate or acetate as exemplary products. We demonstrate the potential of the self-contained, orthogonal pathway in both in vitro (purified enzymes and cell extracts) and in vivo (resting and growing cells) implementations, in which products of diverse functionality (e.g. glycolate, glycolaldehyde, ethylene glycol, ethanol, glycerate) could be produced in a growth and host metabolism independent manner using formaldehyde, formate or methanol as the sole C1 substrates. One can envision potential bioprocesses in which growth and maintenance is performed with a multi-carbon substrate, while the biocatalyst is used for C1 bioconversions. Bioprocesses of this nature, based on multi-enzyme cascades and two phase fermentations, have been the subject of recent reviews 57 58 . [0127] While product synthesis from C1 substrates is a defining feature of FORCE pathways, they also have the potential to enable growth on non-native C1 substrates (e.g. synthetic methylotrophy) via the production of multi-carbon compounds naturally consumed by heterotrophs, such as glycolate, acetate, or glyceraldehyde. Genome scale modeling and flux balance analysis revealed that FORCE pathways are comparable or better than alternative approaches and guided the design. While the current pathway performance could not support the growth of a single strain of E. coli on C1 substrates, the orthogonal nature of the pathway allowed us to separate and evaluate the pathway limitations to growth on formate, formaldehyde, and methanol in separate strains of E. coli. The potential for FORCE pathways to enable methylotrophy allows for possible bioprocess implementations more similar to traditional fermentations based on C1 as a sole carbon source for both growth and product synthesis. Because the FORCE pathway is the branch point for fluxes toward product synthesis and growth, there is significant potential for facile control over flux partitioning (Figure 1b), especially with recent developments in the area of dynamic metabolic control 5960 .

[0128] Further FORCE pathways development should enable more efficient designs for synthetic methylotrophy and diverse product synthesis, especially via pathway iteration. While the current demonstration enabled C1 utilization rates of 118 μmoles/OD/h when implemented at physiologically relevant concentrations of formaldehyde (Fig. 15b), we assess further improvements can be made by improving expression levels and kinetic parameters of HACL. The observation of formate as a byproduct throughout various implementations using formaldehyde or methanol is likely due to an imbalance between the rate of production of formyl-CoA and the rate of its utilization by HACL. We have also observed formyl-CoA hydrolysis 16 , which is probably exacerbated in vivo by endogenous thioesterases. Strategies to address this limitation include re-activating formate to formyl-CoA using a CoA-transferase, as done here using the CoA-transferase CaAbfT, and identification or engineering of an HACL enzyme with better characteristics, shown here via the identification of BsmHACL. Finally, host-strain modifications such as the deletion of endogenous aldehyde dehydrogenases and thioesterases were also explored for this purpose.

[0129] As HACL-catalyzed condensation and enzyme activity was only recently described, we expect that further genome mining, bioprospecting, enzyme engineering, and biochemical characterization will result in better performing variants, ultimately overcoming pathway bottlenecks. HACL variants with well-defined chain length and functional group specificities, in combination with compatible, specific termination enzymes, will allow for the production of specific products, analogous to what has been demonstrated with other platform pathways 61-63 . These studies will also shed additional light on the role of formyl-CoA in metabolism, which is likely greater than the synthetic pathway described here. Recent reports have already contributed to the advancement of knowledge in this area 1764 , and further studies are likely to follow.

Methods:

Thermodynamic calculations

[0130] Standard Gibbs free energies of reactions were found either from database sources (MetaCyc) 65 or by using the eQuilibrator biochemical thermodynamics calculator 41 . Min-max driving forces of pathways were calculated using a previously reported method 41 implemented using MATLAB (Mathworks).

Flux balance analysis

[0131] Flux balance analysis was performed using the COBRA Toolbox 66 for MATLAB (Mathworks) with the Gurobi solver (Gurobi Optimization, LLC). Reactions enabling the various methylotrophy pathways (Table 3) were added or modified to the E. coli genome scale model iML1515 52 . The limits on the substrate exchange reactions were set to 10 mmol C/g DCW/hr for all C1 substrates.

Reagents

[0132] All chemicals were obtained from Fisher Scientific Co. and Sigma- Aldrich Co. unless otherwise specified. Primers were synthesized by Integrated DNA Technologies or by Eurofins Genomics. Restriction enzymes were obtained from New England Biolabs unless otherwise specified.

Genetic methods

[0133] Genes non-native to E. coli were codon-optimized and synthesized by GeneArt (Thermo Fisher). E. coli genes were amplified from chromosomal DNA according to standard protocols 67 . Plasmid-based gene expression was achieved by cloning the desired gene(s) into pCDFDuet-1 or pETDuet-1 (Novagen) digested with appropriate restriction enzymes and by using In-Fusion cloning technology (Clontech Laboratories, Inc.). Gene knockouts and genomic modifications were created using a CRISPR-Cas9-based system developed for E. coli. pCas and pTargetF were gifts from S. Yang (Addgene plasmids nos. 62225 and 62226, respectively). Plasmids and strains used in this study are listed in Table 2.

Evaluation of core pathway module using purified enzymes [0134] Plasmids contain genes encoding RuHACL G390N , LmACR, and OfFrc were cloned into pCDFduet-1 , which were then transformed into E. coli BL21 (DE3) for expression. Overnight cultures of the expression strains were grown in LB with 100 mg/L spectinomycin, which was used to inoculate 50 ml TB medium supplemented with 50 mg/L spectinomycin in a 250 ml baffled flask at 1%. The culture was grown at 30 °C and 250 r.p.m. in an orbital shaker until OD 600 reached 0.4-0.6, at which point expression was induced with 0.1 mM IPTG. Then, 24 h post inoculation, cells were harvested by centrifugation. The cell pellets were washed once with a cold 9 g/L NaCI solution and stored at -80 °C until needed.

[0135] The frozen cell pellets were resuspended in 10 mL of cold lysis buffer (50 mM NaPi pH 7.4, 300 mM NaCI, 20 mM imidazole), to which 250 U of Benzonase nuclease was added. The mixture was further treated by sonication on ice using a Cole-Parmer ultrasonic processor CPX130 (3 min with cycles of 5 seconds pulse on and 6 seconds pulse off, and amplitude set at 30%) and centrifuged at 7,500g for 15 min at 4 °C. The supernatant was applied to a chromatography column containing 5 ml Ni-NTA agarose resin (Qiagen, Inc.), which had been pre-equilibrated with the lysis buffer. The column was then washed first with 10 ml of the lysis buffer and then with 25 ml of wash buffer (50 mM NaPi pH 7.4, 300 mM NaCI, 70 mM imidazole). The His-tagged protein of interest was eluted with 20 ml elution buffer (50 mM NaPi pH 7.4, 300 mM NaCI, 250 mM imidazole). The eluate was collected and applied to a 10,000 molecular weight cut-off Amicon ultrafiltration centrifugal device (Millipore), and the concentrate (<300 μL) was washed twice with 4 ml of 50 mM KPi, 10% glycerol pH 7.4 for desalting. Protein concentrations were calculated using the Bradford Protein Assay (Bio-Rad) according to the manufacturers protocol. Purified protein was saved in 20 pl aliquots at -80 °C until needed.

[0136] To test the utilization of formaldehyde as the sole C1 substrate, the reaction was comprised of 50 mM KPi pH 7.4, 5 mM MgCl 2 , 0.1 mM TPP, 1 mM NAD + , 2 mM CoASH, 1 μM RuHACL G390N , 1 μM LmACR, and 100 mM FALD. To test the utilization of formate and formaldehyde as co-substrates, the reaction was comprised of 50 mM KPi pH 7.4, 5 mM MgCl 2 , 0.1 mM TPP, 1 mM succinyl-CoA, 1 μM RuHACL G390N , 2 μM OfFrc, 100 mM sodium formate, and 100 mM formaldehyde. To test the utilization of formate as the sole C1 substrate, the reaction was comprised of 50 mM KPi pH 7.4, 5 mM MgCI 2 , 0.1 mM TPP, 1 mM NADH, 2 mM succinyl-CoA, 1 μM RuHACL G390N , 2 μM OfFrc, 1 μM LmACR, and 100 mM sodium formate. As a control, a reaction comprised of 50 mM KPi pH 7.4, 5 mM MgCI 2 , 0.1 mM TPP, 1 mM NADH, 1 mM NAD + , 2 mM succinyl-CoA, 2 mM CoASH, 2 μM BSA, 100 mM sodium formate, and 100 mM formaldehyde. The reaction volumes were 200 μL and the reactions were carried out at room temperature for 30 minutes on a rotisserie shaker.

[0137] Samples (200 μL) containing acyl-CoAs were first treated with 5 μL of 10 M NaOH to hydrolyze the thioesters and produce the carboxylic acid. Ammonium sulfate solution acidified with 1% sulfuric acid was then added to improve the efficiency of acid extraction. The resulting sample was extracted into 4 ml ethyl acetate by vigorous vortexing for 90 s. The organic phase was separated and evaporated to dryness under a stream of nitrogen. The residue was dissolved in 50 μl pyridine and 50 pl N,O-bis(trimethylsilyl)trifluoroacetamide, and incubated at 60 °C for 15 min. Derivatized samples were analyzed by GC-MS using an Agilent 5977B GC/MSD single quadrupole, Intuvo 9000 GC system, with integrated GERSTEL multifunctional autosampler sample preparation robot and an Agilent HP-5ms capillary column (0.25 mm internal diameter, 0.25 μM film thickness, 30 m length). For the gas chromatography, 1 μL of the sample was injected with a 1:1 split ratio using helium as the carrier gas at a flowrate of 1.5 ml/min and the following temperature profile: initial 90 °C for 3 min; ramp at 15 °C per min to 170 °C; ramp at 20 °C per min to 300 °C and hold for 8 min. The injector and detector temperatures were 250 and 350 °C, respectively. Data was acquired using Agilent MassHunter GC/MS Acquisition B.07.06.2704 and analyzed using Agilent MassHunter Workstation Software B.08.00.

[0138] To analyze the acyl-CoAs with LC-MS, the reaction was stopped by the adding 8 μL of formic acid to 200 μL reaction sample and desalted with 1 mL HyperSep C18 Cartridges (Thermo Scientific) that were primed twice with 200 μL methanol and equilibrated with 100 μL of 1 mM ammonium acetate pH 3.0. The columns were washed once with 200 μL of 1 mM ammonium acetate pH 3.0, and the acyl-CoAs were eluted in 200 μL methanol. LC-MS analysis was performed based on what has been previously described 7 . An Agilent 6540 Q-TOF LC-MS system was equipped with a Jet-stream electrospray ionization source set to the positive ionization mode and a 100 mm x 4.6 mm Kinetex 2.6 pm Polar C18 100 A column (Phenomenex). The LC conditions were: column oven set at 40°C, injection volume of 5 μL, and 50 mM ammonium formate and methanol as the mobile phases. Compound separation was achieved using the following gradient method at a flow rate of 400 μL/min: 0 min 0% methanol; 1 min 0% methanol; 3 min 2.5% methanol; 9 min 23% methanol; 14 min 80 % methanol; 16 min 80% methanol; 17 min 0% methanol. The MS conditions were: capillary voltage 3.5 kV, nozzle voltage 500 V, fragmentor voltage 150 V, with nitrogen used for nebulizing (25 psig), drying (5 L/min, 225°C), and sheath gas (10 L/min, 400°C). A scan range of 100-1000 m/z was used. Data was acquired using Agilent MassHunter LC/MS data Acquisition B.05.01 and analyzed using MassHunter Qualitative Analysis B.05.00 (Agilent).

Cell-free metabolic engineering for pathway validation

[0139] Enzyme expression and cell extract preparation was performed as described previously 16 . Cell-free reactions contained 50 mM KPi pH 7.4, 4 mM MgCl2, 0.1 mM TPP, 2.5 mM CoASH, 5 mM NAD + , 50 mM formaldehyde, and 0.1 mM coenzyme B12. Individual cell extract loading was around 4.4 g/L protein (1/8 of the reaction volume), and the amount of protein added to each reaction was normalized with BL21(DE3) extract to ~26 g/L protein (3/4 of the reaction volume). The reactions were incubated at room temperature for the indicated time, at which point % of the reaction volume of saturated ammonium sulfate solution acidified with 1% sulfuric acid was added to stop the reactions. Samples were centrifuged at 20817xg for 15 minutes and the supernatant analyzed by HPLC using a Shimadzu Prominence SIL 20 system (Shimadzu Scientific Instruments, Inc.) equipped with a refractive index detector and an HPX-87H organic acid column (Bio-Rad) with operating conditions to optimize peak separation (0.3 ml/min flowrate, 30 mM H2SO4 mobile phase, column temperature 42 °C). Data was acquired and analyzed using Shimadzu LabSolutions v5.96.

Resting cell bioconversions

[0140] Bioconversions using resting cells were performed as described previously 16 with slight modification. The basal salts media used was M9 (6.78 g/L Na 2 HPO 4 , 3 g/L KH 2 PO 4 , 1 g/L NH 4 CI, 0.5 g/L NaCI, 2 mM MgSO 4 , 100 μM CaCI 2 , and 15 μM thiamine-HCI) additionally supplemented with the micronutrient solution of Neidhardt 68 . An overnight LB culture of each strain was used to inoculate (1%) a 250 mL flask containing 50 mL of the above media further supplemented with 20 g/L glycerol, 10 g/L tryptone, 5 g/L yeast extract, and appropriate antibiotics (50 μg/mL carbenicill in, 50 μg/mL spectinomycin). The flask cultures were incubated at 30°C and 250 rpm in an NBS I24 Benchtop Incubator Shaker (New Brunswick Scientific Co.). After 2.5 hours, gene expression was induced by addition of 0.1 mM isopropyl P-d-1-thiogalactopyranoside (IPTG) and 0.04 mM cumate (0.2 mM IPTG and 0.1 mM cumulate was used for the experiment with formaldehyde and formate). [0141] The cells from the above cultures were harvested by centrifugation (5000xg, 22°C, 5 min), and washed twice with the above M9 media without any carbon source. The final cell pellet was resuspended in M9 with the appropriate carbon source (~10 OD 600 with 10 mM formaldehyde or ~5 OD 600 with 1 mM formaldehyde and 10 mM formate). 5 mL of the cell suspension was added to a 25 mL Erlenmeyer flask (Corning Inc.) and topped with a foam plug. Flasks were incubated at 30°C and 200 rpm in an NBS I24 Benchtop Incubator Shaker (New Brunswick Scientific Co.). An additional 10 mM formaldehyde was added after 1.5 hours when formaldehyde was the sole carbon source. Samples were taken after 24 hours for HPLC analysis as described above. When 13C-labeled formaldehyde was used as the substrate, the samples were analyzed by GC-MS after extraction and derivatization as described above.

Fermentation experiments

[0142] The growth media used was M9 (6.78 g/L Na 2 HPO , 3 g/L KH 2 PO 4 , 1 g/L NH 4 CI, 0.5 g/L NaCI, 2 mM MgSO 4 , 100 μM CaCI 2 , and 15 μM thiamine-HCI) additionally supplemented with 500 mM methanol, 10 g/L tryptone, 5 g/L yeast extract and micronutrient solution of Neidhardt 68 . An overnight LB culture of each strain was used to inoculate (1%) a 50 mL closed-cap conical tube (Genesee Scientific Co.) containing 5 mL of the above media further supplemented with appropriate antibiotics (50 μg/mL carbenicillin, 50 μg/mL spectinomycin). After approximately 3 hours, gene expression was induced by addition of 0.04 mM isopropyl β-d-1 -thiogalactopyranoside (IPTG) and 0.04 mM cumate. Tubes were incubated at 30°C and 200 rpm in an NBS I24 Benchtop Incubator Shaker (New Brunswick Scientific Co.). Samples (100 μL) were taken every 24, 48, 72 and 96 hours after inoculation for OD 600 measurement and HPLC analysis as described above. When 13C-methanol was used as the substrate, the samples were analyzed by GC-MS after extraction and derivatization as described above.

Two-strain E. coli system for growth on C1 substrates

[0143] Two-strain experiments were conducted using strains cultured and induced as described previously using M9 medium 16 . The induced cells were resuspended to an initial concentration of 3*10 9 CFU (colony forming unit)/mL (equivalent to OD 600 of ~5) in M9 medium. 20 mL of the suspension was added into 25 mL flask containing 3 mg paraformaldehyde (equivalent to 5 mM), or 10 mL of the suspension was added into 25 mL flask with the addition of 500 mM methanol, or 1 mM formaldehyde and 10 mM sodium formate. A second E. coli strain, AC763, capable of consuming glycolate, was added to an initial concentration of 5*10 6 CFU/mL (equivalent to OD 600 of -0.005). AC763 additionally harbored a chromosomal copy of constitutively expressed eGFP to assist in distinguishing the two strains. Prior to its addition to the culture, AC763 was pre-grown in 25 mL Erlenmeyer flasks (from a single colony inoculation) at 200 rpm and 30°C for 24 hours in 5 mL of the above M9 minimal media supplemented with 5 g/L glycolate and 2 g/L tryptone. Cells were then centrifuged (5000xg, 22°C, 5 min), washed twice with the media supplemented with 5 g/L glycolate, and resuspended to an optical density of -0.05. Following 24 hours of incubation at 200 rpm and 30°C (5 mL in 25 mL Erlenmeyer flasks), cells were centrifuged (5000xg, 22°C), washed twice with media without any carbon source and an appropriate volume added to the two-strain system. The flasks containing both strains were further incubated at 200 rpm and 30°C. Samples were taken at various times for HPLC and cell growth analysis.

[0144] Colony forming units per mL of culture was utilized as a measurement of cell growth. Appropriate volumes of culture were diluted in the above described minimal media without any carbon source and 50 μL of various dilutions plated on minimal media plates containing 2.5 g/L glycolate. Following plate incubation at 37 °C, colonies were counted manually, aided by visualization using a blue-light transilluminator (Vernier, Beaverton, OR) to illuminate the eGFP expressing strain AC763.

[0145] Data Availability Statement All data supporting the findings of this study are included herein as well as the following public databases: MetaCyc (metacyc.org/); eQuilibrator (equilibrator.weizmann.ac.il/); Uniprot (www.uniprot.org/). Uniprot accession numbers for enzymes involved in the study are given in Table 2. All Uniprot sequences are incorporated by reference in their entirety as listed in Table 2.

[0146] Code Availability Statement The scripts used to perform the analyses in the study are found at github.com/ahc7/FORCE_manuscript.

Table 2: Host strains and plasmids used in this study. Uniprot accession numbers for heterologous enzymes used in this work are given in parenthesis.

Table 3: Modifications made to iML1515 to implement methylotrophic pathways. ‘L’ refers to the lower limit of reaction flux, ‘U’ refers to the upper limit of reaction flux, ‘B’ refers to both upper and lower limits of reaction flux.

FORCE-glycolate model

FORCE-acetate model Table 4. Comparison of the properties of selected C1-utlization pathways with potential to enable methylotrophy (from methanol) to a representative C2 (acetyl-CoA) and C3 (pyruvate) metabolite. The pathways from this work used for the calculations were: to acetyl-CoA via acetaldehyde produced from diol dehydratase and to pyruvate via glycerate. Positive values indicate net production, while negative values indicate net consumption. Calculations assume the hydrolysis of ATP to AMP to be two equivalents of ATP to ADP hydrolysis. Carbon yield for methylotrophic pathways is based on moles of methanol in the C2 or C3 metabolite and thus is greater than 100% in cases where CO 2 is also assimilated (i.e. serine pathway).

[0147] Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed invention belongs. Publications cited herein and the materials for which they are cited are specifically incorporated by reference.

[0148] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.