Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR DETERMINING HETEROLOGOUS BIOSYNTHESIS PATHWAYS
Document Type and Number:
WIPO Patent Application WO/2017/134602
Kind Code:
A1
Abstract:
The present invention relates to a method and system for dynamically analyzing, determining, predicting and displaying ranked suitable heterologous biosynthesis pathways for a specified host. The present invention addresses the problem of finding suitable pathways for the endogenous metabolism of a host organism because the efficacy of heterologous biosynthesis is affected by competing endogenous pathways. The present invention is called MRE (Metabolic Route Explorer), and it was conceived and developed to systematically and dynamically search for, determine, analyze, and display promising heterologous pathways while considering competing endogenous reactions in a given host organism.

Inventors:
GAO XIN (SA)
KUWAHARA HIROYUKI (SA)
ALAZMI MESHARI SAUD (SA)
CUI XUEFENG (SA)
Application Number:
PCT/IB2017/050576
Publication Date:
August 10, 2017
Filing Date:
February 02, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV KING ABDULLAH SCI & TECH (SA)
International Classes:
G16B5/30; G16B35/00
Other References:
HIROYUKI KUWAHARA ET AL: "MRE: a web tool to suggest foreign enzymes for the biosynthesis pathway design with competing endogenous reactions in mind", NUCLEIC ACIDS RESEARCH, vol. 44, no. W1, 29 April 2016 (2016-04-29), pages W217 - W225, XP055364895, ISSN: 0305-1048, DOI: 10.1093/nar/gkw342
T. BLUM ET AL: "MetaRoute: fast search for relevant metabolic routes for interactive network navigation and visualization", BIOINFORMATICS., vol. 24, no. 18, 16 July 2008 (2008-07-16), GB, pages 2108 - 2109, XP055364897, ISSN: 1367-4803, DOI: 10.1093/bioinformatics/btn360
G. RODRIGO ET AL: "DESHARKY: automatic design of metabolic pathways for optimal cell growth", BIOINFORMATICS., vol. 24, no. 21, 1 November 2008 (2008-11-01), GB, pages 2554 - 2556, XP055364899, ISSN: 1367-4803, DOI: 10.1093/bioinformatics/btn471
PITKÃ NEN ESA ET AL: "Inferring branching pathways in genome-scale metabolic networks", BMC SYSTEMS BIOLOGY, BIOMED CENTRAL LTD, LO, vol. 3, no. 1, 29 October 2009 (2009-10-29), pages 103, XP021064661, ISSN: 1752-0509, DOI: 10.1186/1752-0509-3-103
RODRIGO LIBERAL ET AL: "PathwayBooster: a tool to support the curation of metabolic pathways", BMC BIOINFORMATICS, BIOMED CENTRAL, LONDON, GB, vol. 16, no. 1, 15 March 2015 (2015-03-15), pages 86, XP021217658, ISSN: 1471-2105, DOI: 10.1186/S12859-014-0447-2
HOLLMAN P.C.; KATAN M.B.: "Bioavailability and health effects of dietary flavonols in man.", ARCH. TOXICOL., vol. 20, 1998, pages 237 - 248
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method of determining and displaying suitable heterologous biosynthesis pathways to produce a target product in a specified host organism from a selected starting material, comprising the steps of: selecting and inputting user input data that includes designation of a host organism, a starting compound, and a desired target product into the pathway computer system; searching for one or more possible heterologous pathways with consideration of competing endogenous reactions in a given host organism; determining one or more suitable heterologous biosynthesis pathways from the starting compound to achieve the desired target product in light of consideration of competing endogenous reactions in a given host organism; dynamically ranking suitable heterologous biosynthesis pathways from the starting compound to achieve the desired target product by an endogenous pathway score; generating a graph of a predetermined number of most suitable heterologous biosynthesis pathways that shows reactions steps and metabolites, as well as competing endogenous reactions, displaying one or more suitable heterologous biosynthesis pathways from the starting compound to the target product, in which vertices on the graph represent metabolites and arrow edges on the graph represent chemical transformations via verified metabolic reactions; and, utilizing the data supplied in graphical display to select a preferred heterologous biosynthesis pathway for a specified chemical transformation.

2. The method of Claim 1 wherein said determining step includes consideration of

thermodynamic criteria for the foreign reaction in view of competing native reactions.

3. The method of Claim 1 wherein said user input data selected and input may also

include selecting the number of reactions per pathway or the number of pathways.

4. The method of Claim 1 wherein said user input data may also include designating the KEGG RPAIR constraints.

5. The method of Claim 1 wherein said user input data may also include selecting one or more compounds to exclude from consideration in the suitable pathway designation.

6. The method of Claim 1 wherein said determining step may include providing the user with possible suggestions on one or more heterologous enzymes that may increase the favorability of the reaction pathway.

7. The method of Claim 1 wherein said determining step may include consideration of thermodynamic criteria involved in one or more reactions.

8. The method of Claim 1 wherein dynamically ranking includes consideration of the integration of new reactions into the endogenous metabolic system

9. The method of Claim 1 further comprises the step of: identifying and suggesting foreign enzymes that may be used to catalyze the desired foreign reactions to increase the efficiency in achieving the target end product.

10. The method of Claim 1 wherein the endogenous pathway score is the sum all of the reaction weights in a given pathway for the reaction, the pathway from the source compound to the target product, the number of reactions that are native and foreign to the host organism and whether the reactions are endogenous to the host.

11. The method of Claim 1 wherein said predetermined number of suitable heterologous biosynthesis pathways on the generated graph is ten.

12. The method of Claim 1 wherein said predetermined number of suitable heterologous biosynthesis pathways on the generated graph is thirty.

13. The method of Claim 1 wherein the graphical display includes color coding of vertices and edges to indicate starting and ending compounds, and which reactions are native or foreign to the host organism.

14. The method of Claim 1 wherein the graphical display includes a variation of the width of the arrow edges to indicate the value of the Gibbs energy, or strength, of the reaction path.

15. The method of Claim 1 wherein the graphical display includes dynamically changing the display based on the placement of a computer pointer over nodes or edges of compound names or the reaction Gibbs energy.

16. The method of Claim 1 further comprising the step of: displaying a table of the specific reaction steps for the route with the reaction identification, reaction formula, whether the reaction is native to the host, the Gibbs energy of the reaction step, the native enzymes, potential foreign enzymes and data for the competing endogenous reactions.

17. The method of Claim 15 further comprising the step of: selecting one or more enzymes from the table to generate and display enzyme data relating to the selected enzyme.

18. A system for determining and displaying suitable heterologous biosynthesis pathways to produce a target product in a specified host organism from a selected starting material, comprising: an input terminal that receives selected user input data that includes designation of a host organism, a starting compound, and a desired target product into the pathway computer system; a computer engine that searches for one or more possible heterologous pathways with consideration of competing endogenous reactions in a given host organism, determines one or more suitable heterologous biosynthesis pathways from the starting compound to achieve the desired target product in light of consideration of competing endogenous reactions in a given host organism, dynamically ranking suitable heterologous biosynthesis pathways from the starting compound to achieve the desired target product by an endogenous pathway score; and generates a graph of a predetermined number of most suitable heterologous biosynthesis pathways that shows reactions steps and metabolites, as well as competing endogenous reactions, a display that displays one or more suitable heterologous biosynthesis pathways from the starting compound to the target product, in which vertices on the graph represent metabolites and arrow edges on the graph represent chemical transformations via verified metabolic reactions; said display of data supplied in graphical display being utilized by said user to select a preferred heterologous biosynthesis pathway for a specified chemical transformation.

19. The system of Claim 18 wherein said determining step includes consideration of thermodynamic criteria for the foreign reaction in view of competing native reactions.

20. The system of Claim 18 wherein said user input data selected and input may also include selecting the number of reactions per pathway or the number of pathways.

21. The system of Claim 18 wherein said user input data may also include designating the KEGG RPAIR constraints.

22. The system of Claim 18 wherein said user input data may also include selecting one or more compounds to exclude from consideration in the suitable pathway designation.

23. The system of Claim 18 wherein said determining step may include providing the user with possible suggestions on one or more heterologous enzymes that may increase the favorability of the reaction pathway.

24. The system of Claim 18 wherein said determining step may include consideration of thermodynamic criteria involved in one or more reactions.

25. The system of Claim 18 wherein dynamically ranking includes consideration of the integration of new reactions into the endogenous metabolic system

26. The system of Claim 18 wherein said computer engine also identifies and suggests foreign enzymes that may be used to catalyze the desired foreign reactions to increase the efficiency in achieving the target end product.

27. The system of Claim 18 wherein the endogenous pathway score is the sum all of the reaction weights in a given pathway for the reaction, the pathway from the source compound to the target product, the number of reactions that are native and foreign to the host organism and whether the reactions are endogenous to the host.

28. The system of Claim 18 wherein said predetermined number of suitable heterologous biosynthesis pathways on the generated graph is ten.

29. The system of Claim 18 wherein said predetermined number of suitable heterologous biosynthesis pathways on the generated graph is thirty.

30. The system of Claim 18 wherein the graphical display includes color coding of vertices and edges to indicate starting and ending compounds, and which reactions are native or foreign to the host organism.

31. The system of Claim 18 wherein the graphical display includes a variation of the width of the arrow edges to indicate the value of the Gibbs energy, or strength, of the reaction path.

32. The system of Claim 18 wherein the graphical display includes dynamically changing the display based on the placement of a computer pointer over nodes or edges of compound names or the reaction Gibbs energy.

33. The system of Claim 18 wherein the display shows a table of the specific reaction steps for the route with the reaction identification, reaction formula, whether the reaction is native to the host, the Gibbs energy of the reaction step, the native enzymes, potential foreign enzymes and data for the competing endogenous reactions.

34. The system of Claim 33 wherein the computer engine selects one or more enzymes from the table to generate and display enzyme data relating to the selected enzyme.

Description:
METHOD FOR DETERMINING HETEROLOGOUS BIOSYNTHESIS PATHWAYS

RELATED APPLICATION DATA

[0001] This application claims the benefit of U.S. Provisional Patent Application Serial No. 62/291,308 filed February 4, 2016.

TECHNICAL FIELD

[0002] The present invention relates to a method and system for analyzing, determining, predicting and displaying ranked suitable heterologous biosynthesis pathways for a specified host.

BACKGROUND OF THE INVENTION

[0003] High-value natural products can be constructed through biosynthesis using recent advances in genome editing and metabolic engineering. Known methods and systems for graphically displaying biosynthesis pathways for natural product construction, for the most part, simply provide for the display of a selection of certain data on a graphical user interface. These prior art graphical systems fail to account for essential analytical functionality of host parameters that is needed to accurately calculate biosynthesis pathways in a high speed system with accuracy and enhanced usability.

[0004] Many design decisions must be made when analyzing possible biosynthesis pathways for a natural product, but prior art graphical display programs do not adequately account for several key decisions, such as the problems associated with foreign gene introduction into a host organism and the suitability of pathways for the endogenous metabolism of a host organism. Specifically, one design decision that may be made for engineering of heterologous biosynthesis systems concerns the decision of which foreign metabolic genes to introduce into a given host organism. The introduction of foreign metabolic genes into the biosynthesis analysis is a decision that must be made based on multifaceted factors, such as the suitability of pathways for the endogenous metabolism of a host organism, in part because the efficacy of heterologous biosynthesis is affected by competing endogenous pathways.

[0005] Known graphical user display systems do not accurately calculate biosynthesis pathways considering this suitability of pathways for the endogenous metabolism of a host organism to maximize speed of the system with accuracy and enhanced usability, which means known systems are not as accurate as possible concerning the design decision of introduction of foreign metabolic genes into a given host organism.

[0006] For instance, several known graphical display systems do not allow the user to specify a host organism in the determination of pathways of construction for a natural product using biosynthesis, such as the graphical systems known as BNICE, PredPath and Metabolic tinker, which were developed to explore pathways irrespective of the consideration for host organisms.

[0007] Table la. Graphical Display Systems.

„, . Chemical Thermodynamic Ranking Information given

(Ref.) transformation consideration score each pathway

3 -level EC number for Closed XT , . Predicted No pathway

BNICE No host No each predicted chemical 7 access reactions ranking

transformation

Final compound of

Open biodegradation, predicted access Predicted Chemical intermediates and

PathPred No host No

web reactions similarity reactions, confidence for server each predicted reaction

Open Possible reactions for each

Metabolic access Directionality, Net chemical transformation 4

No host RHEA reactions

tinker web favorability favorability step and net favorability

[0008] These graphical systems cannot assess the suitability of pathways in a specific context without appropriately considering the introduction foreign metabolic genes into a given host organism or considering the endogenous metabolic system of a host organism. Therefore, these prior art graphical display systems fail to account for essential functionality that is needed to accurately calculate suitability of biosynthesis pathways in a high speed system with accuracy and enhanced usability.

[0009] Several other known graphical display systems do not adequately analyze the basis for the chemical transformation of intermediate precursors that form metabolic routes in a pathway display, such as the FMM, DESHARKY and Metabolic tinker display systems (in Table la, above), which specify chemical transformation using metabolic reaction sets from databases. Through the use of metabolic reaction sets from databases, these display systems that do not adequately consider the basis for chemical transformation of intermediate precursors that forms metabolic routes.

[0010] Table lb. Graphical Display Systems.

Chemical Thermodynamic Ranking Information given for

Tool Access Chassis (Ref.) transformation consideration score each pathway

EC numbers for enzymes, availability of

Open Number each enzyme in various

Many

FMM access web KEGG reactions No of reaction host organisms, 1 choices

server steps suggestion for foreign

enzymes

Source or target compound, EC numbers for enzymes, genes for

Free Growth

DESHARKY E. coli KEGG reactions No some foreign enzymes, download rate

growth rate reduction measures

[0011] The above display systems do not adequately consider the basis for chemical transformation of intermediate precursors that form metabolic routes, but instead consider only reaction sets from databases. For that reason, these graphical display systems cannot adequately assess the suitability of pathways in a specific context without appropriately considering the introduction foreign metabolic genes into a given host organism or considering the endogenous metabolic system of a host organism. Therefore, these prior art graphical display systems fail to account for essential functionality that is needed to accurately calculate suitability of biosynthesis pathways in a high speed system with accuracy and enhanced usability.

[0012] Other known graphical display systems do not adequately analyze the basis for the chemical transformation of intermediate precursors that form metabolic routes in a pathway display, such as graphical display systems that include BNICE (in Table la, above), PredPath (in Table la, above) and XTMS, which merely predict some generalized chemical

transformation rules using such curated reaction sets and apply those generalized rules to expand potentially feasible metabolic routes. Table lc. Graphical Display Systems

„. . Chemical Thermodynamic Ranking Information given for each _ f .

Tool Access

transformation consideration score pathway

Source compound for the retro synthesis path,

Open Gene scores,

predicted reactions with EC access Predicted reaction steps,

XTMS E. coli Favorability numbers, genes for foreign 3 web reactions toxicity, yield,

enzymes, toxicity,

server Gibbs energy

production yield

[0014] Because these graphical systems only consider curated reaction sets and use generalized rules to expand on possible routes, these graphical display systems cannot adequately assess the suitability of pathways in a specific context without appropriately considering the introduction foreign metabolic genes into a given host organism or considering the endogenous metabolic system of a host organism. Therefore, these prior art graphical display systems fail to account for essential functionality that is needed to accurately calculate suitability of biosynthesis pathways in a high speed system with accuracy and enhanced usability.

[0015] Other known graphical display systems do not adequately analyze the basis for the chemical transformation of intermediate precursors that form metabolic routes in a pathway display, such as Metabolic tinker (in Table la, above) and XTMS (in Table lc, above), which use thermodynamic data to constrain the reaction directionality or to rank pathways based on their net favorability. These systems do not adequately consider competing endogenous reactions; and, therefore, these graphical display systems cannot adequately assess the suitability of pathways in a specific context without appropriately considering the introduction of foreign metabolic genes into a given host organism or considering the endogenous metabolic system of a host organism. Therefore, these prior art graphical display systems fail to account for essential functionality that is needed to accurately calculate suitability of biosynthesis pathways in a high speed system with accuracy and enhanced usability.

[0016] Some graphical display systems allow for the consideration of one specific host organism in the analysis, such as the display systems that restrict the user to consider

Escherichia coli as a host organism. Graphical display systems that restrict the user to consider Escherichia coli as a host organism are based on flux balance analysis (FBA), such as XTMS (in Table la, above), DESHARKY (in Table la, above), OptStrain and GEM-Path, are specific to the Escherichia coli chassis. While FBA-based tools tend to offer certain information to evaluate de novo pathways, these systems demand detailed knowledge of a given metabolic system with tight reaction-flux boundaries in order to identify meaningful steady-state flux distributions among a large number of candidate solutions.

[0017] Such detailed data are only available for well-studied organisms, and this may be a major reason why FBA-based tools focus exclusively on the pathway design in E. coli.

Because these graphical display systems are restricted in the type of host organism to be evaluated, these graphical display systems cannot adequately assess the suitability of pathways in a specific context without appropriately considering the introduction of foreign metabolic genes into a given host organism or considering the endogenous metabolic system of a host organism. Therefore, these prior art graphical display systems fail to account for essential functionality that is needed to accurately calculate suitability of biosynthesis pathways in a high speed system with accuracy and enhanced usability.

[0018] Some other graphical display systems, such as FMM and PHT, allow the user to select a host organism from a large set of choices, but these graphical display systems do not use the chassis information to rank suitable biosynthesis pathways for a given endogenous metabolic system. Instead, the PHT display system just reports and displays which enzymes are not natively available in the host, and the FMM display system suggests the introduction of foreign enzymes for certain reactions in heterologous pathways.

Table Id. Graphical Display Systems

Chemical Thermodynamic Ranking Information given for each

Tool Access Chassis (Ref.) transformation consideration score pathway

EC numbers for enzymes,

Open availability of each enzyme in

Number of . , ,

Many various host orgamsms,

FMM aC( S KEGG reactions No reaction „ ° 1 web choices suggestion for foreign server enzymes

EC numbers for enzymes,

Open

Number of local and global compound access Many

PHT KEGG reactions No reaction similarities for each reaction 2 web choices

steps step

server

[0020] Because these systems do not use the chassis information to rank suitable biosynthesis pathways for a given endogenous metabolic system, these graphical display systems cannot adequately assess the suitability of pathways in a specific context without appropriately considering the introduction of foreign metabolic genes into a given host organism or considering the endogenous metabolic system of a host organism. Therefore, these prior art graphical display systems fail to account for essential functionality that is needed to accurately calculate suitability of biosynthesis pathways in a high speed system with accuracy and enhanced usability.

[0021] Overall, known methods and systems for displaying biosynthesis pathways for natural products, for the most part, simply select and display data for disclosure on a graphical user interface, but these known systems do not accurately or adequately analyze pathways for biosynthesis by properly considering introduction of foreign metabolic genes into a given host organism or the endogenous metabolic system of a host organism. These known prior art display systems: (1) do not specify host organisms at all, or (2) do not analyze the basis for the chemical transformation of intermediate precursors that form metabolic routes in a pathway display, or (3) predict some generalized chemical transformation rules using such curated reaction sets and apply them to expand potentially feasible metabolic routes, or (4) restrict the user to use one specific host organism, use thermodynamic data to constrain the reaction directionality or to rank pathways based on their net favorability, which does not consider competing endogenous reactions, or (5) do not use chasis information to rank suitable biosynthesis pathways for a given endogenous metabolic system.

[0022] All of these known graphical display systems cannot adequately assess the suitability of pathways in a specific context without appropriately considering the introduction of foreign metabolic genes into a given host organism or considering the endogenous metabolic system of a host organism. For the above reasons, these prior art graphical display systems fail to account for essential functionality that is needed to accurately calculate suitability of biosynthesis pathways in a high speed system with accuracy and enhanced usability.

SUMMARY OF THE INVENTION

[0023] The present invention relates to a method and system for analyzing, determining, predicting and displaying ranked suitable heterologous biosynthesis pathways for a specified host. The present invention addresses the problem of finding suitable pathways for the endogenous metabolism of a host organism because the efficacy of heterologous biosynthesis is affected by competing endogenous pathways. The present invention is called MRE

(Metabolic Route Explorer), and it was conceived and developed to systematically and dynamically search for, determine, analyze, and display promising heterologous pathways while considering competing endogenous reactions in a given host organism. Table le. Feature summary of present invention.

Chemical Thermodynamic Ranking Information given

Tool Chassis

Access transformation consideration score for each pathway

Required metabolites, EC Fraction of numbers for enzymes,

Open

conversions via genes for foreign access Many Verified KEGG

MRE Boltzmann factor normalized enzymes, reaction free web choices reactions

Boltzmann energy, competing native server

weights reactions

[0024] Unlike known prior art display systems, the present invention Metabolic Route Explorer (MRE) disclosed herein focuses on the suggestion of foreign enzymes with well- characterized activities for promising heterologous pathways by taking into account the effects of the existing, endogenous metabolic infrastructure of a host organism. To find promising biosynthesis routes from a large number of potential candidates, thermodynamic data offer useful information. Unlike some other existing pathway display systems, such as Metabolic tinker and XTMS (which use thermodynamic data to constrain the reaction directionality or to rank pathways based on their net favorability, which does not consider competing endogenous reactions), the present invention MRE system uses thermodynamic data to rank pathways in a host-dependent manner from the perspective of the integration of new reactions into the endogenous metabolic system.

[0025] In order to suggest actual foreign enzymes for the design of heterologous biosynthesis pathways, the present invention MRE only considers verified reactions as metabolic parts. For each foreign reaction in a suggested heterologous pathway, present invention MRE generates information about endogenous reactions competing for metabolites. Since one effective approach to increase the productivity is to attenuate or eliminate competing reactions, MRE also offers useful insights into how to debottleneck and optimize heterologous pathways.

[0026] To rationally design a productive heterologous biosynthesis system, it is essential to consider the suitability of foreign reactions for the specific endogenous metabolic

infrastructure of a host. The present invention MRE has been developed, which, for a given pair of starting and desired compounds in a given chassis organism, and dynamically ranks biosynthesis routes from the perspective of the integration of new reactions into the endogenous metabolic system.

[0027] The present invention is more than a mere "a mathematical algorithm," "a fundamental economic or longstanding commercial practice," or "a challenge in business." The present invention is a method and system that more accurately, more comprehensively, more systematically and dynamically searches for, determines, analyzes, and displays promising heterologous pathways in the field of natural product construction while considering competing endogenous reactions in a given host organism. The claimed invention has a specific, structured graphical user interface paired with the above prescribed functionality that directly relates to the graphical user interface's structure, which resolves identified problems in the prior art display systems.

[0028] For instance, the present invention pairs its graphical user interface with its analysis programming to reduce the time for searching, analysis, and dynamic determination and display of suitable biosynthesis pathways over known prior art display systems, and the present invention achieves more accurate predictions of suitable biosynthesis pathways by adequately assessing the suitability of pathways in a specific context, appropriately considering the introduction foreign metabolic genes into a given host organism, and appropriately considering the endogenous metabolic system of a host organism. The combination of these attributes in the present invention allows researchers to more efficiently and accurately search for, determine, analyze, and display promising heterologous pathways while considering competing endogenous reactions in a given host organism.

[0029] The use of an endogenous pathway score (calculated based on one or more of the reaction weights in a given pathway for the reaction, the route from the source compound to the target product, the number of reactions that are native and foreign to the host organism and whether the reactions are endogenous to the host), specific context factors, host organism factors, and endogenous metabolic system factors are inventive concepts in the context of the present system, which allows the present invention to decrease the design cycle time-periods over known display systems by eliminating erroneous, flawed or unsuitable pathways from the display and consideration in the biosynthesis efforts. For the above reasons, the present invention is a graphical display system that properly accounts for essential factors in the biosynthesis analysis to more accurately calculate suitability of biosynthesis pathways in a high speed system with greater accuracy, enhanced usability, and dynamic displays.

BRIEF DESCRIPTION OF THE DRAWINGS

[0030] The above, and other objects and advantages of the present invention will be understood upon consideration of the following detailed description taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:

[0031] FIG. 1A is the present invention Metabolic Route Explorer (MRE) user-interface query input page showing "auto-completion" field of product ;

[0032] FIG. IB is the present invention Metabolic Route Explorer (MRE) user-interface query input page displaying fields available with "advanced options" checked;

[0033] FIG. 2 is the present invention Metabolic Route Explorer (MRE) user-interface summary page for the top-ranked biosynthesis routes;

[0034] FIG. 3 is the present invention Metabolic Route Explorer (MRE) user-interface page showing a graph of the top-ranked biosynthesis routes;

[0035] FIG. 4 is the present invention Metabolic Route Explorer (MRE) user-interface page showing a graph of the pathway-level information for a specified biosynthesis route;

[0036] FIG. 5 is the present invention Metabolic Route Explorer (MRE) user-interface page showing competing reactions for a specified biosynthesis route;

[0037] FIG. 6 is the present invention workflow diagram of the Metabolic Route Explorer (MRE);

[0038] FIG. 7 is the present invention display of an illustration of a thermodynamic favorability-based weighting scheme;

[0039] FIG. 8 is the present invention display of an illustration of a competition-based weighting scheme for the same reaction in FIG. 7;

[0040] FIG. 9 is the present invention display of a simplified metabolic network showing biosynthesis routes from the C I starting compound to the C6 product;

[0041] FIG. 10 is the present invention display of an illustration of the difference in ranking outcomes for a thermodynamic favorability-based approach and a competition-based approach;

[0042] FIG. 1 1 is the present invention display of a graph demonstrating the computational performance of MRE for various settings; [0043] FIG. 12 is the present invention display of the structure of an experimentally derived heterologous biosynthetic pathway for producing naringenin from L-tyrosine in an E. coli host;

[0044] FIG. 13 is the present invention Metabolic Route Explorer (MRE) user-interface page showing a chart of the reactions for the top-ranked biosynthesis pathway shown in FIG. 12;

[0045] FIG. 14 is the present invention display of a pathway-level graph generated in MRE of the top-ranked pathway for the production of 1,3-PDO from glycerol in E. coli;

[0046] FIG. 15 is the present invention display of a pathway-level graph generated in MRE of the top-ranked pathway for the production of 1,3-PDO from glycerol in yeast;

[0047] FIG. 16 is the present invention display of a pathway-level graph generated in MRE of a known route for the production of artemisinic acid from acetyl-CoA in yeast; and

[0048] FIG. 17 is the present invention display of a pathway-level graph generated in MRE of the top-ranked pathway for the production of artemisinic acid from acetyl-CoA in yeast.

[0049] While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are herein described in detail. It should be understood that the description herein of specific embodiments is not intended to limit the invention to the particular forms disclosed, but on the contrary, the intention is meant to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.

DETAILED DESCRIPTION

[0050] The present invention is a method and system of determining heterologous biosynthesis pathways in a specified host, which not only takes into account thermodynamic criteria for the desired reaction, but also considers the effect of competing endogenous reactions and suggests heterologous enzymes that may increase the favorability of the reaction route. Put another way, the present invention relates to a method and system for analyzing, determining, predicting and displaying ranked suitable heterologous biosynthesis pathways for a specified host.

[0051] The present invention addresses the problem of finding suitable pathways for the endogenous metabolism of a host organism because the efficacy of heterologous biosynthesis is affected by competing endogenous pathways. The present invention is called MRE (Metabolic Route Explorer), and it was conceived and developed to systematically and dynamically search for, determine, analyze, and display promising heterologous pathways while considering competing endogenous reactions in a given host organism and to suggest possible foreign enzymes that may be suitable for use in the reactions. To rationally design a productive heterologous biosynthesis system, it is essential to consider the suitability of foreign reactions for the specific endogenous metabolic infrastructure of a host. The present invention MRE has been developed, which, for a given pair of starting and desired compounds in a given chassis organism, and dynamically ranks biosynthesis routes from the perspective of the integration of new reactions into the endogenous metabolic system.

[0052] To explore biosynthesis routes with MRE, the user specifies a host organism and a pair of the starting and target compounds. To increase its usability and to help the user specify organisms and compounds, MRE comes with an auto-completion feature. With advanced options, the user can override the default setting for the metabolic route search. These options include the maximum number of reaction steps (denoted by n), the number of top-ranked pathways to generate (denoted by K), and a list of compounds that are not considered as primary metabolic precursors in the search, called the exclusion list. By default, n and K are set to 8 and 50, respectively, while the exclusion list has 101 compounds that have high degrees of connectivity in its metabolic network graph, for example, water, ATP and ADP. This exclusion list can also be customized to have other compounds (e.g., CO2). In addition, MRE allows the user to constrain the chemical transformation of precursors based on RPAIR types (e.g., main, cofac and trans). These filtering schemes to constrain possible chemical transformations were reported to increase the relevance of the de novo biosynthesis route suggestion. By default, MRE considers chemical transformations based on main, cofac and trans RPAIR types.

[0053] For each promising heterologous biosynthesis pathway, the present invention MRE suggests actual enzymes for foreign metabolic reactions and dynamically generates information on competing endogenous reactions for the consumption of metabolites. These unique, chassis-centered features distinguish the present invention MRE from existing display systems and allow synthetic biologists to dynamically evaluate the design of their biosynthesis systems from a different perspective. As disclosed herein, the present invention MRE

(Metabolic Route Explorer) was developed that systematically searches for promising heterologous pathways by considering competing endogenous reactions in a given host organism. The present invention supports biosynthesis of a range of high-value natural products as a case study, and the present inventions MRE has been shown to be an effective tool to guide the design and optimization of heterologous biosynthesis pathways.

[0054] The present invention is a novel method and system for determining heterologous biosynthesis pathways to achieve a desired product from a specified host organism considering the suitability of foreign reactions for the specific endogenous metabolic infrastructure of the specified host organism and suggestions of foreign enzymes needed for the reactions using a competition-based weighting approach to determine the top-ranked biosynthesis routes. The present invention has a host-independent metabolic network constructed from databases containing verified metabolic reactions. Weights are assigned to the reactions in host dependent fashion by classifying which enzymatic reactions are native and foreign in the given host organism, by using thermodynamic data, and by identifying competing endogenous reactions. The host-independent metabolic network and weight data are used to construct a metabolic network with host-dependent weights.

[0055] The present invention is dynamic and versatile in that it allows user input to select a host organism, source and target compounds, and search options. The present invention Metabolic Route Explorer exhaustively explores and ranks biosynthesis routes for the selected criteria. The present invention MRE generates ranked biosynthesis routes, genes for foreign enzymes and competing native reactions. The results are displayed in summary tables and reaction pathway graphs with links to more detailed graphs and tables with more specific reaction details including reaction formulas, native and foreign enzymes and competing native reactions.

[0056] The present invention is a computer program based method and system for determining heterologous biosynthesis pathways to produce a target product in a host organism from a selected starting material. The method and system are characterized as follows: a. First, a user inputs data, including a host organism, a starting compound, and a desired product. A user may also select other criteria such as the number of reactions per route or the number of routes, KEGG RPAIR constraints, or additional compounds to exclude in the search. b. Second, a summary of pathways is generated by the Metabolic Route Explorer that ranks the pathways by score, and displays the pathway score summing all of the reaction weights in a given pathway for the reaction, the route from the source compound to the target product, the number of reactions that are native and foreign to the host organism and whether the reactions are endogenous to the host. c. Next, a graph is generated consisting of the top ten or top thirty routes from the starting compound to the target product, in which vertices represent metabolites and edges represent chemical transformations via verified metabolic reactions. i. Color coding of vertices and edges indicate starting and ending compounds, and which reactions are native or foreign to the host organism, and the width of the arrows (edges) indicate the value of the Gibbs energy, or strength, of the reaction path. ii. Hovering a cursor over nodes or edges will dynamically display compound names or the reaction Gibbs energy, respectively. d. From the summary table, a specific route may also be selected and a graph is

generated showing the specified route which indicates the pathway from starting compound to target product. i. The graph shows the reactions steps and metabolites, as well as competing endogenous reactions. ii. With this graph a table of the specific reaction steps for the route is

displayed with the reaction identification, reaction formula, whether the reaction is native to the host, the Gibbs energy of the reaction step, the native enzymes, potential foreign enzymes and data for the competing endogenous reactions. iii. From this page, a competing reaction can be selected and specific pathway details will be displayed. Selecting an enzymes from the table will generate a display of enzyme data.

[0057] The present invention is more than a mere "a mathematical algorithm," "a fundamental economic or longstanding commercial practice," or "a challenge in business." Utilizing the data supplied in the tables and graphs allows a user to select the best path for a specified chemical transformation which is inclusive of thermodynamic criteria for the foreign reaction in view of competing native reactions, and MRE suggests foreign enzymes that may be used to catalyze the desired foreign reactions to increase the desired end product. [0058] The present invention is a method and system that more accurately, more

comprehensively, more systematically and dynamically searches for, determines, analyzes, and displays promising heterologous pathways in the field of natural product construction while considering competing endogenous reactions in a given host organism. The claimed invention has a specific, structured graphical user interface paired with the above prescribed functionality that directly related to the graphical user interface's structure, which resolves identified problems in the prior art display systems.

[0059] For instance, the present invention pairs its graphical user interface with its analysis programming to reduce the time for searching, analysis, and dynamic determination and display of suitable biosynthesis pathways over known prior are display systems, and the present invention achieves more accurate predictions of suitable biosynthesis pathways by adequately assessing the suitability of pathways in a specific context, appropriately considering the introduction of foreign metabolic genes into a given host organism, and appropriately considering the endogenous metabolic system of a host organism. The combination of these attributes in the present invention allows researchers to more efficiently and accurately search for, determine, analyze, and display promising heterologous pathways while considering competing endogenous reactions in a given host organism.

[0060] The use of endogenous pathway score (calculated based on one or more of the reaction weights in a given pathway for the reaction, the route from the source compound to the target product, the number of reactions that are native and foreign to the host organism and whether the reactions are endogenous to the host), specific context factors, host organism factors, and endogenous metabolic system factors are inventive concepts in the context of the present system, which allows the present invention to decrease the design cycle time periods over known display systems by eliminating erroneous, flawed or unsuitable pathways from the display and consideration in the biosynthesis efforts. For the above reasons, the present invention is a graphical display system that properly accounts for essential factors in the biosynthesis analysis to more accurately calculate suitability of biosynthesis pathways in a high speed system with greater accuracy, enhanced usability, and dynamic displays.

[0061] FIGS. 1-5 are typical user interface pages of MRE. FIG. 1A is a query input page

(100) where a user inputs a host organism, a starting material and a desired product. In the host organism input field (101), a host organism is entered, either by name (e.g. Escherichia coli K-

12 MG 1655) or by KEGG organism code (e.g. ECO). A starting material is entered in the starting material input field ( 102) by either KEGG compound ID (e.g. C00031) or name (e.g. D-glucose or grape sugar). The desired product is entered in the desired product input field (103) by either KEGG compound ID (e.g. C00022) or by name (e.g. pyruvate or 2- oxopropanoate). MRE also provides an auto-completion feature. For example, inputting the desired product name generates a drop down list ( 104) of possible product matches from which the desired product may be selected. When this information is entered, MRE generates a summary page for the top-ranked biosynthesis routes.

[0062] FIG. IB is the query input page (100) with the advanced options feature ( 105) selected. A user may customize pathway details that will be displayed by specifying ranges in the option fields. The maximum number of reactions in each route (105) can be selected. The default setting for number of reactions is n = 8, however, a user can specify up 20 reactions for a route. The maximum number of biosynthesis routes (106), can be selected. The default setting for number of reactions is K = 50, however, a user can specify up 500 routes. The KEGG RPAIR (reaction-pair) constraints (108) can be selected. The MRE default setting includes Main, Cofac and Trans, however, a user can remove any of these options or add the Leave or Ligase options. Additional compounds, such as CO2, can be added to the exclusion list in the excluded compounds field (109).

[0063] Based on the input query for biosynthesis requirements in FIG. 1, MRE generates the top-K metabolic routes, and the main result page (FIG. 2) summarizes these routes. For each metabolic route, MRE highlights whether it is endogenous or heterologous to the host organism. For each foreign reaction in a heterologous biosynthesis route, MRE predicts which metabolites may not be available in the host, and it lists exogenous genes for the corresponding enzymatic activity and suggests a list of foreign genes based on a taxonomic similarity measure whose cDNA sequences can be downloaded in the FASTA format. It also shows a list of native reactions competing for a metabolic precursor with each foreign enzymatic reaction. MRE provides a means to visualize a specific pathway with competing endogenous reactions as well as a graph aggregating top-ranked routes.

[0064] Unlike known prior art display systems, the present invention Metabolic Route Explorer (MRE) disclosed herein focuses on the suggestion of foreign enzymes with well- characterized activities for promising heterologous pathways by taking into account the effects of the existing, endogenous metabolic infrastructure of a host organism. To find promising biosynthesis routes from a large number of potential candidates, thermodynamic data offer useful information. Unlike some other existing pathway display systems, such as Metabolic tinker and XTMS (which use thermodynamic data to constrain the reaction directionality or to rank pathways based on their net favorability, which does not consider competing endogenous reactions), the present invention MRE system uses thermodynamic data to rank pathways in a host-dependent manner from the perspective of the integration of new reactions into the endogenous metabolic system. In order to suggest actual foreign enzymes for the design of heterologous biosynthesis pathways, the present invention MRE only considers verified reactions as metabolic parts. For each foreign reaction in a suggested heterologous pathway, present invention MRE generates information about endogenous reactions competing for metabolites. Since one effective approach to increase the productivity is to attenuate or eliminate competing reactions, MRE also offers useful insights into how to debottleneck and optimize heterologous pathways.

[0065] FIG. 2 is an example of a summary page (200) showing a table representation of the ten top-ranked biosynthesis routes for production of naringenin (KEGG compound ID C00509) from L-tyrosine (KEGG compound ID C00082) in an E. coli host organism. The summary page will typically display 50 routes (the default setting for f) unless a different number of routes have been selected in the user-input page. In the example table (200), only 10 of the routes are shown. The No. column (201) in the table shows the ranking of the pathway. The Score column (202) shows the score of the pathway wherein the higher the score, the better the route. The score is based on a predetermined calculation, such as, for example, the sum of all reaction weights in a given pathway taking into consideration the thermodynamic criteria of the reactions in the route as well as the effect of endogenous competing reactions.

[0066] The Route column (203) shows the steps in the specified metabolic route from the starting material to the target compounds by KEGG compound ID. Alternatively, a user can choose to view the compounds by name instead of ID numbers. The Reactions column (204) shows the number of reactions in the route indicating a ratio of how many of the steps in the pathway are reactions native to the host organism (first number) and how many are foreign reactions for the host organism (second number). Column 205 shows whether the reaction pathway is natively present in the user specified host. The ECO column heading in the example table specifies the host is E. coli. ' es" indicates that the all the reactions exist in the user- specified host and "No" indicates that they do not. In the example table, the reactions listed are not native to the host organism.

[0067] FIG. 3 is an example of a graphical representation page (300) of the top-ten ranked biosynthesis routes seen in the FIG. 2 summary page showing the pathway steps for production of naringenin (KEGG compound ID C00509) from L-tyrosine (KEGG compound ID C00082) in an E. coli host organism. Alternatively, a user can choose to view a graph of the top 30 routes. The displayed graphs are scalable by the user to allow visualization of graph details.

[0068] The example graph shows the L-tyrosine starting compound (301) as an oval with the KEGG compound ID and shows the naringenin desired product (302) as an oval with the KEGG compound ID. Alternatively, a user can choose to view the graph with compound names instead of ID numbers. By hovering over a selected compound, a pop-up box (303) displaying the selected compound by common name, chemical structure and KEGG compound ID can be viewed. In the example, naringenin is displayed in the pop-up box (303). Metabolites are shown by KEGG compound ID in ovals designated 304a-304q along the reaction pathways.

[0069] Reactions in the pathway are shown as arrows or edges in the graph wherein the arrows indicate the direction of the reaction (i.e., the reactants and the products). The width of the arrow indicates the value of the Gibbs energy for the reaction, wherein the stronger the reaction, the wider the arrow will be. In the example, the foreign reaction designated 305e has a thicker arrow than the foreign reaction designated 305d, indicating that 305e is the stronger reaction. Hovering the cursor over a reaction pathway will display the reaction compounds and the reaction's Gibbs energy. Foreign reactions are shown by KEGG reaction IDs along arrows 305a-305y in the example graph. Native reactions are shown by KEGG reaction IDs along arrows 306a-306d in the example graph.

[0070] A user viewing the page would see the compounds and reaction paths in color, for example, the starting compound (301), the desired product (302) and the metabolites (304a- 304q) would be seen in red, green and yellow, respectively, and the foreign (305a-305y) and native reaction (306a-306d) arrows would be seen in cyan and purple, respectively. This allows a user to quickly identify the reaction pathways and whether the pathways are native or foreign to the host organism. For instance, in the example graph, cyan colored arrows (305a, 305b and 305d) would indicate that all three of the reactions beginning with the starting compound (301) are foreign reactions to the E. coli host organism.

[0071] FIG. 4 is an example of a pathway-level page (400) showing the graph for only the top- rated biosynthesis route shown on the FIG. 2 summary page for the production of naringenin (KEGG compound ID C00509) from L-tyrosine (KEGG compound ID C00082) in an E. coli host organism. This page shows detailed information about each reaction step of a given biosynthesis pathway. This pathway information is displayed in a table and on a graph in this page. This page can be viewed in terms of KEGG IDs or names, and this display choice can be changed with a click on the link shown on top of the page. For each reaction, the table shows the reactants, the products, whether the reaction is endogenous, the standard reaction Gibbs energy, and the corresponding EC number (if it's an enzymatic reaction). If the reaction is based on a heterologous enzyme, a list of potential enzymes and a list of native competing reactions are also shown (with links to more detailed pages) in the table. Moreover, all potential cDNA gene sequences for the pathway are available for download in FASTA format.

[0072] The selected reaction pathway from the starting compound (401) to the desired product (402) proceeds along the reaction pathway arrows (408a-408h) and includes the KEGG reaction IDs for each reaction (405a-405d and 407a-407f) and the KEGG compound IDs for compounds that are utilized or produced by the reactions (404a-404j). The pop-up box (403), which can be viewed by hovering the cursor over a compound, shows the common name, chemical structure and KEGG compound ID for an intermediate compound in the reaction pathway. For this pathway, the reactions seen along route 408a-408h are foreign reactions (405a-405d). An important additional piece in information on the detailed graph in FIG. 4 is the inclusion of competing reactions (407a-407f), which can impede the progress of the reaction.

[0073] At the top of FIG.4, the pathway details for the route are broken down in table format. The Reaction ID column (409) displays the KEGG reaction ID for the step as well as the KEGG compound ID for the primary reactant in that step. The Formula column (410) displays the primary reactants and products in the step by KEGG compound ID. Column 41 1 shows whether the reaction pathway is natively present in the user specified host. The ECO column heading in the example table specifies the host is E. coli. "Yes" indicates that the reaction exists in the user-specified host and "No" indicates that it does not. In the example table, the reactions listed are not native to the host organism. The Gibbs energy column (412) displays the energy associated with the reaction which indicates the favorability of the reaction based on thermodynamic criteria. The EC# column (413) displays the Enzyme Commission number for the enzyme that catalyzes the step and the Potential Enzyme column (414) suggests foreign enzymes that may be utilized in the reaction. The Competing Reaction column (415) displays competing reactions by KEGG reaction ID and Gibbs Energy associated with the competing reaction.

[0074] A user viewing the page would see the compounds and reaction paths in color. For example, the desired path is shown with blue arrows (408a-408h). The starting compound (401), the desired product (402) and the metabolites (404a-404j) would be seen in red, green and yellow, respectively, and the foreign (405a-405d) reaction boxes would be seen in cyan. Competing endogenous reactions (407a-407f) are shown as gray boxes. This allows a user to quickly identify the reaction pathways and whether the pathways are native, foreign or competing reaction for the host organism. For instance, in the example graph, gray boxes 407a, 407f would indicate that there are six competing reactions on this route.

[0075] Unlike known prior art display systems, the present invention Metabolic Route Explorer (MRE) disclosed herein focuses on the suggestion of foreign enzymes with well- characterized activities for promising heterologous pathways by taking into account the effects of the existing, endogenous metabolic infrastructure of a host organism. To find promising biosynthesis routes from a large number of potential candidates, thermodynamic data offer useful information. Unlike some other existing pathway display systems, such as Metabolic tinker and XTMS (which use thermodynamic data to constrain the reaction directionality or to rank pathways based on their net favorability, which does not consider competing endogenous reactions), the present invention MRE system uses thermodynamic data to rank pathways in a host-dependent manner from the perspective of the integration of new reactions into the endogenous metabolic system. In order to suggest actual foreign enzymes for the design of heterologous biosynthesis pathways, the present invention MRE only considers verified reactions as metabolic parts. For each foreign reaction in a suggested heterologous pathway, present invention MRE generates information about endogenous reactions competing for metabolites. Since one effective approach to increase the productivity is to attenuate or eliminate competing reactions, MRE also offers useful insights into how to debottleneck and optimize heterologous pathways.

[0076] FIG. 5 is an example of a competing reaction information page (500) displaying the details for the competing endogenous reactions associated with compound 404d in FIG. 4. Compound 404d is a metabolite in the pathway from reaction 405c to reaction 405b and three endogenous reactions in the host organism (407a, 407b, 407c) are competing with the desired pathway for that compound. The reaction column (501) displays the specifics of the three competing reactions. The Gibbs energy column (502) displays the standard reaction Gibbs energy associated with that competing reaction and the Enzyme gene ID column (503) displays the Enzyme Gene ID for the enzyme in the host organism associated with the competing reaction. [0077] As the display pages represented in FIGS. 1-5 progress, more specific reaction data is provided. FIGS. 1A and IB are the user input and FIG. 2 shows the top ten routes resulting from that input. FIG. 3 is a graphical representation of the top ten routes displayed in FIG.2. FIG. 4 is a graphical representation of one selected route from FIG. 3 and provides specific details of the reaction mechanism for that pathway, including any known competing endogenous reactions for each step. The FIG. 5 table displays expanded details of the competing reactions for a given step. MRE also provides self-explanatory pages showing detailed information on compounds, reactions and EC numbers. These pages can be accessed by clicking the internal links on the compound IDs, the reaction IDs and the EC numbers. Moreover, external links to other databases are also provided on these pages as alternatives.

[0078] FIG. 6 depicts the workflow of MRE (600). Metabolic reaction data (601) from several data sources, including but not limited to, the Kyoto Encyclopedia of Genes and Genomes (KEGG), ExPASy ENZYME database, and eQuilibrator dataset are complied. KEGG lists around 4000 organisms, which MRE uses for the selection of a host organism. The KEGG COMPOUND database is used to identify metabolites, while the KEGG REACTION database and the ExPASy ENZYME database are used to find metabolic reactions with verified activities. The eQuilibrator dataset is used to obtain the reaction Gibbs energy in the standard 1M concentration setting. The KEGG RPAIR database is used to restrict search space based on the relation between reactants and products. The KEGG GENES database is used for DNA sequence data for enzymatic genes, and the KEGG taxonomy mapping dataset is used to calculate taxonomic distances.

[0079] As seen in FIG. 6, MRE first constructs a directed graph representing a host- independent metabolic network with verified reactions (603). This graph (603) comprises all metabolic reactions with verified activities found in the data source, and it is built regardless of the choice of a host organism for a biosynthesis system. It next assigns weights to the edges in the graph in a host-dependent fashion by classifying which enzymatic reactions are native and foreign in the given host organism and by using the thermodynamic data.

[0080] From the metabolic reaction data compiled from the data sources (601), MRE constructs a host-independent metabolic network with verified reactions (602) by first identifying reactions with verified activities. Enzymatic reactions are categorized based on

Enzyme Commission numbers (EC numbers). Each EC reaction (i.e., a reaction class corresponding to each EC number) denotes a class of catalytic reactions with the same chemical transformation. To retrieve verified metabolic reactions with known enzymes, reaction classes with partially qualified EC numbers are filtered out as these partial EC reactions are unverified and can lead to misinterpretation of enzymatic activities. EC reactions that do not contain any enzymes are also removed. With this filtering process, 5389 complete EC reactions and 76 spontaneous reactions with verified activities were identified.

[0081] Next, standard reaction Gibbs energy rG'° is estimated for each of these verified reactions using eQuilibrator with absolute temperature set to 298.15K. Each verified EC reaction is then split into two reactions: the forward reaction with the reaction Gibbs energy rG'° and the backward reaction with the reaction Gibbs energy -ArG'°. Those EC reactions whose rG'° could not be estimated were assigned the largest of the estimated values for both directions. This conservative approach is used to avoid the suggestion of biosynthesis routes containing reactions with no thermodynamic information as much as possible.

[0082] Using these reactions, a directed graph of the host-directed metabolic network (603) is built that models the transformation of metabolites where its vertices represent metabolites and its edges represent chemical transformations via verified metabolic reactions. Since this directed graph unifies all metabolic reactions with verified activities in the reaction databases, its structure is independent of the endogenous metabolic system of any host organism.

[0083] User input (604), including host organism, source and target compounds and advanced search options, are used in conjunction with the host independent metabolic network (603) to assign weights to edges of the directed graph in a host-dependent fashion (605). To assign the weight of each outgoing edge from a given compound node, the assumption was made that this reaction was in the host organism and computed the probability of converting the precursor via this reaction over the competing native reactions.

[0084] By representing the competition for a metabolic precursor with endogenous reactions by a statistical mechanical model, the probability of each reaction with r G'° through the Boltzmann distribution was computed. The logarithm of this computed probability was assigned as the weight of this outgoing edge. The data from the user input (604), the host independent metabolic network (603) and the assigned weights leads to a metabolic network with host-dependent weights (606). Use of this type of statistical mechanics modeling in the context of the biosynthesis system design is novel. Given the metabolic network graph with host-dependent weights (606), MRE will explore and rank biosynthesis routes (607) by exhaustively searching for biosynthesis paths from the given starting material to the given product and generating results (608) of the top-K metabolic routes, each of which has at most n reaction steps. The results (608) include the ranked biosynthesis routes, genes for foreign enzymes and competing native reactions.

[0085| In this search to explore and rank biosynthesis routes (607), the compounds in the exclusion list are not considered as intermediate precursors of the product. To rank routes, MRE computes their scores by summing all reaction weights in each route and keeps K routes with the highest scores. MRE transforms the metabolic route search problem into a classical computer science problem known as T-shortest loopless path problem and uses an efficient algorithm to solve it. The core part of the search was implemented in C++.

[0086] The weighting scheme used to assign weights to edges in a host-dependent fashion (605) depends on a host organism and models the competition for metabolic precursors with the endogenous reactions. Importantly, this competition-based weighting scheme can capture the effects of competing endogenous reactions on heterologous reactions, while a

thermodynamic favorability-based weighting scheme cannot. This can make their weight assignments widely different from each other, as illustrated in FIGS. 7 and 8.

[0087] To derive a mathematical description of the weighting scheme, a scenario is used to generate weights for edges in the reactions transforming precursor C. Here, RNRN represents a set of native reactions that can transform C in a given host organism. For each reaction r that can transform C, e-ArG'o/RTe- rG' /RT was set as its Boltzmann factor. Thenar), the normalized Boltzmann factor for r, is defined as follows: f(r)=

l+e-^'.^+ f'eRNXlrje-Ar'G /RTji )*-^''!* 7

(1) where R is the gas constant and T is the absolute temperature. Those reactions that are not in the host organism do not affect the calculation of the Boltzmann distribution. If rSRNrSRN, thenar) is simply based on the Boltzmann distribution of the native reaction system transforming compound C. On the other hand, if rgRNrgRN , thenar) is based on the Boltzmann distribution of the reaction system that contains all native reactions transforming C and foreign reaction r. With this scheme, every edge in the graph that transforms C in reaction r has the weight log (r).

[0088] FIGS. 7 and 8 demonstrate the differences between a thermodynamic-favorability- based weighting scheme (700) illustrated in FIG. 7 and a competition-based weighting scheme (800) illustrated in in FIG. 8. [0089] In the thermodynamic-favorability-based weighting scheme (700) illustrated in FIG. 7, node 701 represents a starting metabolite and nodes 704, 707 and 710 are metabolites from the metabolic conversions via reactions represented by the edges (703, 706 and 709). Edge (arrow) 709 represents a reaction that is native to the host organism and edges 703 and 706 represent foreign reactions. The value within the ovals (702, 706, 708) for each edge represents the weight A r G'°/RT where R is the gas constant and T is the absolute temperature.

[0090] In the competition-based weighting scheme (800) illustrated in FIG. 8, node 801 represents a starting metabolite and nodes 804, 807 and 810 are metabolites from the metabolic conversions via reactions represented by the edges (803, 806 and 809). Edge 809 represents a reaction that is native to the host organism and edges 803 and 806 represent foreign reactions. The value within the ovals (802, 806, 808) for each edge represents its weight. With this scheme, edges (or arrows) with the same r G'° value can have different weights in a host- dependent fashion. For example, the weight of CI→ C3 is In [eV(l + e 1 + e 1 )], while that of C1→C4 is In [e l l{ l + e l )l

[0091] The competition-based weighting scheme illustrated in FIG. 8 is based on the logarithm of normalized Boltzmann weights. Unlike thermodynamic favorability-based measure in FIG. 7, the competition-based weighting scheme estimates a fraction of a given precursor that is converted into next intermediate metabolites. Thus, a pathway score based on the sum of all reaction weights in a given pathway can characterize the lower bound of a fraction of starting material that is converted into the product through this pathway, and the competition-based score can capture the productivity of each pathway more appropriately.

[0092] FIGS. 9 and 10 are an example set showing differences in ranking outcomes between the thermodynamic favorability based approach and utilization of the competition-based approach. FIG. 9 is a simplified metabolic network (900). The nodes (901, 904, 907, 912, 915, 920, 923) are metabolites and the edges (903, 905, 909, 91 1, 913, 917, 919, 921 and 925) are metabolic conversions. Edges 909, 917, 919 and 921 indicate native reactions, while edges 903, 905, 911, 913 and 925 indicate foreign reactions. The value within the oval for each edge (902, 906, 908, 910, 914, 916, 918, and 922) indicates A r G'°/RT where R is the gas constant and T is the absolute temperature. In this example, compound CI (901) is the starting metabolite, and compound C6 (915) is the target product. Three possible routes are shown for the conversion of C I to C6. [0093] FIG. 10 shows the ranking of the three biosynthesis routes (1000) with the

thermodynamic favorability approach (1001, 1002, 1003), wherein the lower the score, the better or more favorable the route, and the competition-based approach (1004, 1005, 1006), wherein the higher the score, the better or more favorable the route. For example, the score of CI→ C4→ C6 ( 1003) is -1 + 2 = 1 with the thermodynamic favorability approach, indicating that the 1003 route is the least favorable of the three routes shown. However, for the competition-based approach the score of CI→ C4→ C6 ( 1004) is In [eV(l + e 1 )] + In [e "2 /(l + e ~2 + e "10 )] = -2.44 indicating that the 1004 route is the most favorable.

[0094] Biosynthesis pathways of interest are often those that transform a higher fraction of a starting material to a target product. One heuristic to rank pathways based on this productivity criterion is the net favorability of pathways. At a first glance, the net thermodynamic favorability (as illustrated in FIGS. 9 and 10) can be seen as a good measure to rank pathways based on this criterion. However, this measure can only quantify the ratio of the target concentration to the source concentration at equilibrium, which may not correspond well with the true picture of the titer of the target product, especially when a given pathway has strong competing reactions and the equilibrium concentration of the starting material is substantially lowered.

[0095] FIG. 11 is graph showing computational performance of MRE for various settings. On the graph (1 100), n (1 101) is the maximum number of reaction steps in biosynthesis routes, and K (\ 102) is the number of top routes that MRE generates. For this analysis, six different pairs of source and target compounds, four settings for the maximum number of reaction steps (n= 5, 10, 15 and 20), and five settings for the number of top metabolic routes (K=100, 200, 500, 1000 and 2000) were included. Each point represents the average computational time in seconds (1 103) of the six source-target pairs with a given setting for n and K. In this performance test, the default setting of the exclusion list was used. The computation was performed on an Intel Xeon E5-2680 workstation with 256 GB of memory. The computational time increased as the value of n and K increases, but the magnitude of an increase was found to be reasonable. Even with a very computationally demanding setting (i.e. «=20 and =2000), MRE was able to process the queries within 25 seconds on average. The data also indicated that, with the default setting (i.e. «=8 and =50), the processing time is expected to be less than 5 seconds. Also, since MRE caps the value of n and that the user can set at 20 and 500, respectively, the maximum computational time is expected to be around 10 seconds. These show that the exhaustive search employed in MRE does not compromise the user experience based on its processing time.

[0096] To further evaluate the computational performance of MRE, the processing time in the runtime environment was measured. 1000 reachable pairs of source and target compounds were randomly selected. With the setting of the largest reaction step size and the largest number of top-ranked pathways (i.e., n = 20 and K = 500), it took less than 10 seconds for MRE to exhaustively explore routes and process queries on average. In 95% of the samples, the processing time was less than 20 seconds, and even in the worst case, it was just less than 30 seconds. With the default setting (i.e., n = 8 and K = 50), the processing time was at most 1.36 seconds. The exhaustive pathway search employed in MRE should not compromise the user experience based on its processing time.

Case Study

[0097] As a case study, MRE was applied to search for pathways for various biosynthesis specifications using either E. coli K-12 MG1655 or Saccharomyces cerevisiae as the host organism. Table 2 summarizes the top-ranked heterologous pathways that MRE discovered. This shows that, in biosynthesis of a range of high-value natural products, MRE was able to identify pathways that are known to be productive. The MRE results were also analyzed by comparing them with results from four open-access web servers that can design heterologous biosynthesis pathways, namely, FMM, Metabolic tinker, PHT and XTMS. To explore biosynthesis pathways with these tools, default configurations were used.

Table 2. Top-ranked pathways identified by MRE for various biosynthesis specifications

Comparison

Biosynthesis specification Results of top-ranked pathway identified by MRE with existing tools

Match

Found

Source Target Host Steps Necessary foreign enzymes Remark with a path 3 MRE b

4.3.1.23, 6.2.1.12, 2.3.1.74, Recovered a

L-tyrosine Naringenin E. coli FMM,

4 5.5.1.6 known route as FMM (C00082) (C00509) (ECO) XTMS

the top route 1

Recovered a

glycerol 1,3-PDO E. coli FMM, FMM,

2 4.2.1.30, 1.1.1.202 known route

(C00116) (C02457) (ECO) PHT PHT as the top

route 2

Recovered a

glycerol R-1.2-PDO Yeast

5 4.2.3.3, 1.1.1.79, 1.1.1.77 known route MT MT (C00116) (C02912) (SCE)

as the top

route 3 artemisinic Recovered a

acetyl-CoA Yeast 2.5.1.92, 4.2.3.50, 4.2.3.82,

acid 10 known route , none none (C00024) (SCE) 4.2.3.24, 1.14.13.158

(C20309) and predicted

better ones 4

Recovered a

L-tyrosine resveratrol E. coli

3 4.3.1.23, 6.2.1.12, 2.3.1.95 known route FMM FMM (C00082) (C03582) (ECO)

as the top

route 5

Recovered two

D-xylose xylitol E. coli FMM, FMM,

2 1.1.1.21, 1.1.1.307 known routes

(C00181) (C00379) (ECO) PHT PHT as the top

routes 6

Predicted

PRPP histidine E. coli better and FMM,

8 2.6.1.27 none

(C00119) (C00135) (ECO) shorter routes MT

than a known

native route 7

chorismate tryptophan yeast FMM,

5 none Predicted the FMM (C00251) (C00078) (SCE) MT,

native route as

the best, and

found shorter PHT routes 8

For each biosynthesis specification, the source and target compounds are specified in KEGG ID, and the host organism is in KEGG organism code. For each pathway, the number of reaction steps and the necessary foreign enzymes (in EC number) are specified. Comparison with FMM, Metabolic tinker (MT), PHT and XTMS is also shown. For each tool, its default setting was used, except for the configuration of a pathway length, which was set to accommodate known pathways. In the Table 2, the notation "a" denotes tools that have identified at least one path for a given biosynthesis specification, and the notation "b" denotes tools whose top-ranked pathway is the same as the top-ranked one from MRE.

Biosynthesis of naringenin

[0098] Naringenin is a plant secondary metabolite, which is reported to have various health benefits, including high antioxidant capacities and significant antiviral effects on the hepatitis C virus. Hollman P.C., Katan M.B. Bioavailability and health effects of dietary flavonols in man. Arch. Toxicol. Suppl. 1998;20:237-248. Owing to inefficiencies in the production of naringenin from natural plant sources, metabolic engineering to have an efficient microbial synthesis of this high-value natural product is thought to be a commercially viable alternative.

[0099] FIGS. 12 and 13 show a heterologous biosynthesis pathway to produce naringenin from L-tyrosine in an E. coli host. In the analysis seen in FIGS. 12 and 13, L-tyrosine (KEGG compound ID: C00082), an aromatic non-essential amino acid, was selected as the starting material since a state-of-the-art heterologous naringenin production from L-tyrosine in an E. coli strain is known (see FIG. 12). This heterologous biosynthesis route comprises four foreign enzymatic reactions. To analyze the performance of MRE in comparison with other tools, two open-access biosynthesis pathway web servers were applied, Metabolic tinker and XTMS. Since these two recently developed tools also rely on reaction thermodynamic data for their pathway ranking, it was possible to also analyze the effects on the competition-based ranking scheme.

[00100] Given this biosynthesis requirement, Metabolic tinker and PHT were not able to find any pathways, while XTMS generated a predicted pathway with hypothetical reactions as its top-ranked candidate. In contrast, the top-ranked route from MRE and FMM was identical to the state of the art. The pathway information given by MRE indicates that the third reaction in the pathway, which transforms p-coumaroyl-CoA into naringenin chalcone, is a bottleneck and competes for the availability of cofactor malonyl-CoA with a more favorable native reaction involved in the fatty acid biosynthesis in the E. coli host (Figure 13). This suggests that an increase in the concentration of malonyl-CoA or the inhibition of the fatty acid biosynthesis could enhance the productivity of this naringenin biosynthesis pathway. Indeed, previous studies demonstrated that both an increase in the availability of malonyl-CoA in the host and a decrease in the activities in the fatty acid pathway can increase the naringenin titers. While FMM was also able to identify the heterologous naringenin biosynthesis pathway that MRE found, the pathway information given by FMM was not helpful to find an optimization target as FMM does not have a feature to quantify the effects of competing reactions in the host.

[00101] FIG. 12 shows the structure of an experimentally derived biosynthesis pathway (1200) from the L-tyrosine starting compound (1201) through the intermediate metabolites (1205, 1209, 1213) to the desired naringenin product (1217). The KEGG compound ID of each metabolite appears in an oval (1202, 1206, 1210, 1214, 1218) below each structure. The required enzyme for each step is noted above the arrows (1203, 1207, 1211, 1215). The abbreviations for the enzymes are: tyrosine ammonia lyase (TAL); 4-coumarate:CoA ligase (4CL); chalcone synthase (CHS) and chalcone isomerase (CHI). The EC numbers (1204, 1208, 1212, 1216) for each reaction are indicated below each arrow.

[00102] FIG. 13 displays the information of the top-ranked biosynthesis pathway in MRE for the L-tyrosine to naringenin conversion (which is the same pathway associated with FIG. 4). Column 1301 displays the KEGG reaction and compound IDs. Column 1302 displays the reaction step by KEGG compounds ID. The ECO column (1303) heading in the example table specifies the host is E. coli. ' es" indicates that the reaction exists in the user-specified host and "No" indicates that it does not. In the example table, the reactions listed are not native to the host organism. Column 1304 displays Gibbs energy. Column 1305 displays EC number for the enzyme if it is an enzymatic reaction. Column 1306 displays potential enzymes that may be used for the reaction and Column 1307 displays competing reactions by KEGG ID and Gibbs energy for the competing reaction.

Production of value-added chemicals from glycerol

[00103] Glycerol is a readily available and relatively inexpensive chemical compound that can be generated in large amounts as a byproduct of biodiesel and bioethanol production processes. Because of its economic viability and long-term sustainability, fermentative production of high-value materials from glycerol has gained much attention recently. Using glycerol as the starting material, pathways were searched for the production of two value-added chemicals, 1,3 -propanediol (1,3-PDO), a commodity chemical mainly used to make polyester fiber, and 1,2-propanediol (1,2-PDO), another high-demand commodity chemical used to make a wide range of products including antifreeze, thermoset plastics and cosmetics. [00104] FIGS. 14 and 15 are pathway level graphs generated in MRE for production of 1,3- propanediol (1,3-PDO) from glycerol in two different host organisms. FIG. 14 shows the MRE top-ranked pathway for the production of 1,3-PDO from glycerol in E. coli and FIG. 15 shows the MRE top-ranked pathway for the production of R-l,2-PDO from glycerol in yeast.

[00105] MRE was first applied to search for pathways for the production of 1,3-PDO in E. coli chassis. The top-ranked pathway (Fig. 14) that MRE identified is a known two-step heterologous pathway, which requires the introduction of a glycerol dehydratase gene and a 1,3 -propanediol dehydrogenase gene in the host. Since the first glycerol dehydratase reaction competes for the utilization of glycerol against several native reactions including glycerol kinase, MRE predicts that this can be a target for productivity optimization. Metabolic tinker and XTMS were not able to find any pathways for the 1,3-PDO production, whereas FMM and PHT found the same pathway that MRE identified.

[00106] In FIG. 14, the glycerol starting material (1401) is converted to the end product 1,3- PDO (1402) by heterologous reactions 1405a and 1405b, which are foreign to the E. coli host, via the reaction route shown byl408a-1408d. Intermediate metabolites produced or used by these reactions are shown by 1404a- 1404e. There are three competing endogenous reactions (1407a, 1407b, 1407c) shown along arrows the arrows (1417a, 1417b, 1417c) which indicate that these reactions, which are native to the E. coli host, are competing with reaction 1405a along path 1408a for the glycerol starting product (1401) and foreign reaction pathways are indicated by 1415a-d. In an MRE generated pathway level graph, the display would show starting, ending and intermediate compounds shown in red, green and yellow, respectively. The top-ranked route is shown in blue, while native, foreign and competing reactions and paths are shown in purple, cyan and gray. The color display of pathways and compounds would allow a user to quickly identify the components in each route.

[00107] Next, MRE was applied to search for pathways for the synthesis of R-l,2-PDO in the yeast chassis. The top-ranked pathway (Figure 15) found was a known synthesis pathway for 1,2-PDO. In this pathway, glycerol is first converted to dihydroxyacetone phosphate (DHAP) via two native enzymatic reactions. Methylglyoxal synthase then transforms DHAP into methylglyoxal, which is, in turn, converted into (R)-lactaldehyde. Finally, lactaldehyde reductase is used to produce R- 1,2-PDO from (R)-lactaldehyde. FMM and PHT were not able to find any pathways that convert glycerol into R- 1,2-PDO, whereas Metabolic tinker identified the same pathway that MRE found as the top-ranked one. Since XTMS focuses on the E. coli chassis, this tool was applied to search for heterologous R-l,2-PDO production pathways in E. coli; however, no pathways were found.

[00108] In FIG.15, the glycerol starting material (1501) is converted to the end product R 1,2- PDO (1502) by native reactions 1506a and 1506b, which are endogenous to the yeast host and by heterologous reactions 1505a, 1505b and 1505c, which are foreign to the yeast host, via the reaction route shown byl508a-1508j. Intermediate metabolites produced or used by these reactions are shown by 1504a- 15041. There are five competing endogenous reactions, three (1507a, 1507b, 1507c) are shown along arrows (1517a, 1517b, 1517c) as competing with the desired route for metabolite 1504f, and two reactions (1507d, 1507e) are shown along arrows (1517d, 1517e) as competing with the desired route metabolite 1504h and foreign reaction pathways are indicated by 1515a-f and 1516a-d, respectively. In an MRE generated pathway level graph, starting, ending and intermediate compounds are shown in red, green and yellow, respectively, the top-ranked route is shown in blue, and native, foreign and competing reactions and paths are shown in purple, cyan and gray, which allow a user to quickly identify by color the components in each route.

Production of Artemisinic Acid

[00109] Artemisinic acid is an intermediate precursor for antimalaria drug artemisinin, and its production is often celebrated as one of the early success stories of the combination of metabolic engineering and synthetic biology. This engineered biosynthesis pathway utilizes the endogenous mevalonate pathway in budding yeast to transform acetyl-CoA into farnesyl pyrophosphate (FPP), which is then converted into artemisinic acid with heterologous amorphadiene synthase and three-step oxidation reactions.

[00110] In FIGS. 16 and 17, two routes are shown for the production of artemisinic acid from acetyl-CoA in yeast. FIG. 16 is a known route and FIG. 17 is an MRE top-ranked route. To see if MRE could recover this engineered pathway, MRE was applied to explore pathways for the production of artemisinic acid from acetyl-CoA in yeast. It was found that one of the top- ranked pathways that MRE generated was this known heterologous pathway (Figure 16). Interestingly, the pathway that MRE identified as the top candidate (Figure 17) was slightly different from the previously engineered pathway. The difference lies in how isopentenyl pyrophosphate (IPP) is converted into farnesyl pyrophosphate (FPP). In the top-ranked path, IPP is first converted into (2Z,6Z)-farnesyl diphosphate (Ζ,Ζ-FPP). This route is chosen because IPP is a precursor of a thermodynamically highly favorable native reaction, and the conversion reaction from IPP to Ζ,Ζ-FPP is much more favorable than that from IPP to FPP, enabling a higher fraction of IPP to be utilized in the route. By using Ζ,Ζ-FPP as the precursor, this route introduces three foreign carbon-oxygen lyases to form FPP. FMM, Metabolic tinker and PHT were not able to find any pathways. XTMS found a partial pathway that converts FPP into artemisinic acid, albeit it is for the E. coli chassis.

[00111] In the known route (1600) shown in FIG. 16, the acetyl-CoA starting material (1601) is converted to desired product artemisinic acid (1602) via reaction route 1608a-1608p. The route includes endogenous reactions 1606a-1606f shown along arrows 1616a- 1616p (which indicate paths native to the host) and heterologous reactions 1605a, 1605b shown along arrows 1615a- 1615f (which indicate paths foreign to the host). Intermediate metabolites produced or used by these reactions are shown by 1604a- 16041 There are three competing endogenous reactions (nodes 1607a, 1607b, 1607c and arrows 1617a, 1617b, 1617c) that compete with desire route (16081-1608m) for the intermediate metabolite trans, trans, farnesyl diphosphate (1604k). In an MRE generated pathway level graph, starting, ending and intermediate compounds are shown in red, green and yellow, respectively, the top-ranked route is seen in dark blue, and native, foreign and competing reactions and paths are shown in purple, cyan and gray respectively, to allow a user to quickly identify the components by color in each route.

[00112] In an MRE top-ranked route (1700) shown in FIG. 17, the acetyl-CoA starting material (1701) is converted to desired product artemisinic acid (1702) via reaction route 1708a-1708t. The route includes endogenous reactions 1706a-1706e shown along arrows 1716a- 1716n (which indicate paths native to the host) and heterologous reactions 1705a-1705e shown along arrows 1715a-1715i (which indicate paths foreign to the host). Intermediate metabolites produced or used by these reactions are shown by 1704a-1704v. There are three competing endogenous reactions (nodes 1707a, 1707b 1707c and arrows 1717e, 1717f, 1717g) that compete with desire route (1708m-1708n) for the intermediate metabolite dimethylallyl diphosphate (1704i), and three competing endogenous reactions (nodes 1707b, 1707c, 1707d and arrows 1717a, 1717b, 1717c) that compete with desire route (1708p-1708q) for the intermediate metabolite trans, trans, farnesyl diphosphate (1704k). In an MRE generated pathway level graph, starting, ending and intermediate compounds are shown in red, green and yellow, respectively, the top-ranked route is seen in dark blue, and native, foreign and competing reactions and paths are shown in purple, cyan and gray, respectively, to allow a user to quickly identify the components in each route. [00113] The present invention, MRE, is an open-access biosynthesis design tool, that searches for promising metabolic routes for a given biosynthesis specification and suggests exogenous enzymes for heterologous biosynthesis pathways based on the infrastructure of an endogenous metabolic system. The present invention relies on the data sources (mainly KEGG) to mine verified metabolic reactions and to search for biosynthesis routes based on them. Indeed, while painstaking effort has resulted in a large collection of annotated metabolic reaction data, among the 9910 reactions in the KEGG REACTION database (Release 76.0), 1272 reactions with no EC numbers were found, 1079 with partial EC numbers were found and 2170 with no annotations for associated genes were found. The number of verified reactions in KEGG is expected to increase over time which would alleviate any issues related to a lack of verified reactions. Other metabolic reaction databases, such as Rhea, may also be integrated.

[00114] Several existing tools took an approach to expand a list of metabolic parts in hand by defining specific transformation rules, albeit such rules can be subjective. To design biosynthesis systems, this approach relies on the prediction of metabolic parts with specific metabolic activities, which may or may not exist. Thus, the design of biosynthesis systems via this top-down approach may require the de novo design of unnatural proteins to achieve specific metabolic activities. MRE was developed to suggest actual enzymes for heterologous pathways. Thus, it takes a complementary, bottom-up approach in which a biosynthesis system is designed by using we 11 -characterized metabolic parts. To this end, only verified reactions were used.

[00115] Here, by using the biosynthesis of a range of high-value natural products as a case study, it has shown that MRE can suggest promising heterologous biosynthesis pathways and provide useful information to pinpoint bottlenecks of pathways. With the host-dependent competition-based pathway ranking scheme, along with the suggestion of foreign enzymes with competing endogenous reactions, MRE offers novel insights into the design and optimization of heterologous biosynthesis systems.

[00116] While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are herein described in detail. For example, the user-interface example pages shown in Figures 1-5 do not itemize or describe in detail the dimensions, shapes, sizes, inputs or outputs, or exact specification of the identified items (e.g. user input, summary tables, etc.), which are all understood to exist and be within the scope of the invention as described and claimed. [00117] Furthermore, size and shapes of display pages, input fields and linked data are not described in detail, but such details are understood to be varied or modifiable while still complying with the scope of the invention set forth herein and covered by the claims. It should be understood, however, that the description herein of specific embodiments is not intended to limit the invention to the particular forms disclosed, but on the contrary, the intention is meant to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.