Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
BIOSENSORS FOR POLYKETIDE EXTENDER UNITS AND USES THEREOF
Document Type and Number:
WIPO Patent Application WO/2020/237028
Kind Code:
A1
Abstract:
The present disclosure relates to biosensors and uses thereof for detecting polyketide extender units. Disclosed herein are biosensor systems and methods for detecting polyketide synthase extender units. In some aspects, disclosed herein is a biosensor system comprising: a first nucleic acid comprising a genetically modified fapR gene, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type fapR gene, and wherein the first nucleic acid is operably linked to a first promoter; and a second nucleic acid comprising a reporter gene whose transcription is under the control of a second promoter which is regulated by the fapR transcription factor.

Inventors:
WILLIAMS GAVIN JOHN (US)
KALKREUTER ROBERT EDWARD (US)
MALICO ALEXANDRA ANDERSON (US)
MITCHLER MELISSA (US)
Application Number:
PCT/US2020/033960
Publication Date:
November 26, 2020
Filing Date:
May 21, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV NORTH CAROLINA STATE (US)
International Classes:
C07K14/195; C12N15/00; C12N15/09; C12Q1/68
Domestic Patent References:
WO2018007526A12018-01-11
WO2018035158A12018-02-22
Foreign References:
US20190085416A12019-03-21
US20190112598A12019-04-18
US20040209270A12004-10-21
US20090181854A12009-07-16
Other References:
XU PENG, WANG WENYA, LI LINGYUN, BHAN NAMITA, ZHANG FUMING, KOFFAS MATTHEOS A. G.: "Design and Kinetic Analysis of a Hybrid Promoter-Regulator System for Malonyl-CoA Sensing in Escherichia coli", ACS CHEM. BIOL, vol. 9, 5 November 2013 (2013-11-05), pages 451 - 458, XP055761624
CHAN YOLANDE A., PODEVELS ANGELA M., KEVANY BRIAN M., THOMAS MICHAEL G.: "Biosynthesis of polyketide synthase extender units", NAT PROD REP, vol. 26, 27 October 2008 (2008-10-27), pages 90 - 114, XP055509444
Attorney, Agent or Firm:
PRATHER, Donald M. et al. (US)
Download PDF:
Claims:
CLAIMS

We claim:

1. A biosensor system comprising:

a first nucleic acid comprising a genetically modified fapR gene, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type fapR gene, and wherein the first nucleic acid is operably linked to a first promoter; and a second nucleic acid comprising a reporter gene whose transcription is under the control of a second promoter which is regulated by the fapR transcription factor. 2. The biosensor system of claim 1, wherein the first nucleic acid and the second nucleic acid are located on one recombinant DNA vector. 3. The biosensor system of claim 1 or 2, wherein the first nucleic acid and second nucleic acid do not comprise a lacO sequence. 4. The biosensor system of any one of claims 1-3, wherein the first nucleic acid comprises a first ribosome binding site. 5. The biosensor system of any one of claims 1-4, wherein the second nucleic acid

comprises a second ribosome binding site and a fapO operator. 6. The biosensor system of any one of claims 1-5, wherein the first promoter and the second promoter initiate transcription in opposite directions. 7. The biosensor system of any one of claims 1-5, wherein the first promoter and the second promoter initiate transcription in the same direction. 8. The biosensor system of any one of claims 1-7, wherein the first ribosome binding site sequence is TAVRCAGGH (SEQ ID NO:2); wherein V is A, C, or G; wherein R is A or G; and wherein H is A, C, or T. 9. The biosensor system of any one of claims 1-8, wherein the first ribosome binding site sequence is selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID

10. The biosensor system of any one of claims 1-9, wherein the wild-type fapR gene

comprises the polynucleotide sequence of SEQ ID NO:25. 11. The biosensor system of any one of claims 1-10, wherein the genetically modified fapR gene encodes a fapR protein comprising a F99L amino acid substitution when compared to SEQ ID NO:29. 12. The biosensor system of any one of claims 1-11, wherein the genetically modified fapR gene comprises the polynucleotide sequence of SEQ ID NO:24. 13. The biosensor system of any one of claims 1-12, wherein the genetically modified fapR gene confers detection of one or more polyketide synthase extender units. 14. The biosensor system of claim 13, wherein the one or more polyketide synthase extender units comprise malonyl-CoA or an a-substituted derivative thereof. 15. The biosensor system of claim 14, wherein the a-substituted derivative comprises a substituent selected from the group consisting of alkyl, alkynyl, and alkenyl. 16. The biosensor system of claim 15, wherein the a-substituted derivative comprises

methylmalonyl-CoA or propargylmalonyl-CoA. 17. The biosensor system of any one of claims 1-16, wherein the reporter gene comprises a gene encoding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase, or a fluorescent protein. 18. The biosensor system of claim 17, wherein the fluorescent protein comprises Superfolder GFP. 19. A recombinant DNA vector comprising the biosensor system of any one of claims 1-18.

20. A method for detecting one or more polyketide synthase extender units, said method comprising:

introducing into a cell a recombinant DNA vector comprising:

a first nucleic acid comprising a genetically modified fapR gene, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type fapR gene, and wherein the first nucleic acid is operably linked to a first promoter; and a second nucleic acid comprising a reporter gene whose transcription is under the control of a second promoter which is regulated by the fapR transcription factor; and

measuring the one or more polyketide synthase extender units based on the expression of the reporter gene in the cell. 21. The method of claim 20, wherein the first nucleic acid and the second nucleic acid are located on one recombinant DNA vector. 22. The method of claim 20 or 21, wherein the first nucleic acid and second nucleic acid do not comprise a lacO sequence. 23. The method of any one of claims 20-22, wherein the first nucleic acid comprises a first ribosome binding site. 24. The method of any one of claims 20-23, wherein the second nucleic acid comprises a second ribosome binding site and a fapO operator. 25. The method of any one of claims 20-24, wherein the first promoter and the second

promoter initiate transcription in opposite directions. 26. The method of any one of claims 20-25, wherein the first promoter and the second

promoter initiate transcription in the same direction. 27. The method of any one of claims 20-26, wherein the first ribosome binding site sequence is TAVRCAGGH (SEQ ID NO:2); wherein V is A, C, or G; wherein R is A or G; and wherein H is A, C, or T.

28. The method of any one of claims 20-27, wherein the first ribosome binding site sequence is selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, and SEQ ID NO:20. 29. The method of any one of claims 20-28, wherein the wild-type fapR gene comprises the polynucleotide sequence of SEQ ID NO:25. 30. The method of any one of claims 20-29, wherein the genetically modified fapR gene encodes a fapR protein comprising a F99L amino acid substitution when compared to SEQ ID NO:29. 31. The method of any one of claims 20-30, wherein the genetically modified fapR gene comprises the polynucleotide sequence of SEQ ID NO:24. 32. The method of any one of claims 20-31, wherein the genetically modified fapR gene confers detection of one or more polyketide synthase extender units. 33. The method of claim 32, wherein the one or more polyketide synthase extender units comprise malonyl-CoA or an a-substituted derivative thereof. 34. The method of claim 33, wherein the a-substituted derivative comprises a substituent selected from the group consisting of alkyl, alkynyl, and alkenyl. 35. The method of claim 34, wherein the a-substituted derivative comprises methylmalonyl- CoA or propargylmalonyl-CoA. 36. The method of any one of claims 20-35, wherein the reporter gene comprises a gene encoding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase, or a fluorescent protein. 37. The method of claim 36, wherein the fluorescent protein comprises Superfolder GFP.

38. The method of any one of claims 20-37, wherein the cell comprises a bacterial cell, a mammalian cell, or a yeast cell. 39. The method of claim 38, wherein the bacterial cell comprises an E. coli cell. 40. A cell-free biosensor system, comprising:

a recombinant DNA vector comprising: a reporter gene whose transcription is under the control of a promoter, a fapO operator, and a ribosome binding site;

a fapR transcription factor; and

transcription-translation reagents comprising a polymerase and/or a ribosome. 41. The cell-free biosensor system of claim 40, wherein the promoter comprises a T7

promoter. 42. The cell-free biosensor system of claim 40 or 41, wherein the fapR transcription factor confers detection of one or more polyketide synthase extender units. 43. A cell-free method for detecting one or more polyketide synthase extender units, said method comprising:

combining the cell-free biosensor system of any one of claims 40-42 with one or more polyketide synthase extender units; and

measuring the one or more polyketide synthase extender units based on the expression of the reporter gene. 44. The cell-free method of claim 43, wherein the one or more polyketide synthase extender units comprise malonyl-CoA or an a-substituted derivative thereof. 45. The cell-free method of claim 44, wherein the a-substituted derivative comprises a

substituent selected from the group consisting of alkyl, alkynyl, and alkenyl. 46. The cell-free method of claim 44 or 45, wherein the a-substituted derivative comprises methylmalonyl-CoA or propargylmalonyl-CoA.

47. The cell-free method of any one of claims 43-46, wherein the reporter gene comprises a gene encoding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase, or a fluorescent protein. 48. The cell-free method of any one of claims 43-47, wherein the fluorescent protein

comprises Superfolder GFP.

Description:
BIOSENSORS FOR POLYKETIDE EXTENDER UNITS AND

USES THEREOF CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application 62/850,608, filed on May 21, 2019, the entire contents of which are fully incorporated herein by reference. STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

This invention was made with government support under grant numbers GM104258 and GM124112 awarded by the National Institutes of Health. The government has certain rights in the invention. FIELD

The present disclosure relates to biosensors for detecting polyketide extender units. BACKGROUND

Engineered bacterial transcriptional regulators have contributed immensely to chemical and synthetic biology by providing highly modular and tunable devices for sensing small molecules. By coupling them to fluorescent or chromogenic readouts, transcription factor-based biosensors have been applied as tools that detect key metabolites, regulate biosynthetic circuits, guide metabolic engineering, and enable directed evolution of enzymes and pathways. Yet, biosensors for the detection of natural products and their biosynthetic precursors are not yet widely available, limiting the ability of high-throughput strategies to be applied to many important classes of molecules. For instance, malonyl-CoA (mCoA) plays an integral role in cellular primary and secondary metabolism as a building block for fatty acids, phenylpropanoids, polyketides, and hybrid natural products. Because of this, mCoA biosynthesis has been a longstanding target of metabolic engineering efforts. Accordingly, genetically-encoded biosensors for the detection of mCoA have been constructed using FapR, a transcriptional regulator found in nearly all Gram- positive bacteria and acts as a global regulator for fatty acid biosynthesis. Crystal structures of FapR indicate a dimer, whereby each monomer is comprised of a C-terminal ligand-binding domain and an N-terminal domain that binds to its cognate DNA operator, fapO. Depending on the promoter, FapR has been shown to act as either an activator or a repressor in the presence of its native ligand, mCoA. Given its utility as a transcriptional regulator, FapR has been developed and utilized as a mCoA biosensor in several hosts including E. coli, yeast, and mammalian cells. Notably, none of the previously reported FapR-based mCoA biosensors were utilized for detection of ligands beyond mCoA. Indeed, compounds related to mCoA such as acetyl-CoA, propionyl-CoA, succinyl-CoA, and butyryl-CoA have been reported to be non-effectors of FapR. Furthermore, despite the important role of mCoA derivatives substituted at the C2 position as extender unit building blocks for many biologically-relevant polyketides, genetically-encoded biosensors for C2-derivatives of mCoA have yet to be reported. What is needed are improved systems and methods for detecting a broad range of polyketide extender units beyond mCoA. SUMMARY

Disclosed herein are biosensor systems and methods for detecting polyketide synthase extender units.

In some aspects, disclosed herein is a biosensor system comprising:

a first nucleic acid comprising a genetically modified fapR gene, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type fapR gene, and wherein the first nucleic acid is operably linked to a first promoter; and a second nucleic acid comprising a reporter gene whose transcription is under the control of a second promoter which is regulated by the fapR transcription factor. In some embodiments, the first nucleic acid and the second nucleic acid are located on one recombinant DNA vector.

In some embodiments, the first nucleic acid and second nucleic acid do not comprise a lacO sequence.

In some embodiments, the first nucleic acid comprises a first ribosome binding site. In some embodiments, the second nucleic acid comprises a second ribosome binding site and a fapO operator. In some embodiments, the first promoter and the second promoter initiate transcription in opposite directions. In some embodiments, the first promoter and the second promoter initiate transcription in the same direction.

In some embodiments, the first ribosome binding site sequence is TAVRCAGGH (SEQ ID NO:2); wherein V is A, C, or G; wherein R is A or G; and wherein H is A, C, or T. In some embodiments, the first ribosome binding site sequence is selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, and SEQ ID NO:20. In some embodiments, the wild-type fapR gene comprises the polynucleotide sequence of SEQ ID NO:25. In some embodiments, the genetically modified fapR gene encodes a fapR protein comprising a F99L amino acid substitution when compared to SEQ ID NO:29. In some embodiments, the genetically modified fapR gene comprises the polynucleotide sequence of SEQ ID NO:24. In some embodiments, the genetically modified fapR gene confers detection of one or more polyketide synthase extender units. In some embodiments, the one or more polyketide synthase extender units comprise malonyl-CoA or an a-substituted derivative thereof. In some embodiments, the a-substituted derivative comprises a substituent selected from the group consisting of alkyl, alkynyl, and alkenyl. In some embodiments, the a-substituted derivative comprises methylmalonyl-CoA or propargylmalonyl-CoA.

In some embodiments, the reporter gene comprises a gene encoding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase, or a fluorescent protein. In some embodiments, the fluorescent protein comprises Superfolder GFP.

In some aspects, disclosed herein is a recombinant DNA vector comprising the biosensor system of any preceding aspect.

In some aspects, disclosed herein is a method for detecting one or more polyketide synthase extender units, said method comprising:

1) introducing into a cell a recombinant DNA vector comprising:

a first nucleic acid comprising a genetically modified fapR gene, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type fapR gene, and wherein the first nucleic acid is operably linked to a first promoter; and a second nucleic acid comprising a reporter gene whose transcription is under the control of a second promoter which is regulated by the fapR transcription factor; and

2) measuring the one or more polyketide synthase extender units based on the

expression of the reporter gene in the cell.

In some embodiments, the cell comprises a bacterial cell, a mammalian cell, or a yeast cell. In some embodiments, the bacterial cell comprises an E. coli cell.

In some aspects, disclosed herein is a cell-free biosensor system, comprising:

a recombinant DNA vector comprising: a reporter gene whose transcription is under the control of a promoter, a fapO operator, and a ribosome binding site; a fapR transcription factor; and

transcription-translation reagents comprising a polymerase and/or a ribosome. In some embodiments, the promoter comprises a T7 promoter. In some embodiments, the fapR transcription factor confers detection of one or more polyketide synthase extender units.

In some aspects, disclosed herein is a cell-free method for detecting one or more polyketide synthase extender units, said method comprising:

combining the cell-free biosensor system of any preceding aspect with one or more

polyketide synthase extender units; and

measuring the one or more polyketide synthase extender units based on the expression of the reporter gene. BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying figures, which are incorporated in and constitute a part of this specification, illustrate several aspects described below.

FIG. 1 shows biosynthesis and uses of malonyl-CoA analogues. Natural and engineered biosynthetic pathways for mCoA (1) and derivatives involve acyl-CoA ligases, acetyl-CoA dehydrogenases (AcDH), enoyl-CoA-reductases (ECR), malonyl-CoA synthetases, and various acyl-CoA carboxylases (propionyl-CoA carboxylase, PCC; acetyl-CoA carboxylase, ACC). Co- substrates and cofactors are omitted for clarity. Examples of clinically-relevant compounds that utilize malonyl-CoA and its derivatives are shown.

FIGS.2A-2B illustrate redesigned malonyl-CoA biosensors. FIG.2A is a scheme showing a previous synthetic gene circuit regulated by IPTG and mCoA. FIG.2B is a scheme showing a redesigned synthetic gene circuit regulated only by mCoA. At low levels of mCoA, FapR binds to its cognate operator and represses transcription. At increased levels, mCoA binds to FapR and causes a conformational change, allowing transcription of a fluorescent reporter gene. IPTG, isopropyl b-D-1-thiogalactopyranoside; lacI, E. coli lactose repressor; T7, bacteriophage T7 promoter; lacO, lacI repressor binding site; RBS, ribosome binding site; fapR, B. subtilis fatty acid biosynthetic pathway repressor; fapO, FapR repressor binding site; eGFP, enhanced GFP; mCoA, malonyl-CoA; ProfapR/sfGFP, constitutive lac promoters; sfGFP, superfolder GFP.

FIG. 3 shows activation ratios of 93 variants from the FapR RBS library in E. coli 10G. The relative fluorescence (normalized to cell density) of each variant was determined at 0 mM and 25 mM cerulenin and divided to determine the fold activation. These values were compared with replicates (n=3) of the prototype biosensor RBS (1A1).

FIG. 4 shows cerulenin dose-response curves of the FapR biosensor in various E. coli strains. The relative fluorescence (normalized to cell density) was determined at each indicated cerulenin concentration and the curves were fit to the Hill equation. Error bars (where visible) are the standard deviation of the mean (3 biological replicates).

FIGS. 5A-5D show probing the effector promiscuity of FapR with malonyl-CoA derivatives by cell-free transcription-translation. FIG. 5A shows the mCoA-binding pocket of FapR (PDB: 2F3X) with key residues highlighted (green sticks). Dashed lines are hydrogen bonds (purple). The bound mCoA is shown (white sticks). The ribbon structure of each subunit of the homodimer is shown in light green and light cyan. FIG. 5B shows structures of each analogue tested with FapR. FIG.5C is a scheme showing the layout of the cell-free biosensor, pET28a-T7- fapO-sfGFP. FIG. 5D shows fluorescence output of the cell-free biosensor pET28a-FapR/T7- fapO-sfGFP in the presence of each thioester at 200 mM. The output in the presence of the native effector mCoA is set to 100%. The (-) thioester control is boiled MatB. Error bars are the standard deviation of the mean (3 biological replicates). *p < 0.05 by Student’s unpaired two-tailed t-test vs. (-) thioester control. **p < 0.01 by Student’s unpaired two-tailed t-test vs. (-) thioester control.

FIG.6 shows dose-response analysis of the cell-free TX-TL FapR assay. MatB-generated malonyl-CoA (mCoA) or commercial mCoA was included in a cell-free TX-TL mixture with purified FapR and T7-fapO-sfGFP and the sfGFP fluorescence was determined. Error bars are the standard deviation of the mean (3 biological replicates).

FIG. 7 shows electrophoretic mobility shift assay of FapR and malonyl-CoA or methylmalonyl-CoA. The 40 bp fapO fragment was incubated with purified FapR and either mCoA or mmCoA and visualized on a PAGE gel with SYBR Green I. The gel shift assay demonstrates binding of the fapO DNA by FapR (lane 1 vs. lane 2) and subsequent release upon addition of either malonyl-CoA (lane 3) or methylmalonyl-CoA (lane 4). (a) unbound fapO and (b) bound fapO.

FIG.8 shows detection of mmCoA in E. coli K207-3 by FapR 2H8. Fluorescence output of the 2H8 biosensor was determined in the absence/presence of 1 mM IPTG / 1 mM propionate supplemented to the growth media. Error bars represent the standard deviation of three independent biological replicates. *p < 0.05 by Student’s unpaired two-tailed t-test.

FIG.9 is a graphic abstract of current invention.

FIG. 10 shows de novo biosynthesis of extender units from malonic acids. Scheme illustrating the combined action of the transport protein MatC and the malonyl-CoA synthetase MatB to furnish de novo extender units in E. coli. In the presence of a suitable FapR biosensor, the extender units are detected by quantification of GFP fluorescence.

FIG. 11 shows dose-response curves of the FapR biosensor in various E. coli strains of BL21(DE3). The relative fluorescence (normalized to cell density) was determined at each indicated malonic concentration and the curves were fit to the Hill equation. In the presence of both MatB and MatC, feeding malonic acid (R=H, Fig. 10) leads to dose-dependent GFP fluorescence. RtMatB, Rhizobium triffoli MatB; RpMatB, Rhodopseudomonas palusris; Rhizobium triffoli MatC.

FIG.12 shows de novo biosynthesis and detection of methylmalonyl-CoA in E. coli K207- 3. Scheme illustrating the combined action of the propionyl-CoA carboxylase PCC to furnish methylmalonyl-CoA from propionate in E. coli. In the presence of a suitable FapR biosensor, methylmalonyl-CoA is detected by quantification of GFP fluorescence.

FIG. 13 shows dose-response curves of wild-type and mutant FapR biosensor in various E. coli strains. The relative fluorescence (normalized to cell density) was determined at each indicated propionate concentration and the curves were fit to the Hill equation. The GFP fluorescence output of the FapR mutant F99L is higher than of the wild-type FapR biosensor at low (<3 mM) concentrations of propionate, indicating a mutant with better sensitivity towards methylmalonyl-CoA. DETAILED DESCRIPTION

Engineered bacterial transcriptional regulators have contributed immensely to chemical and synthetic biology by providing highly modular and tunable devices for sensing small molecules. By coupling them to fluorescent or chromogenic readouts, transcription factor-based biosensors have been applied as tools that detect key metabolites, regulate biosynthetic circuits, guide metabolic engineering, and enable directed evolution of enzymes and pathways. Yet, biosensors for the detection of natural products and their biosynthetic precursors are not yet widely available, limiting the ability of high-throughput strategies to be applied to many important classes of molecules. For instance, malonyl-CoA (mCoA) plays an integral role in cellular primary and secondary metabolism as a building block for fatty acids, phenylpropanoids, polyketides, and hybrid natural products. Because of this, mCoA biosynthesis has been a longstanding target of metabolic engineering efforts. Accordingly, genetically-encoded biosensors for the detection of mCoA have been constructed using FapR, a transcriptional regulator found in nearly all Gram- positive bacteria and acts as a global regulator for fatty acid biosynthesis. Crystal structures of FapR indicate a dimer, whereby each monomer is comprised of a C-terminal ligand-binding domain and an N-terminal domain that binds to its cognate DNA operator, fapO. Dependent on the promoter, FapR has been shown to act as either an activator or a repressor in the presence of its native ligand, mCoA. Given its utility as a transcriptional regulator, FapR has been developed and utilized as a mCoA biosensor in several hosts including E. coli, yeast, and mammalian cells. Notably, none of the previously reported FapR-based mCoA biosensors were utilized for detection of ligands beyond mCoA. Indeed, compounds related to mCoA such as acetyl-CoA, propionyl-CoA, succinyl-CoA, and butyryl-CoA have been reported to be non-effectors of FapR. Furthermore, despite the important role of mCoA derivatives substituted at the C2 position as extender unit building blocks for many biologically-relevant polyketides, genetically-encoded biosensors for C2-derivatives of mCoA have yet to be reported. What is needed are improved systems and methods for detecting a broad range of polyketide extender units beyond just mCoA.

Herein, the FapR biosensor was re-engineered for a range of mCoA concentrations across a panel of E. coli strains. The effector specificity of FapR was probed by cell-free transcription- translation, revealing that a variety of non-native and non-natural acyl-thioesters are FapR effectors. This FapR promiscuity proved sufficient for the detection of the polyketide extender unit methylmalonyl-CoA in E. coli, providing the first reported genetically encoded biosensor for this important metabolite. As such, the previously unknown broad effector promiscuity of FapR provides a platform to develop new tools and approaches that can be leveraged to overcome limitations of pathways that construct diverse a-carboxyacyl-CoAs and those that are dependent on them, including biofuels, antibiotics, anticancer drugs, and other value-added products.

Described herein is a platform technology that comprises genetically-encoded biosensors and methods for detection of a class of small molecules called polyketides. Such biosensors broadly recognize malonyl-CoA and its derivatives.

Reference will now be made in detail to the embodiments of the invention, examples of which are illustrated in the drawings and the examples. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this disclosure belongs. Terminology

Terms used throughout this application are to be construed with ordinary and typical meaning to those of ordinary skill in the art. However, Applicant desires that the following terms be given the particular definition as defined below.

As used herein, the article“a,”“an,” and“the” means“at least one,” unless the context in which the article is used clearly indicates otherwise. The term“comprising” and variations thereof as used herein is used synonymously with the term“including” and variations thereof and are open, non-limiting terms. Although the terms “comprising” and“including” have been used herein to describe various embodiments, the terms “consisting essentially of” and“consisting of” can be used in place of “comprising” and “including” to provide for more specific embodiments and are also disclosed.

As used herein, the terms “may,” “optionally,” and “may optionally” are used interchangeably and are meant to include cases in which the condition occurs as well as cases in which the condition does not occur. Thus, for example, the statement that a formulation“may include an excipient” is meant to include cases in which the formulation includes an excipient as well as cases in which the formulation does not include an excipient.

The terms "about" and "approximately" are defined as being "'close to" as understood by one of ordinary skill in the art. In one non-limiting embodiment, the terms are defined to be within 10%. In another non-limiting embodiment, the terms are defined to be within 5%. In still another non-limiting embodiment, the terms are defined to be within l %.

The term“biosensor” is defined as an analytical tool comprised of biological components that are used to detect the presence of target ligand(s) and to generate a signal. As used herein, biosensors can detect polyketide synthase extender units.

A "composition" is intended to include a combination of active agent and another compound or composition, inert (for example, a detectable agent or label) or active, such as an adjuvant.

The term "nucleic acid" as used herein means a polymer composed of nucleotides, e.g. deoxyribonucleotides or ribonucleotides.

The terms "ribonucleic acid" and "RNA" as used herein mean a polymer composed of ribonucleotides.

The terms "deoxyribonucleic acid" and "DNA" as used herein mean a polymer composed 20 of deoxyribonucleotides.

The term "oligonucleotide" denotes single- or double-stranded nucleotide multimers of from about 2 to up to about 100 nucleotides in length. Suitable oligonucleotides may be prepared by the phosphoramidite method described by Beaucage and Carruthers, Tetrahedron Lett., 22: 1859-1862 (1981), or by the triester method according to Matteucci, et al., J. Am. Chem. Soc., 103:3185 (1981), both incorporated herein by reference, or by other chemical methods using either a commercial automated oligonucleotide synthesizer or VLSIPSTM technology. When oligonucleotides are referred to as "double-stranded," it is understood by those of skill in the art that a pair of oligonucleotides exist in a hydrogen-bonded, helical array typically associated with, for example, DNA. In addition to the 100% complementary form of double-stranded oligonucleotides, the term "double-stranded," as used herein is also meant to refer to those forms which include such structural features as bulges and loops, described more fully in such biochemistry texts as Stryer, Biochemistry, Third Ed., (1988), incorporated herein by reference for all purposes.

The term "polynucleotide" refers to a single or double stranded polymer composed of nucleotide monomers.

The term "polypeptide" refers to a compound made up of a single chain of D- or L-amino acids or a mixture of D- and L-amino acids joined by peptide bonds.

The term "promoter" or "regulatory element" refers to a region or sequence determinants located upstream or downstream from the start of transcription and which are involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. Promoters need not be of bacterial origin, for example, promoters derived from viruses or from other organisms can be used in the compositions, systems, or methods described herein.

The term“recombinant” refers to a human manipulated nucleic acid (e.g. polynucleotide) or a copy or complement of a human manipulated nucleic acid (e.g. polynucleotide), or if in reference to a protein (i.e, a“recombinant protein”), a protein encoded by a recombinant nucleic acid (e.g. polynucleotide). In some embodiments, a recombinant expression cassette comprising a promoter operably linked to a second nucleic acid (e.g. polynucleotide) may include a promoter that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation (e.g., by methods described in Sambrook et al., Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., (1989) or Current Protocols in Molecular Biology Volumes 1-3, John Wiley & Sons, Inc. (1994-1998)). In another example, a recombinant expression cassette may comprise nucleic acids (e.g. polynucleotides) combined in such a way that the nucleic acids (e.g. polynucleotides) are extremely unlikely to be found in nature. For instance, human manipulated restriction sites or plasmid vector sequences may flank or separate the promoter from the second nucleic acid (e.g. polynucleotide). One of skill will recognize that nucleic acids (e.g. polynucleotides) can be manipulated in many ways and are not limited to the examples above.

The term“expression cassette” or“vector” refers to a nucleic acid construct, which when introduced into a host cell, results in transcription and/or translation of a RNA or polypeptide, respectively. In some embodiments, an expression cassette comprising a promoter operably linked to a second nucleic acid (e.g. polynucleotide) may include a promoter that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation (e.g., by methods described in Sambrook et al., Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., (1989) or Current Protocols in Molecular Biology Volumes 1-3, John Wiley & Sons, Inc. (1994-1998)).

The terms“identical” or percent“identity,” in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99% or higher identity over a specified region when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI web site or the like). Such sequences are then said to be“substantially identical.” This definition also refers to, or may be applied to, the compliment of a test sequence. The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 10 amino acids or 20 nucleotides in length, or more preferably over a region that is 10-50 amino acids or 20-50 nucleotides in length. As used herein, percent (%) nucleotide sequence identity is defined as the percentage of amino acids in a candidate sequence that are identical to the nucleotides in a reference sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. Alignment for purposes of determining percent sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared can be determined by known methods.

For sequence comparisons, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Preferably, default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters. One example of an algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1977) Nuc. Acids Res.25:3389-3402, and Altschul et al. (1990) J. Mol. Biol.215:403-410, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al. (1990) J. Mol. Biol.215:403-410). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) or 10, M=5, N=-4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands.

The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01.

The phrase“codon optimized” as it refers to genes or coding regions of nucleic acid molecules for the transformation of various hosts, refers to the alteration of codons in the gene or coding regions of polynucleic acid molecules to reflect the typical codon usage of a selected organism without altering the polypeptide encoded by the DNA. Such optimization includes replacing at least one, or more than one, or a significant number, of codons with one or more codons that are more frequently used in the genes of that selected organism.

Nucleic acid is“operably linked” when it is placed into a functional relationship with another nucleic acid sequence. For example, DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation. Generally,“operably linked” means that the DNA sequences being linked are near each other, and, in the case of a secretory leader, contiguous and in reading phase. However, operably linked nucleic acids (e.g. enhancers and coding sequences) do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice. In some embodiments, a promoter is operably linked with a coding sequence when it is capable of affecting (e.g. modulating relative to the absence of the promoter) the expression of a protein from that coding sequence (i.e., the coding sequence is under the transcriptional control of the promoter).

"Ribosome binding site" or "RBS" is also called the Shine Dalgarno sequence and generally has a sequence complementary to the 3' termina1 of 16S rRNA. The ribosomal binding site is found in bacterial and archaeal messenger RNA, and is generally located about 8 bases upstream of the start codon AUG. In particular, the RBS sequence which appears at high frequency in AGGAGG or AAGGAGG (hereinafter these sequences are referred to as ''consensus RBS sequences''), or a sequence homologous with“consensus RBS sequence". Although these sequences appear at various sites of genes, it is understood that the RBS sequences appear at high frequency in regions upstream of start codons. Also included in the term "RBS" is the RBS sequence from the FapR gene as disclosed herein (CAAGGAGGT (SEQ ID NO:1)). Other functional RBS sequences can also be used in place of the specific sequences disclosed herein. When discussing nucleotide mutations in the RBS, the first C is labeled as nucleotide "1" and the final T is labelled as nucleotide "9". Alternatively, the mutations may sometimes referred to by their relative position to the ATG start codon. The basic structure of a prokaryote gene consists of a promoter which starts the synthesis of mRNA, a ribosome binding site which participates in the binding between mRNA and ribosomes and in the translation initiation, a start codon, a translation stop codon and a terminator which terminates the synthesis of mRNA. AUG codon is the most appropriate as a start codon. Since the start codons and coding regions are determined usually based upon a DNA sequence, in the present specification, the sequences of start codons and stop codons and sequences involved in the binding of ribosomes and mRNA are expressed as DNA sequences appropriately as well as RNA sequences, unless mentioned specifically.

The term "gene" or "gene sequence" refers to the coding sequence or control sequence, or fragments thereof. A gene may include any combination of coding sequence and control sequence, or fragments thereof. Thus, a "gene" as referred to herein may be all or part of a native gene. A polynucleotide sequence as referred to herein may be used interchangeably with the term "gene”, or may include any coding sequence, non-coding sequence or control sequence, fragments thereof, and combinations thereof. The term "gene" or "gene sequence" includes, for example, control sequences upstream of the coding sequence (for example, the ribosome binding site). Compositions, Biosensors, and Methods of Use

In some aspects, disclosed herein is a biosensor system comprising:

a first nucleic acid comprising a genetically modified fapR gene, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type fapR gene, and wherein the first nucleic acid is operably linked to a first promoter; and

a second nucleic acid comprising a reporter gene whose transcription is under the control of a second promoter which is regulated by the fapR transcription factor.

In some embodiments, the first nucleic acid and the second nucleic acid are located on one recombinant DNA vector.

In some embodiments, the first nucleic acid and second nucleic acid do not comprise a lacO sequence.

In some embodiments, the first nucleic acid comprises a first ribosome binding site. In some embodiments, the second nucleic acid comprises a second ribosome binding site and a fapO operator. In some embodiments, the first promoter and the second promoter initiate transcription in opposite directions. In some embodiments, the first promoter and the second promoter initiate transcription in the same direction.

In some embodiments, the first ribosome binding site sequence is TAVRCAGGH (SEQ ID NO:2); wherein V is A, C, or G; wherein R is A or G; and wherein H is A, C, or T.

In some embodiments, the first ribosome binding site sequence is selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, and SEQ ID NO:20. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:3. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:4. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:5. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:6. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:7. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:8. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:9. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:10. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:11. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:12. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:13. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:14. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:15. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:16. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:17. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:18. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:19. In some embodiments, the first ribosome binding site sequence comprises SEQ ID NO:20.

In some embodiments, the first ribosome binding site sequence is selected from the group comprising a nucleic acid sequence at least 60% (for example, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%) identical to SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, or SEQ ID NO:20.

In some embodiments, the wild-type fapR gene comprises a polynucleotide sequence at least 60% (for example, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%) identical to SEQ ID NO:25. In some embodiments, the wild-type fapR gene comprises the polynucleotide sequence of SEQ ID NO:25.

In some embodiments, the genetically modified fapR gene comprises a polynucleotide sequence at least 60% (for example, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%) identical to SEQ ID NO:24. In some embodiments, the genetically modified fapR gene comprises the polynucleotide sequence of SEQ ID NO:24. In some embodiments, the genetically modified fapR gene encodes a fapR protein comprising an amino acid substitution at residue F99 when compared to SEQ ID NO:29. In some embodiments, the genetically modified fapR gene encodes a fapR protein comprising a F99L amino acid substitution when compared to SEQ ID NO:29. In some embodiments, the genetically modified fapR gene encodes a fapR protein comprising the amino acid sequence of SEQ ID NO:28.

In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises a polynucleotide sequence selected from SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, and SEQ ID NO:20. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:1. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:2. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:3. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:4. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:5. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:6. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:7. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:8. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:9. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:10. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:11. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:12. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:13. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:14. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:15. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:16. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:17. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:18. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:19. In some embodiments, the genetically modified fapR gene of the biosensor system disclosed herein comprises the polynucleotide sequence of SEQ ID NO:24 and the first ribosomal binding site comprises the polynucleotide sequence of SEQ ID NO:20.

In some embodiments, the genetically modified fapR gene confers detection of one or more polyketide synthase extender units. Polyketides are assembled by successive rounds of decarboxylative Claisen condensations between a thioesterified malonate derivative and an acyl thioester. The enzyme that catalyzes these condensations is a“polyketide synthase”. Polyketide synthases catalyze condensation reactions between an activated carboxylic acid (e.g. acetyl-CoA, which is the activated form of acetate) and an activated dicarboxylic acid (e.g. malonyl-CoA, which is the activated form of malonate). These condensation reactions take place through a decarboxylative Claisen condensation mechanism in which the activated carboxylic acid (e.g. acetyl-CoA) serves as“starter” or“primer” and the activated dicarboxylic acid (e.g. malonyl-CoA) serves as“extender” unit. This reaction involves the decarboxylation of the extender and results in the formation of a di- or polyketide that is two carbons longer than the starter.

In some embodiments, the one or more polyketide synthase extender units comprise malonyl-CoA or an a-substituted derivative thereof. The structure of malonyl-CoA is shown below: In some embodiments, the one or more polyketide synthase extender units is an a- substituted derivative of malonyl-CoA. The structure is shown below:

In some embodiments, the a-substituted derivative of malonyl-CoA comprises a substituent selected from the group consisting of alkyl, alkynyl, and alkenyl. In some embodiments, the a-substituted derivative of malonyl-CoA comprises an alkyl substituent. In some embodiments, the a-substituted derivative of malonyl-CoA comprises an alkynyl substituent. In some embodiments, the a-substituted derivative comprises an alkenyl substituent.

The term“alkyl” as used herein is a branched or unbranched saturated hydrocarbon group of 1 to 24 carbon atoms, such as methyl, ethyl, n-propyl, isopropyl, n-butyl, isobutyl, t-butyl, pentyl, hexyl, heptyl, octyl, nonyl, decyl, dodecyl, tetradecyl, hexadecyl, eicosyl, tetracosyl, and the like. The alkyl group can also be substituted or unsubstituted. The alkyl group can be substituted with one or more groups including, but not limited to, alkyl, halogenated alkyl, alkoxy, alkenyl, alkynyl, aryl, heteroaryl, aldehyde, amino, carboxylic acid, ester, ether, halide, hydroxy, ketone, nitro, silyl, sulfo-oxo, sulfonyl, sulfone, sulfoxide, or thiol. The term“alkenyl” as used herein is a hydrocarbon group of from 2 to 24 carbon atoms with a structural formula containing at least one carbon-carbon double bond. Asymmetric structures such as (A1A2)C=C(A3A4) are intended to include both the E and Z isomers. This may be presumed in structural formulae herein wherein an asymmetric alkene is present, or it may be explicitly indicated by the bond symbol C=C. The alkenyl group can be substituted with one or more groups including, but not limited to, alkyl, halogenated alkyl, alkoxy, alkenyl, alkynyl, aryl, heteroaryl, aldehyde, amino, carboxylic acid, ester, ether, halide, hydroxy, ketone, nitro, silyl, sulfo-oxo, sulfonyl, sulfone, sulfoxide, or thiol.

The term“alkynyl” as used herein is a hydrocarbon group of 2 to 24 carbon atoms with a structural formula containing at least one carbon-carbon triple bond. The alkynyl group can be substituted with one or more groups including, but not limited to, alkyl, halogenated alkyl, alkoxy, alkenyl, alkynyl, aryl, heteroaryl, aldehyde, amino, carboxylic acid, ester, ether, halide, hydroxy, ketone, nitro, silyl, sulfo-oxo, sulfonyl, sulfone, sulfoxide, or thiol.

The term“aryl” as used herein is a group that contains any carbon-based aromatic group including, but not limited to, benzene, naphthalene, phenyl, biphenyl, phenoxybenzene, and the like. The term“heteroaryl” is defined as a group that contains an aromatic group that has at least one heteroatom incorporated within the ring of the aromatic group. Examples of heteroatoms include, but are not limited to, nitrogen, oxygen, sulfur, and phosphorus. The term“non- heteroaryl,” which is included in the term“aryl,” defines a group that contains an aromatic group that does not contain a heteroatom. The aryl and heteroaryl group can be substituted or unsubstituted. The aryl and heteroaryl group can be substituted with one or more groups including, but not limited to, alkyl, halogenated alkyl, alkoxy, alkenyl, alkynyl, aryl, heteroaryl, aldehyde, amino, carboxylic acid, ester, ether, halide, hydroxy, ketone, nitro, silyl, sulfo-oxo, sulfonyl, sulfone, sulfoxide, or thiol as described herein. The term“biaryl” is a specific type of aryl group and is included in the definition of aryl. Biaryl refers to two aryl groups that are bound together via a fused ring structure, as in naphthalene, or are attached via one or more carbon-carbon bonds, as in biphenyl.

In some embodiments, the a-substituted derivative of malonyl-CoA comprises methylmalonyl-CoA or propargylmalonyl-CoA. In some embodiments, the a-substituted derivative of malonyl-CoA is methylmalonyl-CoA. In some embodiemtns, the a-substituted derivative of malonyl-CoA is propargylmalonyl-CoA. The stuctures of methylmalonyl-CoA and propargylmalonyl-CoA are shown below:

In some embodiments, the a-substituted derivative of malonyl-CoA comprises a C 1 -C 6 alkyl, a C 2 -C 6 alkynyl, or a C 2 -C 6 alkenyl. The structure can be, but is not limited to, the examples shown below:

In some embodiments, the reporter gene (or marker gene) comprises a gene encoding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase, or a fluorescent protein. In some embodiments, the fluorescent protein is Superfolder GFP.

In some embodiments, the biosensor system of any preceding aspect further comprises a gene encoding MatC and/or a gene encoding MatB. In some embodiments, the biosensor comprises a polynucleotide sequence at least 60% (for example, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%) identical to SEQ ID NO:26. In some embodiments, the biosensor comprises the polynucleotide of SEQ ID NO:26. In some embodiments, the biosensor comprises a polynucleotide sequence at least 60% (for example, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%) identical to SEQ ID NO:27. In some embodiments, the biosensor comprises the polynucleotide of SEQ ID NO:27.

In some aspects, disclosed herein is a recombinant DNA vector comprising the biosensor system of any preceding aspect.

In some aspects, disclosed herein is a method for making (manufacturing) a recombinant DNA vector comprising the biosensor system of any preceding aspect, said method comprising: construct a first nucleic acid and a second nucleic acid into the DNA vector; wherein the first nucleic acid comprising a genetically modified fapR gene that has at least one genetic mutation when compared to the wild-type fapR gene, and wherein the first nucleic acid is operably linked to a first promoter; and wherein the second nucleic acid comprising a reporter gene whose transcription is under the control of a second promoter which is regulated by the fapR transcription factor.

In some aspects, disclosed herein is a method for detecting one or more polyketide synthase extender units, said method comprising: 1) introducing into a cell a recombinant DNA vector comprising: a first nucleic acid comprising a genetically modified fapR gene, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type fapR gene, and wherein the first nucleic acid is operably linked to a first promoter; and a second nucleic acid comprising a reporter gene whose transcription is under the control of a second promoter which is regulated by the fapR transcription factor; and 2) measuring the one or more polyketide synthase extender units based on the expression of the reporter gene in the cell.

In some embodiments, the cell comprises a bacterial cell, a mammalian cell, or a yeast cell. In some embodiments, the cell is a bacterial cell. In some embodiments, the bacterial cell comprises an E. coli cell.

In some embodiments, the vector comprises a plasmid based vector, an adenovirus vector, a vaccinia vector, or a retroviral vector. In some embodiments, the vector integrates into the genomic DNA of the cell. In some embodiments, the vector is an episomal vector.

In some aspects, disclosed herein is a cell-free biosensor system, comprising: a recombinant DNA vector comprising: a reporter gene whose transcription is under the control of a promoter, a fapO operator, and a ribosome binding site; a fapR transcription factor; and transcription-translation reagents comprising a polymerase and/or a ribosome. In some aspects, disclosed herein is a cell-free biosensor system, comprising: a recombinant DNA vector comprising: a reporter gene whose transcription is under the control of a promoter, a fapO operator, and a ribosome binding site; wherein the DNA vector does not comprise a lacO sequence.

In some embodiments, the transcription-translation reagents are provided from a reconstituted protein synthesis system that comprises all the necessary components for in vitro transcription and translation (for example, PURExpress In Vitro Protein Synthesis Kit from New England Biolabs).

In some embodiments, the promoter comprises a T7 promoter. In some embodiments, the fapR transcription factor confers detection of one or more polyketide synthase extender units.

In some aspects, disclosed herein is a cell-free method for detecting one or more polyketide synthase extender units, said method comprising: combining the cell-free biosensor system of any preceding aspect with one or more polyketide synthase extender units; and measuring the one or more polyketide synthase extender units based on the expression of the reporter gene. EXAMPLES

The following examples are set forth below to illustrate the compounds, systems, methods, and results according to the disclosed subject matter. These examples are not intended to be inclusive of all aspects of the subject matter disclosed herein, but rather to illustrate representative methods and results. These examples are not intended to exclude equivalents and variations of the present invention which are apparent to one skilled in the art. Example 1. Development of a genetically-encoded biosensor for detection of polyketide synthase extender units in Escherichia coli.

Refactoring the FapR Operon for an Efficient Prototype Malonyl-CoA Biosensor. Previously published E. coli malonyl-CoA biosensors were based on FapR as a repressor (with the addition of a second effector molecule, e.g. IPTG) or FapR as an activator. Thus, these biosensors have been limited in their ability to be coupled with other protein expression systems and unnecessarily increase the metabolic burden on the host strain. To address this constraint, a biosensor construct was desired that required no external inputs to function and that was robust enough to use as a template for directed evolution of FapR or other proteins of interest. Towards this end, a new plasmid, pSENSE2FF, was designed based on the a previously described macrolide biosensor. Two constitutive lac promoters, ProsfGFP and ProfapR (pLacIQ), control the transcription of a fluorescent reporter gene (super-folder GFP, sfGFP) and a codon-optimized version of the fapR gene from B. subtilis, respectively. Downstream of ProsfGFP, a single copy of the 17 bp fapO operator was introduced to afford FapR control of reporter transcription (FIG. 2B and Table 1). Superfolder GFP (sfGFP) was selected due to its higher brightness, rapid folding, and low photobleaching compared with other GFP variants. Table 1. Sequences of DNA Constructs. RBS 1A1 (CAAGGAGGT, SEQ ID NO:1) for fapR in pSENSE2FF is underlined. For RBS 2H8, the sequence was TACACAGGC (SEQ ID NO:10). The fapR_fapO fragment is bolded in pSENSE2FF.

In Vivo Engineering of the Prototype Malonyl-CoA Biosensor. Following the construction of the prototype refactored biosensor system, an in vivo assay utilizing cerulenin was developed to determine its fold activation. Because mCoA is not cell-permeable, it is not possible to directly manipulate its intracellular concentration. Cerulenin acts as an equimolar inhibitor of the b-keto-acyl-ACP synthase and causes a build-up of intracellular mCoA due to its inability to be processed to fatty acids. The linear response of [mCoA] to [cerulenin] has been previously established for the concentrations used here. The fold activation of the prototype biosensor strain (RBS 1A1, SEQ ID NO:1) in response to mCoA was determined by measuring sfGFP fluorescence in the presence and absence of cerulenin supplemented to the culture media, revealing a modest activation ratio (~5-fold, FIG. 3). Thus, the prototype biosensor required further engineering to enhance its sensitivity and fold-activation. Lower concentrations of FapR results in higher fluorescence output at a fixed concentration of mCoA. Therefore, the RBS of fapR was targeted for mutagenesis by designing an 18-member RBS variant library (Table 2) with a maximum calculated transcription initiation rate (TIR) equal to that of the RBS of 1A1 (7,594 au). The fold-activation of ~300 members of the fapR RBS library was determined by again leveraging cerulenin to enhance the intracellular concentration of mCoA (FIG. 3). Most RBS variants resulted in significantly higher activation than the prototype 1A1 construct, with some reaching >60-fold activation under the assay conditions (FIG.3). Table 2. RBS Sequences

FapR RBS 1A1: CAAGGAGGT (SEQ ID NO:1)

FapR RBS Library: TAVRCAGGH (SEQ ID NO:2) V = A, C, or G; R = A or G; H = A, C, or T

Several variants were selected from the FapR RBS library and their dose-response curves with cerulenin determined in E. coli DH5a. One notable variant, 2H8 (SEQ ID NO:10), predicted to have the weakest calculated RBS (TIR = 171 au), provided a higher fluorescence output than 1A1 across the entire range of cerulenin concentrations assayed. The variant 2H8 also displayed a robust activation ratio (~34, Table 3) and provided a maximal fluorescent response of 58,000 RFU (relative fluorescence units) that was more than 3-fold higher than that of 1A1 (FIG.4 and Table 3). Notably, this advantage came with the side effect of a leaky OFF state of ~1,400 RFU, ~6-fold higher than that of 1A1, such that the K 1/2 of 2H8 with cerulenin was indistinguishable from that of 1A1 in this strain (Table 3). Nevertheless, this RBS engineering approach quickly arrived at a variant biosensor with the desired improvement in detection ability across a range of cerulenin concentrations. FapR-Based Detection of Malonyl-CoA Across Various E. coli Strains. The FapR- based biosensor inherently accounts for basal levels of ligand as mCoA is required for cell growth and is always present. However, the contribution of background mCoA to overall biosensor activity under other conditions cannot be easily subtracted because the levels of mCoA fluctuate significantly during growth of a culture. Moreover, the effect of the inhibitor cerulenin across various strains of E. coli may not be consistent. To determine whether the engineered FapR biosensor response is dependent on the host strain, the best performing variant 2H8 was also transformed into E. coli 10G and E. coli TOP10, and the cerulenin dose-response curves determined. Notably, there are differences in how the three E. coli strains respond to cerulenin and consequently, the dose-response curves for each strain are significantly different (FIG. 4 and Table 3). For example, the maximum fluorescence output of the FapR biosensor in E. coli TOP10, 10G, and DH5a is reached at ~30, ~50, and ~75 mM cerulenin, respectively (Table 3). In addition, the fluorescence output of the FapR biosensor in E. coli TOP10 at high concentrations of cerulenin (>50 mM) is significantly lower than that in E. coli DH5a or 10G (FIG. 4). Thus, the FapR biosensor performs differently across various E. coli strains during growth of the culture but crucially always out-performs the prototype 1A1 in E. coli DH5a (FIG. 4). The strain-specific dose-response curves are a consequence of several factors, including unique FapR expression levels, unique sensitivity to cerulenin, or strain-dependent levels of mCoA in each strain. However, given that too much mCoA is toxic to the cell, a culture with higher endogenous levels of mCoA can result in both improved sensitivity to cerulenin and a lower fluorescence at lower concentrations of cerulenin. Interestingly, the few genomic differences between these strains are insufficient to directly explain large differences in mCoA production levels (Table 4). Regardless, the overall robust performance of the FapR 2H8 biosensor across several strains of E. coli indicates that the engineered prototype biosensor is a good starting point for development of a tool for detection of other malonyl-CoA derivatives. Table 3. Biosensor Parameters in Various E. coli Strains.

a Mutated nucleotides are underlined.

b Transcription initiation rate (arbituary units).

c Relative GFP fluorescence at 0 mM cerulenin.

d Highest measured relative GFP fluorescence.

e Concentration of cerulenin that produced the highest measured relative GFP fluorescence.

f Concentration of cerulenin at half-maximum relative GFP fluorescence, determined by fitting to the dose- response curve.

g GFP max / GFP min Table 4. Strains and Genotypes.

Probing the Effector Promiscuity of FapR by Cell-Free Transcription-Translation. Aside from mCoA, the ability of other a-carboxyacylmalonyl-CoAs to activate FapR has not been previously reported. Analysis of the crystal structure of FapR reveals that the terminal carboxylate of mCoA is a major factor in effector-binding specificity, forming a salt-bridge with the side-chain of Arg106, and is facilitated by a nearby Phe99 side-chain and Glu73’ backbone carbonyl (FIG. 5A). In addition, the thioester of mCoA binds to the side-chain of Asn115’. Notably, while the side-chain of Phe99 orients almost directly towards the a-carbon of mCoA (distance between Phe99-C 4 and mCoA-C a is ~3.2 Å), there is room for substituents in place of the a-hydrogens of mCoA (especially in place of the pro-R hydrogen). Moreover, the loops that form the majority of the ligand binding pocket are disordered in the ligand-free FapR structure which indicates this region is highly dynamic. The lack of residues to restrict potential mCoA a-substituents coupled with the conformational flexibility indicates that mCoA derivatives are FapR effectors. Accordingly, a series of mCoA derivatives with various alkyl and alkynyl functionalities at the C2-position was designed to probe the effector promiscuity of FapR and included methylmalonyl- CoA (mmCoA), ethylmalonyl-CoA (emCoA), propargylmalonyl-CoA (pgmCoA), and butylmalonyl-CoA (bmCoA) (FIG. 5B). Furthermore, the 3’-phospho-nucleoside moiety of mCoA is solvent exposed (FIG.5A), showing that a large portion of CoA is not required for de- repression of FapR. Accordingly, the truncated analogue mmSNAC was synthesized to confirm this conclusion (FIG.5B).

Assessing the effector promiscuity of FapR in vivo with the panel of mCoA analogues is challenging for several reasons. First, acyl-CoAs are not cell-permeable and cannot be introduced exogenously. Second, pathways for assembling most non-native mCoA derivatives in E. coli have yet to be developed, limiting the high-level in situ biosynthesis of the selected acyl-CoAs. To circumvent these issues, a cell-free transcription-translation (TX-TL) approach was developed that can accurately control the concentration of mCoA within the assay.

To this end, the reporter module of the FapR 2H8 biosensor was cloned into pET28a, replacing the typical lacO site controlling the T7 promoter with a single fapO binding site upstream of the sfGFP gene, producing pET28a-T7-fapO-sfGFP (FIG.5C). Each a-carboxyacyl- CoA was generated enzymatically from the corresponding malonic acid and CoA using the wild- type malonyl-CoA synthetase MatB or the engineered MatB variant, T207G/M306I. Next, the response of the cell-free TX-TL biosensor assay was evaluated by using various concentrations of MatB-generated native effector, mCoA, along with a fixed amount of purified FapR and T7-fapO- sfGFP. The fluorescence output of the cell-free biosensor was linear up to 150 mM mCoA (FIG. 6) and is consistent with previously reported in vitro transcription activities and binding affinities of FapR and mCoA. In addition, the maximum fluorescence signal was ~15-fold greater than that in the absence of mCoA. Furthermore, to account for potential inhibition of the cell-free biosensor by the MatB reaction mixture, the FapR system was also assayed using a commercial standard of mCoA. Notably, there was no significant difference in the fluorescence output of the cell-free FapR biosensor measured in the presence of the MatB-generated or commercial mCoA (FIG.6). Together, these data indicate that the cell-free TX-TL assay is suitable to probe the effector promiscuity of FapR with MatB-generated xmCoAs.

Next, the cell-free FapR biosensor was assayed with each a-carboxyacyl-CoA at a fixed concentration (150 mM). Notably, the fluorescence response with mmCoA was statistically indistinguishable from that with the native ligand, mCoA (FIG. 5D). Furthermore, the ethyl (emCoA), propargyl (pgmCoA), and butyl (bmCoA) derivatives of mCoA were all also identified as effectors, as judged by the cell-free fluorescence data. The ability of pgmCoA to de-repress FapR/T7-fapO-sfGFP was indistinguishable from that of mCoA and mmCoA under the assay conditions used. The saturated alkyl derivatives emCoA and bmCoA were the poorest effectors, based on the fluorescence data. This robust activity of pgmCoA is due in part to the increased rigidity of the alkyne side-chain or additional interactions of the alkynyl p electrons with the FapR ligand-binding site, with the p-rich side-chain of the nearby Phe99 (FIG. 5A). Gratifyingly, de- repression was not observed in the presence of propionyl-CoA and was indistinguishable from the (-) thioester control. To provide further evidence that FapR is de-repressed by derivatives of mCoA, FapR was tested in vitro for DNA-binding and release by electrophoretic mobility shift assay (EMSA) in the presence of mCoA or mmCoA. Using purified wild-type FapR protein, EMSA was run with a 40 bp DNA fragment containing the fapO binding site (FIG.7). Incubation of the DNA fragment with a two-fold excess of FapR resulted in near-complete binding and a corresponding gel shift. Next, to determine the concentration of mCoA at which half of the DNA would be bound, various concentrations of mCoA were titrated against FapR, revealing that 200 mM was required for this. Incubation with the FapR-fapO complex with mmCoA achieved a similar level of de- repression as that observed with mCoA (FIG.7). Together, the in vitro data confirm the ability of mmCoA to de-repress the FapR system.

In addition to the full-length a-carboxyacyl-CoAs, the truncated analogue mmSNAC was determined to be an effector, although it provided ~50% the de-repression activity of the corresponding CoA thioester, based upon the fluorescence assay (FIG.5D). Thus, even though a large portion of CoA is solvent exposed, some of the amide portion of CoA is still required for optimal de-repression of FapR. Remarkably, these data indicate that the promiscuity of FapR extends beyond the previously-established native effector mCoA and includes non-native and non- natural malonyl-thioesters.

FapR-Based Detection of Methylmalonyl-CoA in vivo. The discovery that FapR is de- repressed by several non-native malonyl-CoA derivatives in vitro motivates an investigation of whether the effector promiscuity of FapR can be leveraged to detect non-native acyl-CoAs inside living cells. As an initial test for FapR detection of derivatives of mCoA in vivo, a previously established and engineered E. coli strain optimized for high-level production of methylmalonyl- CoA (mmCoA) was selected for analysis. E. coli K207-3 includes the propionyl-CoA carboxylase (PCC) genes pccB and accA1 from Steptomyces coelicolor under T7 polymerase/LacI control, allowing for production of mmCoA from propionate when induced with IPTG, while endogenous mCoA levels remain unchanged. Accordingly, the FapR biosensor 2H8 was introduced into the K207-3 strain, and the fluorescence output was determined in the presence or absence of IPTG/sodium propionate supplemented to the growth media. In the absence of IPTG, the fluorescence output was low (~2,000 RFU) and similar to the background fluorescence of 2H8 in E. coli TOP10, 10G, and DH5a in the absence of cerulenin (FIG.8). Upon induction of the PCC genes with IPTG, the fluorescence output increased significantly (>15,000 RFU), even in the absence of supplemental propionate (FIG. 8). Furthermore, a small and statistically significant increase in fluorescence signal was observed when the culture media was supplemented with sodium propionate. This indicates that in the absence of enzymatic machinery to otherwise consume mmCoA, and there is sufficient endogenous propionate and/or propionyl-CoA in K207- 3 to drive mmCoA biosynthesis to levels detectable by the engineered FapR biosensor. Discussion

The FapR transcription factor has been used extensively to monitor its native effector, mCoA, and to regulate artificial genetic circuits that produce or consume mCoA. Notably though, applications of FapR-based biosensors have been strictly limited due its assumed strict specificity for mCoA, and genetically-encoded biosensors for derivatives of mCoA have yet be reported. To address this deficiency, the invention disclosed herein first engineers the FapR biosensor by simplifying its circuit architecture and leveraging RBS mutagenesis to improve its ability to detect mCoA in E. coli. By testing the engineered biosensor across a series of E. coli strains, the impact of different cellular backgrounds was highlighted, revealing that some aspects of the FapR-based detection of mCoA are host-dependent. Next, the effector specificity of FapR beyond mCoA was probed and a panel of various a-carboxylmalonyl-thioesters via enzymatic or chemical synthesis was prepared. Crucially, a cell-free approach was developed to determine effector promiscuity with ligands that are otherwise cell impermeable. Moreover, the cell-free assay allowed the concentrations of each effector to be precisely controlled in the absence of other processes that can otherwise lead to fluctuations in their concentration over time. In this way, four additional CoA-thioesters were identified as effectors of FapR, including mmCoA, a metabolite native to B. subtilis. The biosensor was also shown to be de-repressed by an a-carboxylmalonyl-SNAC, indicating that FapR can be harnessed to monitor intracellular levels of these chemically synthesized precursors that are frequently used as non-native and non-natural building blocks for polyketide biosynthesis. Remarkably, the native effector promiscuity of FapR was leveraged to detect mmCoA production in an engineered strain of E. coli, K207-3. It is notable that, even though the ability of mCoA and mmCoA to de-repress FapR is indistinguishable (FIG.5D), the endogenous mCoA levels in K207-3 are not impacted by IPTG induction and/or propionate addition, so that the increase in mmCoA production upon induction is easily detected by the FapR biosensor in K207-3. This result demonstrates the ability of the FapR biosensor to detect fluctations in mmCoA in engineered microbial strains and paves the way to utilize this ability to guide improvements in mmCoA production and to develop strategies for dynamic metabolic control of mmCoA-dependent pathways. Engineering of the FapR effector binding pocket enables the detection of additional ligands or even specificity towards individual ligands, especially given the precedent of manipulating the effector specificity of other transcription factor based biosensors.

Together, this work has identified FapR as a tool for regulating metabolic pathways that produce or consume mmCoA. Moreover, the hitherto unknown ability of FapR to detect various a-carboxylmalonyl-CoAs allows high-throughput engineering approaches such as directed evolution to be applied to enzymatic pathways responsible for the biosynthesis of natural or non- natural malonyl-CoA derivatives in diverse microbial hosts. In this way, FapR is an invaluable device to expand the scope and utility of a broad range of microbial based platforms for synthesis of products constructed from diverse polyketide extender units. Materials and Methods

General. Materials and reagents were purchased from Sigma Aldrich (St. Louis, MO) unless otherwise noted. Isopropyl b-D-thiogalactoside (IPTG) was purchased from Calbiochem (Gibbstown, NJ). Primers were purchased from Integrated DNA Technologies (Coralville, IA). E. cloni 10G electrocompetent cells were purchased from Lucigen Corporation (Middleton, WI). Cerulenin was purchased from Cayman Chemical (Ann Arbor, MI), and stocks were dissolved in DMSO. Commercial malonyl-CoA and methylmalonyl-CoA were purchased from CoALA Biosciences (Austin, TX). All cultures were grown in LB media with 100 mg/mL ampicillin unless otherwise stated. Absorbance and fluorescence readings were taken in clear flat-bottom and black flat-bottom 96-well plates (Greiner Bio-One), respectively, in a BioTek Hybrid Synergy 4 plate reader, unless otherwise stated. All Sanger sequencing was performed by Genewiz, Inc. (South Plainfield, NJ).

Construction of Plasmids. The plasmid pSENSE2 was synthesized by Twist Bioscience (San Francisco, CA) in two fragments. The two pieces were assembled using standard restriction digestion and ligation procedures. The fapR_fapO fragment containing a codon-optimized fapR gene and a fapO operator was synthesized by Genewiz, Inc. and cloned into pSENSE2 between the KpnI and SpeI sites to give pSENSE2FF. FapR was amplified with a 5’ NcoI site and a 3’ XhoI site and cloned into pET28a to give fapR_pET28a (with a C-terminal His6 tag). T7_fapO_sfGFP was constructed using GenScript GenBuilder (GenScript, Piscataway, NJ) according to the manufacturer’s instructions with pET28a (amplified with T7forfapO.Gib1 and T7forfapO.Gib2, see entries 1-2, Table 5) and fapO_sfGFP (amplified from pSENSE2FF with fapOforT7.Gib1 and fapOforT7.Gib2, see entries 3-4, Table 5). Table 5. Sequences of oligonucleotides.

RBS Library Construction and Screening. The original fapR RBS 1A1 was calculated to have a transcription initiation rate (TIR) of 7,594 au using the Salis online RBS calculator. The calculator was used to design an 18-member RBS library with a maximum TIR of 7,594 au. Site-directed mutagenesis was also used to produce the RBS library (VRGAGGH) using the QuikChange II Site-Directed Mutagenesis protocol with the pSENSE2FF template and primers FapR_RBSLib2.SDM1 and FapR_RBSLib2.SDM2 (see entries 5-6, Table 5). The reaction product was digested with DpnI for 3 h at 37 °C before electroporation into E. cloni ® 10G electrocompetent cells (Lucigen). Transformed cells were plated on LB agar supplemented with 100 mg/mL ampicillin and incubated overnight at 37 °C. Individual colonies from the library were picked from LB agar plates and used to inoculate 93 wells of a 96-deepwell microplate with 500 mL LB media supplemented with 100 mg/mL ampicillin. The remaining 3 wells were inoculated with colonies from pSENSE2FF RBS 1A1 (SEQ ID NO:1) that had been transformed into E. coli DH5a. Cultures were grown overnight at 37 °C and 350 rpm, and 10 mL from each well were used to inoculate 440 mL LB media supplemented with 100 mg/mL ampicillin. The new plates were grown for one hour at 37 °C and 350 rpm. Plates were then treated with either 5 mM, 12.5 mM, or 25 mM cerulenin or the corresponding volume of DMSO. Plates were then grown an additional 15 h. Plates were centrifuged at 3,500 rpm for 7 min, and cell pellets were resuspended in 500 mL PBS.100 mL from each well was used for analyzing optical density at 600 nm and sfGFP fluorescence (ex 485 nm/em 509 nm). Fluorescence values were divided by OD600 to yield growth-corrected relative fluorescence values.279 individual colonies from the RBS library were screened at the different cerulenin concentrations.

Expression and Purification of Mutant MatB. The expression and purification of MatB T207G/M306I has been previously described. Briefly, E. coli BL21(DE3) pLysS competent cells were transformed with plasmid and positive transformants were selected on LB agar supplemented with 30 µg/mL kanamycin. A single colony was transferred to LB (3 mL) supplemented with kanamycin (30 µg/mL) and grown at 37 °C and 250 rpm overnight. The culture was used to inoculate LB media (1 L) supplemented with kanamycin (30 µg/mL). One liter culture was incubated at 37 °C and 250 rpm to an OD 600 of 0.6, at which time protein synthesis was induced by the addition of IPTG to a final concentration of 1 mM. After incubation at 18 °C and 200 rpm for 18 h, cells were collected by centrifugation at 5,000 g for 20 min, and resuspended in 100 mM Tris-HCl pH 8.0 (20 mL) containing NaCl (300 mM) and then lysed by sonication. Following centrifugation at 10,000 g, the soluble extract was loaded onto a 1 mL HisTrap HP column (GE Healthcare, Piscataway, NJ) and purified by fast protein liquid chromatography using the following buffers: wash buffer [20 mM sodium phosphate (pH 7.4) containing 0.5 M NaCl and 20 mM imidazole] and elution buffer [20 mM sodium phosphate (pH 7.4) containing 0.5 M NaCl and 200 mM imidazole]. The purified protein was concentrated using an Amicon Ultra 30 kDa MWCO centrifugal filter (Millipore Corp., Billerica, MA) and stored as 10% glycerol stocks at -80 °C. Protein purity was verified by SDS-PAGE. Protein quantification was carried out using the Bradford Protein Assay Kit from Bio-Rad.

Synthesis of Acyl-CoAs by MatB. The MatB-catalyzed synthesis of extender units has been previously described. Briefly, reactions were performed in a 50 µL reaction mixture containing 100 mM sodium phosphate (pH 7), MgCl 2 (2 mM), ATP (12 mM), coenzyme A (8 mM), malonate or corresponding analogue (16 mM) and wild-type or mutant MatB (10 µg) at 25 °C. Aliquots were removed after 3 h incubation, and quenched with an equal volume of ice-cold methanol, centrifuged at 10,000 g for 10 min, and cleared supernatants used for HPLC analysis on a Varian ProStar HPLC system. A series of linear gradients was developed from 0.1% TFA (A) in water to methanol (HPLC grade, B) using the following protocol: 0-32 min, 80% B; 32-35 min, 100% A. The flow rate was 1 mL/min, and the absorbance was monitored at 254 nm using Pursuit XRs C18 column (250 x 4.6 mm, Varian Inc.). To ensure complete conversion, the malonate analog and the acyl-CoA product HPLC peak areas were integrated, and the conversion (%) calculated as a percent of the total peak area. Product elution times and LC-MS data (data not shown) were in complete agreement with that previous described. Synthesis of mmSNAC. To access the previously reported mmSNAC, the a-carboxy group of the corresponding commercial methylmalonic acid was protected via esterification with t Bu, which was then thioesterified with HSNAC, and deprotected via hydrolysis to afford the acyl- SNAC.

Synthesis of N-Acetyl Cysteamine (HSNAC). In an oven-dried round-bottom flask (250 mL) equipped with a magnet was added cysteamine hydrochloride (6.88 g, 60 mmol, 1 eq) followed by deionized water (170 mL). Under strong stirring then was added potassium hydroxide (3.40 g, 61 mmol, 1.01 eq) and sodium bicarbonate (8.65 g, 103 mmol, 1.7 eq). The flask was then cooled to 0 °C and acetic anhydride (5.75 mL, 61 mmol, 1.0 eq) was then added dropwise (300 µL/min). After the addition, the reaction was allowed to warm to room temperature and stirred overnight. The reaction was then cooled again to 0 °C and quenched with concentrated HCl to pH ~ 1. The aqueous layer was extracted with ethyl acetate (4x50 mL). the organic layers were combined washed with brine (40 mL), dried over magnesium sulphate and concentrated in vacuo to give the crude oil which was then vacuum distilled to furnish N-acetyl cysteamine (4.76 g, 66%) as a colourless oil. 1 H NMR (300 MHz, CDCl 3 ) d 6.91 (s, 1H), 3.38 (q, J = 6.5 Hz, 2H), 2.63 (dt, J = 8.4, 6.5 Hz, 2H), 2.00 (s, 3H), 1.41 (t, J = 8.4 Hz, 1H). 13 C NMR (75 MHz, CDCl 3 ) d 171.0, 42.7, 24.4, 23.0.

Synthesis of t Bu-mm acid. In an oven-dried round bottom flask (50 mL) equipped with a magnetic stirbar was added methylmalonic acid (0.5 mmol) followed by dry THF (5 mL). Under strong stirring condition was then dropwise added pyridine (2.17 mmol) and t BuOH (1.82 mmol). The solution was then cooled to 0 °C and MsCl (1.02 mmol) was added dropwise. The reaction mixture was then warmed to room temperature and allowed to react for 3 h. After that, it was quenched with ice cold solution of NaOH (4 M, 5 mL), the aqueous layer was washed with DCM (2x10 mL) acidified to pH ~ 2 and extracted with DCM (3x10 mL). The organic layers were combined, washed with brine (20 mL), dried over MgSO 4 and concentrated in vacuo. The crude was then purified with silica gel column chromatography to furnish the product. t Bu-mm acid 1 H NMR (300 MHz, CDCl 3 ) d 10.06 (s, 1H), 3.38 (q, J = 7.2 Hz, 1H), 1.44 (s, 9H), 1.39 (d, J = 7.2 Hz, 2H). 13 C NMR (75 MHz, CDCl 3 ) d 176.3, 169.4, 82.5, 47.0, 28.0, 13.8.

Synthesis of the t Bu ester of mmSNAC. In an oven-dried round-bottom flask (25 mL) equipped with a magnet was added the mono t Bu-ester of methylmalonic acid (1.0 mmol) followed by dry THF (2.5 mL). The solution was then cooled to 0 °C and under strong stirring condition was then subsequently added diimidazolyl ketone (1.2 mmol) and stirred at 0℃ for 30 min followed by stirring at room temperature for 2 h. 4-(dimethylamino)pyridine (0.3 mmol) and HSNAC (1.3 mmol) was added to the reaction flask and the reaction was stirred overnight. The solvent was then removed in vacuo and the reaction mixture was dissolved in ethyl acetate (10 mL), washed with K 2 CO 3 (2x5 mL, 1 M) and HCl (2x5 mL, 1M), brine (5 mL), dried over MgSO 4 and concentrated in vacuo. Silica gel column chromatography afforded the final compound. t Bu- mmSNAC 1 H NMR (300 MHz, CDCl 3 ) d 6.55 (s, 1H), 3.56 (q, J = 7.2 Hz, 1H), 3.52– 3.35 (m, 2H), 3.18– 2.96 (m, 2H), 2.02 (s, 3H), 1.45 (s, 9H), 1.38 (d, J = 7.2 Hz, 3H). 13 C NMR (75 MHz, CDCl 3 ) d 196.3, 170.8, 168.3, 82.1, 55.1, 39.4, 28.4, 27.6, 22.6, 14.1.

Synthesis of mmSNAC. In an oven-dried round bottom flask (10 mL) equipped with a magnet was taken the t Bu ester of mmSNAC (1 mmol) and TFA (5 mL) was added dropwise to it at 0 °C under strong stirring condition. The reaction was kept overnight, the TFA was removed in vacuo and the residual amount of TFA was then co-evaporated with toluene (3x5 mL). The‘crude’ material (with trace amount of toluene as seen by NMR) was recovered in 98% yield and thus used without further purification. mmSNAC 1 H NMR (300 MHz, CDCl 3 ) d 9.95 (s, 1H), 6.48 (s, 1H), 3.75 (q, J = 7.2 Hz, 1H), 3.55 (q, J = 6.0 Hz, 2H), 3.21 (dt, J = 13.0, 6.0 Hz, 1H), 3.06 (dt, J = 13.0, 6.0 Hz, 1H), 2.08 (s, 3H), 1.49 (d, J = 7.2 Hz, 3H). 13 C NMR (126 MHz, CDCl 3 ) d 196.6, 172.3, 171.1, 54.1, 39.7, 28.3, 22.2, 14.1. LCMS m/z 220.0 ([M+H] + ).

Dose-response analysis of the cell-free TX-TL FapR assay. A premixed solution for cell-free transcription-translation reactions was purchased from New England Biolabs (PURExpress In Vitro Protein Synthesis Kit). To a PCR tube on ice were added (in the order shown): 4 mL Solution A, 3 mL Solution B, 2.5 mM FapR, 250 nM T7-fapO-sfGFP and 0-150 mM of MatB-generated or commercial mCoA in a total volume of 10 mL. In parallel, control reactions in the absence of thioester were assembled by using boiled MatB in the presence of malonic acid. The reaction was incubated for 16 h at 37 °C and then diluted with PBS buffer to give a final total volume of 50 mL. Then, 50 mL of the diluted mixture was used for determination of the sfGFP fluorescence (ex 485 nm/em 510 nm) using a Tecan infinite F200 plate reader.

Electrophoretic mobility shift assay. The fapO oligonucleotides 5’- CCTAGGACTATTAGTACCTAGTCTTAATTGTCCGGCATCC-3’ (SEQ ID NO:30) and 5’- GGATGCCGGACAATTAAGACTAGGTACTAATAGTCCTAGG-3’ (SEQ ID NO:31) were annealed per standard procedure. Briefly, the two oligos were added to a final concentration of 10 mM each in annealing buffer (19 mM Tris, pH 7.5, 50 mM NaCl, 1 mM EDTA) and heated to 95 °C for 2 minutes before cooling to 25 °C at a rate of -1 °C min -1 . Each gel shift assay was carried out in a total volume of 20 mL at 30 °C for 40 min using 2 pmol fapO and 6 pmol of FapR in reaction buffer (10 mM Tris-HCl, pH 7.5, 1 mM EDTA, 5% v/v glycerol, 100 mM KCl, 0.01 mg/mL BSA). For determination of ligand repression, 200 mM of mCoA or mmCoA were also included. Reactions were then analyzed by electrophoresis on a 6% PAGE gel at 220 V. The gel was stained with SYBR Green I and imaged on a Typhoon FLA 7000.

Cell-Free Characterization of FapR Promiscuity. A commercial kit for cell-free transcription-translation reactions was purchased from New England Biolabs (PURExpress In Vitro Protein Synthesis Kit). To a PCR tube on ice were added (in the order shown): 4 mL Solution A, 3 mL Solution B, 2.5 mM FapR, 250 nM T7-fapO-sfGFP and 150 mM of MatB-generated acyl- CoA or mmSNAC in a total volume of 10 mL (FIG.9). In parallel, control reactions in the absence of thioester were assembled by using boiled MatB. The reaction was incubated for 16 h at 37 °C and then diluted with PBS buffer to give a final total volume of 50 mL. Then, the diluted mixture was used for determination of the sfGFP fluorescence (ex 485 nm/em 510 nm) using a Tecan infinite F200 microplate spectrophotometer.

FapR-Bioassay of Methylmalonyl-CoA Production in E. coli K207-3. pSENSE2FF- 2H8 was transformed into E. coli K207-3 and grown overnight in 1 mL volumes in a 96-deepwell microplate at 37 °C and 350 rpm. These cultures (10 mL) were used to inoculate wells of a fresh 96-deepwell plate containing 440 mL LB and 100 mg mL -1 ampicillin which were then incubated at 37 °C and 350 rpm for 2.5 h. Various combinations of IPTG (1 mM final concentration) and sodium propionate (1 mM final concentration) were then added and the cultures incubated for 16 h at 37 °C and 350 rpm. The microplate was centrifuged at 3,500 rpm for 7 min, and the supernatant was discarded. Cell pellets were resuspended in 1 mL PBS and 100 mL of each cell suspension was transferred to flat-bottom 96-well plates and used for determining the optical density at 600 nm (OD 600 ) and sfGFP fluorescence (ex 485 nm/em 509 nm). The fluorescence intensity was divided by the OD 600 to yield a relative GFP fluorescence value.

Purification of FapR. E. coli BL21(DE3) competent cells were transformed with fapR_pET28a plasmid and positive transformants were selected on LB agar supplemented with 30 µg/mL kanamycin. A single colony was transferred to LB (3 mL) supplemented with kanamycin (30 µg/mL) and grown at 37 °C and 250 rpm overnight. The culture was used to inoculate LB media (300 mL) supplemented with kanamycin (30 µg/mL). The culture was incubated at 37 °C and 250 rpm to an OD 600 of 0.6, at which time protein synthesis was induced by the addition of IPTG to a final concentration of 1 mM. After incubation at 22 °C and 250 rpm for 20 h, cells were collected by centrifugation at 5,000 g for 20 min, and resuspended in 100 mM Tris-HCl pH 8.0 (8 mL) containing NaCl (300 mM) and then lysed by sonication. Following centrifugation at 10,000 g, the soluble extract was loaded onto a 1 mL HisTrap HP column (GE Healthcare, Piscataway, NJ) and purified by fast protein liquid chromatography using the following buffers: wash buffer [20 mM sodium phosphate (pH 7.4) containing 0.5 M NaCl and 20 mM imidazole] and elution buffer [20 mM sodium phosphate (pH 7.4) containing 0.5 M NaCl and 200 mM imidazole]. The purified protein was concentrated using an Amicon Ultra 10 kDa MWCO centrifugal filter (Millipore Corp., Billerica, MA) and stored in storage buffer (50 mM HEPES, pH 7.5, 100 mM NaCl, and 10% glycerol) at -80 °C. Protein purity was verified by SDS-PAGE. Protein quantification was carried out using the Bradford Protein Assay Kit from Bio-Rad. Example 2. In situ detection of de novo malonyl-CoA biosynthesis in E. coli.

The 2H8 FapR biosensor was used to detect malonyl-CoA de novo biosynthesis in E. coli via the combined action of a transport protein (MatC) and malonyl-CoA synthetase (MatB, via Psenseffrm, SEQ ID NO:26) (FIG. 10). Notably, the fluorescence readout of the biosensor provided a fast method of evaluating different pathway variants. For example, by omission of the plasmid that carried MatC (pCDFDuet-MatC, SEQ ID NO:27), it was demonstrated that the transport protein was absolutely required for malonyl-CoA biosynthesis and that the MatB from Rhizobium triffoli was not active, in contrast to the homolog from Rhodopseudomonas (FIG.11). Example 3. In situ detection of de novo methylmalonyl-CoA biosynthesis in E. coli using an engineered FapR mutant.

The 2H8 FapR biosensor was subjected to saturation mutagenesis at Phe99 to create a library of mutants. This position was predicted to dictate selectivity of the FapR biosensor towards various malonyl-CoA’s substituted at the C2-side chain position. E. coli K207-3 has been previously described to produce the non-native metabolite methylmalonyl-CoA upon feeding of propionate to the culture medium (FIG.12). Accordingly, this strain was leveraged here to provide methylmalonyl-CoA de novo. The fluorescence output of the library members was analyzed in 96-well plates in the presence of IPTG and sodium propionate. One mutant in particular was chosen for further analysis. DNA sequencing of the plasmid revealed the mutation Phe99Leu. A comparison of the wild-type (2H8) and F99L mutant FapR was made by determining the GFP- fluorescence of each strain in response to varying concentrations of propionate. The corresponding dose-response curves (FIG. 13) indicate the F99L mutant is more sensitive than the wild-type sensor and can detect concentrations of methylmalonyl-CoA corresponding to propionate concentrations as low as 1 mM. This data indicates that Phe99 is a likely determinant of effector specificity and shows that the effector specificity of FapR can be expanded towards non-native effectors through mutagenesis of the effector binding pocket.

Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed invention belongs. Publications cited herein and the materials for which they are cited are specifically incorporated by reference.

Those skilled in the art will appreciate that numerous changes and modifications can be made to the preferred embodiments of the invention and that such changes and modifications can be made without departing from the spirit of the invention. It is, therefore, intended that the appended claims cover all such equivalent variations as fall within the true spirit and scope of the invention.