SKIN COMMENSAL BACTERIA ENGINEERED TO PRODUCE ANTHRANILATE OR DERIVATIVES THEREOF

Title:

SKIN COMMENSAL BACTERIA ENGINEERED TO PRODUCE ANTHRANILATE OR DERIVATIVES THEREOF

Document Type and Number:

WIPO Patent Application WO/2023/133484

Kind Code:

Abstract:

Aspects of the disclosure relate to modified skin commensal bacterial host cells capable of producing anthranilate or derivatives thereof.

More Like This:

WO/2001/031024	A NOVEL POLYPEPTIDE, A THREONINE SYNTHETASE 71 AND THE POLYNUCLEOTIDE ENCODING THE POLYPEPTIDE
WO/1999/005285	METHOD FOR PRODUCING BIOTIN
WO/2023/044937	GENETICALLY MODIFIED YEAST OF THE GENUS YARROWIA CAPABLE OF PRODUCING VITAMIN A

Inventors:

BALDERA AGUAYO PEDRO (US)
CHATZIVASILEIOU ALKIVIADIS (US)
FORMIGHIERI CINZIA (US)
KING JASON (US)

Application Number:

PCT/US2023/060205

Publication Date:

July 13, 2023

Filing Date:

January 06, 2023

Export Citation:

Click for automatic bibliography generation Help

Assignee:

GINKGO BIOWORKS INC (US)

International Classes:

C12N15/52; C12N1/20; C12N9/10; C12N9/88; C12N9/90; C12N15/74; C12N15/77; C12P13/00; C12R1/15; C12R1/45

Domestic Patent References:

WO2017123676A1	2017-07-20
WO2019164346A1	2019-08-29
WO2018095097A1	2018-05-31

Foreign References:

US20200270587A1

2020-08-27

Other References:

LUO ZI WEI, CHO JAE SUNG, LEE SANG YUP: "Microbial production of methyl anthranilate, a grape flavor compound", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, NATIONAL ACADEMY OF SCIENCES, vol. 116, no. 22, 28 May 2019 (2019-05-28), pages 10749 - 10756, XP093078893, ISSN: 0027-8424, DOI: 10.1073/pnas.1903875116

Attorney, Agent or Firm:

MULSKI, Elizabeth, S. et al. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A bacterial host cell comprising one or more heterologous polynucleotides, wherein the bacterial host cell is a skin commensal bacterium, and wherein the bacterial host cell is capable of producing anthranilate or derivatives thereof.

2. The bacterial host cell of claim 1, wherein the bacterial host cell is modified to have increased accumulation of a tryptophan precursor.

3. The bacterial host cell of claim 2, wherein the bacterial host cell comprises or produces a higher level of the tryptophan precursor, the anthranilate, or the derivative thereof relative to a bacterial cell that does not comprise the one or more heterologous polynucleotides.

4. The bacterial host of any one of claims 1-3, wherein the bacterial host cell has increased expression of a methyltransferase.

5. The bacterial host cell of any one of claims 1-4 wherein the bacterial host cell comprises: a) a heterologous polynucleotide encoding Tryptophan biosynthesis protein C (TrpC); b) a heterologous polynucleotide encoding Tryptophan biosynthesis protein B (TrpB); c) a heterologous polynucleotide encoding Tryptophan biosynthesis protein A (TrpA); d) a heterologous polynucleotide encoding feedback resistant Tryptophan biosynthesis protein TrpE comprising a M298R, S40R, or S40F substitution; e) a heterologous polynucleotide encoding Tryptophan biosynthesis protein G (TrpG); f) a heterologous polynucleotide encoding a feedback insensitive 3-deoxy-D-arabino- heptulosonate-7-phosphate synthase (AroG) comprising a D146N substitution; or g) any combination of the foregoing.

6. The bacterial host cell of claim 5 wherein: a) the Tryptophan biosynthesis protein C (TrpC) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 2;

93 b) the Tryptophan biosynthesis protein B (TrpB) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 4; c) the Tryptophan biosynthesis protein A (TrpA) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO:5; d) the feedback resistant Tryptophan biosynthesis protein TrpE comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 1, 6, 11, or 13; e) the Tryptophan biosynthesis protein G (TrpG) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 3 or 12; f) the feedback insensitive 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase (AroG) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 7 or 14; or g) any combination of the foregoing.

7. The bacterial host cell of any one of claims 1-6 wherein the bacterial host cell comprises: a) a heterologous polynucleotide encoding Tryptophan biosynthesis protein C (TrpC); b) a heterologous polynucleotide encoding Tryptophan biosynthesis protein B (TrpB); c) a heterologous polynucleotide encoding Tryptophan biosynthesis protein A (TrpA); d) a heterologous polynucleotide encoding feedback resistant Tryptophan biosynthesis protein TrpE comprising a M298R, S40R, or S40F substitution; e) a heterologous polynucleotide encoding Tryptophan biosynthesis protein G (TrpG); f) a heterologous polynucleotide encoding a feedback insensitive 3-deoxy-D-arabino- heptulosonate-7-phosphate synthase (AroG) comprising a D146N substitution; g) a heterologous polynucleotide encoding Anthranilate O-methyltransferase 1 (AAMT1); h) a heterologous polynucleotide encoding Anthranilate N-methyltransferase (ANMT); i) a heterologous polynucleotide encoding Anthranilic acid methyltransferase (AAMT1); or j) any combination of the foregoing.

8. The bacterial host cell of claim 7 wherein: a) the Tryptophan biosynthesis protein C (TrpC) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 2; b) the Tryptophan biosynthesis protein B (TrpB) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 4; c) the Tryptophan biosynthesis protein A (TrpA) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO:5; d) the feedback resistant Tryptophan biosynthesis protein TrpE comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 1, 6, 11, or 13; e) the Tryptophan biosynthesis protein G (TrpG) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 3 or 12; and/or f) the feedback insensitive 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase (AroG) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 7 or 14; g) the Anthranilate O-methyltransferase 1 (AAMT1) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 15; h) the Anthranilate N-methyltransferase (ANMT) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 9 or 16; i) the Anthranilic acid methyltransferase (AAMT1) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 8; or j) any combination of the foregoing.

9. The bacterial host cell of any one of claims 1-8, wherein the derivatives thereof are alkylated derivatives.

10. The bacterial host cell of any one of claims 1-9, wherein the anthranilate or derivative thereof is anthranilate, methyl anthranilate, N,N-dimethyl anthranilate, ethyl anthranilate, butyl anthranilate or methyl N,N-dimethyl anthranilate.

11. The bacterial host cell of claim 10, wherein the anthranilate or derivative thereof is anthranilate.

12. The bacterial host cell of claim 10, wherein the anthranilate or derivative thereof is methyl anthranilate.

13. The bacterial host cell of any one of claims 1-12, which is Staphylococcus epidermidis bacteria.

14. The bacterial host cell of any one of claims 1-12, which is Corynebacterium spp. bacteria.

15. A composition comprising a bacterial host cell of any one of claims 1-14.

16. The composition of claim 15, wherein the composition is formulated for use as a fragrance or perfume.

17. The composition of claim 15, wherein the composition is formulated for use as a deodorant.

18. The composition of claim 15, wherein the composition is formulated for use as a sunscreen.

19. The composition of claim 15, wherein the composition is formulated for use in oil absorbance.

20. The composition of any one of claims 15-19, wherein the composition is in the form of a gel, cream, ointment, lotion, serum, powder, aerosol spray or two-component dispensing system.

21. The composition of any one of claims 15-20, wherein the composition further comprises a carrier, buffer or thickener.

22. The composition of any one of claims 15-21, wherein the bacterial host cell comprises a genetic kill switch that controls the growth of the bacterial host cell.

23. A method comprising administering the bacterial host cell of any one of claims 1-14 to a subject. 24. A method comprising administering the composition of any one of claims 15-23 to a subject.

25. The method of claim 23 or 24, wherein the subject is a human subject. 26. The method of claim 23 or 24, wherein the subject is a non-human subject.

Description:

SKIN COMMENSAL BACTERIA ENGINEERED TO PRODUCE ANTHRANILATE

OR DERIVATIVES THEREOF

CROSS-REFERENCE TO RELATED APPLICATIONS

This Application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Application Serial No. 63/297,566, entitled "SKIN COMMENSAL BACTERIA ENGINEERED TO PRODUCE ANTHRANILATE OR DERIVATIVES THEREOF" filed on January 7, 2022, and U.S. Provisional Application Serial No. 63/336,922, entitled " SKIN COMMENSAL BACTERIA ENGINEERED TO PRODUCE ANTHRANILATE OR DERIVATIVES THEREOF" filed on April 29, 2022, the entire contents of each of which are incorporated herein by reference.

STATEMENT OF GOVERNMENT SUPPORT

This invention was made with government support under Contract No. HR001120C0073 awarded by the Defense Advanced Research Projects Agency (DARPA). The government has certain rights in the invention.

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

The contents of the electronic sequence listing (G091970093WO00-SEQ-EAS.xml; Size: 102,333 bytes; and Date of Creation: December 19, 2022) is herein incorporated by reference in its entirety.

FIELD OF INVENTION

The present disclosure relates to engineered skin commensal bacterial cells capable of producing anthranilate or derivatives thereof and methods of using the same.

BACKGROUND

The human skin contains billions of microorganisms organized in diverse communities. These communities of bacteria, fungi, mites and viruses, collectively known as the human skin microbiome (HSM) play a crucial role in maintaining human health (Verhulst et al., PLoS ONE. 5, el5829 2010). The human skin microbiome also plays a critical role in generating volatile compounds from human sweat and skin lipids that make up the complex blend of chemicals representing the human odor. SUMMARY

Aspects of the present disclosure provide a bacterial host cell comprising one or more heterologous polynucleotides, wherein the bacterial host cell is a skin commensal bacterium, and wherein the bacterial host cell is capable of producing a compound, the presence of which is beneficial to the skin, such as anthranilate or derivatives thereof. In some embodiments, the bacterial host cell is modified to have increased accumulation of a tryptophan precursor. In some embodiments, the bacterial host cell comprises or produces a higher level of the tryptophan precursor, the anthranilate, or the derivative thereof relative to a bacterial cell that does not comprise the one or more heterologous polynucleotides. In some embodiments, the bacterial host cell has increased expression of a methyltransferase.

In some embodiments, the bacterial host cell comprises: a) a heterologous polynucleotide encoding Tryptophan biosynthesis protein C (TrpC); b) a heterologous polynucleotide encoding Tryptophan biosynthesis protein B (TrpB); c) a heterologous polynucleotide encoding Tryptophan biosynthesis protein A (TrpA); d) a heterologous polynucleotide encoding feedback resistant Tryptophan biosynthesis protein TrpE comprising a M298R, S40R, or S40F substitution; e) a heterologous polynucleotide encoding Tryptophan biosynthesis protein G (TrpG); f) a heterologous polynucleotide encoding a feedback insensitive 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase (AroG) comprising a D146N substitution; or g) any combination of the foregoing. In some embodiments, a) the Tryptophan biosynthesis protein C (TrpC) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 2; b) the Tryptophan biosynthesis protein B (TrpB) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 4; c) the Tryptophan biosynthesis protein A (TrpA) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO:5; d) the feedback resistant Tryptophan biosynthesis protein TrpE comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 1, 6, 11, or 13; e) the Tryptophan biosynthesis protein G (TrpG) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 3 or 12; f) the feedback insensitive 3- deoxy-D-arabino-heptulosonate-7-phosphate synthase (AroG) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 7 or 14; or g) any combination of the foregoing.

In some embodiments, the bacterial host cell comprises: a) a heterologous polynucleotide encoding Tryptophan biosynthesis protein C (TrpC); b) a heterologous polynucleotide encoding Tryptophan biosynthesis protein B (TrpB); c) a heterologous polynucleotide encoding Tryptophan biosynthesis protein A (TrpA); d) a heterologous polynucleotide encoding feedback resistant Tryptophan biosynthesis protein TrpE comprising a M298R, S40R, or S40F substitution; e) a heterologous polynucleotide encoding Tryptophan biosynthesis protein G (TrpG); f) a heterologous polynucleotide encoding a feedback insensitive 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase (AroG) comprising a D146N substitution; g) a heterologous polynucleotide encoding Anthranilate O- methyltransferase 1 (AAMT1); h) a heterologous polynucleotide encoding Anthranilate N- methyltransferase (ANMT); i) a heterologous polynucleotide encoding Anthranilic acid methyltransferase (AAMT1); j) or any combination of the foregoing. In some embodiments, a) the Tryptophan biosynthesis protein C (TrpC) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 2; b) the Tryptophan biosynthesis protein B (TrpB) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 4; c) the Tryptophan biosynthesis protein A (TrpA) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO:5; d) the feedback resistant Tryptophan biosynthesis protein TrpE comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 1, 6, 11, or 13; e) the Tryptophan biosynthesis protein G (TrpG) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 3 or 12; and/or f) the feedback insensitive 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase (AroG) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 7 or 14; g) the Anthranilate O- methyltransferase 1 (AAMT1) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 15; h) the Anthranilate N-methyltransferase (ANMT) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 9 or 16; i) the Anthranilic acid methyltransferase (AAMT1) comprises a sequence that is at least 90% identical to the sequence of SEQ ID NO: 8; or j) any combination of the foregoing.

In some embodiments, the derivatives thereof are alkylated derivatives. In some embodiments, the anthranilate or derivative thereof is anthranilate, methyl anthranilate, N,N- dimethyl anthranilate, ethyl anthranilate, butyl anthranilate or methyl N,N-dimethyl anthranilate. In some embodiments, the anthranilate or derivative thereof is anthranilate. In some embodiments, the anthranilate or derivative thereof is methyl anthranilate.

In some embodiments, the bacterial host cell is Staphylococcus epidermidis bacteria. In some embodiments, the bacterial host cell is Corynebacterium spp. bacteria. Further aspects of the disclosure provide compositions comprising a bacterial host cell described in this application. In some embodiments, the composition is formulated for use as a fragrance, a perfume, a deodorant, a sunscreen, and/or for use in oil absorbance. In some embodiments, the composition is formulated for use as a fragrance or perfume. In some embodiments, the composition is formulated for use as a deodorant. In some embodiments, the composition is formulated for use as a sunscreen. In some embodiments, the composition is formulated for use as an insect repellent. In some embodiments, the composition is formulated for use in oil absorbance. In some embodiments, the composition is in the form of a gel, cream, ointment, lotion, serum, powder, aerosol spray or two-component dispensing system. In some embodiments, the composition further comprises a carrier, buffer or thickener. In some embodiments, the bacterial host cell comprises a genetic kill switch that controls the growth of the bacterial host cell.

Further aspects of the disclosure provide methods comprising administering the bacterial host cell described in the application to a subject. In some embodiments, the subject is a human subject. In some embodiments, the subject is a non-human subject.

Further aspects of the disclosure provide methods comprising administering the composition described in the application to a subject. In some embodiments, the subject is a human subject. In some embodiments, the subject is a non-human subject.

Each of the limitations of the invention can encompass various embodiments of the invention. It is, therefore, anticipated that each of the limitations of the invention involving any one element or combinations of elements can be included in each aspect of the invention. This disclosure is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the drawings. The invention is capable of other embodiments and of being practiced or of being carried out in various ways. Also, the phraseology and terminology used in this application is for the purpose of description and should not be regarded as limiting. The use of “including,” “comprising,” or “having,” “containing,” “involving,” and variations thereof, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. The term “a” or “an” refers to one or more of an entity.

BRIEF DESCRIPTION OF DRAWINGS

The accompanying drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. In the drawings:

FIG. 1 provides a schematic showing the tryptophan biosynthesis pathway that results in accumulation of a tryptophan precursor for generating anthranilate and methyl anthranilate.

FIG. 2 is a graph depicting production of anthranilate (mg/L) by different test strains of S. epidermidis expressing combinations of the following heterologous genes: trpE fbr S40R, trpE fbr S40F, trpC, trpG, trpB, trpA, trpG, and aroG D146N on Day 1 and Day 2 relative to a control strain t869401 not harboring the heterologous genes expressed by the test strains.

FIG. 3 is a graph depicting production of anthranilate (mg/L) by different test strains of C. ammoniagenes expressing combinations of the following heterologous genes: trpE fbr S40R, trpE fbr S40F, trpG, and aroG D146N on Day 1 and Day 2 relative to a control strain tl027457 not harboring the heterologous genes expressed by the test strains.

FIG. 4 is a graph depicting production of anthranilate (mg/L) by different test strains of C. tuberculostearicum expressing combinations of the following heterologous genes: trpE fbr S40R, trpE fbr S40F, trpG, and aroG D146N on Day 1 and Day 2 relative to a control strain tl240034 not harboring the heterologous genes expressed by the test strains.

FIG. 5 is a graph depicting production of anthranilate (mg/L) by different test strains of C. mucifaciens expressing combinations of the following heterologous genes: trpE fbr S40R, trpE fbr S40F, trpG, and aroG D146N on Day 1 and Day 2 relative to a control strain tl240050 not harboring the heterologous genes expressed by the test strains.

FIG. 6 is a graph depicting production of anthranilate and methyl anthranilate (mg/L) by different test strains of C. ammoniagenes expressing combinations of the following heterologous genes: trpE fbr S40R, ANMT-Rg, AAMTl-Zm, AAMTl-Mt, AAMTl-Mt, and AAMTl-Zm on Day 1 and Day 2 relative to a control strain tl240028 not harboring the heterologous genes expressed by the test strains

FIGs. 7A-7C are liquid chromatograms (LCMS) of: a sample from S. epidermidis strain tl208309 producing anthranilate (FIG. 7A) relative to a wildtype strain that does not produce anthranilate (FIG. 7B) and relative to a chemical standard (FIG. 7C).

FIGs. 8A-8C are liquid chromatograms (LCMS) of: a sample from Corynebacterium strain tl223296 producing anthranilate (FIG. 8A) relative to a wildtype strain that does not produce anthranilate (FIG. 8B) and relative to a chemical standard (FIG. 8C). FIG. 9 is a graph depicting methyl anthranilate production by S. epidermidis strain tl254848 expressing heterologous genes trpE fbr S40F and ANMT-Rg on a pIMAY-Z plasmid backbone on Day 1 and Day 2 relative to S. epidermidis strain tl254846 expressing heterologous genes trpE fbr S40F and AAMTl-Mt on a pIMAY-Z plasmid backbone and S. epidermidis strain tl254849 expressing heterologous genes trpE fbr S40F and AAMTl-Zm on a pIMAY-Z plasmid backbone.

FIG. 10 is a graph depicting methyl anthranilate production by S. epidermidis strain tl275357 with genome integration of heterologous genes trpE fbr S40F and ANMT-Rg on Day 1 and Day 2 relative to a wildtype S. epidermidis strain t869401 not harboring the heterologous genes expressed by the test strain.

DETAILED DESCRIPTION

Compounds of the present disclosure (e.g., anthranilate and/or methyl anthranilate) that are not naturally part of the human odor have previously been produced by non-skin commensal microorganisms (Luo et al., 2019 PNAS 116:10749-10756). However, such non-skin commensal microorganisms were not compatible with the physiological skin microbiome and could not be safely administered to humans and animals due to pathogenic risk.

The present disclosure provides, in some aspects, engineered skin commensal bacterial host cells that produce a compound such as anthranilate, an anthranilate derivative such as a methyl anthranilate, or a tryptophan precursor, the presence of which is beneficial to the skin. Engineered skin commensal bacterial host cells associated with the disclosure are capable of producing compounds that are not naturally part of the human odor and that may comprise or produce a desirable property, such as a desirable fragrance or perfume, an insect repellent effect, and/or a health or wellness aid. It was surprisingly demonstrated in the Examples that skin commensal bacterial host cells could be modified to have increased accumulation of a tryptophan precursor and produce anthranilate. Further described in the Examples is the surprising demonstration that strains of skin commensal bacterial host cells could be modified to have both increased accumulation of a tryptophan precursor and could be modified to express one or more methyltransferases and produce methyl anthranilate.

Skin commensal bacterial host cells provided in this disclosure may be incorporated into compositions for administration to human and non-human subjects. Bacterial host cells associated with the disclosure produce anthranilate and/or derivatives thereof. In some embodiments, bacterial host cells associated with the disclosure produce anthranilate. In some embodiments, bacterial host cells associated with the disclosure produce methyl anthranilate. In some embodiments, bacterial host cells associated with the disclosure produce a derivative of anthranilic acid or anthranilate (e.g., alkylated derivative of anthranilic acid or anthranilate, for example, alkylated anthranilate derivative, anthranilate ester, anthranilate ester derivative). In some embodiments, bacterial host cells associated with the disclosure produce one or more compounds selected from: anthranilate, methyl anthranilate, A i methyl anthranilate, ethyl anthranilate, butyl anthranilate, methyl N,N-dimethyl anthranilate, and/or other insect repellent compounds.

In certain embodiments, the compound is a derivative of anthranilic acid or anthranilate (e.g., alkylated derivative of anthranilic acid or anthranilate, for example, alkylated anthranilate derivative, anthranilate ester, anthranilate ester derivative). In certain embodiments, the compound is an alkylated anthranilate derivative (e.g., N,N-dimethyl anthranilate, ethyl anthranilate, butyl anthranilate, and methyl N,N-dimethyl anthranilate). In certain embodiments, bacterial host cells associated with the disclosure produce a compound of Formula (I), which is an alkylated anthranilate derivative (e.g., N,N-dimethyl anthranilate, ethyl anthranilate, butyl anthranilate, and methyl N,N-dimethyl anthranilate). In certain embodiments, bacterial host cells associated with the disclosure produce a compound of Formula (I). In certain embodiments, the compound is a derivative of anthranilic acid or anthranilate of Formula (I): or a salt, solvate, hydrate, stereoisomer, polymorph, tautomer, or prodrug thereof, wherein:

R ¹ is hydrogen, optionally substituted acyl, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted aryl, optionally substituted heteroaryl, or an oxygen protecting group; wherein each instance of R ² is independently hydrogen, optionally substituted acyl, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted aryl, optionally substituted heteroaryl, or a nitrogen protecting group; each instance of R ³ is independently halogen, optionally substituted acyl, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted aryl, optionally substituted heteroaryl, -CN, -NO2, -OR ^D1, -N(R ^Dla)2, or -SR ^D1, or optionally two instances of R ³ are taken together with their intervening atoms to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring; wherein R ^D1 is independently hydrogen, optionally substituted acyl, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted aryl, optionally substituted heteroaryl, or an oxygen protecting group when attached to an oxygen atom, or a sulfur protecting group when attached to a sulfur atom; wherein each occurrence of R ^Dla is independently hydrogen, optionally substituted acyl, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted aryl, optionally substituted heteroaryl, or a nitrogen protecting group; or optionally two instances of R ^Dla are taken together with their intervening atoms to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring; and n is 0, 1, 2, 3, or 4.

In certain embodiments, R ¹ is hydrogen, optionally substituted acyl, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted aryl, optionally substituted heteroaryl, or an oxygen protecting group. In certain embodiments, R ¹ is hydrogen. In certain embodiments, R ¹ is optionally substituted acyl (e.g., -C(=0)Me). In certain embodiments, R ¹ is optionally substituted alkyl (e.g., substituted or unsubstituted Ci-10 alkyl, substituted or unsubstituted C1-6 alkyl). In certain embodiments, R ¹ is substituted or unsubstituted methyl. In certain embodiments, R ¹ is substituted or unsubstituted ethyl. In certain embodiments, R ¹ is substituted or unsubstituted propyl. In certain embodiments, R ¹ is substituted or unsubstituted butyl (e.g., optionally substituted n-butyl, optionally substituted t-butyl). In certain embodiments, R ¹ is optionally substituted alkenyl (e.g., substituted or unsubstituted C2-6 alkenyl). In certain embodiments, R ¹ is optionally substituted alkynyl (e.g., substituted or unsubstituted C2-6 alkynyl). In certain embodiments, R ¹ is optionally substituted carbocyclyl (e.g., substituted or unsubstituted, 3- to 7-membered, monocyclic carbocyclyl comprising zero, one, or two double bonds in the carbocyclic ring system). In certain embodiments, R ¹ is optionally substituted heterocyclyl (e.g., substituted or unsubstituted, 5- to 10-membered monocyclic or bicyclic heterocyclic ring, wherein one or two atoms in the heterocyclic ring are independently nitrogen, oxygen, or sulfur). In certain embodiments, R ¹ is optionally substituted aryl (e.g., substituted or unsubstituted, 6- to 10-membered aryl). In certain embodiments, R ¹ is benzyl. In certain embodiments, R ¹ is substituted or unsubstituted phenyl. In certain embodiments, R ¹ is optionally substituted heteroaryl (e.g., substituted or unsubstituted, 5- to 6-membered, monocyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur; or substituted or unsubstituted, 9- to 10-membered, bicyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur). In certain embodiments, R ¹ is an oxygen protecting group (e.g., silyl, TBDPS, TBDMS, TIPS, TES, TMS, MOM, THP, t-Bu, Bn, allyl, acetyl, pivaloyl, or benzoyl).

In certain embodiments, each instance of R ² is independently hydrogen, optionally substituted acyl, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted aryl, optionally substituted heteroaryl, or a nitrogen protecting group. In certain embodiments, at least one instance of R ² is hydrogen. In certain embodiments, both instances of R ² are hydrogen. In certain embodiments, R ¹ is hydrogen, and both instances of R ² are hydrogen. In certain embodiments, R ¹ is optionally substituted alkyl (e.g., optionally substituted C1-6 alkyl), and at least one instance of R ² is hydrogen. In certain embodiments, R ¹ is optionally substituted alkyl (e.g., optionally substituted C1-6 alkyl), and both instances of R ² are hydrogen. In certain embodiments, R ¹ is optionally substituted acyl (e.g., - C(=O)(optionally substituted C1-6 alkyl)), and both instances of R ² are hydrogen. In certain embodiments, at least one instance of R ² is optionally substituted acyl (e.g., -C(=O)Me). In certain embodiments, at least one instance of R ² is optionally substituted alkyl (e.g., substituted or unsubstituted Ci-10 alkyl, substituted or unsubstituted C1-6 alkyl). In certain embodiments, R ¹ is optionally substituted alkyl (e.g., optionally substituted C1-6 alkyl), and at least one instance of R ² is optionally substituted alkyl (e.g., optionally substituted C1-6 alkyl). In certain embodiments, R ¹ is optionally substituted alkyl (e.g., optionally substituted C1-6 alkyl), and both instances of R ² are optionally substituted alkyl (e.g., optionally substituted Ci-6 alkyl). In certain embodiments, at least one instance of R ² is substituted or unsubstituted methyl. In certain embodiments, at least one instance of R ² is substituted or unsubstituted ethyl. In certain embodiments, at least one instance of R ² is substituted or unsubstituted propyl. In certain embodiments, at least one instance of R ² is substituted or unsubstituted butyl (e.g., optionally substituted n-butyl, optionally substituted t-butyl). In certain embodiments, at least one instance of R ² is optionally substituted alkenyl (e.g., substituted or unsubstituted C2-6 alkenyl). In certain embodiments, at least one instance of R ² is optionally substituted alkynyl (e.g., substituted or unsubstituted C2-6 alkynyl). In certain embodiments, at least one instance of R ² is optionally substituted carbocyclyl (e.g., substituted or unsubstituted, 3- to 7-membered, monocyclic carbocyclyl comprising zero, one, or two double bonds in the carbocyclic ring system). In certain embodiments, at least one instance of R ² is optionally substituted heterocyclyl (e.g., substituted or unsubstituted, 5- to 10- membered monocyclic or bicyclic heterocyclic ring, wherein one or two atoms in the heterocyclic ring are independently nitrogen, oxygen, or sulfur). In certain embodiments, at least one instance of R ² is optionally substituted aryl (e.g., substituted or unsubstituted, 6- to 10-membered aryl). In certain embodiments, at least one instance of R ² is benzyl. In certain embodiments, at least one instance of R ² is substituted or unsubstituted phenyl. In certain embodiments, at least one instance of R ² is optionally substituted heteroaryl (e.g., substituted or unsubstituted, 5- to 6-membered, monocyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur; or substituted or unsubstituted, 9- to 10-membered, bicyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur). In certain embodiments, at least one instance of R ² is an nitrogen protecting group (e.g., Bn, Boc, Cbz, Fmoc, trifluoroacetyl, triphenylmethyl, acetyl, or Ts).

In certain embodiments, both instances of R ² are hydrogen. In certain embodiments, R ¹ is hydrogen, both instances of R ² are hydrogen, and n is 0 or 1. In certain embodiments, R ¹ is optionally substituted alkyl (e.g., optionally substituted C1-6 alkyl), at least one instance of R ² is hydrogen, and n is 0 or 1. In certain embodiments, R ¹ is optionally substituted alkyl (e.g., optionally substituted C1-6 alkyl), both instances of R ² are hydrogen, and n is 0 or 1. In certain embodiments, R ¹ is optionally substituted acyl (e.g., -C(=O)(optionally substituted Ci- 6 alkyl)), both instances of R ² are hydrogen, and n is 0 or 1. In certain embodiments, R ¹ is optionally substituted alkyl (e.g., optionally substituted Ci-6 alkyl), at least one instance of R ² is optionally substituted alkyl (e.g., optionally substituted Ci-6 alkyl), and n is 0 or 1. In certain embodiments, R ¹ is optionally substituted alkyl (e.g., optionally substituted Ci-6 alkyl), both instances of R ² are optionally substituted alkyl (e.g., optionally substituted Ci-6 alkyl), and n is 0 or 1.

Compounds of Formula (I) include zero or more instances of R ³. In certain embodiments, n is 0. In certain embodiments, n is 1. In certain embodiments, n is 2. In certain embodiments, n is 3. In certain embodiments, n is 4. In certain embodiments, n is 0 or 1. In certain embodiments, at least one instance of R ³ is independently halogen, optionally substituted acyl, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted aryl, optionally substituted heteroaryl, -CN, -NO2, -OR ^D1, -N(R ^Dla)2, or -SR ^D1, or optionally two instances of R ³ are taken together with their intervening atoms to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring; wherein R ^D1 is independently hydrogen, optionally substituted acyl, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted aryl, optionally substituted heteroaryl, or an oxygen protecting group when attached to an oxygen atom, or a sulfur protecting group when attached to a sulfur atom. In certain embodiments, at least one instance of R ³ is halogen (e.g., F, Cl, Br, or I). In certain embodiments, at least one instance of R ³ is optionally substituted acyl (e.g., -C(=O)Me). In certain embodiments, at least one instance of R ³ is optionally substituted alkyl (e.g., substituted or unsubstituted C1-6 alkyl). In certain embodiments, at least one instance of R ³ is substituted or unsubstituted methyl. In certain embodiments, at least one instance of R ³ is substituted or unsubstituted ethyl. In certain embodiments, at least one instance of R ³ is substituted or unsubstituted propyl. In certain embodiments, at least one instance of R ³ is optionally substituted alkenyl (e.g., substituted or unsubstituted C2-6 alkenyl). In certain embodiments, at least one instance of R ³ is optionally substituted alkynyl (e.g., substituted or unsubstituted C2-6 alkynyl). In certain embodiments, at least one instance of R ³ is optionally substituted carbocyclyl (e.g., substituted or unsubstituted, 3- to 7-membered, monocyclic carbocyclyl comprising zero, one, or two double bonds in the carbocyclic ring system). In certain embodiments, at least one instance of R ³ is optionally substituted heterocyclyl (e.g., substituted or unsubstituted, 5- to 10-membered monocyclic or bicyclic heterocyclic ring, wherein one or two atoms in the heterocyclic ring are independently nitrogen, oxygen, or sulfur). In certain embodiments, at least one instance of R ³ is optionally substituted aryl (e.g., substituted or unsubstituted, 6- to 10-membered aryl). In certain embodiments, at least one instance of R ³ is benzyl. In certain embodiments, at least one instance of R ³ is substituted or unsubstituted phenyl. In certain embodiments, at least one instance of R ³ is optionally substituted heteroaryl (e.g., substituted or unsubstituted, 5- to 6-membered, monocyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur; or substituted or unsubstituted, 9- to 10-membered, bicyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur). In certain embodiments, two instances of R ³ are taken together with their intervening atoms to form a optionally substituted heterocyclic ring (e.g., substituted or unsubstituted, 5- to 10- membered monocyclic or bicyclic heterocyclic ring, wherein one or two atoms in the heterocyclic ring are independently nitrogen, oxygen, or sulfur) or optionally substituted heteroaryl ring (e.g., substituted or unsubstituted, 5- to 6-membered, monocyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur; or substituted or unsubstituted, 9- to 10-membered, bicyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur). In certain embodiments, at least one instance of R ³ is -N3. In certain embodiments, at least one instance of R ³ is -CN. In certain embodiments, at least one instance of R ³ is -NO2. In certain embodiments, at least one instance of R ³ is - OR ^D1 (e.g., -OH or -OMe). In certain embodiments, at least one instance of R ³ is -N(R ^Dla)2 (e.g., -NMe2). In certain embodiments, at least one instance of R ³ is -SR ^D1 (e.g., -SMe).

In certain embodiments, at least one instance of R ³ is -OR ^D1, -N(R ^Dla)2, or -SR ^D1, and R ^D1 and R ^Dla are as defined herein. In certain embodiments, R ^D1 is hydrogen. In certain embodiments, R ^D1 is optionally substituted acyl (e.g., -C(=O)Me). In certain embodiments, R ^D1 is optionally substituted alkyl (e.g., substituted or unsubstituted C1-6 alkyl). In certain embodiments, R ^D1 is substituted or unsubstituted methyl. In certain embodiments, R ^D1 is substituted or unsubstituted ethyl. In certain embodiments, R ^D1 is substituted or unsubstituted propyl. In certain embodiments, R ^D1 is optionally substituted alkenyl (e.g., substituted or unsubstituted C2-6 alkenyl). In certain embodiments, R ^D1 is optionally substituted alkynyl (e.g., substituted or unsubstituted C2-6 alkynyl). In certain embodiments, R ^D1 is optionally substituted carbocyclyl (e.g., substituted or unsubstituted, 3- to 7-membered, monocyclic carbocyclyl comprising zero, one, or two double bonds in the carbocyclic ring system). In certain embodiments, R ^D1 is optionally substituted heterocyclyl (e.g., substituted or unsubstituted, 5- to 10-membered monocyclic or bicyclic heterocyclic ring, wherein one or two atoms in the heterocyclic ring are independently nitrogen, oxygen, or sulfur). In certain embodiments, R ^D1 is optionally substituted aryl (e.g., substituted or unsubstituted, 6- to 10- membered aryl). In certain embodiments, R ^D1 is benzyl. In certain embodiments, R ^D1 is substituted or unsubstituted phenyl. In certain embodiments, R ^D1 is optionally substituted heteroaryl (e.g., substituted or unsubstituted, 5- to 6-membered, monocyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur; or substituted or unsubstituted, 9- to 10-membered, bicyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur). In certain embodiments, R ^D1 is an oxygen protecting group when attached to an oxygen atom (e.g., silyl, TBDPS, TBDMS, TIPS, TES, TMS, MOM, THP, t-Bu, Bn, allyl, acetyl, pivaloyl, or benzoyl). In certain embodiments, R ^D1 is a sulfur protecting group when attached to a sulfur atom (e.g., acetamidomethyl, t-Bu, 3- nitro-2-pyridine sulfenyl, 2-pyridine-sulfenyl, or triphenylmethyl).

In certain embodiments, at least one instance of R ^Dla is hydrogen. In certain embodiments, at least one instance of R ^Dla is optionally substituted acyl (e.g., -C(=O)Me). In certain embodiments, at least one R ^Dla is optionally substituted alkyl (e.g., substituted or unsubstituted Ci-6 alkyl). In certain embodiments, at least one instance of R ^Dla is substituted or unsubstituted methyl. In certain embodiments, at least one instance of R ^Dla is substituted or unsubstituted ethyl. In certain embodiments, at least one instance of R ^Dla is substituted or unsubstituted propyl. In certain embodiments, at least one instance of R ^Dla is optionally substituted alkenyl (e.g., substituted or unsubstituted C2-6 alkenyl). In certain embodiments, at least one instance of R ^Dla is optionally substituted alkynyl (e.g., substituted or unsubstituted C2-6 alkynyl). In certain embodiments, at least one instance of R ^Dla is optionally substituted carbocyclyl (e.g., substituted or unsubstituted, 3- to 7-membered, monocyclic carbocyclyl comprising zero, one, or two double bonds in the carbocyclic ring system). In certain embodiments, at least one instance of R ^Dla is optionally substituted heterocyclyl (e.g., substituted or unsubstituted, 5- to 10-membered monocyclic or bicyclic heterocyclic ring, wherein one or two atoms in the heterocyclic ring are independently nitrogen, oxygen, or sulfur). In certain embodiments, at least one instance of R ^Dla is optionally substituted aryl (e.g., substituted or unsubstituted, 6- to 10-membered aryl). In certain embodiments, at least one instance of R ^Dla is benzyl. In certain embodiments, at least one instance of R ^Dla is substituted or unsubstituted phenyl. In certain embodiments, at least one instance of R ^Dla is optionally substituted heteroaryl (e.g., substituted or unsubstituted, 5- to 6-membered, monocyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur; or substituted or unsubstituted, 9- to 10- membered, bicyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur). In certain embodiments, at least one instance of R ^Dla is a nitrogen protecting group (e.g., benzyl (Bn), t-butyl carbonate (BOC or Boc), benzyl carbamate (Cbz), 9-fluorenylmethyl carbonate (Fmoc), trifluoroacetyl, triphenylmethyl, acetyl, or p-toluenesulfonamide (Ts)). In certain embodiments, two instances of R ^Dla are taken together with their intervening atoms to form a optionally substituted heterocyclic ring (e.g., substituted or unsubstituted, 5- to 10-membered monocyclic or bicyclic heterocyclic ring, wherein one or two atoms in the heterocyclic ring are independently nitrogen, oxygen, or sulfur) or optionally substituted heteroaryl ring (e.g., substituted or unsubstituted, 5- to 6-membered, monocyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur; or substituted or unsubstituted, 9- to 10-membered, bicyclic heteroaryl, wherein one, two, three, or four atoms in the heteroaryl ring system are independently nitrogen, oxygen, or sulfur).

In certain embodiments, the compound of Formula (I) is of the formula: or a salt, solvate, hydrate, stereoisomer, polymorph, tautomer, or prodrug thereof. In certain embodiments, the compound of Formula (I) is of the formula: anthranilate), or (methyl N,N-dimethyl anthranilate).

The term “alkyl” refers to a radical of, or a substituent that is, a straight-chain or branched saturated hydrocarbon group having from 1 to 20 carbon atoms (“Ci-20 alkyl”). In certain embodiments, the term “alkyl” refers to a radical of, or a substituent that is, a straightchain or branched saturated hydrocarbon group having from 1 to 10 carbon atoms (“Ci-10 alkyl”). In some embodiments, an alkyl group has 1 to 9 carbon atoms (“C1-9 alkyl”). In some embodiments, an alkyl group has 1 to 8 carbon atoms (“C1-8 alkyl”). In some embodiments, an alkyl group has 1 to 7 carbon atoms (“C1-7 alkyl”). In some embodiments, an alkyl group has 2 to 7 carbon atoms (“C2-7 alkyl”). In some embodiments, an alkyl group has 3 to 7 carbon atoms (“C3-7 alkyl”). In some embodiments, an alkyl group has 1 to 6 carbon atoms (“C1-6 alkyl”). In some embodiments, an alkyl group has 2 to 6 carbon atoms (“C2-6 alkyl”). In some embodiments, an alkyl group has 3 to 5 carbon atoms (“C3-5 alkyl”). In some embodiments, an alkyl group has 5 carbon atoms (“C5 alkyl”). In some embodiments, the alkyl group has 3 carbon atoms (“C3 alkyl”). In some embodiments, the alkyl group has 7 carbon atoms (“C7 alkyl”). In some embodiments, an alkyl group has 1 to 5 carbon atoms (“C1-5 alkyl”). In some embodiments, an alkyl group has 1 to 4 carbon atoms (“C1-4 alkyl”). In some embodiments, an alkyl group has 1 to 3 carbon atoms (“C1-3 alkyl”). In some embodiments, an alkyl group has 1 to 2 carbon atoms (“C1-2 alkyl”). In some embodiments, an alkyl group has 1 carbon atom (“Ci alkyl”).

Examples of C1-6 alkyl groups include methyl (Ci), ethyl (C2), propyl (C3) (e.g., n- propyl, isopropyl), butyl (C4) (e.g., n-butyl, tert-butyl, sec-butyl, iso-butyl), pentyl (C5) (e.g., n-pentyl, 3-pentanyl, amyl, neopentyl, 3-methyl-2-butanyl, tertiary amyl), and hexyl (Ce) (e.g., n-hexyl). Additional examples of alkyl groups include n-heptyl (C7), n-octyl (Cs), and the like. Unless otherwise specified, each instance of an alkyl group is independently unsubstituted (an “unsubstituted alkyl”) or substituted (a “substituted alkyl”) with one or more substituents (e.g., halogen, such as F). In certain embodiments, the alkyl group is an unsubstituted Ci-10 alkyl (such as unsubstituted C1-6 alkyl, e.g., -CH3 (Me), unsubstituted ethyl (Et), unsubstituted propyl (Pr, e.g., unsubstituted n-propyl (n-Pr), unsubstituted isopropyl (i-Pr)), unsubstituted butyl (Bu, e.g., unsubstituted n-butyl (n-Bu), unsubstituted tert-butyl (tert-Bu or t-Bu), unsubstituted sec -butyl (sec-Bu), unsubstituted isobutyl (i-Bu)). In certain embodiments, the alkyl group is a substituted Ci-10 alkyl (such as substituted C1-6 alkyl, e.g., -CF3, benzyl).

The term “acyl” refers to a group having the general formula -C(=O)R ^X1, - C(=S)N(R ^X1)2, and -C(=S)S(R ^X1), -C(=NR ^X1)R ^X1, -C(=NR ^X1)OR ^X1, -C(=NR ^X1)SR ^X1, and - C(=NR ^X1)N(R ^X1) ₂, wherein R ^X1 is hydrogen; halogen; substituted or unsubstituted hydroxyl; substituted or unsubstituted thiol; substituted or unsubstituted amino; substituted or unsubstituted acyl, cyclic or acyclic, substituted or unsubstituted, branched or unbranched aliphatic; cyclic or acyclic, substituted or unsubstituted, branched or unbranched heteroaliphatic; cyclic or acyclic, substituted or unsubstituted, branched or unbranched alkyl; cyclic or acyclic, substituted or unsubstituted, branched or unbranched alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, aliphaticoxy, heteroaliphaticoxy, alkyloxy, heteroalkyloxy, aryloxy, heteroaryloxy, aliphaticthioxy, heteroaliphaticthioxy, alkylthioxy, heteroalkylthioxy, arylthioxy, heteroarylthioxy, mono- or di- aliphaticamino, mono- or di- heteroaliphaticamino, mono- or di- alkylamino, mono- or di- hetero alkylamino, mono- or di-arylamino, or mono- or di-heteroarylamino; or two R ^X1 groups taken together form a 5- to 6-membered heterocyclic ring. Exemplary acyl groups include aldehydes (-CHO), carboxylic acids (-CO ₂H), ketones, acyl halides, esters, amides, imines, carbonates, carbamates, and ureas. Acyl substituents include, but are not limited to, any of the substituents described in this application that result in the formation of a stable moiety (e.g., aliphatic, alkyl, alkenyl, alkynyl, heteroaliphatic, heterocyclic, aryl, heteroaryl, acyl, oxo, imino, thiooxo, cyano, isocyano, amino, azido, nitro, hydroxyl, thiol, halo, aliphaticamino, heteroaliphaticamino, alkylamino, heteroalkylamino, arylamino, heteroarylamino, alkylaryl, arylalkyl, aliphaticoxy, heteroaliphaticoxy, alkyloxy, heteroalkyloxy, aryloxy, heteroaryloxy, aliphaticthioxy, heteroaliphaticthioxy, alkylthioxy, heteroalkylthioxy, arylthioxy, heteroarylthioxy, acyloxy, and the like, each of which may or may not be further substituted).

“Alkenyl” refers to a radical of, or a substituent that is, a straight-chain or branched hydrocarbon group having from 2 to 20 carbon atoms, one or more carbon-carbon double bonds, and no triple bonds (“C2-20 alkenyl”). In some embodiments, an alkenyl group has 2 to 10 carbon atoms (“C2-10 alkenyl”). In some embodiments, an alkenyl group has 2 to 9 carbon atoms (“C2-9 alkenyl”). In some embodiments, an alkenyl group has 2 to 8 carbon atoms (“C2-8 alkenyl”). In some embodiments, an alkenyl group has 2 to 7 carbon atoms (“C2-7 alkenyl”). In some embodiments, an alkenyl group has 2 to 6 carbon atoms (“C2-6 alkenyl”). In some embodiments, an alkenyl group has 2 to 5 carbon atoms (“C2-5 alkenyl”). In some embodiments, an alkenyl group has 2 to 4 carbon atoms (“C2-4 alkenyl”). In some embodiments, an alkenyl group has 2 to 3 carbon atoms (“C2-3 alkenyl”). In some embodiments, an alkenyl group has 2 carbon atoms (“C2 alkenyl”). The one or more carboncarbon double bonds can be internal (such as in 2-butenyl) or terminal (such as in 1-butenyl). Examples of C2 4 alkenyl groups include ethenyl (C2), 1-propenyl (C3), 2-propenyl (C3), 1- butenyl (C4), 2-butenyl (C4), butadienyl (C4), and the like. Examples of C2-6 alkenyl groups include the aforementioned C2 4 alkenyl groups as well as pentenyl (C5), pentadienyl (C5), hexenyl (Ce), and the like. Additional examples of alkenyl include heptenyl (C7), octenyl (Cs), octatrienyl (Cs), and the like. Unless otherwise specified, each instance of an alkenyl group is independently optionally substituted, z.e., unsubstituted (an “unsubstituted alkenyl”) or substituted (a “substituted alkenyl”) with one or more substituents. In certain embodiments, the alkenyl group is unsubstituted C2-10 alkenyl. In certain embodiments, the alkenyl group is substituted C2-10 alkenyl.

“Alkynyl” refers to a radical of, or a substituent that is, a straight-chain or branched hydrocarbon group having from 2 to 20 carbon atoms, one or more carbon-carbon triple bonds, and optionally one or more double bonds (“C2-20 alkynyl”). In some embodiments, an alkynyl group has 2 to 10 carbon atoms (“C2-10 alkynyl”). In some embodiments, an alkynyl group has 2 to 9 carbon atoms (“C2-9 alkynyl”). In some embodiments, an alkynyl group has 2 to 8 carbon atoms (“C2-8 alkynyl”). In some embodiments, an alkynyl group has 2 to 7 carbon atoms (“C2-7 alkynyl”). In some embodiments, an alkynyl group has 2 to 6 carbon atoms (“C2-6 alkynyl”). In some embodiments, an alkynyl group has 2 to 5 carbon atoms (“C2-5 alkynyl”). In some embodiments, an alkynyl group has 2 to 4 carbon atoms (“C2 4 alkynyl”). In some embodiments, an alkynyl group has 2 to 3 carbon atoms (“C2-3 alkynyl”). In some embodiments, an alkynyl group has 2 carbon atoms (“C2 alkynyl”). The one or more carbon-carbon triple bonds can be internal (such as in 2-butynyl) or terminal (such as in 1- butynyl). Examples of C2-4 alkynyl groups include, without limitation, ethynyl (C2), 1- propynyl (C3), 2-propynyl (C3), 1-butynyl (C4), 2-butynyl (C4), and the like. Examples of C2-6 alkenyl groups include the aforementioned C2-4 alkynyl groups as well as pentynyl (C5), hexynyl (Ce), and the like. Additional examples of alkynyl include heptynyl (C7), octynyl (Cs), and the like. Unless otherwise specified, each instance of an alkynyl group is independently optionally substituted, unsubstituted (an “unsubstituted alkynyl”) or substituted (a “substituted alkynyl”) with one or more substituents. In certain embodiments, the alkynyl group is unsubstituted C2-10 alkynyl. In certain embodiments, the alkynyl group is substituted C2-10 alkynyl.

“Carbocyclyl” or “carbocyclic” refers to a radical of a non-aromatic cyclic hydrocarbon group having from 3 to 10 ring carbon atoms (“C3-10 carbocyclyl”) and zero heteroatoms in the non-aromatic ring system. In some embodiments, a carbocyclyl group has 3 to 8 ring carbon atoms (“C3-8 carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to 6 ring carbon atoms (“C3-6 carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to 6 ring carbon atoms (“C3-6 carbocyclyl”). In some embodiments, a carbocyclyl group has 5 to 10 ring carbon atoms (“C5-10 carbocyclyl”). Exemplary C3-6 carbocyclyl groups include, without limitation, cyclopropyl (C3), cyclopropenyl (C3), cyclobutyl (C4), cyclobutenyl (C4), cyclopentyl (C5), cyclopentenyl (C5), cyclohexyl (Ce), cyclohexenyl (Ce), cyclohexadienyl (Ce), and the like. Exemplary C3-8 carbocyclyl groups include, without limitation, the aforementioned C3-6 carbocyclyl groups as well as cycloheptyl (C7), cycloheptenyl (C7), cycloheptadienyl (C7), cycloheptatrienyl (C7), cyclooctyl (Cs), cyclooctenyl (Cs), bicyclo[2.2.1]heptanyl (C7), bicyclo[2.2.2]octanyl (Cs), and the like. Exemplary C3-10 carbocyclyl groups include, without limitation, the aforementioned C3-8 carbocyclyl groups as well as cyclononyl (C9), cyclononenyl (C9), cyclodecyl (C10), cyclodecenyl (C10), octahydro- 1 //-indeny 1 (C9), decahydronaphthalenyl (C10), spiro [4.5] dec any 1 (C10), and the like. As the foregoing examples illustrate, in certain embodiments, the carbocyclyl group is either monocyclic (“monocyclic carbocyclyl”) or contain a fused, bridged or spiro ring system such as a bicyclic system (“bicyclic carbocyclyl”) and can be saturated or can be partially unsaturated. “Carbocyclyl” also includes ring systems wherein the carbocyclic ring, as defined above, is fused with one or more aryl or heteroaryl groups wherein the point of attachment is on the carbocyclic ring, and in such instances, the number of carbons continue to designate the number of carbons in the carbocyclic ring system. Unless otherwise specified, each instance of a carbocyclyl group is independently optionally substituted, unsubstituted (an “unsubstituted carbocyclyl”) or substituted (a “substituted carbocyclyl”) with one or more substituents. In certain embodiments, the carbocyclyl group is unsubstituted C3-10 carbocyclyl. In certain embodiments, the carbocyclyl group is a substituted C3-10 carbocyclyl.

In some embodiments, “carbocyclyl” is a monocyclic, saturated carbocyclyl group having from 3 to 10 ring carbon atoms (“C3-10 cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 8 ring carbon atoms (“C3-8 cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 6 ring carbon atoms (“C3-6 cycloalkyl”). In some embodiments, a cycloalkyl group has 5 to 6 ring carbon atoms (“C5-6 cycloalkyl”). In some embodiments, a cycloalkyl group has 5 to 10 ring carbon atoms (“C5-10 cycloalkyl”). Examples of C5-6 cycloalkyl groups include cyclopentyl (C5) and cyclohexyl (C5). Examples of C3-6 cycloalkyl groups include the aforementioned C5-6 cycloalkyl groups as well as cyclopropyl (C3) and cyclobutyl (C4). Examples of C3-8 cycloalkyl groups include the aforementioned C3-6 cycloalkyl groups as well as cycloheptyl (C7) and cyclooctyl (Cs). Unless otherwise specified, each instance of a cycloalkyl group is independently unsubstituted (an “unsubstituted cycloalkyl”) or substituted (a “substituted cycloalkyl”) with one or more substituents. In certain embodiments, the cycloalkyl group is unsubstituted C3-10 cycloalkyl. In certain embodiments, the cycloalkyl group is substituted C3-10 cycloalkyl.

“Aryl” refers to a radical of a monocyclic or polycyclic (e.g., bicyclic or tricyclic) 4n+2 aromatic ring system e.g., having 6, 10, or 14 pi electrons shared in a cyclic array) having 6-14 ring carbon atoms and zero heteroatoms provided in the aromatic ring system (“C6-14 aryl”). In some embodiments, an aryl group has six ring carbon atoms (“Ce aryl”; e.g., phenyl). In some embodiments, an aryl group has ten ring carbon atoms (“C10 aryl”; e.g., naphthyl such as 1-naphthyl and 2-naphthyl). In some embodiments, an aryl group has fourteen ring carbon atoms (“Cu aryl”; e.g., anthracyl). “Aryl” also includes ring systems wherein the aryl ring, as defined above, is fused with one or more carbocyclyl or heterocyclyl groups wherein the radical or point of attachment is on the aryl ring, and in such instances, the number of carbon atoms continue to designate the number of carbon atoms in the aryl ring system. Unless otherwise specified, each instance of an aryl group is independently optionally substituted, unsubstituted (an “unsubstituted aryl”) or substituted (a “substituted aryl”) with one or more substituents. In certain embodiments, the aryl group is unsubstituted Ce-14 aryl- In certain embodiments, the aryl group is substituted Ce-14 aryl.

“Aralkyl” is a subset of alkyl and aryl and refers to an optionally substituted alkyl group substituted by an optionally substituted aryl group. In certain embodiments, the aralkyl is optionally substituted benzyl. In certain embodiments, the aralkyl is benzyl. In certain embodiments, the aralkyl is optionally substituted phenethyl. In certain embodiments, the aralkyl is phenethyl. In certain embodiments, the aralkyl is 7-phenylheptanyl. In certain embodiments, the aralkyl is C7 alkyl substituted by an optionally substituted aryl group (e.g., phenyl). In certain embodiments, the aralkyl is a C7-C10 alkyl group substituted by an optionally substituted aryl group (e.g., phenyl).

“Partially unsaturated” refers to a group that includes at least one double or triple bond. A “partially unsaturated” ring system is further intended to encompass rings having multiple sites of unsaturation but is not intended to include aromatic groups (e.g., aryl or heteroaryl groups) as defined in this application. Likewise, “saturated” refers to a group that does not contain a double or triple bond, i.e., contains all single bonds.

The term “optionally substituted” means substituted or unsubstituted.

Alkyl, alkenyl, alkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl groups are optionally substituted e.g., “substituted” or “unsubstituted” alkyl, “substituted” or “unsubstituted” alkenyl, “substituted” or “unsubstituted” alkynyl, “substituted” or “unsubstituted” carbocyclyl, “substituted” or “unsubstituted” heterocyclyl, “substituted” or “unsubstituted” aryl or “substituted” or “unsubstituted” heteroaryl group). In general, the term “substituted,” whether preceded by the term “optionally” or not, means that at least one hydrogen present on a group (e.g., a carbon or nitrogen atom) is replaced with a permissible substituent, e.g., a substituent which upon substitution results in a stable compound, e.g., a compound which does not spontaneously undergo transformation such as by rearrangement, cyclization, elimination, or other reaction. Unless otherwise indicated, a “substituted” group has a substituent at one or more substitutable positions of the group, and when more than one position in any given structure is substituted, the substituent is either the same or different at each position. The term “substituted” is contemplated to include substitution with all permissible substituents of organic compounds, any of the substituents described in this application that results in the formation of a stable compound. The present invention contemplates any and all such combinations in order to arrive at a stable compound. For purposes of this invention, heteroatoms such as nitrogen may have hydrogen substituents and/or any suitable substituent as described in this application which satisfy the valencies of the heteroatoms and results in the formation of a stable moiety.

Exemplary carbon atom substituents include, but are not limited to, halogen, -CN, -NO ₂, -N ₃, -SO2H, -SO3H, -OH, -OR^, -ON(R ^bb) ₂, -N(R ^bb) ₂, -N(R ^bb)3 ⁺X“, -N(OR ^cc)R ^bb, -SH, -SR^, -SSR ^CC, -C(=O)R ^aa, -CO ₂H, -CHO, -C(OR ^CC) ₂, -CO ₂R ^aa, -OC(=O)R ^aa, -OCO ₂R ^aa, -C(=O)N(R ^bb) ₂, -OC(=O)N(R ^bb) ₂, -NR ^bbC(=O)R ^aa, -NR ^bbCO ₂R ^aa, -NR ^bbC(=O)N(R ^bb) ₂, -C(=NR ^bb)R ^aa, -C(=NR ^bb)OR ^aa, -OC(=NR ^bb)R ^aa, -OC(=NR ^bb)OR ^aa, -C(=NR ^bb)N(R ^bb) ₂, -OC(=NR ^bb)N(R ^bb) ₂, -NR ^bbC(=NR ^bb)N(R ^bb) ₂, -C(=O)NR ^bbSO ₂R ^aa, -NR ^bbSO ₂R ^aa, -SO ₂N(R ^bb) ₂, -SO ₂R ^aa, -SO ₂OR ^aa, -OSO ₂R ^aa, -S(=O)R ^aa, -OS(=O)R ^aa, — Si(R ^aa) ₃, -OSi(R ^aa) ₃ -C(=S)N(R ^bb) ₂, -C(=O)SR ^aa, -C(=S)SR ^aa, -SC(=S)SR ^aa, -SC(=O)SR ^aa, -OC(=O)SR ^aa, -SC(=O)OR ^aa, -SC(=O)R ^aa, -P(=O)(R ^aa) ₂, -P(=O)(OR ^CC) ₂, -NR ^bbP(=O)(OR ^cc) ₂, -NR ^bbP(=O)(N(R ^bb) ₂)2, -P(R ^CC)2, -P(OR ^CC) ₂, -P(R ^CC)3 ⁺X“, -P(OR ^CC)3 ⁺X“, -P(R ^CC) ₄, -P(OR ^CC) ₄, -OP(R ^CC) ₂, -OP(R ^CC)3 ⁺X“, -OP(OR ^CC) ₂, -OP(OR ^CC) ₃ ⁺X-, -OP(R ^CC) ₄, -OP(OR ^CC)4, -B(R ^aa)2, -B(OR ^CC) ₂, -BR ^aa(OR ^cc), Ci-10 alkyl, Ci-10 perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroCi-10 alkyl, heteroC2-io alkenyl, heteroC2-io alkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, Ce-14 aryl, and 5-14 membered heteroaryl; wherein: each instance of R ^aa is, independently, selected from Ci-10 alkyl, Ci-10 perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroCi-10 alkyl, heteroC2-ioalkenyl, heteroC2- loalkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, Ce-14 aryl, and 5-14 membered heteroaryl, or two R ^aa groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^dd groups; each instance of R ^bb is, independently, selected from hydrogen, -OH, -OR ^aa, -N(R ^CC) ₂, -CN, -C(=O)R ^aa, -C(=O)N(R ^CC) ₂, -CO ₂R ^aa, -SO ₂R ^aa, -C(=NR ^cc)OR ^aa, -C(=NR ^CC)N(R ^CC) ₂, -SO ₂N(R ^CC)2, -SO ₂R ^CC, -SO ₂OR ^CC, -SOR ^aa, -C(=S)N(R ^CC)2, -C(=O)SR ^CC, -C(=S)SR ^CC, -P(=O)(R ^aa) ₂, -P(=0)(0R ^CC)2, -P(=O)(N(R ^CC)2)2, CI-IO alkyl, Ci-io perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroCi-ioalkyl, heteroC2-ioalkenyl, heteroC2-ioalkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, Ce-14 aryl, and 5-14 membered heteroaryl, or two R ^bb groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^dd groups; wherein X- is a counterion; each instance of R ^cc is, independently, selected from hydrogen, Ci-10 alkyl, Ci- 10 perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroCi-10 alkyl, heteroC2-io alkenyl, heteroC2-io alkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, Ce-14 aryl, and 5-14 membered heteroaryl, or two R ^cc groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^dd groups; each instance of R ^dd is, independently, selected from halogen, -CN, -NO ₂, -N ₃, -SO2H, -SO3H, -OH, -OR ^ee, -ON(R ^ff) ₂, -N(R ^ff) ₂, -N(R ^ff) ₃ ⁺X“, -N(OR ^ee)R ^ff, -SH, -OC(=O)N(R ^ff) ₂, -NR ^ffC(=O)R ^ee, -NR ^ffCO ₂R ^ee, -NR ^ffC(=O)N(R ^ff) ₂, -C(=NR ^ff)OR ^ee, -OC(=NR ^ff)R ^ee, -OC(=NR ^ff)OR ^ee, -C(=NR ^ff)N(R ^ff) ₂, -OC(=NR ^ff)N(R ^ff) ₂, -NR ^ffC(=NR ^ff)N(R ^ff) ₂, -NR ^ffSO ₂R ^ee, -SO ₂N(R ^ff) ₂, -SO ₂R ^ee, -SO ₂OR ^ee, -OSO ₂R ^ee, -S(=O)R ^ee, -Si(R ^ee) ₃, -OSi(R ^ee) ₃, -C(=S)N(R ^ff) ₂, -C(=O)SR ^ee, -C(=S)SR ^ee, -SC(=S)SR ^ee, -P(=O)(OR ^ee) ₂, -P(=O)(R ^ee) ₂, -OP(=O)(R ^ee) ₂, -OP(=O)(OR ^ee) ₂, C1-6 alkyl, C1-6 perhaloalkyl, C2-6 alkenyl, C2-6 alkynyl, heteroCi-ealkyl, heteroC2-6alkenyl, heteroC2-6alkynyl, C3-10 carbocyclyl, 3-10 membered heterocyclyl, Ce-io aryl, 5-10 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^gg groups, or two geminal R ^dd substituents can be joined to form =0 or =S; wherein X- is a counterion; each instance of R ^ee is, independently, selected from C1-6 alkyl, C1-6 perhaloalkyl, C2-6 alkenyl, C2-6 alkynyl, heteroCi-6 alkyl, heteroC2-6alkenyl, heteroC2-6 alkynyl, C3-10 carbocyclyl, Ce-io aryl, 3-10 membered heterocyclyl, and 3-10 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^gg groups; each instance of R ^ff is, independently, selected from hydrogen, Ci-6 alkyl, Ci-6 perhaloalkyl, C2-6 alkenyl, C2-6 alkynyl, heteroCi-ealkyl, hctcroC2-6alkcnyl, hctcroC2-6alkynyl, C3-10 carbocyclyl, 3-10 membered heterocyclyl, Ce-io aryl and 5-10 membered heteroaryl, or two R ^ff groups are joined to form a 3-10 membered heterocyclyl or 5-10 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^gg groups; and each instance of R ^gg is, independently, halogen, -CN, -NO2, -N3, -SO2H, -SO3H, -OH, -OC1-6 alkyl, -ON(CI- ₆ alkyl) ₂, -N(CI- ₆ alkyl) ₂, -N(CI- ₆ alkyl) ₃ ⁺X-, -NH(CI- ₆ alkyl) ₂ ⁺X-, -NH ₂(CI- ₆ alkyl) ⁺X“, -NH ₃ ⁺X“, -N(OCI- ₆ alkyl)(Ci- ₆ alkyl), -N(OH)(CI- ₆ alkyl), -NH(OH), -SH, -SC1-6 alkyl, -SS(Ci- ₆ alkyl), -C(=O)(Ci- ₆ alkyl), -CO ₂H, -CO ₂(Ci- ₆ alkyl), -OC(=O)(Ci- ₆ alkyl), -OCO ₂(Ci- ₆ alkyl), -C(=O)NH ₂, -C(=O)N(CI- ₆ alkyl) ₂, -OC(=O)NH(CI- ₆ alkyl), -NHC(=O)( C1-6 alkyl), -N(CI- ₆ alkyl)C(=O)( C1-6 alkyl), -NHCO ₂(CI- ₆ alkyl), -NHC(=O)N(CI- ₆ alkyl) ₂, -NHC(=O)NH(CI- ₆ alkyl), -NHC(=O)NH ₂, -C(=NH)O(CI- ₆ alkyl), -OC(=NH)(CI- ₆ alkyl), -OC(=NH)OCI- ₆ alkyl, -C(=NH)N(CI- ₆ alkyl) ₂, -C(=NH)NH(CI- ₆ alkyl), -C(=NH)NH ₂, -OC(=NH)N(CI- ₆ alkyl) ₂, -OC(NH)NH(Ci- 6 alkyl), -OC(NH)NH ₂, -NHC(NH)N(CI- ₆ alkyl) ₂, -NHC(=NH)NH ₂, -NHSO ₂(CI- ₆ alkyl), -SO ₂N(CI- ₆ alkyl) ₂, -SO ₂NH(CI- ₆ alkyl), -SO2NH2, -SO2C1-6 alkyl, -SO2OC1-6 alkyl, -OSO2C1-6 alkyl, -SOC1-6 alkyl, -Si(Ci- ₆ alkyl) ₃, -OSi(Ci- ₆ alkyl) ₃ -C(=S)N(CI- ₆ alkyl) ₂, C(=S)NH(CI- ₆ alkyl), C(=S)NH ₂, -C(=O)S(Ci- ₆ alkyl), -C(=S)SCi- ₆ alkyl, -SC(=S)SCi- ₆ alkyl, -P(=O)(OCi- ₆ alkyl) ₂, -P(=O)(Ci- ₆ alkyl) ₂, -OP(=O)(Ci- ₆ alkyl) ₂, -OP(=O)(OCi- ₆ alkyl)2, C1-6 alkyl, C1-6 perhaloalkyl, C2-6 alkenyl, C2-6 alkynyl, heteroCi-ealkyl, heteroC2- ealkenyl, heteroC2-6alkynyl, C3-10 carbocyclyl, Ce-io aryl, 3-10 membered heterocyclyl, 5-10 membered heteroaryl; or two geminal R ^gg substituents can be joined to form =0 or =S; wherein X- is a counterion. Alternatively, two geminal hydrogens on a carbon atom are replaced with the group =0, =S, =NN(R ^bb) ₂, =NNR ^bbC(=O)R ^aa, =NNR ^bbC(=O)OR ^aa, =NNR ^bbS(=O)2R ^aa, =NR ^bb, or =NOR ^CC; wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^dd groups; wherein X- is a counterion; wherein: each instance of R ^aa is, independently, selected from Ci-io alkyl, Ci-io perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroCi-10 alkyl, heteroC2-ioalkenyl, heteroC2-ioalkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, Ce-14 aryl, and 5-14 membered heteroaryl, or two R ^aa groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^dd groups; each instance of R ^bb is, independently, selected from hydrogen, -OH, -OR ^aa, -N(R ^CC) ₂, -CN, -C(=O)R ^aa, -C(=O)N(R ^CC) ₂, -CO ₂R ^aa, -SO ₂R ^aa, -C(=NR ^cc)OR ^aa, -C(=NR ^CC)N(R ^CC) ₂, -SO ₂N(R ^CC) ₂, -SO ₂R ^CC, -SO ₂OR ^CC, -SOR ^aa, -C(=S)N(R ^CC) ₂, -C(=O)SR ^CC, -C(=S)SR ^CC, -P(=O)(R ^aa) ₂, -P(=O)(OR ^CC) ₂, -P(=O)(N(R ^CC) ₂)2, Ci-10 alkyl, Ci-10 perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroCi-ioalkyl, heteroC2-ioalkenyl, heteroC2-ioalkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, Ce-14 aryl, and 5-14 membered heteroaryl, or two R ^bb groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^dd groups; wherein X- is a counterion; each instance of R ^cc is, independently, selected from hydrogen, Ci-10 alkyl, Ci-10 perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroCi-10 alkyl, heteroC2-io alkenyl, heteroC2-io alkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, Ce-14 aryl, and 5-14 membered heteroaryl, or two R ^cc groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^dd groups; each instance of R ^dd is, independently, selected from halogen, -CN, -NO2, -N3, -SO2H, -SO3H, -OH, -OR ^ee, -ON(R ^ff) ₂, -N(R ^ff) ₂, -N(R ^ff)3 ⁺X“, -N(OR ^ee)R ^ff, -SH, -SR ^ee, -P(=O)(OR ^ee) ₂, -P(=O)(R ^ee) ₂, -OP(=O)(R ^ee) ₂, -OP(=O)(OR ^ee) ₂, Ci- ₆ alkyl, Ci- ₆ perhaloalkyl, C2-6 alkenyl, C2-6 alkynyl, heteroCi-ealkyl, heteroC2-6alkenyl, heteroC2-6alkynyl, C3-10 carbocyclyl, 3-10 membered heterocyclyl, Ce-io aryl, 5-10 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^gg groups, or two geminal R ^dd substituents can be joined to form =0 or =S; wherein X- is a counterion; each instance of R ^ee is, independently, selected from Ci-6 alkyl, Ci-6 perhaloalkyl, C2-6 alkenyl, C2-6 alkynyl, heteroCi-6 alkyl, heteroC2-6alkenyl, heteroC2-6 alkynyl, C3-10 carbocyclyl, Ce-io aryl, 3-10 membered heterocyclyl, and 3-10 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^gg groups; each instance of R ^ff is, independently, selected from hydrogen, Ci-6 alkyl, Ci-6 perhaloalkyl, C2-6 alkenyl, C2-6 alkynyl, heteroCi-ealkyl, heteroC2-6alkenyl, heteroC2-6alkynyl, C3-10 carbocyclyl, 3-10 membered heterocyclyl, Ce-io aryl and 5-10 membered heteroaryl, or two R ^ff groups are joined to form a 3-10 membered heterocyclyl or 5-10 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R ^gg groups; and each instance of R ^gg is, independently, halogen, -CN, -NO2, -N3, -SO2H, -SO3H, -OH, -OC1-6 alkyl, -ON(CI- ₆ alkyl) ₂, -N(CI- ₆ alkyl) ₂, -N(CI- ₆ alkyl) ₃ ⁺X-, -NH(CI- ₆ alkyl) ₂ ⁺X-, -NH ₂(CI- ₆ alkyl) ⁺X“, -NH ₃ ⁺X“, -N(OCI- ₆ alkyl)(Ci- ₆ alkyl), -N(OH)(CI- ₆ alkyl), -NH(OH), -SH, -SC1-6 alkyl, -SS(Ci- ₆ alkyl), -C(=O)(Ci- ₆ alkyl), -CO ₂H, -CO ₂(Ci- ₆ alkyl), -OC(=O)(Ci- ₆ alkyl), -OCO ₂(Ci- ₆ alkyl), -C(=O)NH ₂, -C(=O)N(CI- ₆ alkyl) ₂, -OC(=O)NH(CI- ₆ alkyl), -NHC(=O)( Ci-6 alkyl), -N(CI- ₆ alkyl)C(=O)( Ci-6 alkyl), -NHCO ₂(CI- ₆ alkyl), -NHC(=O)N(CI- ₆ alkyl) ₂, -NHC(=O)NH(CI- ₆ alkyl), -NHC(=0)NH ₂, -C(=NH)O(CI- ₆ alkyl), -OC(=NH)(CI- ₆ alkyl), -OC(=NH)OCI- ₆ alkyl, -C(=NH)N(CI- ₆ alkyl) ₂, -C(=NH)NH(CI- ₆ alkyl), -C(=NH)NH ₂, -OC(=NH)N(CI- ₆ alkyl) ₂, -OC(NH)NH(Ci- 6 alkyl), -0C(NH)NH ₂, -NHC(NH)N(CI- ₆ alkyl) ₂, -NHC(=NH)NH ₂, -NHSO ₂(CI- ₆ alkyl), -SO ₂N(CI- ₆ alkyl) ₂, -SO ₂NH(CI- ₆ alkyl), -SO2NH2, -SO2C1-6 alkyl, -SO2OC1-6 alkyl, -OSO2C1-6 alkyl, -SOC1-6 alkyl, -Si(Ci- ₆ alkyl) ₃, -OSi(Ci- ₆ alkyl) ₃ -C(=S)N(CI- ₆ alkyl) ₂, C(=S)NH(CI- ₆ alkyl), C(=S)NH ₂, -C(=O)S(Ci- ₆ alkyl), -C(=S)SCi- ₆ alkyl, -SC(=S)SCi- ₆ alkyl, -P(=O)(OCi- ₆ alkyl) ₂, -P(=O)(Ci- ₆ alkyl) ₂, -OP(=O)(Ci- ₆ alkyl) ₂, -OP(=O)(OCi- ₆ alkyl) ₂, Ci-6 alkyl, Ci-6 perhaloalkyl, C ₂-6 alkenyl, C ₂-6 alkynyl, heteroCi-ealkyl, heteroC ₂. ealkenyl, heteroC ₂-6alkynyl, C3-10 carbocyclyl, Ce-io aryl, 3-10 membered heterocyclyl, 5-10 membered heteroaryl; or two geminal R ^gg substituents can be joined to form =0 or =S; wherein X- is a counterion.

A “counterion” or “anionic counterion” is a negatively charged group associated with a positively charged group in order to maintain electronic neutrality. An anionic counterion may be monovalent (z.e., including one formal negative charge). An anionic counterion may also be multivalent (z.e., including more than one formal negative charge), such as divalent or trivalent. Exemplary counterions include halide ions (e.g., F , Cl", Br , I"), NO3 , CIO4 , OH , H ₂PO ₄ , HCOs-, HSO4 , sulfonate ions (e.g., methansulfonate, trifluoromethanesulfonate, p- toluenesulfonate, benzenesulfonate, 10-camphor sulfonate, naphthalene-2-sulfonate, naphthalene- 1 -sulfonic acid-5-sulfonate, ethan-1 -sulfonic acid-2-sulfonate, and the like), carboxylate ions (e.g., acetate, propanoate, benzoate, glycerate, lactate, tartrate, glycolate, gluconate, and the like), BFC, PF ₄", PFr, , AsFr, , SbFe , B[3,5-(CF3)2C6FE] ₄] , B(C6Fs)4-, BPh ₄ , AI(OC(CF3)3)4 , and carborane anions (e.g., CBi IHI ₂ or (HCBnMesBre) ). Exemplary counterions which may be multivalent include COs ^2-, HPO4 ^2-, PO4 ^3-, B4O? ^2-, SO ₄ ²’, S ₂O ₃ ²’, carboxylate anions (e.g., tartrate, citrate, fumarate, maleate, malate, malonate, gluconate, succinate, glutarate, adipate, pimelate, suberate, azelate, sebacate, salicylate, phthalates, aspartate, glutamate, and the like), and carboranes.

As used herein, the term “salt” refers to any and all salts, and encompasses pharmaceutically acceptable salts. Salts include ionic compounds that result from the neutralization reaction of an acid and a base. A salt is composed of one or more cations (positively charged ions) and one or more anions (negative ions) so that the salt is electrically neutral (without a net charge). Salts of the compounds of this invention include those derived from inorganic and organic acids and bases.

The term “solvate” refers to forms of a compound that are associated with a solvent, usually by a solvolysis reaction. This physical association may include hydrogen bonding. Conventional solvents include water, methanol, ethanol, acetic acid, DMSO, THF, diethyl ether, and the like. The compounds described herein may be prepared, e.g., in crystalline form, and may be solvated. Suitable solvates include pharmaceutically acceptable solvates and further include both stoichiometric solvates and non- stoichiometric solvates. In certain instances, the solvate will be capable of isolation, for example, when one or more solvent molecules are incorporated in the crystal lattice of a crystalline solid. “Solvate” encompasses both solution-phase and isolable solvates. Representative solvates include hydrates, ethanolates, and methanolates.

The term “hydrate” refers to a compound that is associated with water. Typically, the number of the water molecules contained in a hydrate of a compound is in a definite ratio to the number of the compound molecules in the hydrate. Therefore, a hydrate of a compound may be represented, for example, by the general formula R x H2O, wherein R is the compound and wherein x is a number greater than 0. A given compound may form more than one type of hydrates, including, e.g., monohydrates (x is 1), lower hydrates (x is a number greater than 0 and smaller than 1, e.g., hemihydrates (R 0.5 H2O)), and polyhydrates (x is a number greater than 1, e.g., dihydrates (R-2 H2O) and hexahydrates (R-6 H2O)).

The term “tautomers” refer to compounds that are interchangeable forms of a particular compound structure, and that vary in the displacement of hydrogen atoms and electrons. Thus, two structures may be in equilibrium through the movement of n electrons and an atom (usually H). For example, enols and ketones are tautomers because they are rapidly interconverted by treatment with either acid or base. Another example of tautomerism is the aci- and nitro- forms of phenylnitromethane, which are likewise formed by treatment with acid or base. Tautomeric forms may be relevant to the attainment of the optimal chemical reactivity and biological activity of a compound of interest.

It is also to be understood that compounds that have the same molecular formula but differ in the nature or sequence of bonding of their atoms or the arrangement of their atoms in space are termed “isomers.” Isomers that differ in the arrangement of their atoms in space are termed “stereoisomers.”

Stereoisomers that are not mirror images of one another are termed “diastereomers” and those that are non- superimposable mirror images of each other are termed “enantiomers.” When a compound has an asymmetric center, for example, it is bonded to four different groups, a pair of enantiomers is possible. An enantiomer can be characterized by the absolute configuration of its asymmetric center and described by the R- and S-sequencing rules of Cahn and Prelog. An enantiomer can also be characterized by the manner in which the molecule rotates the plane of polarized light, and designated as dextrorotatory or levorotatory (z.e., as (+) or (-)-isomers respectively). A chiral compound can exist as either an individual enantiomer or as a mixture of enantiomers. A mixture containing equal proportions of the enantiomers is called a “racemic mixture.”

The term “co-crystal” refers to a crystalline structure comprising at least two different components (e.g., a compound described in this application and an acid), wherein each of the components is independently an atom, ion, or molecule. In certain embodiments, none of the components is a solvent. In certain embodiments, at least one of the components is a solvent. A co-crystal of a compound and an acid is different from a salt formed from a compound and the acid. In the salt, a compound described in this application is complexed with the acid in a way that proton transfer (e.g., a complete proton transfer) from the acid to a compound described in this application easily occurs at room temperature. In the co-crystal, however, a compound described in this application is complexed with the acid in a way that proton transfer from the acid to a compound described in this application does not easily occur at room temperature. In certain embodiments, in the co-crystal, there is no proton transfer from the acid to a compound described in this application. In certain embodiments, in the cocrystal, there is partial proton transfer from the acid to a compound described in this application. Co-crystals may be useful to improve the properties (e.g., solubility, stability, and ease of formulation) of a compound described in this application.

The term “polymorphs” refers to a crystalline form of a compound (or a salt, hydrate, or solvate thereof) in a particular crystal packing arrangement. All polymorphs of the same compound have the same elemental composition. Different crystalline forms usually have different X-ray diffraction patterns, infrared spectra, melting points, density, hardness, crystal shape, optical and electrical properties, stability, and solubility. Recrystallization solvent, rate of crystallization, storage temperature, and other factors may cause one crystal form to dominate. Various polymorphs of a compound can be prepared by crystallization under different conditions.

The term “prodrug” refers to compounds, including derivatives of the compounds described herein, that have cleavable groups and become by solvolysis or under physiological conditions.

In some embodiments, bacterial host cells associated with the disclosure produce one or more compounds from the International Fragrance Association (INFA) list of fragrances. See, e.g., https://ifrafragrance.org/priorities/ingredients/ifra-transp arency-list, which is incorporated by reference in this disclosure.

Skin commensal bacterial host cells

Aspects of the disclosure relate to engineered skin commensal bacterial cells. As used herein, a “skin commensal” bacterial cell refers to a bacterial cell that is a component of the skin microbiome and/or that is capable of colonizing the skin. A “skin commensal” bacterial cell would not include, e.g., a bacterial cell that transiently or temporarily comes into contact with the skin but that is not a component of the skin microbiome and is not capable of colonizing the skin.

The term “host cell” refers to a cell that can be used to express a polynucleotide, such as a polynucleotide that encodes an enzyme used in biosynthesis of anthranilate or derivatives thereof. The terms “genetically modified host cell,” “recombinant host cell,” and “recombinant strain” are used interchangeably and refer to host cells that have been genetically modified by, e.g., cloning and transformation methods, or by other methods known in the art (e.g., selective editing methods).

Suitable host cells include any skin commensal bacteria, bacterial skin symbiont, or any bacteria that are component of the skin microbiome and/or that are capable of colonizing the skin known in the art, all of which are contemplated for use herein. Skin commensal bacteria include, but are not limited to species of: Propionibacterium spp., Bifidobacterium spp., Dermabacter spp., Streptococcus spp., Micrococcus spp., Veillonella spp., Staphylococcus spp., Corynebacterium spp., Brevibacterium spp., Lactococcus spp., Lactobacillus spp., Enterococcus spp., Pediococcus spp., Leuconostoc spp., or Oenococcus spp., and mixtures thereof. In some embodiments, the host cell is a gram positive, gram negative, or gram-variable bacterial cell.

In some embodiments, a skin commensal bacterial cell compatible with aspects of the disclosure can be from a bacterial strain that is pathogenic or has been reported to be pathogenic. For example, in some embodiments, a bacterial strain that is pathogenic or has been reported to be pathogenic can be engineered to reduce pathogenicity to be compatible with aspects of the disclosure.

In some nonlimiting embodiments, suitable bacterial host cells include cells from one or more of the following species: Propionibacterium acnes, Corynebacterium tuberculostearicum, Staphylococcus hominis, Staphylococcus epidermidis, Streptococcus mitis, Corynebacterium tuberculostearicum, Staphylococcus warneri, Streptococcus oralis, staphylococcus capitis, Streptococcus pseudopneumoniae, Corynebacterium simulans, Streptococcus sanguinis, Corynebacterium fastidiosum, Streptococcus mitis, Staphylococcus haemolyticus Corynebacterium afermentans, Micrococcus luteus, Corynebacterium aurimucosum, Enhydrobacter aerosaccus, Corynebacterium kroppenstedtii, Veillonella parvula, Corynebacterium amycolatum, and Corynebacterium resistens cells. Other suitable host cells are further described in, and incorporated by reference from Byrd, A. L., Belkaid, Y., & Segre, J. A. (2018). “The human skin microbiome.” Nature Reviews Microbiology, 16(3), 143-155.

In some nonlimiting embodiments, suitable bacterial host cells include cells from Corynebacterium spp. species, such as C. ammoniagenes, C. mucifaciens, C. singulare, C. tuberculostearicum, C. accolens, C. coyleae, C. lipophiloflavum, C. afermentans, C. amycolatum, C. simulans, C. xerosis, C. casei, C. variabile, and C. minutissimum.

The disclosed host cells are exemplified with Staphylococcus epidermidis and Corynebacterium spp., but are also applicable to other skin commensal host cells, as would be understood by one of ordinary skill in the art. Staphylococcus epidermidis is a Grampositive bacterium that is ubiquitous in the human skin and mucosal flora. Staphylococcus epidermidis plays an important role in cutaneous immunity and in maintaining microbial community homeostasis. Corynebacterium is one of the three most abundant bacterial genera on human skin, found especially in moist sites. It has been observed to interact with the cutaneous immune system, and some strains (e.g., Corynebacterium mastitidis') have been found to protect against pathogens.

In some embodiments, the host cell is Staphylococcus epidermidis. In some embodiments, the host cell is Corynebacterium tuberculostearicum. In some embodiments, the host cell is Corynebacterium ammoniagenes. In some embodiments, the host cell is Corynebacterium mucifaciens. In some embodiments, the host cell is Corynebacterium singulare. In some embodiments, the host cell is Corynebacterium accolens. In some embodiments, the host cell is Corynebacterium mastitidis.

In various embodiments, bacterial cells that may be used as host cells in the practice of the disclosure may be readily accessible to the public from a number of culture collections such as American Type Culture Collection (ATCC), Deutsche Sammlung von Mikroorganismen and Zellkulturen GmbH (DSM), Centraalbureau Voor Schimmelcultures (CBS), and Agricultural Research Service Patent Culture Collection, Northern Regional Research Center (NRRL).

In various embodiments, bacterial cells that may be used as host cells in the practice of the disclosure may not be publicly available and may be derived from self-collection samples. The term “self-collection” as used in this application, refers to a swab-based method of collection used to collect various cells from a subject (e.g., the subject’s skin).

The term “cell,” as used in this application, may refer to a single cell or a population of cells, such as a population of cells belonging to the same cell line or strain. Use of the singular term “cell” should not be construed to refer exclusively to a single cell rather than a population of cells. The host cell may comprise genetic modifications relative to a wild-type counterpart.

Ability to control growth of the modified bacterial cells associated with the disclosure may be useful for methods of the present disclosure. For example, the modified bacteria may colonize the skin of a subject and could survive on the skin for longer than the desired timeframe. Methods of killing the modified bacterial cells may include, but are not limited to, application of salicylic acid to the affected skin, application of an agent that alters the pH of the affected skin, and/or application of an agent that alters the temperature of the affected skin.

Modified bacterial host cells of the present disclosure may be further modified to comprise one or more genetic kill switch. The term “genetic kill switch” as used in this application refers to genetic circuits designed to prevent the engineered bacterial cells from surviving outside of their specific purpose and context.

Tryptophan biosynthesis genes

Bacterial host cells associated with the disclosure may be modified to express one or more genes in the tryptophan biosynthesis pathway to increase accumulation of a tryptophan precursor (FIG. 1). In some embodiments, anthranilate can be synthesized in bacterial host cells that overexpress one or more genes in the tryptophan biosynthesis pathway. In some embodiments, bacterial host cells are engineered to express one or more of: TrpC, TrpB, TrpA, TrpE(M298R, S40R, or S40F), TrpG, and AroG(D146N) (Table 1). Table 1. Non-limiting Examples of Enzymes in the Tryptophan Biosynthesis pathway

In some embodiments, bacterial host cells associated with the disclosure are modified to express a tryptophan biosynthesis protein C. In some embodiments, the tryptophan biosynthesis protein C is encoded by the TrpC gene. In some embodiments, a tryptophan biosynthesis protein C comprises a protein sequence or a nucleic acid sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least

71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least

78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least

85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least

92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least

99%, or is 100% identical, including all values in between, to the protein sequence of SEQ ID NO: 2 or the nucleic acid sequence of SEQ ID NO: 19, respectively. In some embodiments, a tryptophan biosynthesis protein C comprises a sequence that is a conservatively substituted version of SEQ ID NO: 2.

In some embodiments, bacterial host cells associated with the disclosure are modified to express a tryptophan biosynthesis protein B. In some embodiments, the tryptophan biosynthesis protein B is encoded by the TrpB gene. In some embodiments, a tryptophan biosynthesis protein B comprises a protein sequence or a nucleic acid sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least

71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least

78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least

85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least

92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is 100% identical, including all values in between, to the protein sequence of SEQ ID NO: 4 or the nucleic acid sequence of SEQ ID NO: 21, respectively. In some embodiments, a tryptophan biosynthesis protein B comprises a sequence that is a conservatively substituted version of SEQ ID NO: 4.

In some embodiments, bacterial host cells associated with the disclosure are modified to express a tryptophan biosynthesis protein A. In some embodiments, the tryptophan biosynthesis protein A is encoded by the TrpA gene. In some embodiments, a tryptophan biosynthesis protein A comprises a protein sequence or a nucleic acid sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least

71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least

78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least

85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least

92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least

99%, or is 100% identical, including all values in between, to the protein sequence of SEQ ID NO: 5 or the nucleic acid sequence of SEQ ID NO: 22, respectively. In some embodiments, a tryptophan biosynthesis protein A comprises a sequence that is a conservatively substituted version of SEQ ID NO: 5.

In some embodiments, bacterial host cells associated with the disclosure are modified to express a feedback resistant tryptophan biosynthesis protein TrpE. In some embodiments, a feedback resistant tryptophan biosynthesis protein TrpE comprises a M298R, S40R, or S40F substitution. In some embodiments, the feedback resistant tryptophan biosynthesis protein TrpE is encoded by the TrpE ^FBR (M298R, S40R, or S40F) gene. In some embodiments, the residue at position 298 of the TrpE protein is a Methionine (M). In some embodiments, the residue at position 298 of the TrpE protein is substituted. In some embodiments, the residue at position 298 of the TrpE protein is an Arginine (R). In some embodiments, the residue at position 40 of the TrpE protein is a Serine (S). In some embodiments the residue at position 40 of the TrpE protein is substituted. In some embodiments, the residue at position 40 of the TrpE protein is an Arginine (R) or a Phenylalanine (F). In some embodiments, a feedback resistant tryptophan biosynthesis protein TrpE comprises a protein or nucleic acid sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is 100% identical, including all values in between, to the protein sequence of SEQ ID NOs: 1, 6, 11, or 13, or the nucleic acid sequence of SEQ ID NOs: 18, 23, 28 or 30, respectively. In some embodiments, a feedback resistant tryptophan biosynthesis protein TrpE comprises a sequence that is a conservatively substituted version of SEQ ID NOs: 1, 6, 11, or 13.

In some embodiments, bacterial host cells associated with the disclosure are modified to increase expression of a tryptophan biosynthesis protein G. In some embodiments, the tryptophan biosynthesis protein G is encoded by the TrpG gene. In some embodiments, a tryptophan biosynthesis protein G comprises a protein sequence or a nucleic acid sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least

70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least

77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least

84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least

91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least

98%, at least 99%, or is 100% identical, including all values in between, to the protein sequence of SEQ ID NOs: 3 or 12, or the nucleic acid sequence of SEQ ID NO: 20 or 29, respectively. In some embodiments, a tryptophan biosynthesis protein G comprises a sequence that is a conservatively substituted version of SEQ ID NO: 3 or 12.

In some embodiments, bacterial host cells associated with the disclosure are modified to increase expression of a feedback insensitive 3-deoxy-D-arabino-heptulosonate-7- phosphate synthase (AroG). In some embodiments, the feedback insensitive AroG comprises a D146N substitution. In some embodiments, the feedback insensitive 3-deoxy-D-arabino- heptulosonate-7-phosphate synthase (AroG) is encoded by the AroG (D146N) gene. In some embodiments, a feedback insensitive 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase (AroG) comprises a protein sequence or a nucleic acid sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least

79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least

86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least

93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is

100% identical, including all values in between, to the protein sequence of SEQ ID NOs: 7 or 14, or the nucleic acid sequence of SEQ ID NO: 24 or 31, respectively. In some embodiments, a feedback insensitive 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase (AroG) comprises a sequence that is a conservatively substituted version of SEQ ID NO: 7 or 14.

Methyltransferases

Bacterial host cells associated with the disclosure may be modified to express a gene encoding a methyltransferase. In some embodiments, methyl anthranilate can be synthesized in bacterial host cells that are engineered to express one or more genes in the tryptophan biosynthesis pathway to increase accumulation of a tryptophan precursor and that are also engineered to express a gene encoding a methyltransferase (FIG. 1). In some embodiments, the gene in the tryptophan biosynthesis pathway is TrpC, TrpB, TrpA, TrpE(M298R, S40R, or S40F), TrpG, and/or AroG(D146N) (Table 1). In some embodiments, the gene encoding a methyltransferase is AAMT1, ANMT, and/or AAMT1 (Table 2).

Table 2. Non-limiting Examples of Methyltransferases

In some embodiments, bacterial host cells associated with the disclosure are modified to express an anthranilate O-methyltransferase 1 or anthranilic acid methyltransferase. In some embodiments, the anthranilate O-methyltransferase 1 or anthranilic acid methyltransferase is encoded by the AAMT1 gene. In some embodiments, an anthranilate O- methyltransferase 1 or anthranilic acid methyltransferase comprises a protein sequence or a nucleic acid sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least

60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least

75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least

82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least

89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least

96%, at least 97%, at least 98%, at least 99%, or is 100% identical, including all values in between, to the protein sequence of SEQ ID NOs: 8, 10, 15, or 17, or the nucleic acid sequence of SEQ ID NOs: 25, 27, 32, or 34, respectively. In some embodiments, an anthranilate O-methyltransferase 1 or anthranilic acid methyltransferase comprises a sequence that is a conservatively substituted version of SEQ ID NO: 8, 10, 15, or 17.

In some embodiments, bacterial host cells associated with the disclosure are modified to express an anthranilate N-methyltransferase. In some embodiments, the anthranilate N- methyltransferase is encoded by the ANMT gene. In some embodiments, an anthranilate N- methyltransferase comprises a protein sequence or a nucleic acid sequence that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is 100% identical, including all values in between, to the protein sequence of SEQ ID NOs: 9 or 16, or the nucleic acid sequence of SEQ ID NOs: 26 or 33, respectively. In some embodiments, an anthranilate N-methyltransferase comprises a sequence that is a conservatively substituted version of SEQ ID NO: 9 or 16.

Expression of Genes of the Present Disclosure

The present disclosure encompasses methods comprising heterologous expression of nucleic acids in a host cell. The term “heterologous” with respect to a nucleic acid, such as a nucleic acid comprising a gene, or a nucleic acid comprising a regulatory region such as a promoter or ribosome binding site, is used interchangeably with the term “exogenous” and the term “recombinant” and refers to: a nucleic acid that has been artificially supplied to a biological system; a nucleic acid that has been modified within a biological system; or a nucleic acid whose expression or regulation has been manipulated within a biological system. A heterologous nucleic acid that is introduced into or expressed in a host cell may be a nucleic acid that comes from a different organism or species than the host cell, or may be a synthetic nucleic acid, or may be a nucleic acid that is also endogenously expressed in the same organism or species as the host cell. For example, a nucleic acid that is endogenously expressed in a host cell may be considered heterologous when it is: situated non-naturally in the host cell; expressed recombinantly in the host cell, either stably or transiently; modified within the host cell; selectively edited within the host cell; expressed in a non-natural copy number within the host cell; or expressed in a non-natural way within the host cell, such as by manipulating regulatory regions that control expression of the nucleic acid. In some embodiments, a heterologous nucleic acid is a nucleic acid that is endogenously expressed in a host cell but whose expression is driven by a promoter that does not naturally regulate expression of the nucleic acid. In other embodiments, a heterologous nucleic acid is a nucleic acid that is endogenously expressed in a host cell and whose expression is driven by a promoter that does naturally regulate expression of the nucleic acid, but the promoter or another regulatory region is modified. In some embodiments, the promoter is recombinantly activated or repressed. For example, gene-editing based techniques may be used to regulate expression of a nucleic acid, including an endogenous nucleic acid, from a promoter, including an endogenous promoter. See, e.g., Chavez et al., Nat Methods. 2016 Jul; 13(7): 563-567. A heterologous nucleic acid may comprise a wild-type sequence or a mutant sequence as compared with a reference nucleic acid sequence.

In some embodiments, a nucleic acid encoding any of the proteins described in this application is under the control of regulatory sequences. In some embodiments, a nucleic acid is expressed under the control of a promoter. In some embodiments, a promoter is heterologous. The promoter can be a native promoter, e.g., the promoter of the gene in its endogenous context, which provides normal regulation of expression of the gene. Alternatively, a promoter can be a promoter that is different from the native promoter of the gene, e.g., the promoter is different from the promoter of the gene in its endogenous context.

Aspects of the disclosure relate to expression of tryptophan biosynthetic genes under the control of synthetic promoters. As used in this application, a “synthetic promoter” refers to a promoter that is not known to occur in nature. In some embodiments, the promoter is a functional fragment of a full-length promoter. A fragment of a nucleic acid refers to a portion up to but not including the full- length nucleic acid molecule. A functional fragment of a nucleic acid of the disclosure refers to a biologically active portion of a nucleic acid. A biologically active portion of a genetic regulatory element such as a promoter may comprise a portion or fragment of a full length genetic regulatory element and have the same type of activity as the full length genetic regulatory element, although the level of activity of the biologically active portion of the genetic regulatory element may vary compared to the level of activity of the full length genetic regulatory element.

Promoters can be heterologous, synthetic or endogenous for increasing expression of target genes. Non-limiting examples include endogenous promoters from Staphylococcus spp. Corynebacterium spp. or other gram positive bacteria such as pSarA, pSodA, pCap-lA, pCasSA, pMecA, pRpsL, pCfb, pAgr, pAsp, pEftu, pGapA, pIlvC, pPgk, pTuf, or synthetic promoters as described for Corynebacterium glutamicum in Henke et al. (2021) Microorganisms 9, 204, Giebelmann et al. (2018) Biotechnol. J., 14, 1800417, which are incorporated by reference in this application.

Non-limiting examples of synthetic promoters include: Laclq, TacI-pTrc-lacO, CP38, CP44, osmY, apFAB38, xthA, poxB, lacUV5, pLlacOl, galP, apFAB322, apFAB346, apFAB339, Bba_J23104, pLTetOl, apFAB76, apFAB56, Trc, apFAB45, apFAB70, apFAB71, apFAB92, T7A1, apFABlOl, apFAB46, apFAB29, bad, and rha.

In some embodiments, a native promoter may be used to drive transcription of one or more MVA pathway genes, MEP pathway genes, and/or tryptophan biosynthetic genes.

In some embodiments, the promoter is a prokaryotic promoter (e.g., bacteriophage or bacterial promoter). Non-limiting examples of bacteriophage promoters include Pls Icon, T3, T7, SP6, and PL. Non-limiting examples of bacterial promoters include Pbad, PmgrB, Ptrc2, Piac/ara, Ptac, CP6, CP25, CP38, CP44, CP43, CP31, CP24, CP18, CP27, CP37, CP17, CP2, CP4, CP45, CPI, CP22, CP19, CP34, CP20, CPU, CP26, CP3, CPI 4, CP13, CP40, CP8, CP28, CP10, CP32, CP30, CP9, CP46, CP23, CP39, CP35, CP33, CP15, CP29, CP12, CP41, CP16, CP42, CP7, Pm, PH207, PD/E20, PN25, PG25, PJS, PAI, PA2, PL, Piac, Piacuvs, Ptaci, and P _COn. Prokaryotic promoters are further described in, and incorporated by reference from, Jensen et al. (1998) Appl Environ Microbiol. 64:82-7, Kosuri et al. (2013) Proc Natl Acad Sci U SA. 110:14024-9, and Deuschle et al. (1986) EMBO J. 5:2987-94. In some embodiments, the promoter is an inducible promoter. As used in this application, an “inducible promoter” is a promoter controlled by the presence or absence of a molecule. This may be used, for example, to controllably induce the expression of an enzyme. In some embodiments, an inducible promoter may be used to regulate expression of one or more enzymes required for production of a compound in a host cell to finely control the production of the compound. Non-limiting examples of inducible promoters include chemically regulated promoters and physically regulated promoters. For chemically regulated promoters, the transcriptional activity can be regulated by one or more compounds, such as alcohol, tetracycline, galactose, a steroid, a metal, or other compounds (e.g., sugars such as xylose). For physically regulated promoters, transcriptional activity can be regulated by a phenomenon such as light or temperature. Non-limiting examples of tetracycline- regulated promoters include anhydrotetracycline (aTc)-responsive promoters and other tetracycline -responsive promoter systems (e.g., a tetracycline repressor protein (tetR), a tetracycline operator sequence (tetO) and a tetracycline transactivator fusion protein (tTA)). Non-limiting examples of steroid-regulated promoters include promoters based on the rat glucocorticoid receptor, human estrogen receptor, moth ecdysone receptors, and promoters from the steroid/retinoid/thyroid receptor superfamily. Non-limiting examples of metal- regulated promoters include promoters derived from metallothionein (proteins that bind and sequester metal ions) genes. Non-limiting examples of pathogenesis-regulated promoters include promoters induced by salicylic acid, ethylene or benzothiadiazole (BTH). Nonlimiting examples of temperature/heat-inducible promoters include heat shock promoters. Non-limiting examples of light-regulated promoters include light responsive promoters from plant cells. In certain embodiments, the inducible promoter is a galactose-inducible promoter. In some embodiments, the inducible promoter is induced by one or more physiological conditions (e.g., pH, temperature, radiation, osmotic pressure, saline gradients, cell surface binding, or concentration of one or more extrinsic or intrinsic inducing agents). Non-limiting examples of an extrinsic inducer or inducing agent include amino acids and amino acid analogs, saccharides and polysaccharides, nucleic acids, protein transcriptional activators and repressors, cytokines, toxins, petroleum-based compounds, metal containing compounds, salts, ions, enzyme substrate analogs, hormones or any combination. In some embodiments, the promoter is a constitutive promoter. As used in this application, a “constitutive promoter” refers to an unregulated promoter that allows continuous transcription of a gene.

Other inducible promoters or constitutive promoters, including synthetic promoters, that may be known to one of ordinary skill in the art are also contemplated.

Expression of a nucleic acid encoding one or more tryptophan biosynthetic genes can be enhanced, at least in part, by the presence of an insulator ribozyme. In some embodiments of the disclosure, an insulator ribozyme is inserted downstream of a promoter and upstream of a ribosome binding site (RBS). In some embodiments, the insulator ribozyme increases expression of a bacterial operon. In some embodiments, an insulator ribozyme is LtsvJ, SccJ, RiboJ, SarJ, PlmJ, VtmoJ, ChmJ, ScvmJ, SltJ, or PlmvJ, as described in, and incorporated by reference from, Lou et al. (2012) Nat Biotechnol. November; 30(11): 1137- 1142, doi: 10.1038/nbt.2401 and Clifton et al. (2018) J. Biol. Eng , 12:23, doi: 10.1186/s 13036-018- 0115-6. It should be appreciated that other insulator ribozymes known in the art may also be compatible with aspects of the disclosure.

Translation of one or more tryptophan biosynthetic genes can be enhanced, at least in part, by the presence of a ribosome binding site (RBS). As known in the art, an RBS is a regulatory nucleic acid region upstream of a start codon in an mRNA that is involved with recruitment of ribosomes. In some embodiments, an RBS is heterologous. In some embodiments, a native RBS can be used for of expression of a gene or operon. An RBS associated with a gene or operon and used for the expression of the gene or operon may be an RBS that is the same or different from the RBS that is natively associated with the gene or operon. An RBS can be synthetic. As used in this application, a “synthetic RBS” refers to an RBS that is not known to occur in nature. Synthetic RBSs are further described in, and incorporated by reference from, Salis et al. (2009) Nat. Biotechnol. 27, 946-950 (2009).

In some embodiments, the RBS is apFAB873, apFAB826, DeadRBS, apFAB871, BBa_J61133, BBa_J61139, apFAB843, BBa_J61124, apFAB864, apFAB964, BBa_J61101, BBa_J61131, salis-3-11, BBa_J61125, BBa_J61118, apFAB922, BBa_J61130, BBa_J61134, BBa_J61128, BBa_J61107, apFAB869, apFAB890, BBa_J61120, BBa_J61109, BBa_J61103, apFAB868, apFAB914, BBa_J61119, BBa_J61126, B0032_RBS, apFAB895, BBa_J61136, apFAB866, GSGV_RBS, apFAB918, BBa_J61129, apFAB867, apFAB903, apFAB872, BBa_J61137, BBa_J61111, apFAB821, apFAB844, BBa_J61110, BBa_J61112, BBa_J61104, BBa_J61122, apFAB854, BBa_J61127, BBa_J61113, GSG_RBS, apFAB892, BBa_J61115, apFAB927, BBa_J61108, Anderson_RBS, apFAB883, apFAB894, BBa_J61132, apFAB860, BBa_J61100, apFAB856, apFAB862, apFAB865, BBa_J61106, apFAB845, apFAB820, apFAB954, apFAB910, salis-4-10, apFAB901, salis-4-4, apFAB832, apFAB909, salis-4-7, apFAB861, apFAB876, apFAB827, salis-2-4, Alon_RBS, apFAB831, apFAB857, apFAB863, apFAB912, apFAB889, apFAB851, apFAB884, apFAB833, apFAB848, apFAB839, salis-1-21, apFAB923, Plotkin_RBS, apFAB842, salis-2-3, apFAB837, apFAB916, apFAB834, apFAB904, apFAB917, salis-1-10, Invitrogen_RBS, salis-1-1, salis-1-3, salis-3-3, salis-4-2, JBEI_RBS, salis-1-5, B0034_RBS, B0030_RBS, or Bujard_RBS, which are further described in and incorporporated by reference from Kosuri et al. (2013) Proc Natl Acad Sci U S A. 110: 14024-9. In certain embodiments, the RBS is apFAB873 or apFAB826.

A nucleic acid described in this application may be incorporated into any appropriate vector through any method known in the art. For example, the vector may be an expression vector, including but not limited to a viral vector (e.g., a lentiviral, retroviral, adenoviral, or adeno-associated viral vector), any vector suitable for transient expression, any vector suitable for constitutive expression, or any vector suitable for inducible expression (e.g., a galactose-inducible or doxycycline-inducible vector). A vector described in this application may be introduced into a suitable host cell using any method known in the art.

In some embodiments, a vector replicates autonomously in the cell. In some embodiments, a vector integrates into a chromosome within a cell. A vector can contain one or more endonuclease restriction sites that are cut by a restriction endonuclease to insert and ligate a nucleic acid containing a gene described in this application to produce a recombinant vector that is able to replicate in a cell. Vectors can be composed of DNA or RNA. Cloning vectors include, but are not limited to: plasmids, fosmids, phagemids, virus genomes and artificial chromosomes. As used in this application, the terms “expression vector” or “expression construct” refer to a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a host cell. In some embodiments, the nucleic acid sequence of a gene described in this application is inserted into a cloning vector such that it is operably joined to regulatory sequences and, in some embodiments, expressed as an RNA transcript. In some embodiments, the vector contains one or more markers, such as a selectable marker as described in this application, to identify cells transformed or transfected with the recombinant vector. In some embodiments, the nucleic acid sequence of a gene described in this application is recoded. Recoding may increase expression of the gene product by at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least

45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least

80%, at least 85%, at least 90%, at least 95%, or 100%, including all values in between) relative to a reference sequence that is not recoded. The choice and design of one or more appropriate vectors suitable for inducing expression of one or more genes in a host cell are within the ability of one of ordinary skill in the art. Expression vectors containing the necessary elements for expression are commercially available and known to one of ordinary skill in the art (see, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, Fourth Edition, Cold Spring Harbor Laboratory Press, 2012).

As used in this disclosure, the term “sequence” refers to an amino acid or a nucleic acid sequence.

The term “amino acid sequence” refers to a sequence of 3 or more amino acid residues covalently linked through peptide bonds as known in the art. The term “amino acid sequence” can refer to the sequence of a peptide, a protein, a polypeptide or an enzyme. Thus, an amino acid sequence can be a protein sequence, peptide sequence, a polypeptide sequence, or an enzyme sequence. The terms “protein,” “peptide,” and “polypeptide” are used interchangeably.

The term “nucleic acid sequence” refers to a sequence of 3 or more nucleic acid residues covalently linked through phosphodiester bonds as known in the art. The term “nucleic acid sequence” can refer to the sequence of a nucleic acid macromolecule, a polynucleotide, a gene, an expression vector, plasmid, a ribose nucleic acid macromolecule (RNA) or a deoxyribose nucleic acid macromolecule (DNA). Thus, a “nucleic acid sequence” can be a nucleotide sequence, a polynucleotide sequence, a gene sequence, an RNA, or a DNA sequence.

Unless otherwise noted, the term “sequence identity,” which is used interchangeably in this disclosure with the term “percent identity,” as known in the art, refers to a relationship between the sequences of two polypeptides or polynucleotides, as determined by sequence comparison (alignment). In some embodiments, sequence identity is determined across the entire length of a sequence. In some embodiments, sequence identity is determined over a region (e.g., a stretch of amino acids or nucleic acids, e.g., the sequence spanning an active site) of a sequence. For example, in some embodiments, sequence identity is determined over a region corresponding to at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or over 100% of the length of the reference sequence.

Identity measures the percent of identical matches between two or more sequences with gap alignments (if any) addressed by a particular mathematical model, algorithm, or computer program.

Identity of related polypeptides or nucleic acid sequences can be readily calculated by any of the methods known to one of ordinary skill in the art. The percent identity of two sequences (e.g., nucleic acid or amino acid sequences) may, for example, be determined using the algorithm of Karlin and Altschul Proc. Natl. Acad. Sci. USA 87:2264-68, 1990, modified as in Karlin and Altschul Proc. Natl. Acad. Sci. USA 90:5873-77, 1993. Such an algorithm is incorporated into the NBLAST® and XBLAST® programs (version 2.0) of Altschul et al., J. Mol. Biol. 215:403-10, 1990. BLAST® protein searches can be performed, for example, with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to the proteins described in this application. Where gaps exist between two sequences, Gapped BLAST® can be utilized, for example, as described in Altschul et al., Nucleic Acids Res. 25(17):3389-3402, 1997. When utilizing BLAST® and Gapped BLAST® programs, the default parameters of the respective programs (e.g., XBLAST® and NBLAST®) can be used, or the parameters can be adjusted appropriately as would be understood by one of ordinary skill in the art.

Another local alignment technique which may be used, for example, is based on the Smith- Waterman algorithm (Smith, T.F. & Waterman, M.S. (1981) “Identification of common molecular subsequences.” J. Mol. Biol. 147:195-197). A general global alignment technique which may be used, for example, is the Needleman-Wunsch algorithm (Needleman, S.B. & Wunsch, C.D. (1970) “A general method applicable to the search for similarities in the amino acid sequences of two proteins.” J. Mol. Biol. 48:443-453), which is based on dynamic programming.

More recently, a Fast Optimal Global Sequence Alignment Algorithm (FOGSAA) was developed that purportedly produces global alignment of nucleic acid and amino acid sequences faster than other optimal global alignment methods, including the Needleman- Wunsch algorithm. In some embodiments, the identity of two polypeptides is determined by aligning the two amino acid sequences, calculating the number of identical amino acids, and dividing by the length of one of the amino acid sequences. In some embodiments, the identity of two nucleic acids is determined by aligning the two nucleotide sequences and calculating the number of identical nucleotide and dividing by the length of one of the nucleic acids.

For multiple sequence alignments, computer programs including Clustal Omega (Sievers et al., Mol Syst Biol. 2011 Oct 11 ;7:539) may be used.

In preferred embodiments, a sequence, including a nucleic acid or amino acid sequence, is found to have a specified percent identity to a reference sequence, such as a sequence disclosed in this application and/or recited in the claims when sequence identity is determined using the algorithm of Karlin and Altschul Proc. Natl. Acad. Sci. USA 87:2264- 68, 1990, modified as in Karlin and Altschul Proc. Natl. Acad. Sci. USA 90:5873-77, 1993 (e.g., BLAST® , NBLAST®, XBLAST® or Gapped BLAST® programs, using default parameters of the respective programs).

In some embodiments, a sequence, including a nucleic acid or amino acid sequence, is found to have a specified percent identity to a reference sequence, such as a sequence disclosed in this application and/or recited in the claims when sequence identity is determined using the Smith- Waterman algorithm (Smith, T.F. & Waterman, M.S. (1981) “Identification of common molecular subsequences.” J. Mol. Biol. 147:195-197) or the Needleman-Wunsch algorithm (Needleman, S.B. & Wunsch, C.D. (1970) “A general method applicable to the search for similarities in the amino acid sequences of two proteins.” J. Mol. Biol. 48:443-453) using default parameters.

As used in this application, a residue (such as a nucleic acid residue or an amino acid residue) in sequence “X” is referred to as corresponding to a position or residue (such as a nucleic acid residue or an amino acid residue) “Z” in a different sequence “Y” when the residue in sequence “X” is at the counterpart position of “Z” in sequence “Y” when sequences X and Y are aligned using amino acid sequence alignment tools known in the art.

Sequences that are homologous to any of the sequences disclosed herein are also contemplated. Homologous sequences include but are not limited to paralogous or orthologous sequences. Paralogous sequences arise from duplication of a gene within a genome of a species, while orthologous sequences diverge after a speciation event.

In some embodiments, a coding sequence comprises a mutation at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,

59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,

84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 or more than 100 positions relative to a reference coding sequence. In some embodiments, the coding sequence comprises a mutation in 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,

47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,

72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96,

97, 98, 99,100 or more codons of the coding sequence relative to a reference coding sequence. As will be understood by one of ordinary skill in the art, a mutation within a codon may or may not change the amino acid that is encoded by the codon due to degeneracy of the genetic code. In some embodiments, the one or more mutations in the coding sequence do not alter the amino acid sequence of the coding sequence relative to the amino acid sequence of a reference polypeptide.

In some embodiments, the one or more mutations in a coding sequence do alter the amino acid sequence of the corresponding polypeptide relative to the amino acid sequence of a reference polypeptide. In some embodiments, the one or more mutations alters the amino acid sequence of the polypeptide relative to the amino acid sequence of a reference polypeptide and alter (enhance or reduce) an activity of the polypeptide relative to the reference polypeptide.

The activity (e.g., specific activity) of any of the recombinant polypeptides described in this application may be measured using routine methods. As a non-limiting example, a recombinant polypeptide’s activity may be determined by measuring its substrate specificity, product(s) produced, the concentration of product(s) produced, or any combination thereof. As used in this application, “specific activity” of a recombinant polypeptide refers to the amount (e.g., concentration) of a particular product produced for a given amount (e.g., concentration) of the recombinant polypeptide per unit time.

The skilled artisan will also realize that mutations in a coding sequence may result in conservative amino acid substitutions to provide functionally equivalent variants of the foregoing polypeptides, e.g., variants that retain the activities of the polypeptides. As used in this application, a “conservative amino acid substitution” refers to an amino acid substitution that does not alter the relative charge or size characteristics or functional activity of the protein in which the amino acid substitution is made. In some embodiments, a variant or mutant of any polypeptide sequence (e.g., TrpC, TrpB, TrpA, TrpE ^FBR (M298R, S40R, or S40F), TrpG, AroG (D146N), AAMT1, and/or ANMT) disclosed herein comprises one or more conservative or non-conservative substitutions.

In some instances, an amino acid is characterized by its R group (see, e.g., Table 6). For example, an amino acid may comprise a nonpolar aliphatic R group, a positively charged R group, a negatively charged R group, a nonpolar aromatic R group, or a polar uncharged R group. Non-limiting examples of an amino acid comprising a nonpolar aliphatic R group include alanine, glycine, valine, leucine, methionine, and isoleucine. Non-limiting examples of an amino acid comprising a positively charged R group includes lysine, arginine, and histidine. Non-limiting examples of an amino acid comprising a negatively charged R group include aspartate and glutamate. Non-limiting examples of an amino acid comprising a nonpolar, aromatic R group include phenylalanine, tyrosine, and tryptophan. Non-limiting examples of an amino acid comprising a polar uncharged R group include serine, threonine, cysteine, proline, asparagine, and glutamine.

Non-limiting examples of functionally equivalent variants of polypeptides may include conservative amino acid substitutions in the amino acid sequences of proteins disclosed in this application. As used in this application “conservative substitution” is used interchangeably with “conservative amino acid substitution” and refers to any one of the amino acid substitutions provided in Table 3.

In some embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more than 20 residues can be changed when preparing variant polypeptides. In some embodiments, amino acids are replaced by conservative amino acid substitutions.

Table 3. Conservative Amino Add Substitutions

Amino acid substitutions in the amino acid sequence of a polypeptide to produce a recombinant polypeptide (e.g., TrpC, TrpB, TrpA, TrpE ^FBR (M298R, S40R, or S40F), TrpG, AroG (D146N), AAMT1, and/or ANMT) variant having a desired property and/or activity can be made by alteration of the coding sequence of the polypeptide. Similarly, conservative amino acid substitutions in the amino acid sequence of a polypeptide to produce functionally equivalent variants of the polypeptide typically are made by alteration of the coding sequence of the recombinant polypeptide (e.g., TrpC, TrpB, TrpA, TrpE ^FBR (M298R, S40R, or S40F), TrpG, AroG (D146N), AAMT1, and/or ANMT).

Mutations e.g., substitutions, insertions, additions, or deletions) can be made in a nucleic acid sequence by a variety of methods known to one of ordinary skill in the art. For example, mutations (e.g., substitutions, insertions, additions, or deletions) can be made by PCR-directed mutation, site-directed mutagenesis according to the method of Kunkel (Kunkel, Proc. Nat. Acad. Sci. U.S.A. 82: 488-492, 1985), by chemical synthesis of a gene encoding a polypeptide, by gene editing, or by insertions, such as insertion of a tag (e.g., a HIS tag or a GFP tag). Mutations can include, for example, substitutions, insertions, additions, deletions, and translocations, generated by any method known in the art. Methods for producing mutations may be found in in references such as Molecular Cloning: A Laboratory Manual, J. Sambrook, et al., eds., Fourth Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 2012, or Current Protocols in Molecular Biology, F.M. Ausubel, et al., eds., John Wiley & Sons, Inc., New York, 2010.

In some embodiments, methods for producing variants include circular permutation (Yu and Lutz, Trends Biotechnol. 2011 Jan;29(l): 18-25). In circular permutation, the linear primary sequence of a polypeptide can be circularized (e.g., by joining the N-terminal and C- terminal ends of the sequence) and the polypeptide can be severed (“broken”) at a different location. Thus, the linear primary sequence of the new polypeptide may have low sequence identity (e.g., less than 80%, less than 75%, less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, less than 40%, less than 35%, less than 30%, less than 25%, less than 20%, less than 15%, less than 10%, less or less than 5%, including all values in between) as determined by linear sequence alignment methods (e.g., Clustal Omega or BLAST). Topological analysis of the two polypeptides, however, may reveal that the tertiary structure of the two polypeptides is similar or dissimilar. Without being bound by a particular theory, a variant polypeptide created through circular permutation of a reference polypeptide and with a similar tertiary structure as the reference polypeptide can share similar functional characteristics (e.g., enzymatic activity, enzyme kinetics, substrate specificity or product specificity). In some instances, circular permutation may alter the secondary structure, tertiary structure or quaternary structure and produce a polypeptide with different functional characteristics (e.g., an enzyme with increased or decreased enzymatic activity, different substrate specificity, or different product specificity). See, e.g., Yu and Lutz, Trends Biotechnol. 2011 Jan;29(l): 18-25.

It should be appreciated that in a polypeptide that has undergone circular permutation, the linear amino acid sequence of the polypeptide would differ from a reference polypeptide that has not undergone circular permutation. However, one of ordinary skill in the art would be able to determine which residues in the polypeptide that has undergone circular permutation correspond to residues in the reference polypeptide that has not undergone circular permutation by, for example, aligning the sequences and detecting conserved motifs, and/or by comparing the structures or predicted structures of the polypeptides, e.g., by homology modeling.

In some embodiments, an algorithm that determines the percent identity between a sequence of interest and a reference sequence described in this application accounts for the presence of circular permutation between the sequences. The presence of circular permutation may be detected using any method known in the art, including, for example, RASPODOM (Weiner et al., Bioinformatics. 2005 Apr l;21(7):932-7). In some embodiments, the presence of circulation permutation is corrected for (e.g., the domains in at least one sequence are rearranged) prior to calculation of the percent identity between a sequence of interest and a sequence described in this application. The claims of this application should be understood to encompass sequences for which percent identity to a reference sequence is calculated after taking into account potential circular permutation of the sequence.

Compositions and kits

The present disclosure provides compositions including modified host cells associated with the disclosure. In some embodiments, the compositions comprise live bacterial cells. In some embodiments, a composition comprising live bacterial cells is provided in an effective amount. In certain embodiments, the effective amount is a therapeutically effective amount. In certain embodiments, the effective amount is a prophy tactically effective amount. Compositions associated with the disclosure comprising live bacteria may be topically administered to the skin, for example as a cream or lotion. In some embodiments, the composition is for use as sunscreen. For example, in some embodiments, modified host cells that produce methyl anthranilate are contemplated for use in production of a sunscreen. In some embodiments, the composition is for use as a perfume or fragrance. For example, in some embodiments, modified host cells associated with the disclosure are contemplated for use in production of a fragrance or perfume. In some embodiments, the composition is for use as a deodorant. For example, in some embodiments, modified host cells that anthranilate and/or methyl anthranilate are contemplated for use as a deodorant. In some embodiments, a composition that is administered topically to the skin may be a cosmetic composition, an insect repellent, a health or wellness aid, or a pharmaceutical composition.

Compositions may be in any form known in the art, such as a gel, cream, ointment, lotion, serum, powder, aerosol spray or two-component dispensing system. Compositions may comprise a carrier or excipient, and may further comprise a buffer, a thickener, a cryoprotectant, or other ingredient(s) as further discussed below.

Compositions described in this application can be prepared by any method known in the art. In general, such preparatory methods include bringing a compound or modified host cell described in this application (z.e., the “active ingredient”) into association with a carrier or excipient, and/or one or more other ingredients, and then, if necessary and/or desirable, mixing, e.g., to form a homogenous mix, freeze-drying, shaping, packaging into a desired single-dose or multi-dose unit, or any combination thereof.

Compositions can be prepared, packaged, and/or sold in bulk, as a single unit dose, and/or as a plurality of single unit doses. A “unit dose” is a discrete amount of the composition comprising a predetermined amount of the active ingredient. The amount of the active ingredient is generally equal to the dosage of the active ingredient which would be administered to a subject and/or a convenient fraction of such a dosage, such as one-half or one-third of such a dosage.

Relative amounts of the active ingredient, the acceptable carrier or excipient, e.g., a pharmaceutically acceptable excipient, and/or any additional ingredients in a composition described in this application will vary, depending upon the identity, size, and/or condition of the subject treated and further depending upon the route by which the composition is to be administered. The composition may comprise between 0.1% and 100% (w/w) active ingredient.

Acceptable carriers or excipients, including pharmaceutically acceptable excipients used in the manufacture of compositions include inert diluents, dispersing and/or granulating agents, surface active agents and/or emulsifiers, disintegrating agents, binding agents, preservatives, buffering agents, lubricating agents, and/or oils. Excipients such as cocoa butter and suppository waxes, coloring agents, coating agents, sweetening, flavoring, and perfuming agents may also be present in the composition. Exemplary excipients include diluents, dispersing and/or granulating agents, surface active agents and/or emulsifiers, disintegrating agents, binding agents, preservatives, buffering agents, lubricating agents, and/or oils (e.g., synthetic oils, semi-synthetic oils) as disclosed in this application.

Exemplary diluents include calcium carbonate, sodium carbonate, calcium phosphate, dicalcium phosphate, calcium sulfate, calcium hydrogen phosphate, sodium phosphate lactose, sucrose, cellulose, microcrystalline cellulose, kaolin, mannitol, sorbitol, inositol, sodium chloride, dry starch, cornstarch, powdered sugar, and mixtures thereof.

Exemplary granulating and/or dispersing agents include potato starch, com starch, tapioca starch, sodium starch glycolate, clays, alginic acid, guar gum, citrus pulp, agar, bentonite, cellulose, and wood products, natural sponge, cation-exchange resins, calcium carbonate, silicates, sodium carbonate, cross-linked poly(vinyl-pyrrolidone) (crospovidone), sodium carboxymethyl starch (sodium starch glycolate), carboxymethyl cellulose, cross-linked sodium carboxymethyl cellulose (croscarmellose), methylcellulose, pregelatinized starch (starch 1500), microcrystalline starch, water insoluble starch, calcium carboxymethyl cellulose, magnesium aluminum silicate (Veegum), sodium lauryl sulfate, quaternary ammonium compounds, and mixtures thereof.

Exemplary surface active agents and/or emulsifiers include natural emulsifiers (e.g., acacia, agar, alginic acid, sodium alginate, tragacanth, chondrux, cholesterol, xanthan, pectin, gelatin, egg yolk, casein, wool fat, cholesterol, wax, and lecithin), colloidal clays (e.g., bentonite (aluminum silicate) and Veegum (magnesium aluminum silicate)), long chain amino acid derivatives, high molecular weight alcohols (e.g., stearyl alcohol, cetyl alcohol, oleyl alcohol, triacetin monostearate, ethylene glycol distearate, glyceryl monostearate, and propylene glycol monostearate, polyvinyl alcohol), carbomers (e.g., carboxy polymethylene, polyacrylic acid, acrylic acid polymer, and carboxyvinyl polymer), carrageenan, cellulosic derivatives (e.g., carboxymethylcellulose sodium, powdered cellulose, hydroxymethyl cellulose, hydroxypropyl cellulose, hydroxypropyl methylcellulose, methylcellulose), sorbitan fatty acid esters (e.g., polyoxyethylene sorbitan monolaurate (Tween® 20), polyoxyethylene sorbitan (Tween® 60), polyoxyethylene sorbitan monooleate (Tween® 80), sorbitan monopalmitate (Span® 40), sorbitan monostearate (Span® 60), sorbitan tristearate (Span® 65), glyceryl monooleate, sorbitan monooleate (Span® 80), polyoxyethylene esters (e.g., polyoxyethylene monostearate (Myrj® 45), polyoxyethylene hydrogenated castor oil, polyethoxylated castor oil, polyoxymethylene stearate, and Solutol®), sucrose fatty acid esters, polyethylene glycol fatty acid esters (e.g., Cremophor®), polyoxyethylene ethers, (e.g., polyoxyethylene lauryl ether (Brij® 30)), poly(vinyl-pyrrolidone), diethylene glycol monolaurate, triethanolamine oleate, sodium oleate, potassium oleate, ethyl oleate, oleic acid, ethyl laurate, sodium lauryl sulfate, Pluronic® F-68, poloxamer P-188, cetrimonium bromide, cetylpyridinium chloride, benzalkonium chloride, docusate sodium, and/or mixtures thereof.

Exemplary binding agents include starch (e.g., cornstarch and starch paste), gelatin, sugars (e.g., sucrose, glucose, dextrose, dextrin, molasses, lactose, lactitol, mannitol, etc.), natural and synthetic gums (e.g., acacia, sodium alginate, extract of Irish moss, panwar gum, ghatti gum, mucilage of isapol husks, carboxymethylcellulose, methylcellulose, ethylcellulose, hydroxyethylcellulose, hydroxypropyl cellulose, hydroxypropyl methylcellulose, microcrystalline cellulose, cellulose acetate, poly (vinyl-pyrrolidone), magnesium aluminum silicate (Veegum®), and larch arabogalactan), alginates, polyethylene oxide, polyethylene glycol, inorganic calcium salts, silicic acid, polymethacrylates, waxes, water, alcohol, and/or mixtures thereof.

Exemplary preservatives include antioxidants, chelating agents, antimicrobial preservatives, antifungal preservatives, antiprotozoan preservatives, alcohol preservatives, acidic preservatives, and other preservatives. In certain embodiments, the preservative is an antioxidant. In other embodiments, the preservative is a chelating agent.

Exemplary antioxidants include alpha tocopherol, ascorbic acid, acorbyl palmitate, butylated hydroxy anisole, butylated hydroxy toluene, monothioglycerol, potassium metabisulfite, propionic acid, propyl gallate, sodium ascorbate, sodium bisulfite, sodium metabisulfite, and sodium sulfite.

Exemplary chelating agents include ethylenediaminetetraacetic acid (EDTA) and salts and hydrates thereof (e.g., sodium edetate, disodium edetate, trisodium edetate, calcium disodium edetate, dipotassium edetate, and the like), citric acid and salts and hydrates thereof (e.g., citric acid monohydrate), fumaric acid and salts and hydrates thereof, malic acid and salts and hydrates thereof, phosphoric acid and salts and hydrates thereof, and tartaric acid and salts and hydrates thereof. Exemplary antimicrobial preservatives include benzalkonium chloride, benzethonium chloride, benzyl alcohol, bronopol, cetrimide, cetylpyridinium chloride, chlorhexidine, chlorobutanol, chlorocresol, chloroxylenol, cresol, ethyl alcohol, glycerin, hexetidine, imidurea, phenol, phenoxyethanol, phenylethyl alcohol, phenylmercuric nitrate, propylene glycol, and thimerosal.

Exemplary antifungal preservatives include butyl paraben, methyl paraben, ethyl paraben, propyl paraben, benzoic acid, hydroxybenzoic acid, potassium benzoate, potassium sorbate, sodium benzoate, sodium propionate, and sorbic acid.

Exemplary alcohol preservatives include ethanol, polyethylene glycol, phenol, phenolic compounds, bisphenol, chlorobutanol, hydroxybenzoate, and phenylethyl alcohol.

Exemplary acidic preservatives include vitamin A, vitamin C, vitamin E, beta-carotene, citric acid, acetic acid, dehydroacetic acid, ascorbic acid, sorbic acid, and phytic acid.

Other preservatives include tocopherol, tocopherol acetate, deteroxime mesylate, cetrimide, butylated hydroxyanisol (BHA), butylated hydroxytoluened (BHT), ethylenediamine, sodium lauryl sulfate (SLS), sodium lauryl ether sulfate (SLES), sodium bisulfite, sodium metabisulfite, potassium sulfite, potassium metabisulfite, Glydant® Plus, Phenonip®, methylparaben, Germall® 115, Germaben® II, NeoIone®, Kathon®, and Euxyl®.

Exemplary buffering agents include citrate buffer solutions, acetate buffer solutions, phosphate buffer solutions, ammonium chloride, calcium carbonate, calcium chloride, calcium citrate, calcium glubionate, calcium gluceptate, calcium gluconate, D-gluconic acid, calcium glycerophosphate, calcium lactate, propanoic acid, calcium levulinate, pentanoic acid, dibasic calcium phosphate, phosphoric acid, tribasic calcium phosphate, calcium hydroxide phosphate, potassium acetate, potassium chloride, potassium gluconate, potassium mixtures, dibasic potassium phosphate, monobasic potassium phosphate, potassium phosphate mixtures, sodium acetate, sodium bicarbonate, sodium chloride, sodium citrate, sodium lactate, dibasic sodium phosphate, monobasic sodium phosphate, sodium phosphate mixtures, tromethamine, magnesium hydroxide, aluminum hydroxide, alginic acid, pyrogen-free water, isotonic saline, Ringer’s solution, ethyl alcohol, and mixtures thereof. Exemplary lubricating agents include magnesium stearate, calcium stearate, stearic acid, silica, talc, malt, glyceryl behanate, hydrogenated vegetable oils, polyethylene glycol, sodium benzoate, sodium acetate, sodium chloride, leucine, magnesium lauryl sulfate, sodium lauryl sulfate, and mixtures thereof.

Exemplary natural oils include almond, apricot kernel, avocado, babassu, bergamot, black current seed, borage, cade, camomile, canola, caraway, carnauba, castor, cinnamon, cocoa butter, coconut, cod liver, coffee, corn, cotton seed, emu, eucalyptus, evening primrose, fish, flaxseed, geraniol, gourd, grape seed, hazel nut, hyssop, isopropyl myristate, jojoba, kukui nut, lavandin, lavender, lemon, litsea cubeba, macademia nut, mallow, mango seed, meadowfoam seed, mink, nutmeg, olive, orange, orange roughy, palm, palm kernel, peach kernel, peanut, poppy seed, pumpkin seed, rapeseed, rice bran, rosemary, safflower, sandalwood, sasquana, savoury, sea buckthorn, sesame, shea butter, silicone, soybean, sunflower, tea tree, thistle, tsubaki, vetiver, walnut, and wheat germ oils. Exemplary synthetic or semi-synthetic oils include, but are not limited to, butyl stearate, medium chain triglycerides (such as caprylic triglyceride and capric triglyceride), cyclomethicone, diethyl sebacate, dimethicone 360, isopropyl myristate, mineral oil, octyldodecanol, oleyl alcohol, silicone oil, and mixtures thereof. In certain embodiments, exemplary synthetic oils comprise medium chain triglycerides (such as caprylic triglyceride and capric triglyceride).

Specifically contemplated routes are direct administration to the skin. Dosage forms for topical and/or transdermal administration of a composition described in this application may include ointments, pastes, creams, lotions, gels, powders, solutions, sprays, inhalants, and/or patches. Generally, the active ingredient is admixed under sterile conditions with a pharmaceutically acceptable carrier or excipient and/or any needed preservatives and/or buffers as can be required. Additionally, the present disclosure contemplates the use of transdermal patches, which often have the added advantage of providing controlled delivery of an active ingredient to the body. Such dosage forms can be prepared, for example, by dissolving and/or dispensing the active ingredient in the proper medium. Alternatively or additionally, the rate can be controlled by either providing a rate controlling membrane and/or by dispersing the active ingredient in a polymer matrix and/or gel.

Suitable devices for use in delivering intradermal pharmaceutical compositions described in this application include short needle devices. Intradermal compositions can be administered by devices which limit the effective penetration length of a needle into the skin. Alternatively or additionally, conventional syringes can be used in the classical mantoux method of intradermal administration. Jet injection devices which deliver liquid formulations to the dermis via a liquid jet injector and/or via a needle which pierces the stratum comeum and produces a jet which reaches the dermis are suitable. Ballistic powder/particle delivery devices which use compressed gas to accelerate the compound in powder form through the outer layers of the skin to the dermis are suitable.

Formulations suitable for topical administration include, but are not limited to, liquid and/or semi-liquid preparations such as liniments, lotions, oil-in-water and/or water-in-oil emulsions such as creams, ointments, and/or pastes, and/or solutions and/or suspensions. Topically administrable formulations may, for example, comprise from about 1% to about 10% (w/w) active ingredient, although the concentration of the active ingredient can be as high as the solubility limit of the active ingredient in the solvent. Formulations for topical administration may further comprise one or more of the additional ingredients described in this application.

Although the descriptions of compositions provided in this application are principally directed to compositions which are suitable for administration to humans, it will be understood by the skilled artisan that such compositions are generally suitable for administration to animals of all sorts. Modification of compositions suitable for administration to humans in order to render the compositions suitable for administration to various animals is well understood, and the ordinarily skilled veterinary pharmacologist can design and/or perform such modification with ordinary experimentation.

A composition as described in this application, can be administered in combination with one or more additional agents (e.g., therapeutically and/or prophylactically active agents). The compounds or compositions can be administered in combination with additional agents that improve their activity, improve bioavailability, improve safety, reduce drug resistance, reduce and/or modify metabolism, inhibit excretion, and/or modify distribution in a subject or cell. In certain embodiments, a composition associated with the disclosure and an additional agent shows a synergistic effect.

The composition can be administered concurrently with, prior to, or subsequent to one or more additional agents, such as pharmaceutical agents. Pharmaceutical agents include therapeutically active agents. Pharmaceutical agents also include prophylactically active agents. Pharmaceutical agents include small organic molecules such as drug compounds (e.g., compounds approved for human or veterinary use by the U.S. Food and Drug Administration as provided in the Code of Federal Regulations (CFR)), peptides, proteins, carbohydrates, monosaccharides, oligosaccharides, polysaccharides, nucleoproteins, mucoproteins, lipoproteins, synthetic polypeptides or proteins, small molecules linked to proteins, glycoproteins, steroids, nucleic acids, DNAs, RNAs, nucleotides, nucleosides, oligonucleotides, antisense oligonucleotides, lipids, hormones, vitamins, and cells.

In some embodiments, one or more of the compositions described in this application are administered to a subject. In certain embodiments, the subject is an animal. The animal may be of either sex and may be at any stage of development. In certain embodiments, the subject is a human. In other embodiments, the subject is a non-human animal. In certain embodiments, the subject is a mammal. In certain embodiments, the subject is a non-human mammal. In certain embodiments, the subject is a domesticated animal, such as a dog, cat, cow, pig, horse, sheep, or goat. In certain embodiments, the subject is a companion animal, such as a dog or cat. In certain embodiments, the subject is a livestock animal, such as a cow, pig, horse, sheep, or goat. In certain embodiments, the subject is a zoo animal. In another embodiment, the subject is a research animal, such as a rodent (e.g., mouse, rat), dog, pig, or non-human primate.

Also encompassed by the disclosure are kits. The kits provided may comprise a composition and a container (e.g., a vial, ampule, bottle, syringe, and/or dispenser package, or other suitable container). In some embodiments, kits may optionally further include a second container comprising a pharmaceutical excipient for dilution or suspension of a composition or compound described in this application. In some embodiments, compositions provided in a first container and a second container are combined to form one unit dosage form.

In certain embodiments, a kit described in this application further includes instructions for using the kit. A kit described in this application may also include information as required by a regulatory agency such as the U.S. Food and Drug Administration (FDA). In certain embodiments, the information included in the kits is prescribing information.

A kit described in this application may include one or more additional agents as a separate composition.

In some embodiments, compositions associated with the disclosure include consumer products, such as comestible, cosmetic, toiletry, potable, inhalable, and wellness products. Exemplary consumer products include salves, waxes, powdered concentrates, pastes, extracts, tinctures, powders, oils, capsules, skin patches, sublingual oral dose drops, mucous membrane oral spray doses, makeup, perfume, shampoos, cosmetic soaps, cosmetic creams, skin lotions, aromatic essential oils, massage oils, shaving preparations, oils for toiletry purposes, lip balm, cosmetic oils, facial washes, moisturizing creams, moisturizing body lotions, moisturizing face lotions, bath salts, bath gels, bath soaps in liquid form, shower gels, bath bombs, hair care preparations, shampoos, conditioner, and herbal infusions.

Culturing of host cells

Any of the cells disclosed in this application can be cultured in media of any type (rich or minimal) and any composition prior to, during, and/or after contact and/or integration of a nucleic acid. The conditions of the culture or culturing process can be optimized through routine experimentation as would be understood by one of ordinary skill in the art. In some embodiments, the selected media is supplemented with various components. In some embodiments, the concentration and amount of a supplemental component is optimized. In some embodiments, other aspects of the media and growth conditions (e.g., pH, temperature, etc.) are optimized through routine experimentation. In some embodiments, the frequency that the media is supplemented with one or more supplemental components, and the amount of time that the cell is cultured, is optimized.

Culturing of the cells described in this application can be performed in culture vessels known and used in the art. In some embodiments, an aerated reaction vessel (e.g., a stirred tank reactor) is used to culture the cells. In some embodiments, a bioreactor or fermenter is used to culture the cell. Thus, in some embodiments, the cells are used in fermentation. As used in this application, the terms “bioreactor” and “fermenter” are interchangeably used and refer to an enclosure, or partial enclosure, in which a biological, biochemical and/or chemical reaction takes place that involves a living organism, part of a living organism, and/or isolated or purified enzymes. A “large-scale bioreactor” or “industrial-scale bioreactor” is a bioreactor that is used to generate a product on a commercial or quasi-commercial scale. Large scale bioreactors typically have volumes in the range of liters, hundreds of liters, thousands of liters, or more.

Non-limiting examples of bioreactors include: stirred tank fermenters, bioreactors agitated by rotating mixing devices, chemostats, bioreactors agitated by shaking devices, airlift fermenters, packed-bed reactors, fixed-bed reactors, fluidized bed bioreactors, bioreactors employing wave induced agitation, centrifugal bioreactors, roller bottles, and hollow fiber bioreactors, roller apparatuses (for example benchtop, cart-mounted, and/or automated varieties), vertically- stacked plates, spinner flasks, stirring or rocking flasks, shaken multi-well plates, MD bottles, T-flasks, Roux bottles, multiple- surface tissue culture propagators, modified fermenters, and coated beads (e.g., beads coated with serum proteins, nitrocellulose, or carboxymethyl cellulose to prevent cell attachment).

In some embodiments, the bioreactor includes a cell culture system where the host cell is in contact with moving liquids and/or gas bubbles. In some embodiments, the cell or cell culture is grown in suspension. In other embodiments, the cell or cell culture is attached to a solid phase carrier. Non-limiting examples of a carrier system includes microcarriers (e.g., polymer spheres, microbeads, and microdisks that can be porous or non-porous), crosslinked beads e.g., dextran) charged with specific chemical groups (e.g., tertiary amine groups), 2D microcarriers including cells trapped in nonporous polymer fibers, 3D carriers (e.g., carrier fibers, hollow fibers, multicartridge reactors, and semi-permeable membranes that can comprising porous fibers), microcarriers having reduced ion exchange capacity, encapsulation cells, capillaries, and aggregates. In some embodiments, carriers are fabricated from materials such as dextran, gelatin, glass, or cellulose.

In some embodiments, industrial-scale processes are operated in continuous, semi- continuous or non-continuous modes. Non-limiting examples of operation modes are batch, fed batch, extended batch, repetitive batch, draw/fill, rotating-wall, spinning flask, and/or perfusion mode of operation. In some embodiments, a bioreactor allows continuous or semi- continuous replenishment of the substrate stock, for example a carbohydrate source and/or continuous or semi-continuous separation of the product, from the bioreactor.

In some embodiments, the bioreactor or fermenter includes a sensor and/or a control system to measure and/or adjust reaction parameters. Non-limiting examples of reaction parameters include biological parameters (e.g., growth rate, cell size, cell number, cell density, cell type, or cell state, etc.), chemical parameters (e.g., pH, redox-potential, concentration of reaction substrate and/or product, concentration of dissolved gases, such as oxygen concentration and CO2 concentration, nutrient concentrations, metabolite concentrations, concentration of an oligopeptide, concentration of an amino acid, concentration of a vitamin, concentration of a hormone, concentration of an additive, serum concentration, ionic strength, concentration of an ion, relative humidity, molarity, osmolarity, concentration of other chemicals, for example buffering agents, adjuvants, or reaction byproducts), physical/mechanical parameters (e.g., density, conductivity, degree of agitation, pressure, and flow rate, shear stress, shear rate, viscosity, color, turbidity, light absorption, mixing rate, conversion rate, as well as thermodynamic parameters, such as temperature, light intensity/quality, etc.). Sensors to measure the parameters described in this application are well known to one of ordinary skill in the relevant mechanical and electronic arts. Control systems to adjust the parameters in a bioreactor based on the inputs from a sensor described in this application are well known to one of ordinary skill in the art in bioreactor engineering.

In some embodiments, the method involves batch fermentation (e.g., shake flask fermentation). General considerations for batch fermentation (e.g., shake flask fermentation) include the level of oxygen and glucose. For example, batch fermentation (e.g., shake flask fermentation) may be oxygen and glucose limited, so in some embodiments, the capability of a strain to perform in a well-designed fed-batch fermentation is underestimated.

In some embodiments, the cells of the present disclosure are adapted to produce anthranilate. In some embodiments, the cells are adapted to secrete anthranilate.

In some embodiments, the cells of the present disclosure are adapted to produce methyl anthranilate. In some embodiments, the cells are adapted to secrete methyl anthranilate.

The present invention is further illustrated by the following Examples, which should not be construed as limiting. The entire contents of all of the references (including literature references, issued patents, published patent applications, and co-pending patent applications) cited throughout this application are hereby expressly incorporated by reference. If a reference incorporated in this application contains a term whose definition is incongruous or incompatible with the definition of same term as defined in the present disclosure, the meaning ascribed to the term in this disclosure shall govern. Mention of any reference, article, publication, patent, patent publication, and patent application cited in this application is not, and should not be taken as, an acknowledgment or suggestion that they constitute valid prior art or form part of the common general knowledge of a skilled artisan.

EXAMPLES In order that the invention described in this application may be more fully understood, the following examples are set forth. The examples described in this application are offered to illustrate the systems and methods provided in this application and are not to be construed as limiting their scope.

Example 1: Development of skin commensal bacterial strains (S. epidermidis and Corynebacterium) capable of producing anthranilate

To develop skin commensal bacterial cells that are capable of producing anthranilate, S. epidermidis cells and Corynebacterium cells were modified to express various combinations of tryptophan biosynthesis genes.

Materials and methods

Competent cell preparation - S. epidermidis

Cell competency for transformation via electroporation was achieved as follows. Precultures were started from freezer stocks or single colonies of S. epidermidis in B2 media (10 g/L of casamino acids, 25 g/L of yeast extract, 1 g/L of K2HPO4, 5 g/L of glucose, 25 g/L of NaCl, pH adjusted to 7.5 with NaOH). Cultures were grown at 37°C overnight. Cultures were then diluted 1:100 (final ODeeo —0.1) into (pre-warmed) B2 media and further grown at 37°C to ODeeo ~0.8 - 1.0 (3 - 4 h). Cells were collected by centrifugation at 4,200 rpm at 4°C for 15 min. The cell pellets were washed 4 times with ice cold 10% w/v glycerol and ultimately resuspended in ice-cold 10% w/v glycerol to 1 - 5 x 10 ¹⁰ cells/mL or in 1/200 - 1/400 of original culture volume.

The protocol was also scaled-down or miniaturized for high-throughput competent cell preparation in 24 or 96-deep well culture plates.

S. epidermidis transformation via electroporation

Cell transformation via electroporation was performed as follows. Recombinant plasmid DNA was amplified and purified from E. coli dem- strain or B. subtilis. 5 pL of purified plasmid DNA corresponding to 0.5 - 1 pg were added to electro-competent cells, gently mixed and incubated at room temperature for 30 minutes. Electroporation was performed by transferring 50 - 55 pL of each of the cells-DNA mixtures in 0.2 cm cuvettes (2 mm gap) and by using a Bio-Rad Gene Pulser set to 2000 V, 100 Q, 25 pF (Time constant was between 2.5 - 2.7 ms). 300 pL of pre-warmed B2 media was immediately added to the cells for recovery post-electroporation.

In an alternative method when higher throughput was needed, electroporation was performed using 96-well plates and a different electroporation apparatus compatible with 96- well plate format. 50 - 55 pL of each of the cells-DNA mixtures was added to the desired columns of the 96-well electroporation plate and pulsed using BTX Gemini Electroporator and HT-100 High Throughput Electroporation Plate Handler (Harvard Apparatus) set to 2000 V, 100 , 25 pF (Time constant was between 1.5 - 1.6 ms). 50 pL of pre-warmed B2 media was immediately added to the cells in the 96-well electroporation plate for recovery postelectroporation.

The post-electroporation recovery was performed at 30°C for 4 h for pIMAY- Z/pJB38-based plasmids or other plasmids with temperature-sensitive origins, or at 37 °C for 2 h for pKK30K or pUBTR plasmids that are not temperature-sensitive.

Transformants were then plated under selective conditions on BHIA plates (37 g/L of Brain Heart Infusion Broth powder (Teknova B9505), 15 g/L of Bacto Agar). Selective conditions included addition of antibiotics, e.g., Cm ¹⁰ - chloramphenicol (10 pg/mL) for pIMAY-Z/pJB38 backbone plasmid, Km ³⁰, or kanamycin (30 pg/mL) for pKK30K backbone plasmid. Transformants were grown for 2 - 3 days at 30°C for pIMAY-Z/pJB 38-based plasmids or other plasmids with temperature-sensitive origins or at 37°C for pKK30K or pUBTR plasmids until colonies were visible on the plates.

Genome integration in human skin commensal strains

To enable genome integration of heterologous genes, the heterologous genes were flanked by sequences homologous to regions of the host cell genome (flanking homology regions) and were inserted into recombinant plasmids. Non-limiting examples of flanking homology regions used in the present disclosure are alrl, alr2, datl, idsA, crt, and glu.

Transformants harboring a recombinant plasmid were further manipulated as follows to achieve stable integration of the recombinant construct into the genome.

Genome integration via allelic exchange from a temperature sensitive plasmid (e.g. pIMAY/pIMAY-Z backbone) was performed by first inducing single-crossover in BHL Chloramphenicol (10 pg/mL) at 37°C overnight (16 - 18 h). pIMAY/pIMAY-Z, as a temperature sensitive plasmid, was unable to replicate at the non-permissive temperature of 37°C that triggered genome integration under selective conditions. Overnight cultures were 10-fold diluted to 10’ ⁵ - 10’ ⁷ cells/mL in BHI and plated on BHIA plus Chloramphenicol (10 pg/mL) and X-Gal (100 pg/mL) at 37°C overnight. Colonies with the plasmid integrated were resistant to chloramphenicol and resulted in a blue coloration in the presence of X-Gal. Blue single colonies from the integration plates were further re-streaked onto BHIA plus Chloramphenicol (10 pg/mL) and X-Gal (100 pg/mL) plates and incubated at 37°C overnight. Potential chromosomal integrants were screened by colony PCR to determine loss of the extrachromosomal replicating plasmid and genome integration sites. Blue colonies that successfully lost the plasmid (now integrated in the genome) were grown at 30°C in BHI without antibiotics. Bacterial cultures were 10-fold diluted to IO ^-4 - 10’ ⁶ cells/mL in BHI. Resulting cell suspensions were plated onto BHIA and X-Gal (100 pg/mL) and incubated at 30°C for 2 - 3 days. Desired colonies were sensitive to chloramphenicol and white in color after double cross-over and excision of the plasmid backbone. White colonies were patched onto: i) BHIA and X-Gal (100 pg/mL) and ii) BHIA plus Chloramphenicol (10 pg/mL) and X-Gal (100 pg/mL) plates and incubated at 37°C overnight to confirm loss of plasmid backbone. Chloramphenicol- sensitive white colonies were screened by colony PCR to confirm the double cross-over and appropriate allelic replacement, i.e. gene knock-out or recombinant cassette integration. Integrants were further confirmed by whole genome sequencing.

Competent cell preparation - Corynebacterium spp.

Cell competency for transformation via electroporation was achieved as follows. Precultures were started from freezer stocks or single colonies and grown into BHIS (37 g/L brain heart infusion broth powder, 91 g/L sorbitol) + 0.1% TWEEN80 media at 30°C for two days. Cultures were then diluted 1:100 (final OD660 -0.1) into BHIS + 0.1% TWEEN80 media. When OD reached -0.4-0.6 (2-3hrs), ampicillin was added to a final concentration of 1.25 pg/mL and cultures were left to grow for additional 1-1.5 hrs at 30°C. Cultures were then put on ice for 15-30 min, and centrifuged at 4000xg at 4°C for 10 min. The cell pellets were washed 3 times with ice-cold EPB1 (20mM HEPES, 5% v/v glycerol). At the end, the cell pellets were resuspended in EPB2 (5mM HEPES, 15% v/v glycerol) at 1/40 of the original culture volume and ready to be used or stored in -80°C. The protocol was also scaled-down or miniaturized for high-throughput competent cell preparation in 24 or 96-deep well culture plates.

Corynebacterium spp. transformation via electroporation

Cell transformation via electroporation was performed as follows. Recombinant plasmid DNA was amplified and purified from E. coli dem- strain or B. subtilis. 5 pL of purified plasmid DNA corresponding to 0.5 - 1 pg were added to 100 pL electro-competent cells thawed on ice. Electroporation was performed by transferring 100 pL of each of the cells-DNA mixtures in 1 mm cuvettes and by using a Bio-Rad Gene Pulser set to 1.8 kV, 600 Q, 25 pF (time constant was anywhere from 5-15 ms). 300 pL of pre-warmed BHIB (183.3 g/L Brain Heart Infusion Broth powder (Teknova B9505), 11 g/L Glucose)-0.1% TWEEN80 media were immediately added to the cells for recovery post-electroporation. Cells were incubated for ~3 hrs at 30°C before being plated on BHIS-Agar with appropriate antibiotics, e.g. Kanamycin (20 pg/mL). Transformant colonies were visible after 2 - 3 days growth at 30°C.

In an alternative method when higher throughput was needed, electroporation was performed in 96-well plates using BTX Gemini Electroporator and HT-100 High Throughput Electroporation Plate Handler (Harvard Apparatus).

Culturing protocol for S. epidermidis

The following cultivation protocol was performed to grow the cells to sufficient biomass before analysis of metabolites of interest via GC-MS or LC-MS.

Genetically engineered strains were pre-grown in BHI (37 g/L of Brain Heart Infusion Broth powder (Teknova B9505)) for 48 hrs at 30°C. Antibiotics, e.g., Cm ¹⁰ - chloramphenicol (10 pg/mL) for pIMAY-Z/pJB38, Km ³⁰ - kanamycin (30 pg/mL) for pKK30K, were supplemented for recombinant plasmid selection when appropriate. Antibiotics were not necessary when recombinant constructs were stably integrated into the genome of the host cell. Cultures were then diluted 1:50 in production media (see recipes below) and grown at 30°C for 24 or 48 hrs using either sealed 96-deep well plates or 1.8 mL glass vials in 54VT vial tray. 500 pL ethylacetate plus 100 pM tridecane were added to 500 pL of production culture for product extraction while shaking at 1600 rpm for 60 min at room temperature. Aqueous phase was separated from ethyl acetate phase by centrifugation 5 min, 2000-4000 rpm. 200-300 pL of ethyl acetate phase was transferred to autosampler vials for GC-MS analysis. Alternatively, whole cell broths were extracted and analyzed by LC-MS.

Production media tested:

BHI (37 g/L of Brain Heart Infusion Broth powder (Teknova B9505));

TSB (30 g/L Bacto Tryptic soy broth);

Sweat media (20.9 g/L MOPS (Fisher, BP308500) 1 g/L yeast extract (Thermo Fisher, BP1422-2) 2 g/L NaCl (Fisher, BP3581) 0.65 mg/L cod methyl ester fatty acids (Sigma, C5650-5G) 0.1 g/L Tween 80 (Thermo Fisher, BP338-500) pH= 6.0); and

Sweat-TSB (same as sweat media with the addition of 7.5 g/L of tryptic soy broth).

Culturing protocol for Corynebacterium

The following cultivation protocol was performed to grow the cells to sufficient biomass before analysis of metabolites of interest via GC-MS or LC-MS.

Cultivation of Corynebacterium spp strains was conducted using the same protocol as for S. epidermidis. BHI was replaced with BHIS (37 g/L of Brain Heart Infusion Broth powder, 91.67 g of sorbitol) and 0.1% Tween80. Antibiotics were supplemented for plasmid selection when appropriate, e.g., Km ²⁰ - kanamycin (20 pg/mL). Antibiotics were not necessary when recombinant constructs were stably integrated into the genome of the host cell. For IPTG inducible promoters, 1 mM of isopropylthio-P-galactoside (IPTG) was supplemented to the cultures to activate recombinant gene expression.

Production media tested:

BHIS (37 g/L of Brain Heart Infusion Broth powder (Teknova B9505), 91.67 g of sorbitol) plus 0.1% Tween80 (Sigma P4780))

TSB (30 g/L Bacto Tryptic soy broth) plus 0.1% Tween80 (Sigma P4780))

Gas chromatography-mass spectrometry

From the ethyl acetate sample extracts, 2 pL of each sample was injected onto a Thermo Scientific TraceGOLD™ TG-WaxMS A GC column (0.25 mm diameter, 30 m length, 0.25 pm film thickness). Upon injection, the column was held at 70.0°C for 1.00 minute, before the temperature was increased to 100.0°C at a rate of 4.0°C/min. From there, the rate was increased to 50.0°C/min until reaching a temperature of 175.0°C, at which point the rate was decreased to 30.0°C/min until reaching a temperature of 250.0°C. The column was held at this temperature for 1.5 minutes, before cooling down to prepare for the next injection. Samples were detected using a single quadrupole mass spectrometer, in a mass range of 30-300 amu. Detector response was corrected to the response of the tridecane internal standard, and then concentrations were calculated by comparing response rate to the response rate of a 50 pg/L to 50 mg/L calibration curve.

Liquid chromatography-mass spectrometry

Whole cell broth samples were extracted into a 50% methanol solution containing heavy labeled versions of tryptophan and anthranilate. The volume of the methanol solution added was equivalent to the volume of the whole cell broth. Samples were inverted to mix, and then placed in a -30°C freezer for an hour. The two phases were separated by a centrifugation of 4000 rpm for 10 minutes. The top layer of this separation was diluted to place the expected concentration of anthranilate within the linear range of the calibration curve, 1 pg/L to 5 mg/L. If the expected concentration was unknown, then the top layer of the separation was diluted at least 5X. Samples were injected onto a liquid chromatography-mass spectrometry system with an Accucore™ PFP HPLC Column (100 mm length, 2.1 mm diameter, 2.6 pm particle size). At injection, the mobile phase was 90.0% water with 0.1% formic acid and 10.0% acetonitrile at a flow rate of 0.50 mL/min. This mobile phase was held for 0.50 minutes, at which time a 2.50 minute gradient was started, producing a final mixture of 50.0% aqueous and 50.0% organic. This mixture was held for 0.75 minutes, before returning to the starting conditions of 90.0% aqueous and 10.0% organic, which was held for 1.00 minute before the next injection. Detection occurred using an orbitrap mass spectrometer, detecting a mass range from 100 to 600 m/z. The response rates of the analytes were corrected to the response of the heavy labeled internal standards. This response was used to calculate the concentration of the analytes, by comparison to the response rates from the 1 pg/L to 5 mg/L standard curve.

Results

In general, skin commensal bacteria were observed to be recalcitrant for genetic tractability and functional expression of heterologous genes and pathways, with many strains not transformable. Additionally, some strains were successfully transformed and sequence verified but target heterologous genes of interest were not expressed and no significant heterologous production was observed. However, despite these challenges associated with genetic manipulation of skin commensal strains, strains of S. epidermis and Corynebacterium spp. were successfully modified to express the target heterologous genes of interest and were capable of producing anthranilate.

Transformation and cultivation protocols were miniaturized, as described in the Material and Methods section, for high-throughput and automated screening of both strain libraries and different recombinant constructs. This high-throughput and automated screening enabled the exploration of a large design space and provided a means for addressing the recalcitrant nature of skin commensal strains and the different performance across species/strains.

Constructs expressing different combinations of genes encoding enzymes, promoters, terminators and operon topologies were evaluated in different cultivation and product extraction conditions, producing approximately 1056 samples.

S. epidermidis strains were modified to express various combinations of tryptophan pathway genes on a pIMAY-Z backbone. Strain IDs and specific modifications within each strain are listed in Table 4 below.

Table 4: S. epidermidis strain modification for production of anthranilate

Strains corresponding to IDs tl208304; tl208305; H208308; H208309; and tl208310 produced anthranilate, as shown in Table 5 and FIG. 2. Anthranilate production in S. epidermidis strain tl208309 was determined by a liquid chromatogram relative to a wildtype strain that did not produce anthranilate and relative to a chemical standard mixture (FIGs. 7A-7C). Anthranilate production in Corynebacterium strain tl223296 producing anthranilate was determined by a liquid chromatogram relative to a wildtype strain that did not produce anthranilate and relative to a chemical standard mixture (FIGs. 8A-8C). Table 5: Anthranilate titers in modified S. epidermidis strains

Corynebacterium strains were modified to express various combinations of tryptophan pathway genes on a pCgV 1 backbone. Strain IDs and specific modifications are listed in Table 6. Anthranilate production in Corynebacterium tuberculostearicum strains is shown in Table 7 and FIG. 3, Anthranilate production in Corynebacterium ammoniagenes strains is shown in Table 8 and FIG. 4, and Anthranilate production in Corynebacterium mucifaciens strains is shown in Table 9 and FIG. 5.

Table 6: Corynebacterium strain modification for production of anthranilate 11240060 Anthranilate trpE fbr S40F + aroG D146N

Table 7: Anthranilate titers in modified Corynebacterium tuberculostearicum

Table 8: Anthranilate titers in modified C. ammoniagenes

Table 9: Anthranilate titers in modified C. mucifaciens Example 2: Development of skin commensal bacterial strains (Corynebacterium and S. epidermidis) capable of producing methyl anthranilate

To develop a bacterial strain capable of producing methyl anthranilate, Corynebacterium strains were modified to express various combinations of tryptophan pathway genes on a pCgV Ibackbone to accumulate tryptophan precursor, and to express one or more anthranilate methyltransferases and cultured as described in Example 1.

To develop additional bacterial strains capable of producing methyl anthranilate, S. epidermidis strains were modified to express various combinations of tryptophan pathway genes on a pIMAY-Z plasmid backbone to accumulate tryptophan precursor, and to express one or more anthranilate methyltransferases and cultured as described in Example 1.

As discussed above in Example 1, in general, skin commensal bacteria were observed to be recalcitrant for genetic tractability and functional expression of heterologous genes and pathways, with many strains not transformable. Additionally, some strains were successfully transformed and sequence verified but target heterologous genes of interest were not expressed and no significant heterologous production was observed. However, despite these challenges associated with genetic manipulation of skin commensal strains, strains of Corynebacterium spp. and S. epidermidis were successfully modified to express the target heterologous genes of interest and were capable of producing methyl anthranilate.

Transformation and cultivation protocols were miniaturized, as described in Example 1, for high-throughput and automated screening of both strain libraries and different recombinant constructs. This high-throughput and automated screening enabled the exploration of a large design space and provided a means for addressing the recalcitrant nature of skin commensal strains and the different performance across species/strains.

Strain IDs and specific modifications for Corynebacterium spp. are listed in Table 10. Anthranilate production in Corynebacterium ammoniagenes strains is shown in Table 11 and FIG. 6, methyl anthranilate production in Corynebacterium ammoniagenes strains is shown in Table 12 and FIG. 6, and tryptophan production in Corynebacterium ammoniagenes strains is shown in Table 13 and FIG. 6. Strain tl240029, which expressed the following genes: TrpE fbr S40R and methyltransferase ANMT-Rg (Ruta graveolens) on a pCgVl backbone, produced methyl anthranilate, Table 12 and FIG. 6.

Table 10: Corynebacterium strain modification for production of methyl anthranilate

Table 11: Anthranilate titers in modified C. ammoniagenes

Table 12: Methyl anthranilate titers in modified C. ammoniagenes

Table 13: Tryptophan titers in modified C. ammoniagenes

Strain IDs and specific modifications for S. epidermidis are listed in Table 14.

Strain tl254848 of S. epidermidis, which expressed the following genes trpE fbr S40F and ANMT-Rg on a pIMAY-Z plasmid backbone, produced methyl anthranilate (Table 15 and FIG. 9).

Strain tl275357 of S. epidermidis, which expressed the following genes trpE fbr S40F and ANMT-Rg from genomic integration at the alrl locus, produced methyl anthranilate (Table 16 and FIG. 10). Table 14: S. epidermidis strain modification for production of methyl anthranilate

Table 15: Methyl anthranilate titers in modified S. epidermidis

Table 16: Methyl anthranilate titers in modified S. epidermidis

Table 17: Sequences Associated with the Disclosure

It should be appreciated that sequences disclosed in this application may or may not contain signal sequences. The sequences disclosed in this application encompass versions with or without signal sequences. It should also be understood that protein sequences disclosed in this application may be depicted with or without a start codon (M). The sequences disclosed in this application encompass versions with or without start codons. Accordingly, in some instances amino acid numbering may correspond to protein sequences containing a start codon, while in other instances, amino acid numbering may correspond to protein sequences that do not contain a start codon. It should also be understood that sequences disclosed in this application may be depicted with or without a stop codon. The sequences disclosed in this application encompass versions with or without stop codons. Aspects of the disclosure encompass host cells comprising any of the sequences described in this application and fragments thereof. ADDITIONAL REFERENCES

Grosser, Melinda R., and Anthony R. Richardson. "Method for preparation and electroporation of S. aureus and S. epidermidis." The Genetic Manipulation of Staphylococci. Humana Press, New York, NY, 2014. 51-57.

Augustin, Johannes, and Friedrich Gotz. "Transformation of Staphylococcus epidermidis and other staphylococcal species with plasmid DNA by electroporation." FEMS microbiology letters 66.1-3 (1990): 203-207.

David Dodds, Jeffrey L. Bose, Ming-De Deng, Gilles R. Dube, Trudy H. Grossman, Alaina Kaiser, Kashmira Kulkarni, Roger Leger, Sara Mootien-Boyd, Azim Munivar, Julia Oh, Matthew Pestrak, Komal Rajpura, Alexander P. Tikhonov, Traci Turecek, Travis Whitfill. “Controlling the Growth of the Skin Commensal Staphylococcus epidermidis Using D- Alanine Auxotrophy”. American Society for Microbiology (2020) 5:3

EQUIVALENTS

Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described here. Such equivalents are intended to be encompassed by the following claims.

All references, including patent documents, are incorporated by reference in their entirety.

Previous Patent: RECOMBINANT POLYPEPTIDES WITH BERBERINE BRIDGE ENZYME ACTIVITY USEFUL FOR THE BIOSYNTHESIS OF CANNAB...

Next Patent: COMPOSITIONS AND METHODS FOR TREATMENT OF THYROID EYE DISEASE