Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
INDUCIBLE PROMOTERS
Document Type and Number:
WIPO Patent Application WO/2023/205693
Kind Code:
A2
Abstract:
Provided herein are nucleic acid constructs that comprise an inducible promoter. Dual expression systems are provided comprising two nucleic acid constructs or a single nucleic acid construct with two inducible promoters. Also provided are methods of expressing transcripts by transforming a nucleic acid construct described herein into a prokaryotic cell and contacting the prokaryotic cell with an inducer. Methods of producing a carotenoid are also disclosed herein.

Inventors:
CARRILLO RINCÓN ANDRÉS (US)
FARNY NATALIE (US)
Application Number:
PCT/US2023/065954
Publication Date:
October 26, 2023
Filing Date:
April 19, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
WORCESTER POLYTECH INST (US)
International Classes:
C12P23/00; C12N15/75
Attorney, Agent or Firm:
FAYERBERG, Roman (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A nucleic acid construct comprising a modified inducible promoter, wherein the modified inducible promoter comprises:

(a) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the promoter;

(b) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; and

(c) a nucleic acid sequence that encodes a bacterial ribosome binding sequence.

2. The nucleic acid construct of claim 1, wherein the nucleic acid construct further comprises a transgene.

3. The nucleic acid construct of claim 1, wherein the modified inducible promoter, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is inserted into a cell.

4. The nucleic acid construct of claim 3, wherein the expression of the transgene in the absence of an inducer is lower than an amount of expression in the absence of the inducer of the transgene operatively coupled to a weak promoter in a comparable nucleic acid construct.

5. The nucleic acid construct of claim 3, wherein the expression of the transgene in the presence of an inducer is at least equal to an amount of expression in the presence of the inducer of the transgene operatively coupled to the strong promoter in the comparable nucleic acid construct.

6. The nucleic acid construct of claim 2, wherein the transgene is selected from the group consisting of: a crtW gene, a crtE gene, a crtY gene, a crtl gene, a crtZ gene, a crtEB gene, a crtEBl gene, a crtEBlY gene, a crtEBlYZ gene, a crtEBl-YZW gene, an ABAI gene, an ABA2 gene, and a CocE gene.

7. The nucleic acid construct of claim 1, wherein the nucleic acid construct further comprises a sequence encoding a reporter gene.

8. The nucleic acid construct of claim 1, wherein the nucleic acid construct further comprises one or more regulatory element.

9. The nucleic acid construct of claim 1, wherein the nucleic acid construct further comprises a terminator sequence.

10. A nucleic acid construct comprising a sequence that is at least 85% identical to a sequence comprising any one of: SEQ ID NOS: 1-14, 39-73.

11. A nucleic acid construct comprising:

(a) a first modified inducible promoter sequence, wherein the first modified inducible promoter sequence comprises the modified inducible promoter of any one of claims 1 to 10; and

(b) a second modified inducible promoter sequence, wherein the second modified inducible promoter sequence comprises:

(i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the second modified inducible promoter sequence;

(ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the second modified inducible promoter sequence; and

(iii) a nucleic acid sequence that encodes a second bacterial ribosome binding sequence.

12. The nucleic acid construct of claim 11, wherein the nucleic acid construct further comprises a transgene.

13. The nucleic acid construct of claim 11, wherein the modified inducible promoter, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is in the presence of an inducer.

14. The nucleic acid construct of claim 11, further comprising a terminator sequence.

15. The nucleic acid construct of claim 11, further comprising one or more regulatory element.

16. The nucleic acid construct of claim 11, wherein the transgene comprises a carotenoid biosynthesis pathway gene.

17. The nucleic acid construct of claim 16, wherein the transgene is selected from the group consisting of: a crtW gene, a crtE gene, a crtY gene, a crtl gene, a crtZ gene, a crtEB gene, a crtEBI gene, a crtEB IY gene, a crtEBIYZ gene, a crtEBI-YZW gene, an ABAI gene, an ABA2 gene, and a CocE gene.

18. A composition comprising two or more of a nucleic acid construct of any one of claims 1 to 17.

19. An cell comprising the nucleic acid construct of any one of claims 1 to 17 or the composition of claim 18.

20. The cell of claim 19, wherein the cell is a prokaryotic cell.

21. The cell of claim 20, wherein the prokaryotic cell is a bacterial cell.

22. The cell of claim 21, wherein the bacterial cell is a Gram-negative bacterial cell. l'S. The cell of claim 22, wherein the Gram-negative bacterial cell is a P. putida bacterium or a K natriegens bacterium.

24. The cell of claim 21, wherein the bacterial cell is a Gram-positive bacterial cell.

25. The cell of claim 19, wherein the cell is a eukaryotic cell.

26. The cell of claim 19, wherein the cell is a mammalian cell.

27. A cell-free system comprising the nucleic acid construct of any one of claims 1 to 17 or the composition of claim 18; and an RNA polymerase.

28. The cell-free system of claim 27, further comprising a ribosome, an aminoacyl transfer RNA, a translation factor, and a buffer.

29. A method of expressing a protein encoded by a transgene in a cell, the method comprising:

(a) transforming a cell with a nucleic acid construct, wherein the nucleic acid construct comprises a modified inducible promoter that comprises:

(i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the modified inducible promoter;

(ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the modified inducible promoter;

(iii) a nucleic acid that encodes a bacterial ribosome binding sequence; and

(iv) a transgene; and

(b) contacting the cell with an inducer, thereby expressing the protein encoded by the transgene.

30. The method of claim 29, wherein the cell is a prokaryotic cell.

31. The method of claim 30, wherein the prokaryotic cell is a bacterium.

32. The method of claim 29, wherein the cell is a eukaryotic cell.

33. The method of claim 31, wherein the bacterium is a P. putida bacterium or a V. natriegens bacterium.

34. The method of any one of claims 29 to 33, wherein the cell does not comprise a T7 promoter or a T7 polymerase.

35. The method of claim 34, wherein when in the absence of the inducer, the expression of the protein is lower than the expression of the protein in a comparable cell comprising the transgene operably linked to a T7 promoter.

-Ti

36. The method of any one of claims 31 to 35, wherein when in the presence of the inducer, the expression of the protein is greater than the expression of the protein in a comparable cell comprising the transgene operably linked to a T7 promoter.

37. A non-naturally occurring organism comprising: a nucleic acid construct comprising:

(a) a first inducible promotor sequence comprising:

(i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the first inducible promoter sequence;

(ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the first inducible promoter sequence; and

(iii) a nucleic acid sequence that encodes a first bacterial ribosome binding sequence;

(b) one or more of a biosynthesis pathway transgene; and

(c) a second modified inducible promotor sequence comprising:

(i) a TATAATGT (SEQ ID NO: 44) consensus sequence at position -10, relative to a transcriptional start site of the second modified inducible promoter sequence;

(ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the second modified inducible promoter sequence; and

(iii) a nucleic acid sequence that encodes a second bacterial ribosome binding sequence.

38. The non-naturally occurring organism of claim 37, wherein the one or more biosynthesis pathway transgene is selected from the group consisting of: a crtW gene, a crtE gene, a crtY gene, a crtl gene, a crtZ gene, a crtEB gene, a crtEBl gene, a crtEBlY gene, a crtEBlYZ gene, a crtEBI-YZW gene, and a CocE gene.

39. The non-naturally occurring organism of claim 37 or claim 38, wherein the non- naturally occurring organism comprises a bacterium.

40. The non-naturally occurring organism of claim 39, wherein the bacterium is a Gram-negative bacterium.

41. The non-naturally occurring organism of claim 40, wherein the Gram-negative bacterium is a J! putida bacterium or a V. natriegens bacterium.

42. The non-naturally occurring organism of claim 39, wherein the bacterium is a Gram-positive bacterium.

43. The non-naturally occurring organism of claim 37 or claim 38, wherein the non- naturally occurnng organism comprises a eukaryotic cell.

44. The non-naturally occurring organism of any one of claims 37 to 43, wherein the nucleic acid construct comprises a terminator sequence.

45. The non-naturally occurring organism of any one of claims 37 to 44, wherein the nucleic acid construct comprises one or more regulatory element.

46. The non-naturally occurring organism of any one of claims 37 to 45, wherein the nucleic acid construct comprises a reporter gene.

47. A composition comprising the non-naturally occurring organism of any one of claims 37 to 46; and an inducer.

48. The composition of claim 47, wherein the inducer comprises: anhydrotetracychne (aTc), isopropyl P- d-1 -thiogalactopyranoside, cocaine, metallothionine, ecdysone, an antibiotic agent, galactose, a steroid, or a divalent cation.

49. A method for producing a protein, the method comprising: culturing the non- naturally occurring organism of any one of claims 37 to 46 for a period of time; and contacting the non-naturally occurring organism with an inducer, thereby producing the protein.

50. A method for producing a carotenoid, the method comprising: culturing the non- naturally occurring organism of any one of claims 37 to 46 for a period of time; and contacting the non-naturally occurring organism with an inducer, thereby producing the carotenoid.

51. A carotenoid produced by the method of claim 50, wherein the carotenoid is lycopene, beta-carotene, zeaxanthin, canthaxanthin, or astaxanthine.

52. A method for producing benzoic acid, the method comprising: culturing the cell of any one of claims 21 to 27 or the non-naturally occurring organism of any one of claims 37 to 46 for a penod of time; and contacting the cell or the non-naturally occurnng organism with an inducer, thereby producing benzoic acid.

53. A composition comprising benzoic acid, wherein the benzoic acid is produced by the method of claim 52.

54. A kit comprising the nucleic acid construct of any one of claims 1 to 19, packaging, buffers, and materials therefor.

55. Akit comprising the cell of any one of claims 21 to 27, packaging, culture medium, buffers, and materials therefor.

56. A kit comprising the non-naturally occurring organism of any one of claims 37 to 46, packaging, culture medium, buffers, and materials therefor.

57. A nucleic acid construct comprising a modified inducible promoter, wherein the modified inducible promoter compnses:

(a) a TATAAT consensus sequence at position -10, relative to a transcriptional start site of the promoter;

(b) a TTGACA sequence at position -35, relative to a transcriptional start site of the promoter;

(c) a bacterial ribosome binding sequence; wherein the modified inducible promoter, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is inserted into a prokaryotic cell in the presence of an inducer; and wherein:

(i) the expression of the transgene in the absence of the inducer is less than an amount of expression in the absence of the inducer of the transgene operatively coupled to a weak promoter in a comparable nucleic acid construct; and

(ii) the expression of the transgene in the presence of the inducer is at least equal to an amount of expression in the presence of the inducer of the transgene operatively coupled to the strong promoter in the comparable nucleic acid construct.

58. An isolated prokaryotic cell comprising the nucleic acid construct of claim 57.

59. A method of expressing a transgene in a prokary otic cell that does not comprise a T7 RNA polymerase, the method comprising:

(a) transforming a nucleic acid construct into the prokaryotic cell, wherein the nucleic acid construct comprises a modified inducible promoter that comprises:

(i) a TATAAT consensus sequence at position -10, relative to a transcriptional start site of the promoter;

(n) a TTGACA sequence at position -35, relative to a transcriptional start site of the promoter; and

(iii) a bacterial ribosome binding sequence; and

(b) contacting the prokaryotic cell with an inducer, thereby expressing the transgene; wherein

(i) the expression of the transgene in the absence of the inducer is less than an amount of expression in the absence of the inducer of the transgene operatively coupled to a T7 promoter in a comparable nucleic acid construct; and

-SO- (ii) the expression of the transgene in the presence of the inducer is greater than an amount of expression in the presence of the inducer of the transgene operatively coupled to the T7 promoter in the comparable nucleic acid construct.

60. A nucleic acid construct comprising a modified inducible promoter, wherein the modified inducible promoter comprises:

(a) a TXTXXTGT (SEQ ID NO: 45) sequence at position -10, relative to a transcriptional start site of the promoter;

(b) a TXGXCX (SEQ ID NO: 46) sequence at position -35, relative to a transcriptional start site of the promoter; and

(c) a nucleic acid sequence that encodes a bacterial ribosome binding sequence.

61. A nucleic acid construct comprising a modified inducible promoter, wherein the modified inducible promoter comprises:

(a) a TXTXXT (SEQ ID NO: 47) sequence at position -10, relative to a transcriptional start site of the promoter;

(b) a TXGXCX (SEQ ID NO: 46) sequence at position -35, relative to a transcriptional start site of the promoter; and

(c) a nucleic acid sequence that encodes a bacterial ribosome binding sequence.

62. Any nucleic acid, nucleic acid construct, composition, cell, non-naturally occurring organism, cell-free system, kit, or method provided herein.

Description:
INDUCIBLE PROMOTERS

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of, and priority to, U.S. Provisional Patent Application No. 63/332,507 filed April 19, 2022, and U.S. Utility Patent Application No. 18/136,660 filed April 19, 2023, the entirety of each of which is incorporated herein by reference.

BACKGROUND

[0002] Inducible promoters are ubiquitous biotechnology tools for manufacturing proteins, providing molecular models of biosynthesis pathways, and as synthetic switches for a variety of environmental, physiological, and cellular tools. Inducible promoters have a consistent architecture including two key elements: the operator region recognized by transcriptional regulatory proteins and consensus sequences that recruit the sigma (o) subunits of RNA polymerase to initiate transcription of the inducible gene. Despite their widespread use, leaky transcription in the “OFF” state remains a challenge for inducible promoters. Therefore, improved inducible promoters, cellular systems, and methods of generating proteins are needed to enable advances in protein production and various biotechnology applications.

BRIEF SUMMARY

[0003] Provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise: a modified inducible promoter, wherein the modified inducible promoter comprises: (a) a TXTXXTGT (SEQ ID NO: 45) sequence at position -10, relative to a transcriptional start site of the promoter; (b) a TXGXCX (SEQ ID NO: 46) sequence at position -35, relative to a transcriptional start site of the promoter; and (c) a nucleic acid sequence that encodes a bacterial ribosome binding sequence, wherein X is any nucleobase. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise: a modified inducible promoter, wherein the modified inducible promoter comprises: (a) a TATAAT (SEQ ID NO: 1) sequence at position -10, relative to a transcriptional start site of the promoter; (b) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; and (c) a nucleic acid sequence that encodes a bacterial ribosome binding sequence. In some embodiments, the nucleic acid construct further comprises a transgene. In some embodiments, the modified inducible promoter, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is inserted into a prokaryotic cell in the presence of an inducer; and wherein: (i) the expression of the transgene in the absence of the inducer is less than an amount of expression in the absence of the inducer of the transgene operatively coupled to a weak promoter in a comparable nucleic acid construct; and (ii) the expression of the transgene in the presence of the inducer is at least equal to an amount of expression in the presence of the inducer of the transgene operatively coupled to the strong promoter in the comparable nucleic acid construct. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise: a modified inducible promoter, wherein the modified inducible promoter comprises: (a) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the promoter; (b) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; and (c) a nucleic acid sequence that encodes a bacterial ribosome binding sequence. In some embodiments, the nucleic acid construct further comprises a transgene. In some embodiments, the modified inducible promoter, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is inserted into a prokaryotic cell in the presence of an inducer; and wherein: (i) the expression of the transgene in the absence of the inducer is less than an amount of expression in the absence of the inducer of the transgene operatively coupled to a weak promoter in a comparable nucleic acid construct; and (ii) the expression of the transgene in the presence of the inducer is at least equal to an amount of expression in the presence of the inducer of the transgene operatively coupled to the strong promoter in the comparable nucleic acid construct. In some embodiments, a strong promoter increases the amount of expression of a transgene provided herein relative to a comparable inducible promoter that does not comprise SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 44, SEQ ID NO: 45, or SEQ ID NO: 46. In some embodiments, the strong promoter increases the amount of expression of a transgene provided herein by at least 10% relative to the expression of a transgene expressed by a comparable inducible promoter that does not comprise SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 44, SEQ ID NO: 45, or SEQ ID NO: 46.

[0004] Provided herein are methods of expressing a transgene in a cell, the methods comprising: (a) transforming a nucleic acid construct into the cell, wherein the nucleic acid construct comprises a modified inducible promoter, wherein the modified inducible promoter comprises: (i) a TATAAT (SEQ ID NO: 1) sequence or a TATAATGT (SEQ ID NO: 44) sequence at position - 10, relative to a transcriptional start site of the promoter; (ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; (iii) a nucleic acid sequence encoding a bacterial ribosome binding sequence; and (iv) a transgene; and (b) contacting the cell with an inducer, thereby expressing the transgene. In some embodiments, when cloned into a comparable nucleic acid construct that comprises a strong promoter, the transgene inhibits growth of the cell prior to the contacting step (b) with the inducer, thereby preventing expression of the transgene via the comparable nucleic acid. In some embodiments, the cell is a prokaryotic cell. In some embodiments, the cell is a eukaryotic cell.

[0005] Provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise: (a) a first modified inducible promoter sequence, wherein the first modified inducible promoter comprises: (i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the first modified inducible promoter sequence; (ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the first modified inducible promoter sequence; and (iii) a nucleic acid sequence that encodes a first bacterial ribosome binding sequence; and (b) a second modified inducible promoter sequence, wherein the second modified inducible promoter sequence comprises: (i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the second modified inducible promoter sequence; (ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the second modified inducible promoter sequence; and (iii) a nucleic acid sequence that encodes a second bacterial ribosome binding sequence. In some embodiments, the nucleic acid constructs further comprise a transgene. In some embodiments, wherein the first modified inducible promoter sequence, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is in the presence of an inducer.

[0006] A composition comprising two or more of a nucleic acid construct provided herein.

[0007] Provided herein are isolated prokaryotic cells, wherein the isolated prokaryotic cells comprise a nucleic acid construct provided herein. Provided herein are isolated eukaryotic cells, wherein the isolated eukaryotic cells comprise a nucleic acid construct provided herein. Provided herein are cell-free systems, wherein the cell-free systems comprise a nucleic acid construct provided herein.

[0008] Provided herein are non-naturally occurring organisms, wherein the non-naturally occurring organisms comprise: a nucleic acid construct comprising: (a) a first inducible promotor sequence comprising: (i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the promoter; (ii) a TTGACA (SEQ ID NO: 2) sequence at position - 35, relative to a transcriptional start site of the promoter; and (iii) a nucleic acid sequence that encodes a bacterial ribosome binding sequence; (b) one or more of a biosynthesis pathway transgene; and (c) a second modified inducible promotor sequence comprising: (i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the second modified inducible promotor sequence; (n) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the second modified inducible promotor sequence; and (iii) a nucleic acid sequence that encodes abacterial ribosome binding sequence. In some embodiments, the non-naturally occurring organisms comprise prokaryotic organisms. In some embodiments, the prokaryotic organisms comprise a population of bacteria. In some embodiments, the biosynthesis pathway transgene comprises a carotenoid synthesis gene.

[0009] Provided herein are compositions comprising a non-naturally occurring organism provided herein and an inducer.

[0010] Further provided herein are methods of expressing a transgene in a cell that does not comprise a T7 RNA polymerase, wherein the methods comprise: (a) transforming a nucleic acid construct into the cell, wherein the nucleic acid construct comprises a modified inducible promoter, wherein the modified inducible promoter comprises: (i) a TATAAT (SEQ ID NO: 1) sequence or a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the promoter; (ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; and (iii) a bacterial ribosome binding sequence; and a transgene: and (b) contacting the cell with an inducer, thereby expressing a protein encoded by the transgene. In some embodiments, the expression of the protein in the absence of the inducer is less than an amount of expression of the protein in the absence of the inducer of the transgene operatively coupled to a T7 promoter in a comparable nucleic acid construct. In some embodiments, the expression of the protein encoded by the transgene in the presence of the inducer is greater than an amount of expression of the protein in the presence of the inducer of the transgene operatively coupled to the T7 promoter in the comparable nucleic acid construct. In some embodiments, the transgene is toxic to a prokaryotic cell that does not express a modified inducible promoter. In some embodiments, the cell is a prokaryotic cell. In some embodiments, the cell is a eukaryotic cell.

[0011] Further provided herein are methods of expressing a transgene in a cell-free system, wherein the methods comprise: (a) transforming a nucleic acid construct into the cell-free system, wherein the nucleic acid construct comprises a modified inducible promoter, wherein the modified inducible promoter comprises: (i) a TATAAT (SEQ ID NO: 1) sequence or a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the promoter; (ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; and (iii) a bacterial ribosome binding sequence; and a transgene: and (b) contacting the cell-free system with an inducer, thereby expressing a protein encoded by the transgene. In some embodiments, the cell-free system further comprises: an RNA polymerase, ribonucleotides, and/or a buffer.

[0012] Provided herein are methods for producing a protein, the methods comprising: culturing the non-naturally occurring organism provided herein; and contacting the non-naturally occurring organism with an inducer, thereby producing the protein. Further provided herein are methods for producing a carotenoid, the methods comprising: culturing the non-naturally occurring organism provided herein; and contacting the non-naturally occurring organism with an inducer, thereby producing the protein. Provided herein are methods for producing an organic compound, the methods comprising: culturing the non-naturally occurring organism provided herein; and contacting the non-naturally occurring organism with an inducer, thereby producing the organic compound. Further provided herein are methods for producing a carotenoid, the methods comprising: culturing the non-naturally occurring organism provided herein; and contacting the non-naturally occurring organism with an inducer, thereby producing the carotenoid. In some embodiments, the organic compound or the protein comprises a polyketide, a terpene, a non- ribosomal peptide, or an enzyme protein. Provided herein are methods for producing benzoic acid, the methods comprising: culturing the non-naturally occurring organism provided herein; and contacting the non-naturally occurring organism with an inducer, thereby producing the benzoic acid. Further provided herein is a composition comprising benzoic acid made by a method provided herein. Further provided herein is a composition comprising a carotenoid made by a method provided herein.

[0013] Provided herein are kits, wherein the kits comprise a nucleic acid construct provided herein, a cell provided herein, a cell-free system provided herein, or a non-naturally occurring organism provided herein, packaging and materials, therefore.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] Novel features of exemplary embodiments are set forth with particularity in the appended claims. A better understanding of the features and advantages will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosed systems and methods are utilized, and the accompanying drawings of which:

[0015] FIGS. 1A-1B show schematic diagrams of exemplary plasmids produced in accordance with the teachings of the present disclosure. FIG. 1A shows a schematic of a constitutive version of the synthetic lac and tet expression systems. FIG. IB shows a schematic of inducible versions of the synthetic lac and tet expression systems in pColEl backbone. Plasmids contain the Kanamycin selection marker and the attP sites specific fox Bxbl integrase. # indicates the version of the promoter (e.g. VlTc, V21ac, etc).

[0016] FIGS. 2A-2C show the architecture of exemplary synthetic lac and tet promoters. The sequences of the operators lacO and tetO are overlined, in bold text. The -35 and -10 sequences of the promoters are underlined. Boxes represent the native -35 and -10 sequences, and consensus o70 -35 and -10 sequences. Conserved regions of the original lac promoter and the original tad promoter are shown. FIG. 2A shows the original lac (i) and synthetic lac (ii-vii) promoters. SEQ ID NOS: 3-9 are shown. FIG. 2B shows the original tet (i) and synthetic tet (ii-v) promoters. SEQ ID NOS: 10-14 are show n. FIG. 2C shows schematic representation of the function of synthetic lac and tet promoters. Each transcriptional unit is insulated by terminators. LacI/TetR regulators bind to the operators lacO/tetO. Addition of the inducers (y ellow ovals) IPTG or anhydrotetracycline (aTc) remove the repressor allowing transcription from the promoters.

[0017] FIGS. 3A-3D show a graphical comparison of sfGFP and mCardinal emissions in Gram-negative bacteria. FIG. 3A shows absolute values of fluorescence measured in arbitrary units with far red wavelengths (excitation 604, emission 659, left panel), and green wavelengths (excitation 485, emission 510) E. coli DH10B, P. putida and V. natriegens. Note that these strains are wild type and do not contain any fluorescent protein genes. N=3. Error bars +/- SD. FIGS. 3B-3D show graphs of the fluorescence signal-to-background ratio of recombinant strains expressing sfGFP and mCardinal from the constitutive tad promoter (these represent "signal") vs. wild type strains of (FIG. 3B) E. coli (FIG. 3C) P. putida & (FIG. 3D) V. natriegens (these wild type strains represent “background”). Signal-to-background measurement ratios are then plotted over time (left panels) The fluorescence values (in arbitrary units, AU) measured to create the signal-to-background ratios are shown in the middle (sfGFP) and right (mCardinal) panels. N=3. Error bars +/- SD.

[0018] FIGS. 4A-4C show graphs of the time course of mCardinal production in E. coli DH10B with synthetic lac and tet promoters. FIG. 4A shows the synthetic lac promoters, and FIG. 4B shows the synthetic tet promoters, with and without the transcriptional regulators tetR and lacl and induced after 3 hours with aTc or IPTG, respectively. FIG. 4C shows direct comparison of each promoter under evaluation against the strong constitutive promoter tad. The recombinant DH10B carrying tad -mCardinal was normalized to 100% mCardinal production and the wild ty pe DH10B strain normalized to 0% mCardinal production. For all samples, the fluorescence mean of mCardinal signal (excitation 605, emission 659) was normalized by the cell density (OD600). N=4. Error bars +/- SD. [0019] FIGS. 5A-5B show an SDS-Page gel and graph of the expression profile of CocE in E. coli. FIG. 5A shows gel electrophoresis of total crude extracts (top), the insoluble fraction analysis (middle panel), and the soluble fraction (bottom). The expected size of CocE is 63 kDa. E. coli expressing the pET and V2TcR expression systems, uninduced and induced (induced samples indicated with an asterisk). FIG. 5B shows a graph of benzoic acid production by CocE present in the soluble fraction, using cocaine as substrate. N=3. Error bars +/- SD. Key for X axis Groups: I: DH10B po 70 V2TcR-CocE; II: DH10B po 70 V2TcR-CocE aTc; III: DH10B po 70 V2TcR19-CocE; IV: DH10B po 70 V2TcR19-CocE aTc; V: BL21 pET21-cocE; VI: BL21 pET21-cocE IPTG.

[0020] FIGS. 6A-6C show graphs of the time course of mCardinal production in P. putida. Synthetic lac (FIG. 6A) and tet (FIG. 6B) promoters in their constitutive, repressed and induced states. All constructs were integrated in a single copy into the same genomic locus. FIG. 6C shows direct comparison of each promoter under evaluation against the constitutive promoter tael in P. putida. P. putida with a single-copy integration of tad-mCardinal was normalized to 100% mCardinal production and the wild type P. putida strain normalized to 0% mCardinal production. For all samples, the fluorescence mean of mCardinal signal (excitation 605, emission 659) was normalized by the cell density (OD600). N=4. Error bars +/- SD.

[0021] FIGS. 7A-7C show graphs of the time courses of mCardinal production in V. natriegens. Synthetic lac (FIG. 7A) and tet (FIG. 7B) promoters in their constitutive, repressed and induced states are represented. FIG. 7C shows direct comparison of each promoter under evaluation against the constitutive promoter tael in V. natriegens. V. natriegens carrying tacl- mCardinal was normalized to 100% mCardinal production and the wild type V. natriegens strain normalized to 0% mCardinal production. For all samples, the fluorescence mean of mCardinal signal (excitation 605, emission 659) was normalized by the cell density (OD600). N=4. Error bars +/- SD.

[0022] FIG. 8 shows a graphical comparison of the pET21a expression system in E. coli BL21 versus the synthetic V2Lac/lacI and V2Tc/tetR promoters in E. coli DH10B. Cultures were induced after 3 hours of cultivation and mCardinal mean was normalized by the cell density (ODeoo). N=4. Error bars +/- SD.

[0023] FIGS. 9A-9F shows a schematic representation of an exemplary dual expression system. FIG. 9A shows the design of the V2TcR-V2(3)LacI system. FIG. 9B shows a dual expression system controlling expression of mCardinal. FIG. 9C shows a dual expression system controlling expression of sfGFP and mCardinal. FIG. 9D shows a V2TcR expression system controlling the LYC operon. FIG. 9E shows a V2TcR expression system controlling expression of crtEBIY. FIG. 9F shows a dual expression system controlling expression of crtEBI and crtY. Exemplary sequences tested include SEQ ID NOS: 69-73 and 76-79.

[0024] FIGS. 10A-10C show graphs of the time course of mCardinal production in E. coll (FIG. 10A). P. putida (FIG. 10B) & V. natriegens (FIG. IOC). Left graphs show the recombinant strains containing the V2TcR-mCardinal (SEQ ID NO: 56) and V2(3)LacI-mCardinal expression systems and the right graphs show the recombinant strains containing the dual V2TcR- mCardinal/V2(3)Lad -mCardinal expression system. For all samples, the fluorescence mean of mCardinal signal (excitation 605, emission 659) was normalized by the cell density (OD600). N=3. Error bars +/- SD.

[0025] FIGS. 11A-11C shows graphs of the time course of the reporter systems mCardinal and sfGFP in E. coli (FIG. 11A), P. putida (FIG. 11B), and V. natriegens (FIG. 11C) containing the dual expression system V2TcR-sfGFP/V2(3)LacI-mCardinal. Left graphs measure mCardinal production and right graphs measure sfGFP. For all samples, the fluorescence mean of mCardinal signal (excitation 605, emission 659) and sfGFP signal (excitation 485, emission 510) was normalized by the cell density (OD600). N=3. Error bars +/- SD.

[0026] FIGS. 12A-12B shows graphs of the production of lycopene and |3-carotene with the V2TcR expression system in E. coli, P. putida, and V. natriegens. Cultures were induced with aTc at OD600. 0.7 and cultured for 4 hours. FIG. 12A shows Lycopene levels that were extracted and measured by UHPLC. FIG. 12B shows B-carotene levels that were extracted and measured by UHPLC. N=3. Error bars +/- SD.

[0027] FIGS. 13A-13B shows graphs of the production of lycopene and beta (P)-carotene by the dual expression system in E. coli, P. putida, and V. natriegens. Cultures were induced with aTc, IPTG or both inducers at OD600. 0.7 and cultured for 4 hours. FIG. 13A shows lycopene levels that were extracted and measured by UHPLC. FIG. 13B shows P-carotene levels that were extracted and measured by UHPLC. N=3. Error bars +/- SD.

[0028] FIG. 14 shows a schematic of an exemplary dual expression system. An exemplary sequences include SEQ ID NO: 54 and SEQ ID NO: 55.

DETAILED DESCRIPTION

Overview

[0029] Disclosed herein are nucleic acid constructs that comprise a modified inducible promoter. Such constructs can be utilized to express a transgene operatively coupled to the promoter upon transformation into a cell or a cell-free system. As described herein, the modified inducible promoter does not substantially express the transgene in the absence of an inducer and in the presence of a specific transcriptional regulator. In some cases, a transgene is toxic to the cell when expressed utilizing the modified inducible promoter described herein. In some cases, a transgene is not toxic to the cell when expressed utilizing the modified inducible promoter described herein. Expression of toxic genes is one of many advantages to the modified inducible promoter systems provided herein. In addition, non-toxic gene product yields (e.g., proteins or organic compounds) are also higher as compared to inducible promoters that are not modified.

[0030] In contrast, many inducible promoters produce significant expression of the transgene even in the absence of an inducer. Such promoters, known as “leaky” promoters, cannot be used to express transgenes that are toxic to a cell, as the leaky expression of the transgene inhibits growth of the cell culture and thus prevents overexpression or avoids formation of inclusion bodies.

[0031] Further, nucleic acid constructs provided herein comprising a modified inducible promoter that generates the expression of a protein encoded by a transgene in the presence of an inducer that is comparable or to a greater extent than the expression of a protein encoded by the transgene when expressed using a different promoter (e.g., a T7 promoter such as a pET vector). In some embodiments, expression of the transgene can be carried out in various prokaryotic cells, eukaryotic cells, or cell-free systems without the need for T7 lysogenization.

Definitions

[0032] The terminology used herein is for the purpose of describing particular cases only and is not intended to be limiting. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. Furthermore, to the extent that the terms “including”, “includes”, “having”, “has”, “with”, or variants thereof are used in either the detailed description and/or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising”

[0033] The term “about” or “approximately” means within an acceptable error range for the particular value as determined by one of ordinary' skill in the art, such as plus or minus 10%. Where ranges and/or subranges of values are provided, the ranges and/or subranges include the endpoints of the ranges and/or subranges

[0034] The term “substantially” as used herein refers to a value approaching 100% of a given value. For example, an expression system described herein that does not “substantially” express a transgene in the absence of an inducer can indicate that less than 10% of the transgene (e.g. less than .5%, less than 1%, less than 0 1%, or less than 0.01%) is expressed, relative to an amount of transgene expressed in the presence of the inducer.

[0035] As used herein, the term “operably linked” indicates that a promoter is in a correct functional location and/or orientation in relation to a nucleic acid sequence it regulates to control transcriptional initiation and/or expression of that sequence.

Modified Inducible Promoters

[0036] Inducible promoters often have a consistent architecture including two elements: (1) the operator region recognized by the transcriptional regulator proteins (e.g., lad and tetR, and the - 10) and -35 consensus sequences that recruits the sigma (o ) subunits of RNA polymerase to initiate transcription. Improvements to the lac promoter can be made to increase its strength and improve its regulation. For example, the -10 and -35 boxes of the original lac promoter (FIG. 1 Ai) can be updated to generate improved variants such as the lacUV5 (FIG. 1 Aii) and tad (FIG 1 Aiii) promoters. In the lacUV5 promoter, the -10 box of the original lac promoter can be replaced by the Pribnow box (e.g., TATAAT, SEQ ID NO: 1) or any one of SEQ ID NOS: 44, 47 to 51. The lacUV5 and trp promoters shown in the working examples can be combined to create the tad promoter, which increased transcription 11 -fold compared to its predecessor with the incorporation of the -35 consensus box TTGACA. The tet promoter (FIG. IBi) shares the highly conserved -35 hexamer with tac promoter, but does not contain the Pribnow box.

[0037] Despite their ubiquitous use in biology , there remain problems with the current Zac-based inducible expression systems. Leaky transcription in the OFF state remains a consistent challenge. Tight transcriptional control is indispensable to produce high yields of challenging recombinant proteins, including toxic genes and proteins that are impossible for a heterologous host to process or fold. For the pET system, target proteins are dnven indirectly by controlling expression of the T7 polymerase under the lacUV5 promoter, and then driving the target gene transcription by the T7 promoter. Still, low level T7 transcription in the uninduced state leads to leaky transcription of the target gene. To combat this problem further, E. coli strains such as pLysS and pLysE with integration of the T7 lysozyme that inhibits low level T7 activity are used to obtain tighter transcriptional control. However, a host strain with T7 RNA polymerase under control of the lacUV5 promoter, and integration of T7 lysozyme, is required to tightly regulate target gene expression. Expression with the tet system has historically provided tighter transcriptional control than lac derived promoters, does not required specialized strains (such as the BL21), and fully induction is achieved by anhydrotetracycline (aTc) at concentrations that do not cause growth defect due the high affinity of aTc to TetR, and its imperceptible antibiotic activity. Remarkably, at the uninduced state just one mRNA molecule per three cells is produced, and up to 5000-fold induction has been reported, however, the yields of recombinant protein obtained by the tet expression system remains low compared to the pET expression system when using E. coli as heterologous host.

[0038] The repertoire of organisms used both in academic and industrial settings is rapidly expanding. To address challenges related to complex protein expression in E. coli, other chassis organisms such as Pseudomonas putida and Vibrio natriegens have been employed to produce challenging proteins (e.g., carotenoids) in a variety of biotechnological processes. Both lac and tet expressions systems can be adapted to P. putida and V. natriegens, though in some instances with lower total protein yield than achieved in pET. However, previous systems for inducible expression were not configured for use across various cell types (e.g., prokaryotic cells). Accordingly, disclosed herein are universal expression systems than can be directly ported between different gram-negative species, yield high quantities of recombinant protein comparable to expression via a pET in E. coli, and maintain tight transcriptional repression in the uninduced state. The expression system of the current disclosure is a significant improvement over the pET system in providing tight OFF state control while achieving similar yields of recombinant protein, and with the advantage of direct portability to alternative host species. Further, additional modifications can be performed to enable the expression of the transgene to be turned OFF after induction, thus allowing for reversible control of expression.

[0039] A modified, inducible promoter of the current disclosure comprises modified architecture of the lac and tet expression systems to improve their strength, control, and portability. In some embodiments, the genetic architecture of the lac and tet expression systems were modified in three ways: (1) addition of the consensus -10 and -35 sequence boxes to be strongly targeted by o 70 , (2) incorporation of a nucleic acid sequence that encodes a strong ribosome binding site recognized by a broad spectrum of gram-negative bacteria, and (3) independent control of the transcriptional regulators by appropriately-tuned constitutive promoters.

[0040] The nucleic acid constructs provided herein can comprise DNA, RNA, an artificial nucleic acid analog, or any combination thereof. In some embodiments, the nucleic acid constructs provided herein comprise, a nucleic acid modification (e.g., a chemical modification), a nucleobase substitution, or a nucleotide substitution. In some embodiments, the sequence at position -10, relative to the transcriptional start site of the promoter comprises one or more nucleobase substitutions. In some embodiments, the sequence at position -10, relative to the transcriptional start site of the promoter comprises a sequence of TXTXXT (SEQ ID NO: 47), where X is any nucleobase. In some embodiments, the sequence at position -10, relative to the transcriptional start site of the promoter comprises a sequence of TXTXXTGT (SEQ ID NO: 45), where X is any nucleobase. In some embodiments, the sequence at position -10, relative to the transcriptional start site of the promoter comprises a sequence of TATXXTGT (SEQ ID NO: 48), where X is any nucleobase. In some embodiments, the sequence at position -10, relative to the transcriptional start site of the promoter comprises a sequence of TATAXTGT (SEQ ID NO: 49), where X is any nucleobase. In some embodiments, the sequence at position -10, relative to the transcriptional start site of the promoter comprises a sequence of TATAATGT (SEQ ID NO: 50), where X is any nucleobase. In some embodiments, the sequence at position -10, relative to the transcriptional start site of the promoter comprises a sequence of TATAATGT (SEQ ID NO: 51), where X is any nucleobase.

[0041] In some embodiments, the sequence at position -35, relative to the transcriptional start site of the promoter comprises one or more nucleobase substitutions. In some embodiments, the sequence at position -35, relative to the transcriptional start site of the promoter comprises a sequence of TXGXCX (SEQ ID NO: 46), where X is any nucleobase. In some embodiments, the sequence at position -35, relative to the transcriptional start site of the promoter comprises a sequence of TTGXCX (SEQ ID NO: 52), where X is any nucleobase. In some embodiments, the sequence at position -35, relative to the transcriptional start site of the promoter comprises a sequence of TTGACX (SEQ ID NO: 53), where X is any nucleobase. In some embodiments, the sequence at position -35, relative to the transcriptional start site of the promoter comprises a sequence of TTGACA (SEQ ID NO: 2), where X is any nucleobase.

[0042] In some embodiments, a nucleic acid construct provided herein comprises a nucleotide analogue. Nucleotide analogues include nucleotides having modifications in the chemical structure of the base, sugar and/or phosphate, including, but not limited to, 5 -position pyrimidine modifications, 8-position purine modifications, modifications at cytosine exocyclic amines, substitution of 5-bromo-uracil, and the like; and 2 " -position sugar modifications, including but not limited to, sugar-modified ribonucleotides in which the 2 " -OH is replaced by a group selected from H, OR, R, halo, SH, SR, NH2, NHR, NR2, or CN. shRNAs also can comprise non-natural elements such as non-natural bases, e.g., inosine and xanthine, sugars, e.g., 2 ' -methoxy ribose, or non-natural phosphodiester linkages, e.g., methylphosphonates, phosphorothioates and peptides. [0043] The nucleic acid construct provided herein can comprise a sequence encoding a ribosome binding site. Ribosome binding sites (RBSs) are nucleic acid sequences that promote efficient and accurate translation of mRNAs for protein synthesis, and are also provided for use in the inducible promoters provided herein to permit modulation of the efficiency and rates of synthesis of the proteins encoded by the system. An RBS affects the translation rate of an open reading frame in two main ways — i) the rate at which ribosomes are recruited to the mRNA and initiate translation is dependent on the sequence of the RBS. and ii) the RBS can also affect the stability of the mRNA, thereby affecting the number of proteins made over the lifetime of the mRNA. Accordingly, one or more nucleic acid sequence encoding a ribosome binding site (RBS) or an RBS mRNA can be added to the nucleic acid constructs described herein to control expression of proteins.

[0044] In some embodiments, a nucleic acid construct provided herein further comprises a terminator sequence. Terminators are sequences that usually occur at the end of a gene or operon and cause transcription to stop, and are also provided for use in the modules and engineered systems described herein to regulate transcription and prevent transcription from occurring in an unregulated fashion, i.e., a terminator sequence prevents activation of downstream modules by upstream promoters. A terminator or termination signal can include the DNA sequences involved in specific termination of an RNA transcript by an RNA polymerase. Thus, in certain embodiments a terminator that ends the production of an RNA transcript is contemplated. A terminator can be necessary for use in vivo to achieve desirable message levels.

[0045] The most commonly used type of temiinator is a forward terminator. When placed downstream of a nucleic acid sequence that is usually transcribed, a forward transcriptional terminator will cause transcription to abort. In some embodiments, bidirectional transcriptional terminators are provided. Such terminators will usually cause transcription to terminate on both the forward and reverse strand. Finally, in some embodiments, reverse transcriptional terminators can be used to terminate transcription on the reverse strand only.

[0046] In some embodiments, a nucleic acid construct provided herein comprises additional regulatory elements that increase or decrease transgene expression in a cell or cell-free system depending on the absence or presence of a particular inducer or set of inducers. In some embodiments, the regulatory element is an enhancer. Additional non-limiting examples of regulatory elements include: lasR activator (e.g, from /< aeruginosa), cinR activator, toxicitygene activator (e.g., ToxR, from Vibrio cholerae), lacl (e.g., a wild-type, derivative, or variant thereof), lad repressor, tetracycline repressor (e.g., TetR from transposon TnlO), mnt repressor, TP901, heat shock proteins, and any derivative or variant thereof.

[0047] Thus, in some embodiments, a nucleic acid construct provided herein can be part of a synthetic gene network. Synthetic gene networks can include an engineered composition that comprises at least one nucleic acid construct provided herein and can perform a function including, but not limited to, sensing the presence or absence of an analyte or inducer, a logic function, or a regulatory function. In some embodiments of a synthetic gene network comprising at least two nucleic acid constructs, the nucleic acid constructs can interact with each other directly or indirectly. A synthetic gene network can comprise a nucleic acid encoding a transgene operably linked to a modified inducible promoter provided herein.

[0048] The nucleic acid constructs provided herein can be used to visualize chemical, analyte, or protein production in the presence and absence of an inducer provided herein. In some embodiments, a nucleic acid construct provided herein further comprises a reporter gene. In some embodiments, the reporter gene is mCardinal or a green fluorescent protein (e.g., superfolder GFP or sfGFP). A reporter gene encoding any fluorescent protein can be applicable to the nucleic acid constructs and methods of use provided herein. Additional examples of genes encoding fluorescent proteins that can be used in accordance with the compositions and methods described herein include, without limitation, enhance yellow fluorescent protein (EYFP), engineered cyan fluorescent protein (ECFP), mOrange, mCherry, Venus YFP, Cerulean, mBanana, orange fluorescent protein (OFP), derivatives, or variants thereof.

[0049] In some embodiments, the reporter gene encodes for a colorimetric protein enzyme. Colorimetric enzymes can cleave a substrate (e.g., a chemical) to yield a color-changing product. In some embodiments, the protein tag is chitinase (which cleaves colorless 4-Nitrophenyl N,N " - diacetyl-beta-D-chitobioside substrate to yield a yellow p-nitrophenol product). In some embodiments, the reporter gene is LacZ (which encodes beta-galactosidase) or a fragment thereof. When ZocZ is expressed, the enzyme cleaves the yellow chlorophenol Red-P-D-galactopyranoside (CPRG) substrate to produce the purple chlorophenol red product.

[0050] In some embodiments, the reporter gene comprises a catalytic nucleic acid. Examples of catalytic nucleic acids include, but are not limited to, a ribozyme, an RNA-cleaving deoxyribozyme, a group I ribozyme, RNase P, a Hepatitis delta ribozyme, and DNA-zymes.

[0051] In some embodiments, the reporter gene comprises an antigen for which a specific antibody or antibody fragment is available. In some embodiments, a reporter gene comprises an antibody, which when expressed, binds to a complementary antigen.

Methods of Expression

[0052] Provided herein are methods for expressing a transgene or a protein encoded by a transgene sequence using a nucleic acid construct provided herein having a modified inducible promoter as described herein. In some embodiments, a transgene can be expressed in a host cell that is toxic to the host cell or a non-naturally occurring organism (e.g., a bacterium). Such transgenes can be difficult to express utilizing a pET or other vector containing a leaky promoter, because the leaky expression upon transformation can inhibit grow th of the host cell and thereby either prevent expression or express the transgene as an inclusion body. Indeed, tight OFF state control can allow the host cell to reach mid-log phase growth prior to induction, thus allowing for improved expression of the toxic transgene in the prokaryotic cell relative to expression via a vector having a leaky promoter.

[0053] The nucleic acid constructs provided herein can comprise a transgene encoding a protein or a fragment thereof. In some embodiments, the transgene or a protein encoded by the transgene is not toxic to the host cell. A protein and/or peptide or fragment thereof can be any protein of interest, for example, but not limited to: biological proteins; mutated proteins; therapeutic proteins; truncated proteins, and the like. Proteins can also be selected from a group comprising: mutated proteins, genetically engineered proteins, peptides, synthetic peptides, recombinant proteins, chimeric proteins, antibodies, midibodies, tribodies, humanized proteins, humanized antibodies, chimeric antibodies, modified proteins and fragments thereof. A transgene provided herein can comprise a gene that is, for example, part of a biosynthesis pathway for the production of a protein, an organic compound, or a molecule of interest. The cells, systems, and organisms provided herein provide for a facile method of manufacturing such proteins.

[0054] The modified inducible promoters provided herein can be introduced to any cell type or any system that can transcribe and translate the protein encoded by the transgene downstream of the promoter, In some embodiments, a cell provided herein is a prokaryotic cell or a eukaryotic cell. In some embodiments, the system is a cell-free system that can be used to produce the protein or organic compound of interest. A cell-free system is a composition comprising a set of reagents capable of providing for or supporting a biosynthetic reaction (e.g., transcription reaction, translation reaction, or both) in vitro in the absence of cells. For example, to provide for a transcription reaction, a cell-free system comprises promoter-containing DNA, RNA polymerase, ribonucleotides, and a buffer system. Cell-free systems can be prepared using enzymes, coenzymes, and other subcellular components either isolated or purified from eukaryotic or prokaryotic cells, including recombinant cells, or prepared as extracts or fractions of such cells. A cell-free system can be derived from a variety of sources, including, but not limited to, eukaryotic and prokaryotic cells, such as bacteria including, but not limited to, E. coli, P. putida, V. natriegens, thermophilic bacteria and the like, wheat germ, rabbit reticulocytes, mouse L cells, Ehrlich's ascitic cancer cells, HeLa cells, CHO cells, or budding yeast. In some embodiments, the cell-free system comprises an RN A polymerase. In some embodiments, the cell-free system comprises components sufficient for the translation reaction. In some embodiments, the cell-free system comprise ribosomes, aminoacyl transfer RNAs, translation factors, and a buffer system. The components can also comprise amino acids or amino acids and aminoacyl tRNA synthetases. Components of translation factors are disclosed, for example, in Shimizu and Ueda, “Pure Technology,” Cell-Free Protein Production: Methods and Protocols, Methods in Molecular Biology, Endo et al. (Eds), Humana 2010, the contents of which is incorporated herein by reference in its entirety. Exemplary translation factors include, but are not limited to, factors responsible for protein biosynthesis are initiation factors (IF1, IF2, and IF3), elongation factors (EF-G, EF-Tu, and EF-Ts), and release factors (RF1, RF2, and RF3), as well as RRF for termination.

[0055] Prior to induction, the modified inducible promoter provided herein does not substantially express the transgene in the absence of an inducer (e.g. isopropyl P- d-1 -thiogalactopyranoside (IPTG), anhydrotetracycline, and the like). In some embodiments, the amount of expressed transgene produced in the absence of inducer is less than 10%, less than 9%, less than 8%, less than 7%, less than 6%, less than 5%, less than 4%, less than 3%, less than 2%, less than 1%, less than 0.9%, less than 0.8%, less than 0.7%, less than 0.6%, less than 0.5%, less than 0.4%, less than 0.3%, less than 0.2%, less than 0.1%, less than 0.09%, less than 0.08%, less than 0.07%, less than 0.06%, less than 0.05%, less than 0.04%, less than 0.03%, less than 0.02%, or less than 0.01% of an amount of expressed transgene produced in the presence of the inducer. In some embodiments, the amount of expressed transgene produced in the absence of inducer is less than 10%, less than 9%, less than 8%, less than 7%, less than 6%, less than 5%, less than 4%, less than 3%, less than 2%, less than 1%, less than 0.9%, less than 0.8%, less than 0.7%, less than 0.6%, less than 0.5%, less than 0.4%, less than 0.3%, less than 0.2%, less than 0.1%, less than 0.09%, less than 0.08%, less than 0.07%, less than 0.06%, less than 0.05%, less than 0.04%, less than 0.03%, less than 0.02%, or less than 0.01% of an amount of expressed transgene produced by a comparable nucleic acid having a weak promoter (e.g. lad) promoter in the absence of the inducer.

[0056] Expression of the transgene is performed by contacting the host cell or cell-free system provided herein with an inducer (e.g. isopropyl P- d-1 -thiogalactopyranoside (IPTG), anhydrotetracycline, and the like). Non-limiting examples of inducers include: a chemical, a compound, and organic compound, a protein, an analyte, tetracycline and derivatives thereof, metallothionine, ecdysone, cocaine, hormones, steroids, and antibiotics (e.g., rapamycin, kanamycin). Exemplary environmental inducers include exposure to heat (i.e., thermal pulses or constant heat exposure), light (e.g., photoirradiation within the defined range of wavelengths), various steroidal compounds, divalent cations (including Cu 2+ and Zn 2+ ), galactose, tetracycline, IPTG (isopropyl-P-D thiogalactoside), as well as other naturally occurring and synthetic inducing agents and gratuitous inducers.

[0057] In some embodiments, the amount of expressed transgene produced in the presence of inducer is at least equal to the amount of expressed transgene produced by a comparable nucleic acid having a lacUV5 promoter and a T7 polymerase in the presence of the inducer. In some embodiments, the amount of protein expressed in the presence of the inducer is at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 100%, at least 105%, at least 110%, at least 115%, at least 120%, at least 125%, at least 130%, at least 135%, at least 140%, at least 145%, at least 150%, at least 155%, atleast 160%, at least 165%, at least 170%, at least 175%, at least 180%, at least 185%, at least 190%, at least 195%, at least 200%, at least 300%, at least 400%, at least 500%, at least 600%, at least 700%, at least 800%, at least 900%, or at least 1000% an amount of expressed transgene produced by a comparable nucleic acid having a strong (e.g., T7) promoter in the presence of the inducer.

[0058] Such expression can be carried in a variety of prokaryotic host cells. Indeed, while most expression is carried out in T7 lysogenized E. coll DE3 strains, the nucleic acid constructs of the present disclosure having a modified inducible promoter can be utilized in prokaryotic cells in the absence of T7 lysogenization.

[0059] The host cell for expression of a nucleic acid provided herein (e.g., a transgene) can comprise a prokaryotic cell or a eukaryotic cell. In some embodiments, the prokaryotic cell is a bacterial cell. In some embodiments, a prokaryotic cell can include any Gram-positive strain bacterial cell. In some embodiments, a prokaryotic cell can include any Gram-negative strain bacterial cell. Examples of bacterium that can be used include, but are not limited to: non-T7 lysogenized E. coli strains such as DH5a, DH10P, or W3110, P. putida, P. aeruginosa, H. influenzae, C. trachomatis, P. mirabilis, P. vulgaris, C. pneumoniaea, K. pneumoniaea, N. gonorrhoeae, H. pylori, A. cholera, S. aureus, S. enterica, C. jejuni, B. fragilis, L. pneumophila, V. parahaemolyticus , and V. natri egens.

[0060] The host cell for expression can be a eukary otic cells, e.g., a mammalian cell, an insect cell, a yeast cell, a fungal cell, and the like. A nucleic acid construct provided herein can be regulated in a cell-specific or tissue-specific manner such that it is only active in transcribing the associated coding region of a given transgene in a specific tissue type(s).

Exemplary biosynthesis pathways for producing gene products

[0061] Provided herein are methods of producing a protein, an organic compound, or a molecule using the inducible promoters provided herein, a cell expressing an inducible promoter provided herein, a non-naturally occurring organism provided herein, or kits provided herein. In some embodiments, the non-naturally occurring organism provided herein comprises a nucleic acid construct comprising one or more inducible promoters provided herein and one or more sequence encoding a transgene. In some embodiments, the one or more transgene comprises a sequence encoding a protein in a biosynthesis pathway. In some embodiments, the nucleic acid construct provided herein comprise a biosynthesis pathway transgene. In some embodiments, the biosynthesis pathway is a polyketide synthesis pathway, a terpene synthesis pathway, a non- ribosomal peptide biosynthesis pathway, or a carotenoid biosynthesis pathway. In some embodiments, the cell expressing an inducible promoter provided herein or the a non-naturally occurring organism provided herein are cultured for a period of time. In some embodiments, the period of time is for at least 4 hours, 6 hours, 10 hours, 12 hours, or more. In some embodiments, the period of time is for at least 24 hours. In some embodiments, the period of time is for at least 96 hours.

[0062] In some embodiments, the biosynthesis pathway is a carotenoid synthesis pathway. Industrially useful carotenoids are generally produced by chemical sy nthesis processes for which possibility of undesired actions such as contamination of synthesis auxiliary materials is a major concern for the quality of the product. In addition, tastes of consumers tend to lean toward naturally-occurring carotenoids. However, there is a limit to extraction of carotenoids from plants and natural products, and an effective industrial process is not entirely established. As a production method of naturally-occurring carotenoids, microbial fermentation methods have been used. However, none of such cases enable production of carotenoids in an amount which is enough for economical industrial production. In many cases, through classical mutation and breeding, wildtype of carotenoid producing microorganisms do not generate enough carotenoid product for large- scale manufacturing, carotenoid biosynthesis pathway is made up of various enzymes, and genes encoding such enzymes have been analyzed by many researches. In a typical pathway, for example, carotenoid is sy nthesized in its early stage by an isoprenoid biosynthesis pathway which is shared by steroid and terpenoid, starting from mevalonic acid which is a basic metabolite. Famesyl pyrophosphate having 15 carbons (Cl 5) generating through the isoprenoid basic synthesis system is condensed with isopentenyl diphosphate (IPP) (C5), to give geranylgeranyl diphosphate (GGPP) (C20). Then through condensation of two molecules of GGPP, colorless phytoene which is the first carotenoid is synthesized. The phytoene is then converted into lycopene through a series of unsaturation reactions, and then the lycopene is converted into P-carotene through a cyclization reaction. Then, a hydroxyl group and a keto group are introduced into the P-carotene, which leads synthesis of various xanthophylls represented by astaxanthin.

[0063] Provided herein are methods for producing a carotenoid, the method comprises culturing the non-naturally occurring organism provided herein or a plurality of non-naturally occurring organisms under conditions and for a sufficient period of time and contacting the non-naturally occurring organism with an inducer, thereby producing the carotenoid. In some embodiments, the cells or non-naturally occurnng organisms provided herein are cultured in a cell culture medium (e.g., a lysogeny broth, also called LB broth or Luria Broth). In some embodiments, the cell or non-naturally occurring organisms are cultured in a bioreactor, a spinning flask, or a vessel suitable for cell growth and survival.

[0064] In some embodiments, the transgene encodes a polypeptide having such an enzymatic activity that converts a methylene group at 4 position in P-ionone ring into a keto group. In some embodiments the transgene comprises a crtW gene. In some embodiments, the transgene encodes a polypeptide having such an enzymatic activity that adds one hydroxyl group to a carbon at 3- position of 4-keto-P-ionone ring and/or at 3-position of P-ionone ring. In some embodiments, the transgene comprises a crtZ gene sequence. In some embodiments, the transgene encodes for a polypeptide having such an enzymatic activity that converts lycopene into P-carotene. In some embodiments, the transgene comprises a crtY gene sequence. In some embodiments, the transgene encodes for a polypeptide having such an enzymatic activity that converts phytoene into lycopene. In some embodiments, the transgene comprises a crtl gene sequence. In some embodiments, the transgene encodes for a polypeptide having prephytoene synthase activity. In some embodiments, the transgene comprises a crtB gene sequence. In some embodiments, the transgene encodes for a polypeptide having geranylgeranyl diphosphate synthase activity. In some embodiments, the transgene comprises a crtE gene sequence. In some embodiments, the transgene encodes for a polypeptide having a lycopene elongase/hydratase activity. In some embodiments, the transgene comprises a crtEB gene sequence. In some embodiments, the transgene comprises a crtEBI gene sequence. In some embodiments, the transgene comprises a crtEBIY gene sequence. In some embodiments, the transgene comprises a crtEBIYZ gene sequence. In some embodiments, the transgene comprises a crtEBI-YZW gene sequence. In some embodiments, the transgene comprises an ABAI gene sequence. In some embodiments, the transgene comprises an ABA2 gene sequence. In some embodiments, the transgene comprises a sequence that is at least 85% identical to any one of SEQ ID NOS: 69-73, 76-79. In some embodiments, the comprises a sequence that is at least 90% identical to any one of SEQ ID NOS: 69-73, 76-79. In some embodiments, the comprises a sequence that is at least 95% identical to any one of SEQ ID NOS: 69-73, 76-79. In some embodiments, the comprises a sequence that is at least 99% identical to any one of SEQ ID NOS: 69-73, 76-79. In some embodiments, the comprises any one of SEQ ID NOS: 69-73, 76-79.

[0065] Further provided herein are methods of producing benzoic acid. In some embodiments, the biosynthetic pathway gene is a CocE gene. In some embodiments, the transgene comprises a sequence that is at least 85% identical to SEQ ID NO: 74. In some embodiments, the transgene comprises a sequence that is at least 90% identical to SEQ ID NO: 74. In some embodiments, the transgene comprises a sequence that is at least 95% identical to SEQ ID NO: 74. In some embodiments, the transgene comprises a sequence that is at least 99% identical to SEQ ID NO: 74. In some embodiments, the transgene comprises SEQ ID NO: 74.

Exemplary embodiments:

[0066] Provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise: a modified inducible promoter, wherein the modified inducible promoter comprises: (a) a TXTXXTGT (SEQ ID NO: 45) sequence at position -10, relative to a transcriptional start site of the promoter; (b) a TXGXCX (SEQ ID NO: 46) sequence at position -35, relative to a transcriptional start site of the promoter; and (c) a nucleic acid sequence that encodes a bacterial ribosome binding sequence, wherein X is any nucleobase. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise: a modified inducible promoter, wherein the modified inducible promoter comprises: (a) a TATAAT (SEQ ID NO: 1) sequence at position -10, relative to a transcriptional start site of the promoter; (b) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; and (c) a nucleic acid sequence that encodes a bacterial ribosome binding sequence. Further provide herein are nucleic acid constructs, wherein the nucleic acid constructs further comprises a transgene. Further provided herein are nucleic acid constructs, wherein the modified inducible promoter, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is inserted into a prokaryotic cell in the presence of an inducer; and wherein: (i) the expression of the transgene in the absence of the inducer is less than an amount of expression in the absence of the inducer of the transgene operatively coupled to a weak promoter in a comparable nucleic acid construct; and (ii) the expression of the transgene in the presence of the inducer is at least equal to an amount of expression in the presence of the inducer of the transgene operatively coupled to the strong promoter in the comparable nucleic acid construct. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise: a modified inducible promoter, wherein the modified inducible promoter comprises: (a) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the promoter; (b) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; and (c) a nucleic acid sequence that encodes a bacterial ribosome binding sequence. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs further comprise a transgene. Further provided herein are nucleic acid constructs, wherein the transgene is selected from the group consisting of: a crtW gene, a crtE gene, a crtY gene, a crtl gene, a crtZ gene, a crtEB gene, a crtEBI gene, a crtEBIY gene, a crtEBIYZ gene, a crtEBI-YZW gene, an ABAI gene, an ABA2 gene, and a CocE gene. Further provided herein are nucleic acid constructs, wherein the modified inducible promoter, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is inserted into a cell in the presence of an inducer. Further provided herein are nucleic acid constructs, wherein the expression of the transgene in the absence of the inducer is less than an amount of expression in the absence of the inducer of the transgene operatively coupled to a weak promoter in a comparable nucleic acid construct. Further provided herein are nucleic acid constructs, wherein the expression of the transgene in the presence of the inducer is at least equal to an amount of expression in the presence of the inducer of the transgene operatively coupled to the strong promoter in the comparable nucleic acid construct. Further provided herein are nucleic acid constructs, wherein a strong promoter increases the amount of expression of a transgene provided herein relative to a comparable inducible promoter that does not comprise SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 44, SEQ ID NO: 45, or SEQ ID NO: 46. Further provided herein are nucleic acid constructs, wherein the strong promoter increases the amount of expression of a transgene provided herein by at least 10% relative to the expression of a transgene expressed by a comparable inducible promoter that does not comprise SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 44, SEQ ID NO: 45, or SEQ ID NO: 46. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs further comprise a sequence encoding a reporter gene. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs further comprise one or more regulatory element. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs further comprise a terminator sequence. [0067] Provided herein are nucleic acid constructs comprising: a sequence that is at least 85% identical to a sequence comprising any one of: SEQ ID NOS: 1-14, 39-73. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise a sequence that is at least 90% identical to a sequence comprising any one of: SEQ ID NOS: 1-14, 39-73. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise a sequence that is at least 95% identical to a sequence comprising any one of: SEQ ID NOS: 1-14, 39-73. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise a sequence that is at least 99% identical to a sequence comprising any one of: SEQ ID NOS: 1-14, 39-73. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs comprise a sequence comprising any one of: SEQ ID NOS: 1-14, 39-73.

[0068] Provided herein are nucleic acid constructs comprising: (a) a first modified inducible promoter sequence provided herein; and (b) a second modified inducible promoter sequence, wherein the second modified inducible promoter sequence comprises: (i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the second modified inducible promoter sequence; (ii) a TTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the second modified inducible promoter sequence; and (iii) a nucleic acid sequence that encodes a second bacterial ribosome binding sequence. Further provided herein are nucleic acid constructs, wherein the nucleic acid construct further comprises a transgene. Further provided herein are nucleic acid constructs, wherein the modified inducible promoter, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is in the presence of an inducer. Further provided herein are nucleic acid constructs, wherein the expression of the transgene in the absence of the inducer is less than an amount of expression in the absence of the inducer of the transgene operatively coupled to a weak promoter in a comparable nucleic acid construct. Further provided herein are nucleic acid constructs, wherein the expression of the transgene in the presence of the inducer is at least equal to an amount of expression in the presence of the inducer of the transgene operatively coupled to the strong promoter in the comparable nucleic acid construct. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs further comprise a terminator sequence. Further provided herein are nucleic acid constructs, wherein the nucleic acid constructs further comprise one or more regulatory elements. Further provided herein are nucleic acid constructs, wherein the transgene comprises a carotenoid biosynthesis pathway gene. Further provided herein are nucleic acid constructs, wherein the transgene is selected from the group consisting of: a crtW gene, a crtE gene, a crtY gene, a crtl gene, a crtZ gene, a crtEB gene, a crtEBI gene, a crtEBIY gene, a crtEBlYZ gene, a crtEBl-YZW gene, an ABAI gene, an ABA2 gene, and a CocE gene.

[0069] Provided herein are compositions, wherein the compositions comprise two or more nucleic acid constructs provided herein.

[0070] Provided herein are cells comprising a nucleic acid construct or a composition provided herein. Further provided herein are cells, wherein the cells are prokaryotic cells. Further provided herein are cells, wherein the cells are bacterial cells. Further provided herein are cells, wherein the cells are Gram-negative bacterial cells. Further provided herein are cells, wherein the cells are P. putida bacterial cells or V. natriegens bacterial cells. Further provided herein are cells, wherein the cells are eukaryotic cells. Further provided herein are cells, wherein the cells are mammalian cells. [0071] Provided herein are cell-free systems comprising a nucleic acid construct provided herein, a composition provided herein and/or an RNA polymerase. Further provided herein are cell-free systems, wherein the cell-free systems further comprise ribosomes, aminoacyl transfer RNAs, translation factors, and a buffer.

[0072] Provided herein are methods of expressing a protein encoded by a transgene in a cell, the methods comprising: (a) transforming a cell with a nucleic acid construct, wherein the nucleic acid construct comprises a modified inducible promoter that comprises: (i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the promoter; (ii) aTTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; (iii) a nucleic acid that encodes a bacterial ribosome binding sequence; and a transgene; and then (b) contacting the cell with an inducer, thereby expressing the protein encoded by the transgene. Further provided herein are methods, wherein the transgene when cloned into a comparable nucleic acid construct that comprises a strong promoter inhibits growth of the cell prior to the contacting with the inducer, thereby preventing expression of the transgene via the comparable nucleic acid. Further provided herein are methods, wherein the cell is a prokaryotic cell. Further provided herein are methods, wherein the prokaryotic cell is a bacterium. Further provided herein are methods, wherein the cell does not comprise a T7 promoter or aT7 polymerase. Further provided herein are methods, wherein when in the absence of the inducer, the expression of the protein encoded by the transgene is lower relative to the expression of the protein in a comparable cell compnsing the transgene operably linked to a T7 promoter. Further provided herein are methods, wherein when in the presence of the inducer the expression of the protein encoded by the transgene is greater relative to the expression of the protein in a comparable cell comprising the transgene operably linked to a T7 promoter. [0073] Provided herein are non-naturally occurring organisms comprising a nucleic acid construct comprising: (a) a first inducible promotor sequence comprising: (i) a TATAATGT (SEQ ID NO: 44) sequence at position -10, relative to a transcriptional start site of the promoter; (ii) aTTGACA (SEQ ID NO: 2) sequence at position -35, relative to a transcriptional start site of the promoter; and (iii) a nucleic acid sequence that encodes a bacterial ribosome binding sequence; (b) one or more of a biosynthesis pathway transgene; and (c) a second modified inducible promotor sequence comprising: (i) a TATAATGT (SEQ ID NO: 44) consensus sequence at position -10, relative to a transcriptional start site of the promoter; (ii) a TTGACA (SEQ ID NO: 2) sequence at position - 35, relative to a transcriptional start site of the promoter; and (iii) a nucleic acid sequence that encodes a bacterial ribosome binding sequence. Further provided herein are non-naturally occurring organisms wherein the one or more biosynthesis pathway transgene is selected from the group consisting of: a crtW gene, a crtE gene, a crtY gene, a crtl gene, a crtZ gene, a crtEB gene, a crtEBI gene, a crtEBIY gene, a crtEBIYZ gene, a crtEBI-YZW gene, and a CocE gene. Further provided herein are non-naturally occurring organisms, wherein the non-naturally occurring organism comprises a bactenum. Further provided herein are non-naturally occurring organisms wherein the bacterium is a Gram-negative bacterium. Further provided herein are non-naturally occurring organisms, wherein the Gram-negative bacterium is a P. putida bacterium or a V. natriegens bacterium. Further provided herein are non-naturally occurring organisms, wherein the bacterium is a Gram-positive bacterium. Further provided herein are non-naturally occurring organisms, wherein the nucleic acid construct comprises a terminator sequence. Further provided herein are non-naturally occurring organisms, wherein the nucleic acid construct comprises one or more regulatory element. Further provided herein are non-naturally occurring organisms, wherein the nucleic acid construct comprises a reporter gene.

[0074] Provided herein are compositions, wherein the compositions comprise: the non- naturally occurring organism provided herein; and an inducer. Further provided herein are compositions, wherein the inducer comprises: anhydrotetracycline (aTc), isopropyl (3- d-1 -thiogalactopyranoside, cocaine, metallothionine, ecdysone, an antibiotic agent, galactose, a steroid, or a divalent cation.

[0075] Provided herein are methods for producing a carotenoid, the methods comprising: culturing the non-naturally occurring organism provided herein; and contacting the non-naturally occurring organism with an inducer, thereby producing the carotenoid. Further provided herein are methods, wherein the non-naturally occurring organism comprises a nucleic acid comprising a gene selected from the group consisting of: a crtW gene, a crtE gene, a crtY gene, a crtl gene, a crtZ gene, a crtEB gene, a crtEBI gene, a crtEBIY gene, a crtEBIYZ gene, and a crtEBI-YZW gene. [0076] Provided herein is a carotenoid produced by the methods provided herein, wherein the carotenoid is lycopene, beta-carotene, zeaxanthm, canthaxanthin, or astaxanthine.

[0077] Provided herein are methods for producing benzoic acid, the methods comprising: culturing the cell provided herein; or the non-naturally occurring organism of provided herein; and contacting the cell or the non-naturally occurring organism with an inducer, thereby producing benzoic acid. Further provided herein are methods, wherein the cell or the non-naturally occurring organism comprises a nucleic acid comprising a CocE gene.

[0078] Provided herein are compositions comprising benzoic acid, wherein the benzoic acid is produced by the methods provided herein.

[0079] Provided herein are kits comprising the nucleic acid construct provided herein, packaging, buffers, and materials therefor.

[0080] Provided herein are kits comprising the cell provided herein, packaging, culture medium, buffers, and materials therefor.

[0081] Provided herein are kits comprising the non-naturally occurring organism provided herein, packaging, culture medium, buffers, and materials therefor.

[0082] Provided herein are nucleic acid constructs comprising a modified inducible promoter, wherein the modified inducible promoter comprises: a TATAAT consensus sequence at position - 10, relative to a transcriptional start site of the promoter; a TTGACA sequence at the -35 position, relative to a transcriptional start site of the promoter; a bacterial ribosome binding sequence; wherein the modified inducible promoter, when operatively linked to a transgene, facilitates expression of the transgene when the nucleic acid construct is inserted into a cell in the presence of an inducer; and wherein: the expression of the transgene in the absence of the inducer is less than an amount of expression in the absence of the inducer of the transgene operatively coupled to a weak promoter in a comparable nucleic acid construct; and the expression of the transgene in the presence of the inducer is at least equal to an amount of expression in the presence of the inducer of the transgene operatively coupled to the strong promoter in the comparable nucleic acid construct.

[0083] Provided herein are isolated prokaryotic cells comprising a nucleic acid construct provided herein.

[0084] Provided herein are methods of expressing a transgene toxic to a prokaryotic cell, the methods comprising: transforming a nucleic acid construct into the prokaryotic cell, wherein the nucleic acid construct comprises a modified inducible promoter comprises: a TATAAT consensus sequence at position -10, relative to a transcriptional start site of the promoter; a TTGACA sequence at the -35 position, relative to a transcriptional start site of the promoter; a bacterial ribosome binding sequence; and contacting the prokaryotic cell with an inducer, thereby expressing the transgene; wherein the transgene when cloned into a comparable nucleic acid construct that comprises a strong promoter inhibits growth of the prokaryotic cell prior to the contacting with the inducer, thereby preventing expression of the transgene via the comparable nucleic acid.

[0085] Provided herein are methods of expressing a transgene in a prokaryotic cell that does not comprise a T7 RNA polymerase, the methods comprising: transforming a nucleic acid construct into the prokaryotic cell, wherein the nucleic acid construct comprises a modified inducible promoter comprises: a TATAAT consensus sequence at position -10, relative to a transcriptional start site of the promoter; a TTGACA sequence at the -35 position, relative to a transcriptional start site of the promoter; a bacterial ribosome binding sequence; and contacting the prokaryotic cell with an inducer, thereby expressing the transgene; wherein the expression of the transgene in the absence of the inducer is less than an amount of expression in the absence of the inducer of the transgene operatively coupled to a T7 promoter in a comparable nucleic acid construct; and the expression of the transgene in the presence of the inducer is greater than an amount of expression in the presence of the inducer of the transgene operatively coupled to the T7 promoter in the comparable nucleic acid construct.

[0086] Provided herein are nucleic acids, nucleic acid constructs, compositions, cells, non- naturally occurring organisms, cell-free systems, kit, or method provided herein.

[0087] For a better understanding of the present disclosure and of its many advantages, the following examples are given by way of illustration and without limiting the scope of this disclosure.

EXAMPLES

EXAMPLE 1: INDUCIBLE PROMOTOR ARCHITECTURE.

[0088] In exemplary embodiments, the -10 and -35 boxes of the original lac promoter (FIG. lAi), which are targeted by sigma 70 (o 70 ), were modified to generate improved variants such as the lacUV5 (FIG. 1 Aii) and tael (FIG. lAiii) promoters. In the lacUV5 promoter, the -10 box of the original lac promoter was replaced by the Pribnow box (TATAAT). Later the lacUV5 and trp promoters were combined to create the tad promoter, which increased transcription 11 -fold compared to its predecessor with the incorporation of the -35 consensus box TTGACA. The tet promoter (FIG. IBi) shares the highly conserved -35 hexamer with tac promoter, however it does not contain the Pribnow box.

[0089] Provided herein are exemplary lac and tet expression systems with improved strength, control, and portability.

[0090] The genetic architecture of the lac and tet expression systems were modified in three ways:

(1) addition of the consensus -10 and -35 sequence boxes to be strongly targeted by o70;

(2) incorporation of a strong ribosome binding site recognized by a broad spectrum of Gramnegative bacteria; and

(3) independent control of the transcriptional regulators by appropriately -tuned constitutive promoters.

[0091] The results were validated with the reporter protein mCardinal, which significantly improves the dynamic range of promoter measurements over more commonly used green fluorescent proteins. The improvement seen in the mCardinal dynamic range is due to intrinsic fluorescence of many bacterial species that interferes with measurements in the green wavelengths, and the bacteria provided herein have reduced autofluorescence in the far red spectrum. Additionally, the inducible promoters provided herein were compared with the pET system with the production of the cocaine esterase CocE, a thermosensitive enzyme capable of metabolizing cocaine into benzoic acid. CocE, which is prone to form inclusion bodies in leaky E. coli expression systems, is expressed as an inducible and soluble protein using the promoters provided herein. The results and assays provided in Example 2 further support that the expression system provided herein is a significant improvement over available expression systems in providing tight OFF state control while achieving high yields of recombinant protein, and with the advantage of direct portability to alternative host species.

EXAMPLE 2: GENERATION OF INDUCIBLE PROMOTORS IN GRAM NEGATIVE BACTERIA.

Bacterial strains

[0092] E. coli DH10B, P. putida JE90 derivative of KT2440 with BxBlint-attB, and V. nairiegens Vmax X2 (Codex DNA, Inc.) were used to evaluate the synthetic lac and tet promoters. The plasmid pET21a-mCardinal was evaluated in E. coli BL21 strain. Selective markers kanamycin (50 pg/mL for E. coli and P. putida, and 400 pg/mL for V. natriegens), ampicillin (100 pg/mL for E. coli) and spectinomycin (60 pg/mL for E. coli and P. putida, and 250 pg/mL for V. natriegens) were supplemented to LB medium when required. E. coli strains were transformed by electroporation and V. natriegens by chemical transformation. Integration of plasmids into P. putida chromosome was performed by electroporation following the protocol as described, for example, in Elmore et al. Metab Eng Commun, 5: 1-8 (2017), the contents of which is incorporated herein by reference in its entirety.

[0093] Plasmid information is listed in Table 1 below. Any plasmid containing any version of the o 70 -based promoters as shown in FIG. 2A and FIG. 2B has been labeled “po 70 ” followed by information about the construct and the specific promoter. “Lac” and “Tc” constructs are constitutive and do not express a repressor protein; “Lad” and “TcR” constructs are inducible and include expression of LacI or TetR, respectively. Numbers in parentheses indicate a vector backbone other than pJH0204 ((13) for pJH0228 with the pCloDF13 origin of replication, and (19) for pUC19).

Table 1. List of Plasmids and Vectors.

Molecular cloning:

[0094] A schematic diagram of plasmid construction are shown in FIGS. 1A-1B. Polymerase chain reactions (PCR) were carried out with Phusion DNA polymerase (ThermoFisher Scientific, USA). Digestion of DNA was performed with fast digest restriction enzymes and DNA fragments were join by T4 DNA ligase or Gibson assembly kit (ThermoFisher Scientific, USA). Oligonucleotides and DNA synthesis were ordered to IDT (IDT, USA). DNA sequencing was performed (QuintaraBio, USA). Shuttle vectors pJH0204 and pJH0228 were used. The synthetic Vlac and VTc promoters were synthesized as gBLocks and incorporated into the MCS of pJH0204 using BamHI and Xhol. Further, sfGFP and mCardinal ORF were codon optimized for P. putida and inserted downstream the synthetic promoters using Ndel and Hindlll. The transcriptional regulator tetR gene was synthesized with the PJ23119 promoter in a single gBlock and tetR was codon optimized for P. putida using the IDT DNA codon optimization tool (available on the world wide web at https:// www.idtdna.com/pages/tools/codon-optimization-tool) and inserted after mCardinal with EcoRI and Xhol. The lad gene was PCR amplified from pET21a vector and placed under the control of PJ23119 using Ncol and Xhol, but this combination resulted toxic for E. coll, therefore lacl was PCR amplified together with its native promoter and ligated after mCardinal using EcoRI and Xhol. pJH0228-V21ac-mCardinal and pJH0228-V21ac/lad- mCardinal vectors were produced using the same cloning strategy. mCardinal was inserted into pET21a using Ndel and Xhol. The transcriptional unit V2Tc/tetR-mCardinal was inserted into the pUC19 vector by Gibson removing the Ndel RE nucleotide sequence in the plasmid and incorporating terminators insulating the transcriptional unit. DNA sequence of CocE was codon optimized for P. putida using the IDT DNA codon optimization tool and cloned into the pET21a and V2Tc-tetR expression sy stem using Ndel and Hindlll. A complete list of plasmids generated in this study is shown in Table 1. Primers used in this study are listed in Table 2.

Table 2. List of Primers.

[0095] Primers were used as sequencing primers (e.g., placl upstream, tetR upstream). 204F primers were used to incorporate BamHI, Ndel, Hindlll, Kpnl, Ecorl, PstI, Xbal, Ncol, Xhol RE into the MCS of pJH0204 and remove Ncol from kanamycin cassette. The pUC19 modified FW Xhol primers were used to incorporate terminators into pUC19 vector and BamHI/XhoI sites.

[0096] Sequences of cocE, tetR, and mCardinal, codon optimized for P. putida, and the PJ23119 promoter sequence, are provided below. [0097] DNA sequence of mCardinal codon optimized for P. putida:

ATGGT GAGTAAGGGT GAGGAGCTCATTAAGGAGAACAT GCACAT GAAGCT GTATAT GGAGGGCACCGTAAACAACCA

CCACTTCAAGTGTACCACCGAGGGTGAAGGTAAACCCTACGAGGGGACGCAGACCCA ACGCATCAAGGTCGTGGAGG

GCGGCCCGCTGCCTTTCGCATTCGACATTCTGGCGACCTGTTTTATGTACGGCTCGA AGACCTTCATCAACCACACC

CAAGGCATCCCGGACTTCTTCAAGCAGAGCTTCCCTGAGGGCTTCACCTGGGAGCGC GTCACCACGTATGAAGACGG

TGGGGTGCTCACCGTGACCCAGGACACGAGCTTGCAGGATGGCTGCTTGATTTACAA CGTCAAGCTGCGCGGGGTGA

ACTTCCCTAGCAACGGGCCAGTGATGCAGAAAAAGACGCTGGGTTGGGAGGCCACCA CCGAGACCCTGTACCCGGCC

GACGGGGGGCTGGAAGGGCGGTGCGATATGGCCCTGAAATTGGTCGGCGGCGGTCAT TTGCACTGCAATCTCAAGAC

CACGTACCGCTCCAAGAAACCCGCCAAAAACCTGAAGATGCCTGGTGTTTATTTTGT CGACCGGCGCCTGGAGCGCA

TCAAGGAAGCGGACAATGAGACGTACGTGGAACAGCACGAAGTGGCCGTGGCTCGTT ATTGCGATCTGCCGTCGAAG

CTGGGTCACAAACTGAACGGCATGGATGAGCTGTACAAAGATTATAAGGATGATGAC GACAAGTAA ( SEQ ID

NO : 39 )

[0098] DNA sequence of sfGFP codon optimized for P. putida:

ATGTCCAAAGGTGAAGAGCTGTTTACCGGCGTCGTGCCCATTCTGGTGGAGCTGGAT GGCGACGTCAACGGGCACAA

GTTTAGCGTCCGTGGCGAAGGTGAGGGCGACGCCACGAACGGTAAGCTGACGCTGAA ATTCATTTGCACCACCGGCA

AATTGCCTGTACCCTGGCCCACCCTGGTGACCACGCTCACCTACGGCGTACAGTGCT TCAGCCGTTACCCGGACCAC

ATGAAGCGTCACGACTTCTTCAAAAGCGCCATGCCGGAGGGTTACGTGCAGGAGCGT ACGATTAGTTTCAAGGACGA

CGGCACCTATAAGACCCGTGCCGAAGTGAAGTTCGAAGGCGATACGTTGGTGAACCG TATCGAGTTGAAGGGTATCG

ACTTTAAGGAAGACGGCAACATCCTGGGCCATAAGCTGGAGTACAATTTCAACAGCC ATAACGTTTACATCACCGCC

GATAAACAGAAGAACGGCATTAAAGCCAACTTTAAGATCCGCCACAACGTCGAAGAC GGCTCGGTGCAGCTGGCCGA

CCATTATCAGCAAAACACCCCCATCGGTGATGGGCCCGTGCTGCTGCCGGATAACCA TTATCTGAGCACGCAGTCGG

TGCTCAGCAAGGACCCTAACGAAAAGCGCGATCACATGGTGCTGCTGGAGTTCGTCA CGGCGGCGGGGATCACCCAT

GGGATGGACGAGCTCTACAAAGACTATAAAGATGACGATGACAAGTAAA ( SEQ ID NO: 40 )

[0099] DNA sequence of tetR codon optimized for P. putida:

ATGTCCCGCCTGGATAAATCGAAAGTGATTAACTCGGCCCTCGAATTGCTGAATGAA GTCGGTATCGAGGGGCTGAC

GACCCGTAAATTGGCACAAAAGTTGGGGGTGGAGCAACCCACGTTGTATTGGCACGT CAAAAATAAGCGGGCATTGC

TGGATGCCCTCGCTATTGAAATGTTGGATCGCCACCATACCCATTTCTGTCCACTGG AGGGCGAGTCCTGGCAGGAC

TTTCTCCGCAACAACGCGAAATCCTTTCGCTGTGCACTCTTGTCCCATCGGGACGGT GCTAAGGTGCACTTGGGCAC

CCGTCCCACCGAAAAACAATACGAAACCTTGGAAAATCAATTGGCGTTTTTGTGCCA GCAAGGGTTTAGCTTGGAGA

ATGCTCTCTATGCGCTCTCGGCTGTCGGGCACTTTACGTTGGGGTGCGTGTTGGAGG ACCAGGAGCATCAAGTCGCA

AAAGAGGAGCGTGAAACCCCAACCACGGACTCGATGCCACCTCTGCTCCGCCAAGCT ATCGAACTCTTCGATCATCA

GGGCGCGGAGCCAGCCTTCCTCTTTGGGCTGGAGCTGATTATCTGCGGTTTGGAAAA ACAACTCAAGTGTGAAAGCG

GGTCCTAA ( SEQ ID NO : 41 )

[00100] DNA sequence of cocE codon optimized for P. putida:

CATATGGTGGACGGTAATTATTCGGTAGCGTCCAACGTTATGGTGCCGATGCGCGAC GGGGTGCGCTTGGCTGTAGA

TCTGTACCGCCCGGACGCAGATGGCCCTGTACCGGTCCTGCTGGTCCGCAACCCCTA CGACAAATTCGACGTGTTCG

CTTGGAGTACGCAGAGCACGAACTGGCTGGAATTTGTGCGCGATGGGTACGCCGTCG TCATCCAAGACACCCGGGGC CTCTTTGCATCCGAAGGTGAGTTCGTTCCACATGTTGATGACGAGGCGGATGCGGAAGAC ACGCTGAGCTGGATCTT

GGAACAAGCATGGTGCGACGGCAATGTGGGTATGTTCGGTGTAAGCTACCTGGGCGT TACGCAGTGGCAAGCTGCTG TTAGCGGTGTGGGTGGTTTGAAGGCAATCGCCCCGAGCATGGCGAGCGCGGATCTGTACC GTGCCCCCTGGTACGGT CCTGGCGGCGCCCTGAGCGTGGAAGCACTCCTGGGCTGGAGCGCATTGATCGGTACGGGC CTGATTACCAGCCGTAG CGATGCCCGCCCGGAAGACGCAGCCGACTTCGTACAGCTGGCAGCCATCCTGAACGATGT GGCCGGTGCCGCAAGCG TGACCCCTCTGGCCGAACAGCCCTTGTTGGGCCGCCTGATCCCTTGGGTGATCGACCAGG TGGTGGACCATCCAGAC AACGACGAGTCGTGGCAGAGCATCTCGCTCTTTGAACGTTTGGGTGGGCTCGCTACCCCG GCCTTGATTACCGCCGG GTGGTACGATGGCTTCGTGGGCGAGAGCCTCCGTACCTTCGTAGCTGTGAAGGACAACGC GGATGCGCGTCTGGTGG TGGGGCCGTGGAGCCACAGCAATCTGACCGGCCGTAATGCCGACCGTAAGTTTGGGATCG CCGCGACCTACCCCATC CAGGAGGCGACGACCATGCACAAGGCTTTTTTCGACCGGCACCTCCGTGGCGAGACCGAT GCCCTGGCAGGGGTGCC CAAGGTGCGCCTCTTCGTAATGGGTATCGATGAGTGGCGCGACGAGACCGACTGGCCATT GCCAGATACCGCTTACA CGCCTTTTTACCTCGGGGGCTCCGGTGCGGCCAACACGAGCACGGGTGGTGGGACCCTGT CGACCTCGATCAGCGGC ACGGAGTCGGCGGACACCTACCTGTATGATCCTGCCGACCCCGTGCCAAGTCTGGGCGGC ACCCTCCTCTTCCATAA TGGGGACAACGGTCCAGCTGACCAGCGCCCGATTCACGATCGCGACGACGTGCTGTGCTA CTCCACCGAGGTGTTGA CCGACCCCGTGGAAGTAACGGGGACGGTTTCGGCTCGCCTGTTCGTGTCCTCGTCGGCCG TGGATACCGATTTTACC GCCAAGTTGGTCGACGTGTTCCCCGATGGTCGGGCAATCGCTCTCTGCGACGGCATCGTG CGTATGCGCTACCGGGA GACCTTGGTAAATCCTACGCTCATTGAGGCCGGTGAGATTTACGAGGTGGCTATTGATAT GCTGGCCACCAGCAACG TGTTTTTGCCGGGCCACCGCATCATGGTGCAAGTTAGCAGCTCGAACTTCCCGAAGTACG ACCGCAACTCCAACACC GGCGGCGTCATCGCTCGCGAGCAACTGGAGGAAATGTGCACCGCCGTAAACCGCATTCAC CGCGGCCCCGAACACCC GTCCCATATCGTGCTGCCGATCATTAAGCGCGACTATAAGGACGACGACGATAAGTGAAA GCTT ( SEQ ID NO : 42 )

[00101] DNA sequence of pJ23119 promoter:

TTGACAGCTAGCTCAGTCCTAGGTATAATGCTAGCCGCAGTAAGAGAGGAATGTACA C ( SEQ ID NO : 43 )

SDS-PAGE

[00102] SDS-PAGE was earned out on a 4-12% Bis-tris Midi Protein Gel in an XCell4 SureLock Midi system (Invitrogen, USA). E. coli, P. putida and V. natriegens cell extracts were obtained from diluting an overnight culture 1 :50 in 10 mL of fresh LB media and grow n at 37 C for E. coli and 30°C for P. putida and V. natriegens at 220 rpm until the culture reached an optical density (OD) at 600 nm (ODeoo)of 0.7 measured spectrophotometrically, then induced with 0.2 mM IPTG or 0. 1 pg/mL aTc accordingly. Induced cultures were grown for 5 hours, and 2 mL culture were spun down at 14.000 rpm and 4°C for 10 minutes and frozen at - 20°C for further analysis. Cells were lysed with Complete Bacterial Protein Extraction Reagent (ThermoFisher Scientific), and crude extracts were analyzed by SDS-PAGE. Total protein concentration was estimated with Thermo Scientific NanoDrop one and ~10 mg total protein of each cell extract was loaded into the SDS-gel. Expression of CocE and measurement of benzoic acid production

[00103] Recombinant E. coli BL21 strain carrying the pET21a-cocE and DH10B containing either po 7 " V2TcR-cocE or pd° V2TcR-cocE19 plasmids were cultivated overnight at 30°C at 220 rpm and fresh cultures were started with 5% of the ON culture and induced at -ODeoo 0.7 with 0.2 mM IPTG or 0.1 pg/mL aTc accordingly. 1 mL samples were collected each hour for 6 hours by centrifugation at 4°C and 14000 rpm and stored at - 80°C. Cell pellets were disrupted using 150 pL of B-PE Complete Bacterial Protein Extraction Reagent (ThermoFisher Scientific) for 25 minutes at room temperature and soluble fractions were collected after centrifugation for 25 minutes at 4°C and 14000 rpm. ~ 3 mg/mL of soluble fractions were incubated with 0.015 mg of cocaine for 20 minutes at 28°C and benzoic acid production was estimated using the benzoic acid detection kit for feed (Atogene, EZ2013-03).

EXAMPLE 3: DESIGN OF SYNTHETIC LAC AND TET PROMOTERS.

[00104] A schematic diagram of plasmid construction is shown in FIGS. 1A-1B. All plasmids used in the assay are listed in Table 1. The original lac promoter (FIG. 2Ai) lacks the -35 conserved box recognized by the o'° subunit of RNA polymerase. Original lac and tet (FIG. 2Bi) promoters lack the Pribnow (-10) box. Three synthetic variants of each promoter were constructed containing the highly conserved o'° consensus hexamers at positions -10 and -35. Version 1 (VI lac, FIG. 2Aiv, and VlTc, FIG. 2Bii) of each promoter resembles the original lac and tet promoters, with the incorporation of a strong RBS from P. putida. The version 2 (V2lac, FIG. 2Av, and V2Tc, FIG. 2Biii) replaces the -35 and -10 boxes of Version 1 for the highly conserved o 70 consensus hexamers TTGACA and TATAAT, respectively. Version 3 (V3lac, FIG. 2Avi, and V3Tc, FIG. 2Biv) incorporates an additional lacO operator and the RBS of the tad promoter. Version 4 (V4lac, FIG. 2Avii, and V4Tc, FIG. 2Bv) displaces the location of the -10 and -35 boxes (V4lac) and incorporates an additional -35 box in V4Tc. The synthetic promoters were cloned into the MCS of the shutle vector pJH0204 containing the origin of replication pColEl with 25-30 copies per cell in E. coli . The design allowed the incorporation of the transcriptional regulators lad and tetR after a terminator L3S2P21 and under the control of the constitutive promoter PJ23119 (FIG. 2C).

[00105] No aberrant phenotypes in the E. coli strains carrying the PJ23119-tetR construct were observed. However, E. coli failed to maintain the PJ233 \ 9-lad construct. The plasmids containing the lad repressor gene in this configuration produced slow-growing colonies and yielded plasmids with aberrant restriction paterns, indicating gross plasmid rearrangements. Therefore, the strong PJ23119 promoter was replace for the native lad promoter and observed no further toxicity. [00106] The reporter protein mCardinal was incorporated downstream of the synthetic promoters and the plasmids were transformed into E. coli DH10B, P. putida JE90 derivative KT2440, and V. natriegensNmaie X2 strains. The V2lac-mCardinal constructs could not be maintained in E. coli, likely due the high strength of the V2lac promoter. To avoid this abnormal phenotype, the genetic circuit V2lac-mCar dinal was produced in the pJH0228 vector containing the CloDF13 origin of replication at about 10 copies per cell, allowing stable maintenance of the V2lac promoter.

EXAMPLE 4: SELECTION OF THE FLUORESCENT REPORTER SYSTEM.

[00107] GFP and other green fluorescent protein derivatives are widely used as reporters in bacterial expression systems. However, the intrinsic green fluorescence produced by endogenous molecules, such as proteins containing aromatic amino acids, negatively impact the interpretation of the exogenous green fluorescence generated by the reporter system. The autofluorescence is exacerbated in P. putida, a member of the fluorescent Pseudomonas species. Under iron limited conditions P putida secretes the siderophore pyoverdine, a soluble fluorescent yellow green pigment. Far-red fluorescent proteins were explored, specifically mCardinal, a monomeric far red shifted derivative of mKate, as wavelengths between 600 and 900 nm are not absorbed by cells and organic molecules, thus reducing the noise of endogenous autofluorescence.

[00108] To quantify the impact of autofluorescence on the selected Gram-negatives strains, the endogenous fluorescence of these strains cultivated in LB over time were measured. All three species emit fluorescence in the green spectrum, as expected (FIG. 3A, right panel). This fluorescence is more or less constant over time with E. coli, however P. putida and V. natriegens show variable production of molecules that absorb green light at different cell densities. In contrast, when measured in the far-red spectrum specific for mCardinal, there was approximately 400 times less detection of autofluorescence, in arbitrary units on the same instrument, in all strains as compared to the measurements in the green spectrum (FIG. 3A, left panel). Again, E. coli displayed rather consistent far red autofluorescence over time whereas P. putida and V. natriegens autofluorescence varies with cell density.

[00109] To further validate the benefit of using mCardinal instead of sfGFP, the inherent noise of each reporter system was quantified by measuring the fluorescence signal-to-background ratio of both reporter systems expressed under the control of the constitutive tad promoter in the three species. After 16 hours of growth, the recombinant E. coli, P. putida and V. natriegens expressing sfGFP emitted 64, 11 and 5 times more green fluorescence than their respective wild type strains (not expressing a fluorescent protein). Meanwhile, mCardinal-expressmg strains displayed 294, 34 and 13-fold higher red-light emission in E. coli, P. putida and V. natriegens compared to their wild type controls (FIGS. 3B-3D). Overall, the use of mCardinal rather than sfGFP significantly improves the dynamic range and facilitates measurement of promoter strength, as there is little endogenous autofluorescence in the far-red spectrum.

EXAMPLE 5: STRENGTH AND REGULATION OF SYNTHETIC LAC PROMOTERS IN

ESCHERICHIA COLL

[00110] The lac promoter, and its derivates, are constitutively active in the absence of its transcriptional regulator lad. The strength of the synthetic lac promoters were evaluated without lacl and compared these against the strong tael promoter, which drives high levels of transcription and can result in recombinant protein production of up to 30% of total protein. The VI Lac promoter was 10-fold weaker than the tad promoter (FIG. 4A), consistent with previous data observing an 11 -fold difference between these two promoters. The V3lac promoter matched the toe/ promoter strength, while the V4lac promoter showed 5 times less mCardinal than tad (FIG. 4A). The V2lac promoter had to be tested in the low copy plasmid pCloDF13 because the medium copy pColE l derived plasmid could not be maintained in E. coll. Even in the low copy configuration, the V2Lac promoter surpassed by ~1.4-fold the tad promoter and was therefore the strongest constitutive promoter despite its expression in a low copy plasmid. The result suggests that maintenance of a promoter of this strength within in a medium copy plasmid exceeds the sustainable metabolic burden of E. coli (FIG. 4A). These results indicate that the incorporation of o'° consensus sequences at positions -35 and -10 efficiently increase the strength of the lac promoter.

[00111] Next, the transcriptional regulator lad was incorporated into these circuits to quantify the efficiency of the OFF state. The VILac and V4lac constructs containing the lad repressor produced a red fluorescence signal comparable to wild type E. coli (FIG. 4A), indicating tight transcriptional repression. The V31ac promoter could not be totally repressed by lacl, showing a slight tendency to leak, even with the presence of two lacO operators. The V2lac promoter with lad was stably maintained by E. coli in both pColEl and CloDF 13 -derived vectors. However, the repressed state of the V2lac promoter in the low and medium copy plasmids showed rather a constitutive behavior. Lacl repressed ~9-fold the V2Lac promoter in the CloDF 13 derived plasmid as compared to its constitutive construct (FIG. 4A), while the medium copy version of the V2lac- lad system (which was not sustainable as a constitutive circuit) exhibited a strong constitutive expression ~1.3-fold higher than the tad promoter, thus showing that one lacO operator region is not sufficient to block transcription of V2Lac containing both o 70 consensus sequences by Lacl, but incorporation of a second lacO reduces the leakage as observed in the V3 Lac promoter (FIG. 4A).

[00112] To verify the inducibility of the synthetic lac promoters, E. coll was exposed to IPTG at the beginning of exponential phase. The VI lac and V4lac promoters showed minimal induction of mCardinal (FIG. 4A). The V3lac promoter, which mimics the V2lac promoter with an additional lacO operator, demonstrated a dynamic range of ~17-fold versus the uninduced state, but could only produce 24% of its full constitutive potential (FIG. 4A). The V2lac promoter in the low copy plasmid produced ~2-fold more mCardinal upon induction; similar behavior was observed in the medium copy version, where production of mCardinal reached the maximum values of all promoters tested at -2.6-fold stronger than the tael promoter (FIG. 4A). Overall, the results demonstrate that incorporation of o 70 consensus sequences to the lac promoter improve its strength (V2Lac). producing more mCardinal than the strong tael promoter in both, the constitutive Lacl- less version in the low copy plasmid and the repressed and induced states in the medium copy plasmid. However, Lad could not repress the transcription of the synthetic V2Lac promoter. Thus, additional lacO operators are necessary to turn the lac promoter OFF containing the perfect o 70 boxes, as observed in V3Lac, though full induction then becomes impossible in this architecture.

EXAMPLE 6: STRENGTH AND REGULATION OF SYNTHETIC TET PROMOTERS IN

ESCHERICHIA COLL

[00113] The tet promoter also drives transcription constitutively without TetR. The expression profile of the constitutive synthetic tet promoters were evaluated and compared against the tad promoter. As observed for the lac promoter, replacement of the -10 and -35 sequences with the o 70 consensus boxes in the tet promoter also increased the constitutive efficiency of the V2Tc promoter over the original VlTc, in this case by 20%, and reached the yields obtained with tad promoter (FIG. 4B). The V4Tc promoter showed similar results to the V2Tc, while the V3Tc lost any capability to initiate transcription (FIG. 4B). Further, incorporation of the tetR repressor to each of the four circuits under the control of the strong constitutive PJ23119 promoter completely silenced mCardinal production from all the synthetic tet promoters, even in the strong V2Tc promoter which harbors the optimal combination of consensus sequences recognized by the o 70 (FIG. 4B).

[00114] To confirm the induction of each promoter and determine its functional dynamic range, each circuit was induced with anhydrotetracycline (aTc). The original tet promoter (VlTc) only achieved -37% of its full constitutive potential, reaching only 30% of mCardinal compared to tad promoter confirming the middle level strength of the tet promoter (FIG. 4B). The V2Tc promoter achieved maximal induct on reaching the full potential dynamic range and produced similar yields as the strong tad promoter (FIG. 4B). Interestingly, the V4Tc promoter could not be de-repressed by aTc and showed a weak induction of only 5%, and as expected, the V3Tc promoter showed no activity (FIG. 4B). Together these results confirm that in E. coli the incorporation of the consensus boxes targeted by the o 70 is key to boost the expression profile of promoters recognized by this transcription initiation factor and achieve the maximal transcriptional capacity of these promoters upon induction.

EXAMPLE 7: DIRECT COMPARISON OF SYNTHETIC LAC AND TET EXPRESSION SYSTEMS TO PET IN ESCHERICHIA COLL

[00115] In E. coli, the pET series of expression plasmids are the most popular systems for recombinant protein production. The efficiency of the synthetic lac and tet promoters were compared directly against pET21a. The pET expression system in the BL21 strain can produce more than 50% of the target gene as the total protein per cell, mCardinal production was measured in BL21. As expected, the pET system induced by IPTG produced ~1.5-fold more mCardinal than the strong constitutive tad promoter, which is expected to accumulate up to 30% of total cell protein. Despite the massive production of mCardinal by the pET system, its OFF or uninduced state can be considered as a medium high constitutive expression system, yielding 30% of the constitutive tad promoter as evidenced by mCardinal production (FIG. 4B, FIG. 4C). This transcriptional leakage in the pET system is well known in the scientific community.

[00116] Among the synthetic lac and tet promoters, all tested in E. coli DH10B, only the V2Lac promoter surpassed the pET system in the medium copy plasmid pColEl. The IPTG induced V2Lac expression system produced ~1.8 times more recombinant protein than the pET system, however, the leakage of V2Lac promoter equals the induced state of the pET system (FIG. 4A and FIG. 8), therefore cancelling the advantages of an inducible expression system. The V3Lac expression system showed a tighter control of the OFF state, therefore, to increase its strength the V3Lac-lacI construct containing mCardinal was inserted into the pUC19 vector, thus increasing the copy number from 30 to 250 copies per cell. E. coli failed to maintain the V3lac-lacl construct in the pUC19 vector, likely due to the toxicity of Lad. The V2Tc-tetR expression system was evaluated in the pUC 19 vector. E. coli stably maintained the V2Tc-tetR construct and no phenotypic or genotypic abnormalities were observed as in pCJC19-V3Lac-lad. mCardinal production by p\JC19-V2Tc-tetR matched the pET21a expression system, with one exceptional difference, the pUC19-V2Tc-tetR maintained complete repression in the uninduced after 12 hours (FIG. 4B and FIG. 8). Thus, pUC19-F27c-tefi7 shows significant improvement in transcnptional control and dynamic range over pET21a (FIG. 4B, C). Further, equivalent protein production and tighter transcriptional control were achieved without the obligatory use of the BL21(DE3) strain, suggesting the system provided herein is readily portable to other strains and species.

EXAMPLE 8: EXPRESSION OF THE COCAINE ESTERASE COCE AND PRODUCTION OF BENZOIC ACID.

[00117] To further validate the advantage of the expression system provided herein over the pET expression system, the production of a functional protein product, the cocaine esterase CocE, was assayed. This enzyme hydrolyses cocaine into benzoic acid and could expand the use of the narcotic compound as raw source in the production of the carboxylic acid widely used as precursor and preservative in the food and pharmaceutical industries. Expression of CocE has previously been shown to be a difficult and laborious task with the pET expression system, because CocE forms inclusion bodies; consequently, long incubation methods at low temperatures are required to isolate sufficient yields of functional CocE. E. coh could not support CocE expression in the repressor-less variants of the synthetic promoters, thus confirming this is atoxic protein for E. coli. Next, CocE in the V2Tc-tetR expression system was assessed in both the medium and high copy plasmid backbones, pJH0204 and pUC19 respectively (po70 V2TcR-cocE and po70 V2TcR-cocE, see Table 1), and compared against the pET expression system for the production of benzoic acid. Benzoic acid production is an indication that CocE was correctly folded and properly hydrolyzing cocaine.

[00118] The soluble fraction recovered from the recombinant E. coli strain containing the pET21 a- cocE expression system showed equivalent benzoic acid production in the presence of cocaine in both the uninduced and induced states (FIG. 5B), with a clear tendency to form inclusion bodies, as evidenced by the accumulation of protein in the insoluble fraction upon activation by IPTG (FIG. 5 A). Thus, while induction of CocE by IPTG in the pET system does result in the additional production of protein, the additional protein made is insoluble and non-functional protein. Functionally speaking, the expression of CocE in pET21a is not inducible, but instead behaves constitutively, as there is no difference in cocaine hydrolysis in the uninduced and induced states (FIG. 5B).

[00119] The soluble fractions from the recombinant E. coli V2Tc-tetR-cocE expression system, in both the pJH0204 and pUC19 plasmids, showed no benzoic acid production in the absence of the inducer. The addition of aTc triggered production of CocE with protein production and function roughly equivalent to the soluble pET21a-cocE plasmid (FIG. 5A, FIG. 5B). In both plasmids, benzoic acid production was observed after only one hour of aTc induction, and reached maximal production by 3 hours (FIG. 5B). These results confirmed the advantage of the incorporation of the o 70 boxes to the tetracycline promoter, keeping the toxic enzyme CocE OFF in the uninduced state and producing sufficient amounts of mature CocE after 3 hours of induction. This is significantly shorter than induction of 16 hours previously recorded (data not shown). Further, the expression system provided herein produced soluble and functional protein without evidence of massive toxicity or inclusion body formation. This method facilitates the scalability of this bioprocess which has potential as an alternative, environmentally friendly method to obtain benzoic by replacing petroleum-based starting materials.

EXAMPLE 9: STRENGTH AND REGULATION OF SYNTHETIC LAC AND TET PROMOTERS ITS PSEUDOMONAS PUTIDA.

[00120] Genetic control in the soil-dwelling species P. putida is of great interest for biotechnological applications due to the ability of this microorganism to synthesize complex natural products and metabolize a variety of xenobiotic compounds. The constitutive strength, repression, and inducibility of synthetic promoters in P. putida were assayed. The lac-based promoter systems (which include the tael promoter) are reported to have poor dynamic range and instability in high-copy plasmids when harbored in P. putida as replicative plasmids. To address this problem, the genetic circuits were integrated into the P. putida chromosome. The constitutive strength of Vllac, V21ac, V31ac and V41ac promoters were measured using mCardinal, and found out that tad is stronger than all the synthetic lac variants by ~3, 1.5, 10 and 33-fold respectively (FIG. 6A, FIG. 6C). The incorporation of the transcriptional regulator lad efficiently repressed the Vllac, V31ac and V41ac promoters, but as seen in E. coli, the V21ac showed a tendency to leak, though not as egregiously as in E. coli (FIG. 6A, FIG. 6C). Induction by IPTG was barely detected in the Vllac, V31ac and V4Lac promoters, thus indicating the weakness of the original lac promoter in P. putida (FIG. 6A, FIG. 6C). Interestingly and analogously to E. coli, the V21ac promoter showed ~27-fold increase in mCardinal production against the uninduced state and surpassed its theoretical dynamic range as compared to the constitutive version, ultimately achieving similar yields as the strong tad promoter (FIG. 6A, FIG. 6C). These results confirm that the incorporation of the o70 consensus sequences to the lac promoter (V2Lac) significantly improve the strength of this promoter in the P. putida host. [00121] The tet promoters in P. putida w ere also evaluated. The original tet promoter with a strong RBS (VlTc) was the strongest among all synthetic tet promoters in the absence of tetR, followed by V4Tc, V2Tc and V3Tc (FIG. 6B, FIG. 6C). Incorporation of the tetR repressor under the control of the PJ23119 promoter silenced all the synthetic tet promoters, as in E. coli (FIG. 6B, 6C). Addition of aTc activated the four tet promoters and remarkably, the induced V2Tc promoter showed the highest mCardinal induction exceeding its constitutive expression by ~10-fold and matching the tad promoter. The V2Tc promoter was revealed to be -2 and 3-fold more efficient than the VlTc and V4Tc promoters after induction respectively, while the V3Tc proved to be inefficient in/’, putida (FIG. 6B, FIG. 6C). These results highlight the importance of the consensus sequences targeted by o70 to amplify the strength of inducible promoters also in P. putida.

EXAMPLE 10: STRENGTH AND REGULATION OF SYNTHETIC LAC AND TET PROMOTERS IN VIBRIO NATRIEGENS.

[00122] The circuits were next evaluated in the marine bacterium V. natriegens which has gained popularity for routine molecular biology applications due its ability to double in <10 min. In this host, previous studies indicated that the tad promoter produced GFP upon activation with IPTG, while induction of the tet promoter resulted in low GFP yields. The constitutive and inducible versions of the synthetic lac and tet promoters were evaluated in the pColEl derived plasmid with -300 copies per cell, except for the constitutive V2Lac, which was evaluated in pCloDF13 derived vector with -64 copies per cell in this strain. The constitutive V2Lac and V4Tc promoters outperformed tac/by -16 and -2 -fold respectively, while V3Lac, VlTc and V2Tc produced similar levels as tad (FIGS. 7A-7C). VILac and V3Lac underperformed tad by -2.7-fold, and V3Tc showed no activity (FIGS. 7A-7C). Incorporation of the transcriptional regulators lacl and tetR turned OFF all the synthetic promoters except for 1'21. ac. which continued to show leakage, as in E. coli and P putida (FIG. 7A). Induction of the synthetic lac promoters by IPTG revealed that only the V2Lac promoter was fully activated, surpassing its constitutive version by -1.3-fold and showed a potential dynamic range of -25-fold (FIG. 7A). The V3Lac reached 30% of its full potential, and the VILac and V4Lac showed no induction (FIG. 7A). While addition of aTc activated all four synthetic tet promoters, the V3Tc and V4Tc promoters rather showed weak mCardinal production, and VlTc produced 2-fold more mCardinal than its constitutive variant (FIG. 7B). The V2Tc promoter, which harbors the 2 consensus o 70 boxes, demonstrated strong inducibility producing 11-fold more mCardinal than the tad promoter and displaying full potential of its dynamic range producing 100-fold more mCardinal than its OFF version (FIG. 7B, C). These results confirmed that the adaptation of the two consensus boxes recognized by the o 70 in the V2lac and V2Tc promoters improved the performance of the inducible lac and tet promoters in V. natriegens.

EXAMPLE 11: ADDITIONAL CONSIDERATIONS FOR INDUCIBLE PROMOTER DESIGN.

[00123] Production of recombinant proteins is and will continue to be one of the main tools scientists use to understand biological processes and transfer academic results to industrial applications. The pET system is well known to induce the formation of inclusion bodies, a major drawback in the production of soluble proteins. pET requires specific strains that carry the T7 RNA polymerase, it lacks real tuneability, and fails to keep the target gene OFF Leakiness in the OFF state negates the main advantage of an inducible system, which is intended to permit time- or context-specific control of gene expression.

[00124] In contrast to pET, the results provided in Examples 2-9 show that inclusion of the o 70 consensus sequences into the lac and tet inducible promoters improve both repression and induction. The tight transcriptional control does not require any particular strain background, and permits rapid expression of soluble proteins, including toxic proteins such as CocE. The inducible promoters provided herein can be easily optimized to be used in a variety of Gram-negative hosts, including P. putida and V. natriegens, thus widening the applicability of these tools to a broad spectrum of bacteria. The inducible promoters provided herein offer a significant advance to the biotechnology industry in offering additional platforms for exogenous protein expression and purification.

[00125] The V2lac promoter expressed very highly across three Gram-negative species. Indeed, in V. natriegens the V2lac promoter is the strongest promoter yet described, as it surpasses the widely used tad promoter by 16-20-fold (FIG. 7). Additional factors can affect promoter performance, for instance, the length and sequence of the spacer between the -10 and -35 elements can have dramatic effects on promoter strength. .

[00126] The results in FIG. 5 suggest that complete OFF state control is key to avoiding common problems with protein expression of challenging proteins, such as the formation of inclusion bodies. Future promoter configurations could be aided by the combination of additional mechanisms to achieve complete OFF state control, such as the additional of small RNA transcriptional and translational regulators. These approaches have increased appeal over the pET system with pLysE, for example, as they permit facile portability among strains and species. OFF state control of the lac promoters was less of an issue in P. putida and V natriegens, where beter repression of the strong V2lac promoter was achieved (FIGS. 6A-7C). Without being bound by a particular theory, it is likely that species-specific differences in RNA polymerase binding and promoter clearance may affect the behavior of inducible promoters.

[00127] The challenges of leakiness were entirely eliminated by the use of tet-based promoters in the system provided herein. An often-cited drawback of tet promoters for protein expression was the limitation of protein production as compared to /oc-based systems. Incorporation of the o 70 consensus sequence into the tet promoter (V2Tc) significantly increased expression above that of the original tet promoter (VI 7 c) to the point where the promoter output equaled (and in the case of V. natriegens, exceeded) that of the tael promoter. As a complementary approach to inducible promoter manufacturing and design, the sigma factor can be replaced with o 70 sequences to assess promoter performance. Promoter performance was quantified herein by using far-red reporters, which significantly decreased background fluorescence and increased dynamic range in the bacterial species assayed.

[00128] Different yields of recombinant protein can be achieved based on promoter and host selection. Overall, the incorporation of the consensus -35 sequence and Pribnow (-10) box unlocks the strength of the lac and tet promoters in Gram-negative bacteria, facilitating the production of any given target gene in different host with the same set of plasmids.

[00129] The V2Tc promoter provided herein offers an improvement over the pET-based protein expression systems that remain very widely used. V2Tc is tightly regulated, robustly inducible, and drives protein production comparable to or beter than the tael promoter in all three species that were examined. No specific strain backgrounds were necessary, as it does not rely on the presence of the T7 polymerase.

EXAMPLE 12: DEVELOPMENT OF THE DUAL EXPRESSION SYSTEM V2TCR/V2(3)LACI.

[00130] The o 70 adapted expression systems V2TcR and V3LacI have tight regulation and strong induction in E. coll in the presence of anhydrotetracycline (aTc) and IPTG respectively, while in P. putida and V. natriegens the V2TcR and V2LacI showed the best performance as inducible promoters in these host bacteria. A dual expression system was developed using as backbone the pJH0204 vector containing the origin of replication pColEl and the atB sites specific for bxbl recombinase. Each transcriptional unit of the dual expression system was insolated by terminators and the transcriptional regulators tetR and lacl were located in-between and in opposite direction of the inducible promoters V2Tc and V2/3Lac to block undesired transcription of the open reading frames (ORF). Further, 2 different multiclonmg sites were located after the inducible promoters V2Tc and V2/3Lac to facilitate the incorporation of target genes (FIGS. 9A-9F, and FIG. 14).

[00131] The dual expression systems were tested in E. coli, P. putida and V. natriegens to evaluate i) synergy ii) expression of different recombinant proteins and iii) production of the biosynthetic gene cluster (BGC) leading to the biosynthesis of lycopene and B-carotene. Synergy was evaluated with the reporter system mCardinal which was accommodated in both inducible promoters of the dual expression system (FIG. 9B). Expression of different recombinant proteins was performed by adapting the sfGFP under the control of V2TcR and mCardinal controlled by V2/3LacI (FIG. 9C). Finally, production of the terpenoids lycopene and B-carotene by the dual expression system was achieved by the assembly of the crtEBIY genes from Pantoea ananatis. The crtEBIY genes were synthesized with the codon usage for E. coli. The crtEBI operon was assembled in the pUC19 vector via Gibson adapting the ribosome binding site (RBS) of J23119 9 to crtB and the RBS pJLl to crtl, further the synthetic construct was transferred to the V2TcR expression system to evaluate lycopene production (FIG. 9D). B-carotene production was evaluated with the incorporation of the crtY gene with the J23119 RBS to V2TcR-crtEBI yielding V2TcR-crtEBIY (FIG. 9E). The dual control of the crtEBI operon together with the crtY gene to produce B-carotene was achieved by adapting the crtEBI to the V2TcR expression system and the crtY gene to V2(3)LacI expression system present in the dual vector (FIG. 9F). All measurements were performed using LB as growth medium as the scope of this work is not to optimize the production of terpenoids. After transforming the plasmids listed in Table 3 in the three bacterial species no difference in the wild type strains was observed, thereby confirming the stability of the dual expression systems.

Table 3. List of Plasmids.

Note: SEQ ID NOS: are listed in the SEQUENCES section of this paper.

EXAMPLE 13: SYNERGY OF THE DUAL EXPRESSION SYSTEM V2TCR/V2(3)LACI. [00132] Increasing the copy number in E. coli efficiently boosted production of mCardinal by the po70V2TcR expression system but had a negative impact in the po70 V3LacI expression system due the toxicity of the transcriptional regulator Lach Therefore, the synergistic effect of the duet expression system po70V2TcR /VSLacI both controlling mCardinal as different studies demonstrated that combined promoters enhance production of the target protein. The E. coli strain containing the duet expression system V2TcR/V3LacI produced 2-fold and 4-fold less red fluorescence than the strains containing the V2TcR and V3LacI when induced with aTc and IPGT respectively (FIG. 10A). Synergy was observed by the addition of IPTG and aTc to the V2TcR/V3LacI strain improving mCardinal production by 1.5-fold against the induction obtained with aTc alone (FIG. 10A). However, the synergy in mCardinal production in the duet system reached equal amounts of mCardinal as the V2TcR expression system (FIG. 10A). These results indicate that the tandem version of the synthetic promoters V2TcR/V3LacI offer an alternative to co-express different genes at high levels using different inducers.

[00133] In P. putida the V2TcR and V21acl expression systems showed the best performance activating transcription of heterologous proteins. Consequently, we integrated both promoters controlling expression of mCardinal into the chromosome of this host. Single induction of the dual system by aTc resulted in induction of mCardinal equivalent to V2TcR system; single induction of the dual system with IPTG resulted in a 3-fold decrease relative to V21acl alone (FIG. 10B). However, the addition of both inducers (aTc+IPTG) increased mCardinal by 1.1 -fold over the V2TcR and V21acl expression system, thus showing a minimal synergistic effect of the dual promoters in P. putida (FIG. 10B).

[00134] The duet expression system V2TcR/V2LacI decreased substantially the efficiency of the synthetic promoters in V. natriegens compared to the single expression system counterparts by 12 and 2-fold for V2LacI and V2TcR after induction with IPTG and aTc, respectively (FIG. 10C). Co-induction with IPTG + aTc did not have a synergistic effect in mCardinal production, but reached similar yields as obtained by induction with aTc alone (FIG. IOC). Together, this information demonstrates that synergy cannot be achieved with the dual expression system V2TcR/V2(3)LacI in medium copy plasmids as is the case for E. coli and V. natriegens. On the contrary, the duet expression system diminishes the activity of each promoter probably due the constitutive expression of the transcriptional regulators TetR and LacI, and competition of the o 70 for both promoters. In P. putida the duet expression system was directly integrated into the chromosome and reduced the quantities of the transcriptional regulators TetR and LacI. Surprisingly, in this host the V2TcR expression system gained strength in the dual version, the V21acl expression system lost efficiency, and synergy was observed, thus indicating that duet expression system gain performance as single copies in the chromosome of the host bacterium.

EXAMPLE 14: EXPRESSION OF DIFFERENT RECOMBINANT PROTEINS WITH THE DUAL EXPRESSION SYSTEM V2TCR/V2(3)LACI.

[00135] Prior to the expression systems provided herein there was not a universal dual expression system that allows differential expression of distinct set of genes via two independent inducible promoters that is portable among different Gram-negative species. The V2TcR/V2(3)LacI dual expression system demonstrated that it can control the co-expression of mCardinal, although synergy was only observed inf. putida. Two different reporter proteins were expressed to evaluate their production in E. coli, P. putida and V. natriegens using the V2TcR/V2(3)LacI expression system. The transcriptional unit V2TcR controlling the expression of sfGFP while the V2(3)LacI unit controlling mCardinal production.

[00136] The three bacterial species under study containing V2TcR-sfGFP/V2(3)LacI -mCardinal expression system showed no green fluorescence in the uninduced state, thus confirming the tight regulation of the V2TcR expression system, however, the V2(3)LacI -mCardinal is prone to leak and red fluorescence was observed in the absence of IPTG (FIGS. 11A-11C C). Addition of aTc induced the production of sfGFP, and similar result was observed with the addition of IPTG which triggered production of mCardinal. Interestingly, when the cultures were co-induced with aTc + IPTG the green and red fluorescence reached similar levels compared with the cultures induced with only one inducer, thus indicating that both promoters (V2TcR/V2(3)Lacl) are unaffected by the activation of each other reaching their maximum activity either in their sole activation or their independent combined activation (FIGS. 11A-11C).

EXAMPLE 15: PRODUCTION OF LYCOPENE AND B-CAROTENE IN E. COLI, P. PUTIDA AND V. NATRIEGENS THE EXPRESSION SYSTEM V2TCR.

[00137] To decipher the capability of the dual-expression system V2TcR /V2(3)LacI to produce natural products we first had to test the ability of the V2TcR expression system to control the expression of multiple genes, we therefore decided to evaluate the production of lycopene and B- carotene in E. coli, P. putida and V. natriegens. Lycopene production has already been demonstrated in E. coli and P. putida via heterologous expression of the lycopene-producing operon (LYC) containing the crtEBI genes from Pantoea ananatis. Lycopene is the biosynthetic precursor of B-carotene, thus expression of the LYC operon together with crtY yields B-carotene production, which also has been demonstrated in E. coli and V. natriegens. The LYC and the crtEBIY operons were accommodated in the V2TcR expression system and lycopene and B- carotene production were measured by UHPLC after 4 hours of induction with aTc. In E. coli production of both, lycopene and B-carotene, was achieved but the induced culture produced only 1.7 and 1.2-fold more lycopene and B-carotene respectively than the uninduced culture, thus demonstrating that the tight control of the V2TcR can’t totally repress the expression of the crtEBI and crtEBIY genes (FIGS. 12A-12B). Lycopene biosynthesis in E. coli using the pET expression system also demonstrated abundant leakage and production of 170 mg/L lycopene after 40 hours of induction, 1. 1-fold more than the uninduced culture. Despite the V2TcR expression system produced 94 mg/L, this was achieved after 4 hours of induction, against the 40 hours reported with the pET expression system. This result confirms the advantage of the V2TcR expression system over the pET expression system as described in Examples 2-10. In P. putida and V. natriegens containing the V2TcR-crtEBI (SEQ ID NO: 70) and V2TcR-crtEBIY (SEQ ID NO: 71) transcriptional units no lycopene or B-carotene production was observed in the uninduced state, and activation of by aTc yielded production of the terpenoids (FIGS. 12A-12B). P. putida was reported to produce 1.22 ng/mL of lycopene after 24 hours of growth with the pSEVA421 vector, with the V2TcR expression system lycopene production reached 17 mg/L after 4 hours of cultivation, which is a substantial increment (13.000x), and B-carotene production by P. putida reached 840 mg/L, the highest among the three species under study (FIG. 12). V. natriegens was reported to produce 2.93 mg/L of B-carotene, while the V2TcR expression system achieved 597 mg/L of B-carotene and 31 mg/L of lycopene. These results demonstrate that the V2TcR can be efficiently programmed for production of biosynthetic gene clusters in Gram-negative bacteria.

EXAMPLE 16: COORDINATED EXPRESSION OF THE LYCOPENE AND B- CAROTENE BIOSYNTHETIC GENE CLUSTERS (BGC) WITH THE DUAL EXPRESSION SYSTEM V2TCR/V2(3)LACI.

[00138] Natural products are encoded by BGC and ability to activate each gene at a different time point with a different inducer represent an advantage when the genes are toxic, or the metabolites produce by the BGC affect the cell growth of the heterologous host. As provided herein, the V2TcR expression system can activate the production of lycopene and B-carotene. In order to control the production of B-carotene with two different inducers, aTc and IPTG were used. The LYC operon was incorporated to the V2TcR expression system and the crtY gene was controlled by the V2(3)LacI expression system. The plasmid pc/° V2TcR-crtEBI/V3LacI-crtY (SEQ ID NO: 73)

70 was transformed into E. coll and the plasmid per V2TcR-crtEBI/V2lacI-crtY (SEQ ID NO: 72) was transformed into P. putida and V. natriegens. In the three bacteria species no lycopene production was observed despite the V2TcR expression system was controlling the crtEBI genes and activation by aTc should trigger production of lycopene (FIG. 13 A). This result is explained in E. coli by the leakage of the crtEBIY genes under the control of the synthetic promoters V2TcR and V3LacI, demonstrating the high processivity of the B-carotene BGC to metabolize the endogenous IPP isopentenyl diphosphate directly into B-carotene. However, the V2TcR-crtEBI showed no leakage in P. putida and V. natriegens, but the V21acl expression system was reported to leak, consequently, the lycopene produced after addition of aTc is directly transformed into B- carotene by the CrtY present in the cells due its minimal expression and high processivity.

[00139] B-carotene production in E. coli was observed no matter the absence or presence of the inducers aTc and IPTG (FIG. 13B), thus demonstrating that the V2TcR/V3LacI expression system failed to keep downregulated the B-carotene BGC in in this host. In P. putida and V. natriegens B- carotene production was only achieved when the cultures were induced with aTc or with aTc + IPTG (FIG. 13A). IPTG induction alone did not conduce to B-carotene production because the crtEBI genes were not transcribed by the V2TcR expression system. Addition of aTc resulted in B-carotene levels similar to induction with aTc + IPTG (FIG. 13 A), thus indicating that the minimal leakage of the V21acI-crtY is sufficient to metabolize the lycopene produced by the V2TcR-crtEBI into B-carotene. [00140] Importantly, the yields of B-carotene reached by E. coll were always lower than the production achieved by P. putida and V. natriegens. Thus, P. putida can be used to produce complex biosynthetic gene clusters. Therefore, a dual expression system developed in this study expands the genetic tool kit of P. putida to produce complex proteins. Also, V. natriegens outperformed E. coli, and the dual expression system allowed higher production of B-carotene compared with the V2TcR by 1.2-fold. Together these results demonstrate the potential of the dual expression system presented here to fully capitalize the potential of heterologous host in industrial applications.

SEQUENCES

FIG. 2A Sequences:

Lac (SEQ ID NO: 3)

AGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAAC AATTTCACACAGGAAACAGC TATG lacUV5 (SEQ ID NO: 4)

AGGCTTTACACTTTATGCTTCCGGCTCGTATAATGTGTGGAATTGTGAGCGGATAAC AATTTCACACAGGAAACAGC TATG tael (SEQ ID NO: 5)

GAGCTGTTGACAATTAATCATCGGCTCGTATAATGTGTGGAATTGTGAGCGGATAAC AATTTCACACAGGAAACAGA ATCATATG

Vllac (SEQ ID NO: 6)

AGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAAC AACGCAGTAAGAGAGGAATG TACATATG

V21ac (SEQ ID NO: 7)

GAGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTGGAATTGTGAGCGGA TAACAACGCAGTAAGAGAGG AATGTACATATG

V31ac (SEQ ID NO: 8)

AGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTGGAATTGTGAGCGGAT AACAAGTGGAATTGTGAGCG GATAACAAT TT CACACAG GAAACAGAAT CATAT G

V41ac (SEQ ID NO: 9)

CTTTATGCTTCCGGCTCGTTGACAGTGTGGAATTGTGAGCGGATAACAATATAATGT GTGGAATTGTGAGCGGATAA CAATTTCACACAGGAAACAGAATCATATG

FIG. 2B Sequences:

Tet (SEQ ID NO: 10)

GGATCCTTGACACTCTATCATTGATAGAGTTATTTTACCACTCCCTATCAGTGATAG AGAAAAGTGAAATG

VlTc (SEQ ID NO: 11)

GGATCCTTGACACTCTATCATTGATAGAGTTATTTTACCACTCCCTATCAGTGATAG AGACGCAGTAAGAGAGGAAT

GTACATATG V2Tc (SEQ ID NO: 12)

GAGCT GTTGACAACT CTAT CATTGATAGAGTTATAATGTT CCCTAT CAGT GATAGAGACGCAGTAAGAGAGGAAT GT ACATATG

V3Tc (SEQ ID NO: 13)

GAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTGATAGAG AGTGGAATTGTGAGCGGATA ACAAT TT CACACAGGAAACAGAAT CATAT G

V4Tc (SEQ ID NO: 14)

TTGACACTCTATCATTGATAGAGTTTGACATCCCTATCAGTGATAGAGATATAATGT GTGGAATTGTGAGCGGATAA

CAATTTCACACAGGAAACAGAATCATATG

SEQ ID NOS: 15-38 are provided in Table 2.

SEQ ID NOS: 39-43 are provided in Example 2.

SEQ ID NOS: 44-53 are provided in the Detailed Description.

FIG. 14 Full-length Sequence:

V2Tc/tetR-V21ac/lacI (SEQ ID NO: 54)

GAGCT GTTGACAACT CTAT CATTGATAGAGTTATAATGTT CCCTAT CAGT GATAGAGACGCAGTAAGAGAGGAAT GT ACATATGAAGCTTCTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTT TTTCGTTTTGGTCCGAA

TTCTTGACAGCTAGCTCAGTCCTAGGTATAATGCTAGCCGCAGTAAGAGAGGAATGT ACACATGTCCCGCCTGGATA

AATCGAAAGTGATTAACTCGGCCCTCGAATTGCTGAATGAAGTCGGTATCGAGGGGC TGACGACCCGTAAATTGGCA CAAAAGTTGGGGGTGGAGCAACCCACGTTGTATTGGCACGTCAAAAATAAGCGGGCATTG CTGGATGCCCTCGCTAT

TGAAATGTTGGATCGCCACCATACCCATTTCTGTCCACTGGAGGGCGAGTCCTGGCA GGACTTTCTCCGCAACAACG

CGAAATCCTTTCGCTGTGCACTCTTGTCCCATCGGGACGGTGCTAAGGTGCACTTGG GCACCCGTCCCACCGAAAAA

CAATACGAAACCTTGGAAAATCAATTGGCGTTTTTGTGCCAGCAAGGGTTTAGCTTG GAGAATGCTCTCTATGCGCT

CTCGGCTGTCGGGCACTTTACGTTGGGGTGCGTGTTGGAGGACCAGGAGCATCAAGT CGCAAAAGAGGAGCGTGAAA

CCCCAACCACGGACTCGATGCCACCTCTGCTCCGCCAAGCTATCGAACTCTTCGATC ATCAGGGCGCGGAGCCAGCC

TTCCTCTTTGGGCTGGAGCTGATTATCTGCGGTTTGGAAAAACAACTCAAGTGTGAA AGCGGGTCCTAACTGCAGTC

ACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCA ACGCGCGGGGAGAGGCGGTT

TGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGA TTGCCCTTCACCGCCTGGCC

CTGAGAGAGTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTG TTTGATGGTGGTTAACGGCG

GGATATAACATGAGCTGTCTTCGGTATCGTCGTATCCCACTACCGAGATATCCGCAC CAACGCGCAGCCCGGACTCG

GTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGCAGTG GGAACGATGCCCTCATTCAG

CATTTGCATGGTTTGTTGAAAACCGGACATGGCACTCCAGTCGCCTTCCCGTTCCGC TATCGGCTGAATTTGATTGC GAGTGAGATATTTATGCCAGCCAGCCAGACGCAGACGCGCCGAGACAGAACTTAATGGGC CCGCTAACAGCGCGATT TGCTGGTGACCCAATGCGACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAG AAAATAATACTGTTGAT GGGTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTCCAC AGCAATGGCATCCTGGT

CATCCAGCGGATAGTTAATGATCAGCCCACTGACGCGTTGCGCGAGAAGATTGTGCA CCGCCGCTTTACAGGCTTCG

ACGCCGCTTCGTTCTACCATCGACACCACCACGCTGGCACCCAGTTGATCGGCGCGA GATTTAATCGCCGCGACAAT

TTGCGACGGCGCGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTG TTTGCCCGCCAGTTGTTGTG

CCACGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTTTTCCCGCG TTTTCGCAGAAACGTGGCTG

GCCTGGTTCACCACGCGGGAAACGGTCTGATAAGAGACACCGGCATACTCTGCGACA TCGTATAACGTTACTGGTTT

CACATTCACCACCCTGAATTGACTCTCTTCCGGGCGCTATCATGCCATACCGCGAAA GGTTTTGCGCCATTCGATGG

TGTCCGGGATCTCGACGCTCTCCCTTATGCGACTCCTGCATTAGGAAGCAGCCCAGT AGTAGGTTGAGGCCGTTGAG

CACCGCCGCCGCAAGGAATGGTGCATGCAAGGAGATGGCGCCCAACAGTCCCCCGGC CACGGGGagtcaaaagcctc cggtcggaggcttttgactTCTAGAGAGCTGTTGACACTTTATGCTTCCGGCTCGTATAA TGTGTGTGGAATTGTGA GCGGATAACAACGCAGTAAGAGAGGAATGTACCCATGGCCATGGCTCGAGgacgaacaat aaggcctccctaacggg gggccttttttattgataacaaaa

Additional sequences:

V2Tc/tetR-V31ac/lacI (SEQ ID NO: 55)

GAGCT GTTGACAACT CTAT CATTGATAGAGTTATAATGTT CCCTAT CAGT GATAGAGACGCAGTAAGAGAGGAAT GT

ACATATGAAGCTTCTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCC TTTTTTCGTTTTGGTCCGAA

TTCTTGACAGCTAGCTCAGTCCTAGGTATAATGCTAGCCGCAGTAAGAGAGGAATGT ACACATGTCCCGCCTGGATA

AATCGAAAGTGATTAACTCGGCCCTCGAATTGCTGAATGAAGTCGGTATCGAGGGGC TGACGACCCGTAAATTGGCA

CAAAAGTTGGGGGTGGAGCAACCCACGTTGTATTGGCACGTCAAAAATAAGCGGGCA TTGCTGGATGCCCTCGCTAT

TGAAATGTTGGATCGCCACCATACCCATTTCTGTCCACTGGAGGGCGAGTCCTGGCA GGACTTTCTCCGCAACAACG

CGAAATCCTTTCGCTGTGCACTCTTGTCCCATCGGGACGGTGCTAAGGTGCACTTGG GCACCCGTCCCACCGAAAAA

CAATACGAAACCTTGGAAAATCAATTGGCGTTTTTGTGCCAGCAAGGGTTTAGCTTG GAGAATGCTCTCTATGCGCT

CTCGGCTGTCGGGCACTTTACGTTGGGGTGCGTGTTGGAGGACCAGGAGCATCAAGT CGCAAAAGAGGAGCGTGAAA

CCCCAACCACGGACTCGATGCCACCTCTGCTCCGCCAAGCTATCGAACTCTTCGATC ATCAGGGCGCGGAGCCAGCC

TTCCTCTTTGGGCTGGAGCTGATTATCTGCGGTTTGGAAAAACAACTCAAGTGTGAA AGCGGGTCCTAACTGCAGTC

ACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCA ACGCGCGGGGAGAGGCGGTT

TGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGA TTGCCCTTCACCGCCTGGCC

CTGAGAGAGTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTG TTTGATGGTGGTTAACGGCG

GGATATAACATGAGCTGTCTTCGGTATCGTCGTATCCCACTACCGAGATATCCGCAC CAACGCGCAGCCCGGACTCG

GTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGCAGTG GGAACGATGCCCTCATTCAG

CATTTGCATGGTTTGTTGAAAACCGGACATGGCACTCCAGTCGCCTTCCCGTTCCGC TATCGGCTGAATTTGATTGC

GAGTGAGATATTTATGCCAGCCAGCCAGACGCAGACGCGCCGAGACAGAACTTAATG GGCCCGCTAACAGCGCGATT

TGCTGGTGACCCAATGCGACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGG GAGAAAATAATACTGTTGAT

GGGTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTC CACAGCAATGGCATCCTGGT

CATCCAGCGGATAGTTAATGATCAGCCCACTGACGCGTTGCGCGAGAAGATTGTGCA CCGCCGCTTTACAGGCTTCG

ACGCCGCTTCGTTCTACCATCGACACCACCACGCTGGCACCCAGTTGATCGGCGCGA GATTTAATCGCCGCGACAAT

TTGCGACGGCGCGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTG TTTGCCCGCCAGTTGTTGTG

CCACGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTTTTCCCGCG TTTTCGCAGAAACGTGGCTG GCCTGGTTCACCACGCGGGAAACGGTCTGATAAGAGACACCGGCATACTCTGCGACATCG TATAACGTTACTGGTTT

CACATTCACCACCCTGAATTGACTCTCTTCCGGGCGCTATCATGCCATACCGCGAAA GGTTTTGCGCCATTCGATGG TGTCCGGGATCTCGACGCTCTCCCTTATGCGACTCCTGCATTAGGAAGCAGCCCAGTAGT AGGTTGAGGCCGTTGAG

CACCGCCGCCGCAAGGAATGGTGCATGCAAGGAGATGGCGCCCAACAGTCCCCCGGC CACGGGGagtcaaaagcctc cggtcggaggcttttgactTCTAGAGAGCTGTTGACACTTTATGCTTCCGGCTCGTATAA TGTGTGTGGAAT'TGTGA GCGGATAACAAGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGAATCCCATG GCTCGAGgacgaacaat aaggcctccctaacggggggccttttttaztgataacaaaa

Table 3. List of Plasmids pa V2TcR-mCardinal (SEQ ID NO: 56) ggatccGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTGATA GAGACGCAGTAAGAGAG

GAATGTACATATGGTGAGTAAGGGTGAGGAGCTCATTAAGGAGAACATGCACATGAA GCTGTATATGGAGGGCACCG

TAAACAACCACCACTTCAAGTGTACCACCGAGGGTGAAGGTAAACCCTACGAGGGGA CGCAGACCCAACGCATCAAG

GTCGTGGAGGGCGGCCCGCTGCCTTTCGCATTCGACATTCTGGCGACCTGTTTTATG TACGGCTCGAAGACCTTCAT

CAACCACACCCAAGGCATCCCGGACTTCTTCAAGCAGAGCTTCCCTGAGGGCTTCAC CTGGGAGCGCGTCACCACGT

ATGAAGACGGTGGGGTGCTCACCGTGACCCAGGACACGAGCTTGCAGGATGGCTGCT TGATTTACAACGTCAAGCTG

CGCGGGGTGAACTTCCCTAGCAACGGGCCAGTGATGCAGAAAAAGACGCTGGGTTGG GAGGCCACCACCGAGACCCT

GTACCCGGCCGACGGGGGGCTGGAAGGGCGGTGCGATATGGCCCTGAAATTGGTCGG CGGCGGTCATTTGCACTGCA

ATCTCAAGACCACGTACCGCTCCAAGAAACCCGCCAAAAACCTGAAGATGCCTGGTG TTTATTTTGTCGACCGGCGC

CTGGAGCGCATCAAGGAAGCGGACAATGAGACGTACGTGGAACAGCACGAAGTGGCC GTGGCTCGTTATTGCGATCT

GCCGTCGAAGCTGGGTCACAAACTGAACGGCATGGATGAGCTGTACAAAGATTATAA GGATGATGACGACAAGTAAA

AGCTTCTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCG TTTTGGTCCGAATTCTTGAC

AGCTAGCTCAGTCCTAGGTATAATGCTAGCCGCAGTAAGAGAGGAATGTACACATGT CCCGCCTGGATAAATCGAAA

GTGATTAACTCGGCCCTCGAATTGCTGAATGAAGTCGGTATCGAGGGGCTGACGACC CGTAAATTGGCACAAAAGTT

GGGGGTGGAGCAACCCACGTTGTATTGGCACGTCAAAAATAAGCGGGCATTGCTGGA TGCCCTCGCTATTGAAATGT

TGGATCGCCACCATACCCATTTCTGTCCACTGGAGGGCGAGTCCTGGCAGGACTTTC TCCGCAACAACGCGAAATCC

TTTCGCTGTGCACTCTTGTCCCATCGGGACGGTGCTAAGGTGCACTTGGGCACCCGT CCCACCGAAAAACAATACGA

AACCTTGGAAAATCAATTGGCGTTTTTGTGCCAGCAAGGGTTTAGCTTGGAGAATGC TCTCTATGCGCTCTCGGCTG

TCGGGCACTTTACGTTGGGGTGCGTGTTGGAGGACCAGGAGCATCAAGTCGCAAAAG AGGAGCGTGAAACCCCAACC

ACGGACTCGATGCCACCTCTGCTCCGCCAAGCTATCGAACTCTTCGATCATCAGGGC GCGGAGCCAGCCTTCCTCTT

TGGGCTGGAGCTGATTATCTGCGGTTTGGAAAAACAACTCAAGTGTGAAAGCGGGTC CTAACTGCAGTCTAGACCAT GGctcgag

V21acI-mCardinal (SEQ ID NO: 57)

GGATCCGAGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTGGAATTGTG AGCGGATAACAACGCAGTAA

GAGAGGAATGTACATATGGTGAGTAAGGGTGAGGAGCTCATTAAGGAGAACATGCAC ATGAAGCTGTATATGGAGGG

CACCGTAAACAACCACCACTTCAAGTGTACCACCGAGGGTGAAGGTAAACCCTACGA GGGGACGCAGACCCAACGCA

TCAAGGTCGTGGAGGGCGGCCCGCTGCCTTTCGCATTCGACATTCTGGCGACCTGTT TTATGTACGGCTCGAAGACC

TTCATCAACCACACCCAAGGCATCCCGGACTTCTTCAAGCAGAGCTTCCCTGAGGGC TTCACCTGGGAGCGCGTCAC

CACGTATGAAGACGGTGGGGTGCTCACCGTGACCCAGGACACGAGCTTGCAGGATGG CTGCTTGATTTACAACGTCA

AGCTGCGCGGGGTGAACTTCCCTAGCAACGGGCCAGTGATGCAGAAAAAGACGCTGG GTTGGGAGGCCACCACCGAG

ACCCTGTACCCGGCCGACGGGGGGCTGGAAGGGCGGTGCGATATGGCCCTGAAATTG GTCGGCGGCGGTCATTTGCA

CTGCAATCTCAAGACCACGTACCGCTCCAAGAAACCCGCCAAAAACCTGAAGATGCC TGGTGTTTATTTTGTCGACC GGCGCCTGGAGCGCATCAAGGAAGCGGACAATGAGACGTACGTGGAACAGCACGAAGTGG CCGTGGCTCGTTATTGC

GATCTGCCGTCGAAGCTGGGTCACAAACTGAACGGCATGGATGAGCTGTACAAAGAT TATAAGGATGATGACGACAA

GTAAAAGCTTCTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTT TTTCGTTTTGGTCCGAATTC

CCCCGTGGCCGGGGGACTGTTGGGCGCCATCTCCTTGCATGCACCATTCCTTGCGGC GGCGGTGCTCAACGGCCTCA

ACCTACTACTGGGCTGCTTCCTAATGCAGGAGTCGCATAAGGGAGAGCGTCGAGATC CCGGACACCATCGAATGGCG

CAAAACCTTTCGCGGTATGGCATGATAGCGCCCGGAAGAGAGTCAATTCAGGGTGGT GAATGTGAAACCAGTAACGT

TATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCGCGTGGTGA ACCAGGCCAGCCACGTTTCT

GCGAAAACGCGGGAAAAAGTGGAAGCGGCGATGGCGGAGCTGAATTACATTCCCAAC CGCGTGGCACAACAACTGGC

GGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCTGGCCCTGCACGCGCC GTCGCAAATTGTCGCGGCGA

TTAAATCTCGCGCCGATCAACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACGAA GCGGCGTCGAAGCCTGTAAA

GCGGCGGTGCACAATCTTCTCGCGCAACGCGTCAGTGGGCTGATCATTAACTATCCG CTGGATGACCAGGATGCCAT

TGCTGTGGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGACCA GACACCCATCAACAGTATTA

TTTTCTCCCATGAAGACGGTACGCGACTGGGCGTGGAGCATCTGGTCGCATTGGGTC ACCAGCAAATCGCGCTGTTA

GCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGGCATAAATAT CTCACTCGCAATCAAATTCA

GCCGATAGCGGAACGGGAAGGCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCAT GCAAATGCTGAATGAGGGCA

TCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCGCTGGGCGCAATGCGCG CCATTACCGAGTCCGGGCTG

CGCGTTGGTGCGGATATCTCGGTAGTGGGATACGACGATACCGAAGACAGCTCATGT TATATCCCGCCGTTAACCAC

CATCAAACAGGATTTTCGCCTGCTGGGGCAAACCAGCGTGGACCGCTTGCTGCAACT CTCTCAGGGCCAGGCGGTGA

AGGGCAATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCCCA ATACGCAAACCGCCTCTCCC

CGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGC GGGCAGTGACTGCAGCTCGA G prr V3LacT-mCardinal (SEQ TD NO: 60)

GGATCCGAGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTGGAATTGTG AGCGGATAACAAGTGGAATT

GTGAGCGGATAACAATTTCACACAGGAAACAGAATCATATGGTGAGTAAGGGTGAGG AGCTCATTAAGGAGAACATG

CACATGAAGCTGTATATGGAGGGCACCGTAAACAACCACCACTTCAAGTGTACCACC GAGGGTGAAGGTAAACCCTA

CGAGGGGACGCAGACCCAACGCATCAAGGTCGTGGAGGGCGGCCCGCTGCCTTTCGC ATTCGACATTCTGGCGACCT

GTTTTATGTACGGCTCGAAGACCTTCATCAACCACACCCAAGGCATCCCGGACTTCT TCAAGCAGAGCTTCCCTGAG

GGCTTCACCTGGGAGCGCGTCACCACGTATGAAGACGGTGGGGTGCTCACCGTGACC CAGGACACGAGCTTGCAGGA

TGGCTGCTTGATTTACAACGTCAAGCTGCGCGGGGTGAACTTCCCTAGCAACGGGCC AGTGATGCAGAAAAAGACGC

TGGGTTGGGAGGCCACCACCGAGACCCTGTACCCGGCCGACGGGGGGCTGGAAGGGC GGTGCGATATGGCCCTGAAA

TTGGTCGGCGGCGGTCATTTGCACTGCAATCTCAAGACCACGTACCGCTCCAAGAAA CCCGCCAAAAACCTGAAGAT

GCCTGGTGTTTATTTTGTCGACCGGCGCCTGGAGCGCATCAAGGAAGCGGACAATGA GACGTACGTGGAACAGCACG

AAGTGGCCGTGGCTCGTTATTGCGATCTGCCGTCGAAGCTGGGTCACAAACTGAACG GCATGGATGAGCTGTACAAA

GATTATAAGGATGATGACGACAAGTAAAAGCTTCTCGGTACCAAATTCCAGAAAAGA GGCCTCCCGAAAGGGGGGCC

TTTT T TCGTT , T TGGTCCGAAT T CCCCCGTGGCCGGGGG A CTGTTGGGCGCCA T CTCCTTGCATGC A CCA T TCCTTGC

GGCGGCGGTGCTCAACGGCCTCAACCTACTACTGGGCTGCTTCCTAATGCAGGAGTC GCATAAGGGAGAGCGTCGAG

ATCCCGGACACCATCGAATGGCGCAAAACCTTTCGCGGTATGGCATGATAGCGCCCG GAAGAGAGTCAATTCAGGGT

GGTGAATGTGAAACCAGTAACGTTATACGATGTCGCAGAGTATGCCGGTGTCTCTTA TCAGACCGTTTCCCGCGTGG

TGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTGGAAGCGGCGATGG CGGAGCTGAATTACATTCCC

AACCGCGTGGCACAACAACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACC TCCAGTCTGGCCCTGCACGC GCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCCGATCAACTGGGTGCCAGCGTGGT GGTGTCGATGGTAGAAC GAAGCGGCGTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGCGTCAGTG GGCTGATCATTAACTAT CCGCTGGATGACCAGGATGCCATTGCTGTGGAAGCTGCCTGCACTAATGTTCCGGCGTTA TTTCTTGATGTCTCTGA CCAGACACCCATCAACAGTATTATTTTCTCCCATGAAGACGGTACGCGACTGGGCGTGGA GCATCTGGTCGCATTGG GTCACCAGCAAATCGCGCTGTTAGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTC TGGCTGGCTGGCATAAA TATCTCACTCGCAATCAAATTCAGCCGATAGCGGAACGGGAAGGCGACTGGAGTGCCATG TCCGGTTTTCAACAAAC CATGCAAATGCTGAATGAGGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGAT GGCGCTGGGCGCAATGC GCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGGATATCTCGGTAGTGGGATACGACG ATACCGAAGACAGCTCA TGTTATATCCCGCCGTTAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACCAGC GTGGACCGCTTGCTGCA ACTCTCTCAGGGCCAGGCGGTGAAGGGCAATCAGCTGTTGCCCGTCTCACTGGTGAAAAG AAAAACCACCCTGGCGC CCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGAC AGGTTTCCCGACTGGAA AGCGGGCAGTGACTGCAGCTCGAG pa V2TcR-sfGFP (SEQ ID NO: 61)

GGATCCGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTG ATAGAGACGCAGTAAGAGAG GAATGTACATATGTCCAAAGGTGAAGAGCTGTTTACCGGCGTCGTGCCCATTCTGGTGGA GCTGGATGGCGACGTCA ACGGGCACAAGTTTAGCGTCCGTGGCGAAGGTGAGGGCGACGCCACGAACGGTAAGCTGA CGCTGAAATTCATTTGC ACCACCGGCAAATTGCCTGTACCCTGGCCCACCCTGGTGACCACGCTCACCTACGGCGTA CAGTGCTTCAGCCGTTA CCCGGACCACATGAAGCGTCACGACTTCTTCAAAAGCGCCATGCCGGAGGGTTACGTGCA GGAGCGTACGATTAGTT TCAAGGACGACGGCACCTATAAGACCCGTGCCGAAGTGAAGTTCGAAGGCGATACGTTGG TGAACCGTATCGAGTTG AAGGGTATCGACTTTAAGGAAGACGGCAACATCCTGGGCCATAAGCTGGAGTACAATTTC AACAGCCATAACGTTTA CATCACCGCCGATAAACAGAAGAACGGCATTAAAGCCAACTTTAAGATCCGCCACAACGT CGAAGACGGCTCGGTGC AGCTGGCCGACCATTATCAGCAAAACACCCCCATCGGTGATGGGCCCGTGCTGCTGCCGG ATAACCATTATCTGAGC ACGCAGTCGGTGCTCAGCAAGGACCCTAACGAAAAGCGCGATCACATGGTGCTGCTGGAG TTCGTCACGGCGGCGGG GAT CACC CAT GGGAT GGAC GAGCT CTACAAAGACTATAAAGAT GAG GAT GACAAGTAAAAGCT T CT C GGTACCAAAT TCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCCGAATTCTTGACA GCTAGCTCAGTCCTAGG TATAATGCTAGCCGCAGTAAGAGAGGAATGTACACATGTCCCGCCTGGATAAATCGAAAG TGATTAACTCGGCCCTC GAATTGCTGAATGAAGTCGGTATCGAGGGGCTGACGACCCGTAAATTGGCACAAAAGTTG GGGGTGGAGCAACCCAC GTTGTATTGGCACGTCAAAAATAAGCGGGCATTGCTGGATGCCCTCGCTATTGAAATGTT GGATCGCCACCATACCC ATTTCTGTCCACTGGAGGGCGAGTCCTGGCAGGACTTTCTCCGCAACAACGCGAAATCCT TTCGCTGTGCACTCTTG TCCCATCGGGACGGTGCTAAGGTGCACTTGGGCACCCGTCCCACCGAAAAACAATACGAA ACCTTGGAAAATCAATT GGCGTTTTTGTGCCAGCAAGGGTTTAGCTTGGAGAATGCTCTCTATGCGCTCTCGGCTGT CGGGCACTTTACGTTGG GGTGCGTGTTGGAGGACCAGGAGCATCAAGTCGCAAAAGAGGAGCGTGAAACCCCAACCA CGGACTCGATGCCACCT CTGCTCCGCCAAGCTATCGAACTCTTCGATCATCAGGGCGCGGAGCCAGCCTTCCTCTTT GGGCTGGAGCTGATTAT CTGCGGTTTGGAAAAACAACTCAAGTGTGAAAGCGGGTCCTAACTGCAGCTCGAG pa V21acI-sfGFP (SEQ ID NO: 62)

GGATCCGAGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTGGAATTGTG AGCGGATAACAACGCAGTAA GAGAGGAATGTACATATGTCCAAAGGTGAAGAGCTGTTTACCGGCGTCGTGCCCATTCTG GTGGAGCTGGATGGCGA CGTCAACGGGCACAAGTTTAGCGTCCGTGGCGAAGGTGAGGGCGACGCCACGAACGGTAA GCTGACGCTGAAATTCA TTTGCACCACCGGCAAATTGCCTGTACCCTGGCCCACCCTGGTGACCACGCTCACCTACG GCGTACAGTGCTTCAGC

CGTTACCCGGACCACATGAAGCGTCACGACTTCTTCAAAAGCGCCATGCCGGAGGGT TACGTGCAGGAGCGTACGAT

TAGTTTCAAGGACGACGGCACCTATAAGACCCGTGCCGAAGTGAAGTTCGAAGGCGA TACGTTGGTGAACCGTATCG

AGTTGAAGGGTATCGACTTTAAGGAAGACGGCAACATCCTGGGCCATAAGCTGGAGT ACAATTTCAACAGCCATAAC

GTTTACATCACCGCCGATAAACAGAAGAACGGCATTAAAGCCAACTTTAAGATCCGC CACAACGTCGAAGACGGCTC

GGTGCAGCTGGCCGACCATTATCAGCAAAACACCCCCATCGGTGATGGGCCCGTGCT GCTGCCGGATAACCATTATC

TGAGCACGCAGTCGGTGCTCAGCAAGGACCCTAACGAAAAGCGCGATCACATGGTGC TGCTGGAGTTCGTCACGGCG

GCGGGGATCACCCATGGGATGGACGAGCTCTACAAAGACTATAAAGATGACGATGAC AAGTAAAAGCTTCTCGGTAC

CAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCCGAAT TCCCCCGTGGCCGGGGGACT

GTTGGGCGCCATCTCCTTGCATGCACCATTCCTTGCGGCGGCGGTGCTCAACGGCCT CAACCTACTACTGGGCTGCT

TCCTAATGCAGGAGTCGCATAAGGGAGAGCGTCGAGATCCCGGACACCATCGAATGG CGCAAAACCTTTCGCGGTAT

GGCATGATAGCGCCCGGAAGAGAGTCAATTCAGGGTGGTGAATGTGAAACCAGTAAC GTTATACGATGTCGCAGAGT

ATGCCGGTGTCTCTTATCAGACCGTTTCCCGCGTGGTGAACCAGGCCAGCCACGTTT CTGCGAAAACGCGGGAAAAA

GTGGAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAACAACTG GCGGGCAAACAGTCGTTGCT

GATTGGCGTTGCCACCTCCAGTCTGGCCCTGCACGCGCCGTCGCAAATTGTCGCGGC GATTAAATCTCGCGCCGATC

AACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGCGTCGAAGCCTGTA AAGCGGCGGTGCACAATCTT

CTCGCGCAACGCGTCAGTGGGCTGATCATTAACTATCCGCTGGATGACCAGGATGCC ATTGCTGTGGAAGCTGCCTG

CACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGACCAGACACCCATCAACAGTAT TATTTTCTCCCATGAAGACG

GTACGCGACTGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAGCAAATCGCGCTGT TAGCGGGCCCATTAAGTTCT

GTCTCGGCGCGTCTGCGTCTGGCTGGCTGGCATAAATATCTCACTCGCAATCAAATT CAGCCGATAGCGGAACGGGA

AGGCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAATGAGGG CATCGTTCCCACTGCGATGC

TGGTTGCCAACGATCAGATGGCGCTGGGCGCAATGCGCGCCATTACCGAGTCCGGGC TGCGCGTTGGTGCGGATATC

TCGGTAGTGGGATACGACGATACCGAAGACAGCTCATGTTATATCCCGCCGTTAACC ACCATCAAACAGGATTTTCG

CCTGCTGGGGCAAACCAGCGTGGACCGCTTGCTGCAACTCTCTCAGGGCCAGGCGGT GAAGGGCAATCAGCTGTTGC

CCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCCCAATACGCAAACCGCCTCTC CCCGCGCGTTGGCCGATTCA

TTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGACTGCAGCTC GAG pa V3LacI-sfGFP (SEQ ID NO: 63)

GGATCCGAGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTGGAATTGTG AGCGGATAACAAGTGGAATT

GTGAGCGGATAACAATTTCACACAGGAAACAGAATCATATGTCCAAAGGTGAAGAGC TGTTTACCGGCGTCGTGCCC

ATTCTGGTGGAGCTGGATGGCGACGTCAACGGGCACAAGTTTAGCGTCCGTGGCGAA GGTGAGGGCGACGCCACGAA

CGGTAAGCTGACGCTGAAATTCATTTGCACCACCGGCAAATTGCCTGTACCCTGGCC CACCCTGGTGACCACGCTCA

CCTACGGCGTACAGTGCTTCAGCCGTTACCCGGACCACATGAAGCGTCACGACTTCT TCAAAAGCGCCATGCCGGAG

GGTTACGTGCAGGAGCGTACGATTAGTTTCAAGGACGACGGCACCTATAAGACCCGT GCCGAAGTGAAGTTCGAAGG

CGATACGTTGGTGAACCGTATCGAGTTGAAGGGTATCGACTTTAAGGAAGACGGCAA CATCCTGGGCCATAAGCTGG

AGTACAATTTCAACAGCCATAACGTTTACATCACCGCCGATAAACAGAAGAACGGCA TTAAAGCCAACTTTAAGATC

CGCCACAACGTCGAAGACGGCTCGGTGCAGCTGGCCGACCATTATCAGCAAAACACC CCCATCGGTGATGGGCCCGT

GCTGCTGCCGGATAACCATTATCTGAGCACGCAGTCGGTGCTCAGCAAGGACCCTAA CGAAAAGCGCGATCACATGG

TGCTGCTGGAGTTCGTCACGGCGGCGGGGATCACCCATGGGATGGACGAGCTCTACA AAGACTATAAAGATGACGAT

GACAAGTAAAAGCTTCTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGG CCTTTTTTCGTTTTGGTCCG

AATTCCCCCGTGGCCGGGGGACTGTTGGGCGCCATCTCCTTGCATGCACCATTCCTT GCGGCGGCGGTGCTCAACGG CCTCAACCTACTACTGGGCTGCTTCCTAATGCAGGAGTCGCATAAGGGAGAGCGTCGAGA TCCCGGACACCATCGAA

TGGCGCAAAACCTTTCGCGGTATGGCATGATAGCGCCCGGAAGAGAGTCAATTCAGG GTGGTGAATGTGAAACCAGT AACGTTATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCGCGTGGT GAACCAGGCCAGCCACG TTTCTGCGAAAACGCGGGAAAAAGTGGAAGCGGCGATGGCGGAGCTGAATTACATTCCCA ACCGCGTGGCACAACAA CTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCTGGCCCTGCACGCG CCGTCGCAAATTGTCGC GGCGATTAAATCTCGCGCCGATCAACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACG AAGCGGCGTCGAAGCCT GTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGCGTCAGTGGGCTGATCATTAACTATC CGCTGGATGACCAGGAT GCCATTGCTGTGGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGAC CAGACACCCATCAACAG TATTATTTTCTCCCATGAAGACGGTACGCGACTGGGCGTGGAGCATCTGGTCGCATTGGG TCACCAGCAAATCGCGC TGTTAGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGGCATAAAT ATCTCACTCGCAATCAA ATTCAGCCGATAGCGGAACGGGAAGGCGACTGGAGTGCCATGTCCGGTTTTCAACAAACC ATGCAAATGCTGAATGA GGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCGCTGGGCGCAATGCG CGCCATTACCGAGTCCG GGCTGCGCGTTGGTGCGGATATCTCGGTAGTGGGATACGACGATACCGAAGACAGCTCAT GTTATATCCCGCCGTTA ACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACCAGCGTGGACCGCTTGCTGCAA CTCTCTCAGGGCCAGGC GGTGAAGGGCAATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCC CAATACGCAAACCGCCT CTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAA GCGGGCAGTGACTGCAG CTCGAG gaacacggcggcatcagagcagccgattgzctgttgtgcccagtcazagccgaatagcct ctccacccaagcggccg gagaacctgcgtgcaatccatcttgttcaatcatgcgaaacgatcczcatcctgtctctt gatcagatcttgatccc ctgcgccatcagatccttggcggcaagaaagccatccagtttacttagcagggcttccca accttaccagagggcgc cccagctggcaattccggttcgcttgctgaccataaaaccgcccagactagctatcgcca tgtaagcccactgcaag pa V2TcR-mCardinal/V2LacI-mCardiiial (SEQ ID NO: 65) ggatccGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTGATA GAGACGCAGTAAGAGAG GAATGTACATATGGTGAGTAAGGGTGAGGAGCTCATTAAGGAGAACATGCACATGAAGCT GTATATGGAGGGCACCG

TAAACAACCACCACTTCAAGTGTACCACCGAGGGTGAAGGTAAACCCTACGAGGGGA CGCAGACCCAACGCATCAAG

GTCGTGGAGGGCGGCCCGCTGCCTTTCGCATTCGACATTCTGGCGACCTGTTTTATG TACGGCTCGAAGACCTTCAT CAACCACACCCAAGGCATCCCGGACTTCTTCAAGCAGAGCTTCCCTGAGGGCTTCACCTG GGAGCGCGTCACCACGT ATGAAGACGGTGGGGTGCTCACCGTGACCCAGGACACGAGCTTGCAGGATGGCTGCTTGA TTTACAACGTCAAGCTG CGCGGGGTGAACTTCCCTAGCAACGGGCCAGTGATGCAGAAAAAGACGCTGGGTTGGGAG GCCACCACCGAGACCCT GTACCCGGCCGACGGGGGGCTGGAAGGGCGGTGCGATATGGCCCTGAAATTGGTCGGCGG CGGTCATTTGCACTGCA ATCTCAAGACCACGTACCGCTCCAAGAAACCCGCCAAAAACCTGAAGATGCCTGGTGTTT ATTTTGTCGACCGGCGC CTGGAGCGCATCAAGGAAGCGGACAATGAGACGTACGTGGAACAGCACGAAGTGGCCGTG GCTCGTTATTGCGATCT GCCGTCGAAGCTGGGTCACAAACTGAACGGCATGGATGAGCTGTACAAAGATTATAAGGA TGATGACGACAAGTAAA AGCTTCTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTT TGGTCCGAATTCTTGAC AGCTAGCTCAGTCCTAGGTATAATGCTAGCCGCAGTAAGAGAGGAATGTACACATGTCCC GCCTGGATAAATCGAAA GTGATTAACTCGGCCCTCGAATTGCTGAATGAAGTCGGTATCGAGGGGCTGACGACCCGT AAATTGGCACAAAAGTT GGGGGTGGAGCAACCCACGTTGTATTGGCACGTCAAAAATAAGCGGGCATTGCTGGATGC CCTCGCTATTGAAATGT TGGATCGCCACCATACCCATTTCTGTCCACTGGAGGGCGAGTCCTGGCAGGACTTTCTCC GCAACAACGCGAAATCC TTTCGCTGTGCACTCTTGTCCCATCGGGACGGTGCTAAGGTGCACTTGGGCACCCGTCCC ACCGAAAAACAATACGA AACCTTGGAAAATCAATTGGCGTTTTTGTGCCAGCAAGGGTTTAGCTTGGAGAATGCTCT CTATGCGCTCTCGGCTG TCGGGCACTTTACGTTGGGGTGCGTGTTGGAGGACCAGGAGCATCAAGTCGCAAAAGAGG AGCGTGAAACCCCAACC ACGGACTCGATGCCACCTCTGCTCCGCCAAGCTATCGAACTCTTCGATCATCAGGGCGCG GAGCCAGCCTTCCTCTT TGGGCTGGAGCTGATTATCTGCGGTTTGGAAAAACAACTCAAGTGTGAAAGCGGGTCCTA ACTGCAGTCACTGCCCG CTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGA GAGGCGGTTTGCGTATT GGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGATTGCCCTTCAC CGCCTGGCCCTGAGAGA GTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGGTGG TTAACGGCGGGATATAA CATGAGCTGTCTTCGGTATCGTCGTATCCCACTACCGAGATATCCGCACCAACGCGCAGC CCGGACTCGGTAATGGC GCGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGCAGTGGGAACGATGCC CTCATTCAGCATTTGCA

TGGTTTGTTGAAAACCGGACATGGCACTCCAGTCGCCTTCCCGTTCCGCTATCGGCT GAATTTGATTGCGAGTGAGA TATTTATGCCAGCCAGCCAGACGCAGACGCGCCGAGACAGAACTTAATGGGCCCGCTAAC AGCGCGATTTGCTGGTG

ACCCAATGCGACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAAT AATACTGTTGATGGGTGTCT

GGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGCAA TGGCATCCTGGTCATCCAGC

GGATAGTTAATGATCAGCCCACTGACGCGTTGCGCGAGAAGATTGTGCACCGCCGCT TTACAGGCTTCGACGCCGCT

TCGTTCTACCATCGACACCACCACGCTGGCACCCAGTTGATCGGCGCGAGATTTAAT CGCCGCGACAATTTGCGACG

GCGCGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTGTTTGCCCG CCAGTTGTTGTGCCACGCGG

TTGGGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTTTTCCCGCGTTTTCGCA GAAACGTGGCTGGCCTGGTT

CACCACGCGGGAAACGGTCTGATAAGAGACACCGGCATACTCTGCGACATCGTATAA CGTTACTGGTTTCACATTCA

CCACCCTGAATTGACTCTCTTCCGGGCGCTATCATGCCATACCGCGAAAGGTTTTGC GCCATTCGATGGTGTCCGGG

ATCTCGACGCTCTCCCTTATGCGACTCCTGCATTAGGAAGCAGCCCAGTAGTAGGTT GAGGCCGTTGAGCACCGCCG

CCGCAAGGAATGGTGCATGCAAGGAGATGGCGCCCAACAGTCCCCCGGCCACGGGGa gtcaaaagcctccggtcgga ggcttttgactTCTAGAGAGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTG GAATTGTGAGCGGATAA

CAACGCAGTAAGAGAGGAATGTACCCATGGTGAGTAAGGGTGAGGAGCTCATTAAGG AGAACATGCACATGAAGCTG

TATATGGAGGGCACCGTAAACAACCACCACTTCAAGTGTACCACCGAGGGTGAAGGT AAACCCTACGAGGGGACGCA

GACCCAACGCATCAAGGTCGTGGAGGGCGGCCCGCTGCCTTTCGCATTCGACATTCT GGCGACCTGTTTTATGTACG

GCTCGAAGACCTTCATCAACCACACCCAAGGCATCCCGGACTTCTTCAAGCAGAGCT TCCCTGAGGGCTTCACCTGG

GAGCGCGTCACCACGTATGAAGACGGTGGGGTGCTCACCGTGACCCAGGACACGAGC TTGCAGGATGGCTGCTTGAT

TTACAACGTCAAGCTGCGCGGGGTGAACTTCCCTAGCAACGGGCCAGTGATGCAGAA AAAGACGCTGGGTTGGGAGG

CCACCACCGAGACCCTGTACCCGGCCGACGGGGGGCTGGAAGGGCGGTGCGATATGG CCCTGAAATTGGTCGGCGGC

GGTCATTTGCACTGCAATCTCAAGACCACGTACCGCTCCAAGAAACCCGCCAAAAAC CTGAAGATGCCTGGTGTTTA

TTTTGTCGACCGGCGCCTGGAGCGCATCAAGGAAGCGGACAATGAGACGTACGTGGA ACAGCACGAAGTGGCCGTGG

CTCGTTATTGCGATCTGCCGTCGAAGCTGGGTCACAAACTGAACGGCATGGATGAGC TGTACAAAGATTATAAGGAT GAT GACGACAAGTAACT CGAG pa V2TcR-mCardiiial/V3LacI-mCardiiial (SEQ ID NO: 66) ggatccGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTGATA GAGACGCAGTAAGAGAG

GAATGTACATATGGTGAGTAAGGGTGAGGAGCTCATTAAGGAGAACATGCACATGAA GCTGTATATGGAGGGCACCG

TAAACAACCACCACTTCAAGTGTACCACCGAGGGTGAAGGTAAACCCTACGAGGGGA CGCAGACCCAACGCATCAAG

GTCGTGGAGGGCGGCCCGCTGCCTTTCGCATTCGACATTCTGGCGACCTGTTTTATG TACGGCTCGAAGACCTTCAT

CAACCACACCCAAGGCATCCCGGACTTCTTCAAGCAGAGCTTCCCTGAGGGCTTCAC CTGGGAGCGCGTCACCACGT

ATGAAGACGGTGGGGTGCTCACCGTGACCCAGGACACGAGCTTGCAGGATGGCTGCT TGATTTACAACGTCAAGCTG

CGCGGGGTGAACTTCCCTAGCAACGGGCCAGTGATGCAGAAAAAGACGCTGGGTTGG GAGGCCACCACCGAGACCCT

GTACCCGGCCGACGGGGGGCTGGAAGGGCGGTGCGATATGGCCCTGAAATTGGTCGG CGGCGGTCATTTGCACTGCA

ATCTCAAGACCACGTACCGCTCCAAGAAACCCGCCAAAAACCTGAAGATGCCTGGTG TTTATTTTGTCGACCGGCGC

CTGGAGCGCATCAAGGAAGCGGACAATGAGACGTACGTGGAACAGCACGAAGTGGCC GTGGCTCGTTATTGCGATCT

GCCGTCGAAGCTGGGTCACAAACTGAACGGCATGGATGAGCTGTACAAAGATTATAA GGATGATGACGACAAGTAAA

AGCTTCTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCG TTTTGGTCCGAATTCTTGAC

AGCTAGCTCAGTCCTAGGTATAATGCTAGCCGCAGTAAGAGAGGAATGTACACATGT CCCGCCTGGATAAATCGAAA

GTGATTAACTCGGCCCTCGAATTGCTGAATGAAGTCGGTATCGAGGGGCTGACGACC CGTAAATTGGCACAAAAGTT GGGGGTGGAGCAACCCACGTTGTATTGGCACGTCAAAAATAAGCGGGCATTGCTGGATGC CCTCGCTATTGAAATGT

TGGATCGCCACCATACCCATTTCTGTCCACTGGAGGGCGAGTCCTGGCAGGACTTTC TCCGCAACAACGCGAAATCC

TTTCGCTGTGCACTCTTGTCCCATCGGGACGGTGCTAAGGTGCACTTGGGCACCCGT CCCACCGAAAAACAATACGA

AACCTTGGAAAATCAATTGGCGTTTTTGTGCCAGCAAGGGTTTAGCTTGGAGAATGC TCTCTATGCGCTCTCGGCTG

TCGGGCACTTTACGTTGGGGTGCGTGTTGGAGGACCAGGAGCATCAAGTCGCAAAAG AGGAGCGTGAAACCCCAACC

ACGGACTCGATGCCACCTCTGCTCCGCCAAGCTATCGAACTCTTCGATCATCAGGGC GCGGAGCCAGCCTTCCTCTT

TGGGCTGGAGCTGATTATCTGCGGTTTGGAAAAACAACTCAAGTGTGAAAGCGGGTC CTAACTGCAGTCACTGCCCG

CTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGG GGAGAGGCGGTTTGCGTATT

GGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGATTGCCCTT CACCGCCTGGCCCTGAGAGA

GTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGG TGGTTAACGGCGGGATATAA

CATGAGCTGTCTTCGGTATCGTCGTATCCCACTACCGAGATATCCGCACCAACGCGC AGCCCGGACTCGGTAATGGC

GCGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGCAGTGGGAACGAT GCCCTCATTCAGCATTTGCA

TGGTTTGTTGAAAACCGGACATGGCACTCCAGTCGCCTTCCCGTTCCGCTATCGGCT GAATTTGATTGCGAGTGAGA

TATTTATGCCAGCCAGCCAGACGCAGACGCGCCGAGACAGAACTTAATGGGCCCGCT AACAGCGCGATTTGCTGGTG

ACCCAATGCGACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAAT AATACTGTTGATGGGTGTCT

GGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGCAA TGGCATCCTGGTCATCCAGC

GGATAGTTAATGATCAGCCCACTGACGCGTTGCGCGAGAAGATTGTGCACCGCCGCT TTACAGGCTTCGACGCCGCT

TCGTTCTACCATCGACACCACCACGCTGGCACCCAGTTGATCGGCGCGAGATTTAAT CGCCGCGACAATTTGCGACG

GCGCGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTGTTTGCCCG CCAGTTGTTGTGCCACGCGG

TTGGGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTTTTCCCGCGTTTTCGCA GAAACGTGGCTGGCCTGGTT

CACCACGCGGGAAACGGTCTGATAAGAGACACCGGCATACTCTGCGACATCGTATAA CGTTACTGGTTTCACATTCA

CCACCCTGAATTGACTCTCTTCCGGGCGCTATCATGCCATACCGCGAAAGGTTTTGC GCCATTCGATGGTGTCCGGG

ATCTCGACGCTCTCCCTTATGCGACTCCTGCATTAGGAAGCAGCCCAGTAGTAGGTT GAGGCCGTTGAGCACCGCCG

CCGCAAGGAATGGTGCATGCAAGGAGATGGCGCCCAACAGTCCCCCGGCCACGGGGa gtcaaaagcctccggtcgga ggcttttgactTCTAGAGAGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTG GAATTGTGAGCGGATAA

CAAGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGAATCCCATGGTGAG TAAGGGTGAGGAGCTCATTA

AGGAGAACATGCACATGAAGCTGTATATGGAGGGCACCGTAAACAACCACCACTTCA AGTGTACCACCGAGGGTGAA

GGTAAACCCTACGAGGGGACGCAGACCCAACGCATCAAGGTCGTGGAGGGCGGCCCG CTGCCTTTCGCATTCGACAT

TCTGGCGACCTGTTTTATGTACGGCTCGAAGACCTTCATCAACCACACCCAAGGCAT CCCGGACTTCTTCAAGCAGA

GCTTCCCTGAGGGCTTCACCTGGGAGCGCGTCACCACGTATGAAGACGGTGGGGTGC TCACCGTGACCCAGGACACG

AGCTTGCAGGATGGCTGCTTGATTTACAACGTCAAGCTGCGCGGGGTGAACTTCCCT AGCAACGGGCCAGTGATGCA

GAAAAAGACGCTGGGTTGGGAGGCCACCACCGAGACCCTGTACCCGGCCGACGGGGG GCTGGAAGGGCGGTGCGATA

TGGCCCTGAAATTGGTCGGCGGCGGTCATTTGCACTGCAATCTCAAGACCACGTACC GCTCCAAGAAACCCGCCAAA

AACCTGAAGATGCCTGGTGTTTATTTTGTCGACCGGCGCCTGGAGCGCATCAAGGAA GCGGACAATGAGACGTACGT

GGAACAGCACGAAGTGGCCGTGGCTCGTTATTGCGATCTGCCGTCGAAGCTGGGTCA CAAACTGAACGGCATGGATG AGCTGTACAAAGATTATAAGGATGATGACGACAAGTAACTCGAG pa V2TcR- sfGFP/V2LacI- mCardinal (SEQ ID NO: 67) ggatccGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTGATA GAGACGCAGTAAGAGAG GAATGTACATATGTCCAAAGGTGAAGAGCTGTTTACCGGCGTCGTGCCCATTCTGGTGGA GCTGGATGGCGACGTCA

ACGGGCACAAGTTTAGCGTCCGTGGCGAAGGTGAGGGCGACGCCACGAACGGTAAGC TGACGCTGAAATTCATTTGC ACCACCGGCAAATTGCCTGTACCCTGGCCCACCCTGGTGACCACGCTCACCTACGGCGTA CAGTGCTTCAGCCGTTA

CCCGGACCACATGAAGCGTCACGACTTCTTCAAAAGCGCCATGCCGGAGGGTTACGT GCAGGAGCGTACGATTAGTT

TCAAGGACGACGGCACCTATAAGACCCGTGCCGAAGTGAAGTTCGAAGGCGATACGT TGGTGAACCGTATCGAGTTG

AAGGGTATCGACTTTAAGGAAGACGGCAACATCCTGGGCCATAAGCTGGAGTACAAT TTCAACAGCCATAACGTTTA

CATCACCGCCGATAAACAGAAGAACGGCATTAAAGCCAACTTTAAGATCCGCCACAA CGTCGAAGACGGCTCGGTGC

AGCTGGCCGACCATTATCAGCAAAACACCCCCATCGGTGATGGGCCCGTGCTGCTGC CGGATAACCATTATCTGAGC

ACGCAGTCGGTGCTCAGCAAGGACCCTAACGAAAAGCGCGATCACATGGTGCTGCTG GAGTTCGTCACGGCGGCGGG

GATCACCCATGGGATGGACGAGCTCTACAAAGACTATAAAGATGACGATGACAAGTA AAAGCTTCTCGGTACCAAAT

TCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCCGAATTCTTG ACAGCTAGCTCAGTCCTAGG

TATAATGCTAGCCGCAGTAAGAGAGGAATGTACACATGTCCCGCCTGGATAAATCGA AAGTGATTAACTCGGCCCTC

GAATTGCTGAATGAAGTCGGTATCGAGGGGCTGACGACCCGTAAATTGGCACAAAAG TTGGGGGTGGAGCAACCCAC

GTTGTATTGGCACGTCAAAAATAAGCGGGCATTGCTGGATGCCCTCGCTATTGAAAT GTTGGATCGCCACCATACCC

ATTTCTGTCCACTGGAGGGCGAGTCCTGGCAGGACTTTCTCCGCAACAACGCGAAAT CCTTTCGCTGTGCACTCTTG

TCCCATCGGGACGGTGCTAAGGTGCACTTGGGCACCCGTCCCACCGAAAAACAATAC GAAACCTTGGAAAATCAATT

GGCGTTTTTGTGCCAGCAAGGGTTTAGCTTGGAGAATGCTCTCTATGCGCTCTCGGC TGTCGGGCACTTTACGTTGG

GGTGCGTGTTGGAGGACCAGGAGCATCAAGTCGCAAAAGAGGAGCGTGAAACCCCAA CCACGGACTCGATGCCACCT

CTGCTCCGCCAAGCTATCGAACTCTTCGATCATCAGGGCGCGGAGCCAGCCTTCCTC TTTGGGCTGGAGCTGATTAT

CTGCGGTTTGGAAAAACAACTCAAGTGTGAAAGCGGGTCCTAACTGCAGTCACTGCC CGCTTTCCAGTCGGGAAACC

TGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTA TTGGGCGCCAGGGTGGTTTT

TCTTTTCACCAGTGAGACGGGCAACAGCTGATTGCCCTTCACCGCCTGGCCCTGAGA GAGTTGCAGCAAGCGGTCCA

CGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGGTGGTTAACGGCGGGATAT AACATGAGCTGTCTTCGGTA

TCGTCGTATCCCACTACCGAGATATCCGCACCAACGCGCAGCCCGGACTCGGTAATG GCGCGCATTGCGCCCAGCGC

CATCTGATCGTTGGCAACCAGCATCGCAGTGGGAACGATGCCCTCATTCAGCATTTG CATGGTTTGTTGAAAACCGG

ACATGGCACTCCAGTCGCCTTCCCGTTCCGCTATCGGCTGAATTTGATTGCGAGTGA GATATTTATGCCAGCCAGCC

AGACGCAGACGCGCCGAGACAGAACTTAATGGGCCCGCTAACAGCGCGATTTGCTGG TGACCCAATGCGACCAGATG

CTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGTTGATGGGTGT CTGGTCAGAGACATCAAGAA

ATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGCAATGGCATCCTGGTCATCCA GCGGATAGTTAATGATCAGC

CCACTGACGCGTTGCGCGAGAAGATTGTGCACCGCCGCTTTACAGGCTTCGACGCCG CTTCGTTCTACCATCGACAC

CACCACGCTGGCACCCAGTTGATCGGCGCGAGATTTAATCGCCGCGACAATTTGCGA CGGCGCGTGCAGGGCCAGAC

TGGAGGTGGCAACGCCAATCAGCAACGACTGTTTGCCCGCCAGTTGTTGTGCCACGC GGTTGGGAATGTAATTCAGC

TCCGCCATCGCCGCTTCCACTTTTTCCCGCGTTTTCGCAGAAACGTGGCTGGCCTGG TTCACCACGCGGGAAACGGT

CTGATAAGAGACACCGGCATACTCTGCGACATCGTATAACGTTACTGGTTTCACATT CACCACCCTGAATTGACTCT

CTTCCGGGCGCTATCATGCCATACCGCGAAAGGTTTTGCGCCATTCGATGGTGTCCG GGATCTCGACGCTCTCCCTT

ATGCGACTCCTGCATTAGGAAGCAGCCCAGTAGTAGGTTGAGGCCGTTGAGCACCGC CGCCGCAAGGAATGGTGCAT

GCAAGGAGATGGCGCCCAACAGTCCCCCGGCCACGGGGagtcaaaagcctccggtcg gaggcttttgactTCTAGAG

AGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTGGAATTGTGAGCGGAT AACAACGCAGTAAGAGAGGA

ATGTACCCATGGTGAGTAAGGGTGAGGAGCTCATTAAGGAGAACATGCACATGAAGC TGTATATGGAGGGCACCGTA

AACAACCACCACTTCAAGTGTACCACCGAGGGTGAAGGTAAACCCTACGAGGGGACG CAGACCCAACGCATCAAGGT

CGTGGAGGGCGGCCCGCTGCCTTTCGCATTCGACATTCTGGCGACCTGTTTTATGTA CGGCTCGAAGACCTTCATCA

ACCACACCCAAGGCATCCCGGACTTCTTCAAGCAGAGCTTCCCTGAGGGCTTCACCT GGGAGCGCGTCACCACGTAT

GAAGACGGTGGGGTGCTCACCGTGACCCAGGACACGAGCTTGCAGGATGGCTGCTTG ATTTACAACGTCAAGCTGCG CGGGGTGAACTTCCCTAGCAACGGGCCAGTGATGCAGAAAAAGACGCTGGGTTGGGAGGC CACCACCGAGACCCTGT

ACCCGGCCGACGGGGGGCTGGAAGGGCGGTGCGATATGGCCCTGAAATTGGTCGGCG GCGGTCATTTGCACTGCAAT CTCAAGACCACGTACCGCTCCAAGAAACCCGCCAAAAACCTGAAGATGCCTGGTGTTTAT TTTGTCGACCGGCGCCT GGAGCGCATCAAGGAAGCGGACAATGAGACGTACGTGGAACAGCACGAAGTGGCCGTGGC TCGTTATTGCGATCTGC CGTCGAAGCTGGGTCACAAACTGAACGGCATGGATGAGCTGTACAAAGATTATAAGGATG ATGACGACAAGTAACTC GAG pa V2TcR- sfGFP /V3LacI- mCardinal (SEQ ID NO: 68) ggatccGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTGATA GAGACGCAGTAAGAGAG GAATGTACATATGTCCAAAGGTGAAGAGCTGTTTACCGGCGTCGTGCCCATTCTGGTGGA GCTGGATGGCGACGTCA ACGGGCACAAGTTTAGCGTCCGTGGCGAAGGTGAGGGCGACGCCACGAACGGTAAGCTGA CGCTGAAATTCATTTGC ACCACCGGCAAATTGCCTGTACCCTGGCCCACCCTGGTGACCACGCTCACCTACGGCGTA CAGTGCTTCAGCCGTTA CCCGGACCACATGAAGCGTCACGACTTCTTCAAAAGCGCCATGCCGGAGGGTTACGTGCA GGAGCGTACGATTAGTT TCAAGGACGACGGCACCTATAAGACCCGTGCCGAAGTGAAGTTCGAAGGCGATACGTTGG TGAACCGTATCGAGTTG AAGGGTATCGACTTTAAGGAAGACGGCAACATCCTGGGCCATAAGCTGGAGTACAATTTC AACAGCCATAACGTTTA CATCACCGCCGATAAACAGAAGAACGGCATTAAAGCCAACTTTAAGATCCGCCACAACGT CGAAGACGGCTCGGTGC AGCTGGCCGACCATTATCAGCAAAACACCCCCATCGGTGATGGGCCCGTGCTGCTGCCGG ATAACCATTATCTGAGC ACGCAGTCGGTGCTCAGCAAGGACCCTAACGAAAAGCGCGATCACATGGTGCTGCTGGAG TTCGTCACGGCGGCGGG GAT CACC CAT GGGAT GGAC GAGCT CTACAAAGACTATAAAGAT GAC GAT GACAAGTAAAAGCT T CT C GGTACCAAAT TCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCCGAATTCTTGACA GCTAGCTCAGTCCTAGG TATAATGCTAGCCGCAGTAAGAGAGGAATGTACACATGTCCCGCCTGGATAAATCGAAAG TGATTAACTCGGCCCTC GAATTGCTGAATGAAGTCGGTATCGAGGGGCTGACGACCCGTAAATTGGCACAAAAGTTG GGGGTGGAGCAACCCAC GTTGTATTGGCACGTCAAAAATAAGCGGGCATTGCTGGATGCCCTCGCTATTGAAATGTT GGATCGCCACCATACCC ATTTCTGTCCACTGGAGGGCGAGTCCTGGCAGGACTTTCTCCGCAACAACGCGAAATCCT TTCGCTGTGCACTCTTG TCCCATCGGGACGGTGCTAAGGTGCACTTGGGCACCCGTCCCACCGAAAAACAATACGAA ACCTTGGAAAATCAATT GGCGTTTTTGTGCCAGCAAGGGTTTAGCTTGGAGAATGCTCTCTATGCGCTCTCGGCTGT CGGGCACTTTACGTTGG GGTGCGTGTTGGAGGACCAGGAGCATCAAGTCGCAAAAGAGGAGCGTGAAACCCCAACCA CGGACTCGATGCCACCT CTGCTCCGCCAAGCTATCGAACTCTTCGATCATCAGGGCGCGGAGCCAGCCTTCCTCTTT GGGCTGGAGCTGATTAT CTGCGGTTTGGAAAAACAACTCAAGTGTGAAAGCGGGTCCTAACTGCAGTCACTGCCCGC TTTCCAGTCGGGAAACC TGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTG GGCGCCAGGGTGGTTTT TCTTTTCACCAGTGAGACGGGCAACAGCTGATTGCCCTTCACCGCCTGGCCCTGAGAGAG TTGCAGCAAGCGGTCCA CGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGGTGGTTAACGGCGGGATATAAC ATGAGCTGTCTTCGGTA

TCGTCGTATCCCACTACCGAGATATCCGCACCAACGCGCAGCCCGGACTCGGTAATG GCGCGCATTGCGCCCAGCGC CATCTGATCGTTGGCAACCAGCATCGCAGTGGGAACGATGCCCTCATTCAGCATTTGCAT GGTTTGTTGAAAACCGG ACATGGCACTCCAGTCGCCTTCCCGTTCCGCTATCGGCTGAATTTGATTGCGAGTGAGAT ATTTATGCCAGCCAGCC AGACGCAGACGCGCCGAGACAGAACTTAATGGGCCCGCTAACAGCGCGATTTGCTGGTGA CCCAATGCGACCAGATG CTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGTTGATGGGTGTCTG GTCAGAGACATCAAGAA ATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGCAATGGCATCCTGGTCATCCAGCG GATAGTTAATGATCAGC CCACTGACGCGTTGCGCGAGAAGATTGTGCACCGCCGCTTTACAGGCTTCGACGCCGCTT CGTTCTACCATCGACAC CACCACGCTGGCACCCAGTTGATCGGCGCGAGATTTAATCGCCGCGACAATTTGCGACGG CGCGTGCAGGGCCAGAC TGGAGGTGGCAACGCCAATCAGCAACGACTGTTTGCCCGCCAGTTGTTGTGCCACGCGGT TGGGAATGTAATTCAGC TCCGCCATCGCCGCTTCCACTTTTTCCCGCGTTTTCGCAGAAACGTGGCTGGCCTGGTTC ACCACGCGGGAAACGGT

CTGATAAGAGACACCGGCATACTCTGCGACATCGTATAACGTTACTGGTTTCACATT CACCACCCTGAATTGACTCT CTTCCGGGCGCTATCATGCCATACCGCGAAAGGTTTTGCGCCATTCGATGGTGTCCGGGA TCTCGACGCTCTCCCTT ATGCGACTCCTGCATTAGGAAGCAGCCCAGTAGTAGGTTGAGGCCGTTGAGCACCGCCGC CGCAAGGAATGGTGCAT GCAAGGAGATGGCGCCCAACAGTCCCCCGGCCACGGGGagtcaaaagcctccggtcggag gcttttgactTCTAGAG AGCTGTTGACACTTTATGCTTCCGGCTCGTATAATGTGTGTGGAATTGTGAGCGGATAAC AAGTGGAATTGTGAGCG GATAACAATTT CACACAGGAAACAGAAT CCCATGGT GAGTAAGGGT GAGGAGCTCATTAAGGAGAACAT GCACAT GA AGCTGTATATGGAGGGCACCGTAAACAACCACCACTTCAAGTGTACCACCGAGGGTGAAG GTAAACCCTACGAGGGG ACGCAGACCCAACGCATCAAGGTCGTGGAGGGCGGCCCGCTGCCTTTCGCATTCGACATT CTGGCGACCTGTTTTAT GTACGGCTCGAAGACCTTCATCAACCACACCCAAGGCATCCCGGACTTCTTCAAGCAGAG CTTCCCTGAGGGCTTCA CCTGGGAGCGCGTCACCACGTATGAAGACGGTGGGGTGCTCACCGTGACCCAGGACACGA GCTTGCAGGATGGCTGC TTGATTTACAACGTCAAGCTGCGCGGGGTGAACTTCCCTAGCAACGGGCCAGTGATGCAG AAAAAGACGCTGGGTTG GGAGGCCACCACCGAGACCCTGTACCCGGCCGACGGGGGGCTGGAAGGGCGGTGCGATAT GGCCCTGAAATTGGTCG GCGGCGGTCATTTGCACTGCAATCTCAAGACCACGTACCGCTCCAAGAAACCCGCCAAAA ACCTGAAGATGCCTGGT GTTTATTTTGTCGACCGGCGCCTGGAGCGCATCAAGGAAGCGGACAATGAGACGTACGTG GAACAGCACGAAGTGGC CGTGGCTCGTTATTGCGATCTGCCGTCGAAGCTGGGTCACAAACTGAACGGCATGGATGA GCTGTACAAAGATTATA AGGAT GAT GAG GACAAGTAACT CGAG pUC19-crtEBI (SEQ ID NO: 69)

CATATGTATCCGTTTATAAGGACAGCCCGAATGACGGTCTGCGCAAAAAAACACGTT CATCTCACTCGCGATGCTGC GGAGCAGTTACTGGCTGATATTGATCGACGCCTTGATCAGTTATTGCCCGTGGAGGGAGA ACGGGATGTTGTGGGTG CCGCGATGCGTGAAGGTGCGCTGGCACCGGGAAAACGTATTCGCCCCATGTTGCTGTTGC TGACCGCCCGCGATCTG GGTTGCGCTGTCAGCCATGACGGATTACTGGATTTGGCCTGTGCGGTGGAAATGGTCCAC GCGGCTTCGCTGATCCT TGACGATATGCCCTGCATGGACGATGCGAAGCTGCGGCGCGGACGCCCTACCATTCATTC TCATTACGGAGAGCATG TGGCAATACTGGCGGCGGTTGCCTTGCTGAGTAAAGCCTTTGGCGTAATTGCCGATGCAG ATGGCCTCACGCCGCTG GCAAAAAATCGGGCGGTTTCTGAACTGTCAAACGCCATCGGCATGCAAGGATTGGTTCAG GGTCAGTTCAAGGATCT GTCTGAAGGGGATAAGCCGCGCAGCGCTGAAGCTATTTTGATGACGAATCACTTTAAAAC CAGCACGCTGTTTTGTG CCTCCATGCAGATGGCCTCGATTGTTGCGAATGCCTCCAGCGAAGCGCGTGATTGCCTGC ATCGTTTTTCACTTGAT CTTGGTCAGGCATTTCAACTGCTGGACGATTTGACCGATGGCATGACCGACACCGGTAAG GATAGCAATCAGGACGC CGGTAAATCGACGCTGGTCAATCTGTTAGGCCCGAGGGCGGTTGAAGAACGTCTGAGACA ACATCTTCAGCTTGCCA GTGAGCATCTCTCTGCGGCCTGCCAACACGGGCACGCCACTCAACATTTTATTCAGGCCT GGTTTGACAAAAAACTC GCTGCCGTCAGTTAACGCAGTAAGAGAGGAATGTAGATATGAATAATCCGTCGTTACTCA ATCATGCGGTCGAAACG ATGGCAGTTGGCTCGAAAAGTTTTGCGACAGCCTCAAAGTTATTTGATGCAAAAACCCGG CGCAGCGTACTGATGCT CTACGCCTGGTGCCGCCATTGTGACGATGTTATTGACGATCAGACGCTGGGCTTTCAGGC CCGGCAGCCTGCCTTAC AAACGCCCGAACAACGTCTGATGCAACTTGAGATGAAAACGCGCCAGGCCTATGCAGGAT CGCAGATGCACGAACCG GCGTTTGCGGCTTTTCAGGAAGTGGCTATGGCTCATGATATCGCCCCGGCTTACGCGTTT GATCATCTGGAAGGCTT CGCCATGGATGTACGCGAAGCGCAATACAGCCAACTGGATGATACGCTGCGCTATTGCTA TCACGTTGCAGGCGTTG TCGGCTTGATGATGGCGCAAATCATGGGCGTGCGGGATAACGCCACGCTGGACCGCGCCT GTGACCTTGGGCTGGCA TTTCAGTTGACCAATATTGCTCGCGATATTGTGGACGATGCGCATGCGGGCCGCTGTTAT CTGCCGGCAAGCTGGCT GGAGCATGAAGGTCTGAACAAAGAGAATTATGCGGCACCTGAAAACCGTCAGGCGCTGAG CCGTATCGCCCGTCGTT TGGTGCAGGAAGCAGAACCTTACTATTTGTCTGCCACAGCCGGCCTGGCAGGGTTGCCCC TGCGTTCCGCCTGGGCA ATCGCTACGGCGAAGCAGGTTTACCGGAAAATAGGTGTCAAAGTTGAACAGGCCGGTCAG CAAGCCTGGGATCAGCG

GCAGTCAACGACCACGCCCGAAAAATTAACGCTGCTGCTGGCCGCCTCTGGTCAGGC CCTTACTTCCCGGATGCGGG

CTCATCCTCCCCGCCCTGCGCATCTCTGGCAGCGCCCGCTCTGAAATAATTTTGTTT AACTTTAAGAAGGAGATATA

ATGAAACCAACTACGGTAATTGGTGCAGGCTTCGGTGGCCTGGCACTGGCAATTCGT CTACAAGCTGCGGGGATTCC

CGTCTTACTGCTTGAACAACGTGATAAACCCGGCGGTCGGGCTTATGTCTACGAGGA TCAGGGGTTTACCTTTGATG

CAGGCCCGACGGTTATCACCGATCCCAGTGCCATTGAAGAACTGTTTGCACTGGCAG GAAAACAGTTAAAAGAGTAT

GTCGAACTGCTGCCGGTTACGCCGTTTTACCGCCTGTGTTGGGAGTCAGGGAAGGTC TTTAATTACGATAACGATCA

AACCCGGCTCGAAGCGCAGATTCAGCAGTTTAATCCCCGCGATGTCGAAGGTTATCG TCAGTTTCTGGACTATTCAC

GCGCGGTGTTTAAAGAAGGCTATCTAAAGCTCGGTACTGTCCCTTTTTTATCGTTCA GAGACATGCTTCGCGCCGCA

CCTCAACTGGCGAAACTGCAGGCATGGAGAAGCGTTTACAGTAAGGTTGCCAGTTAC ATCGAAGATGAACATCTGCG

CCAGGCGTTTTCTTTCCACTCGCTGTTGGTGGGCGGCAATCCCTTCGCCACCTCATC CATTTATACGTTGATACACG

CGCTGGAGCGTGAGTGGGGCGTCTGGTTTCCGCGTGGCGGCACCGGCGCATTAGTTC AGGGGATGATAAAGCTGTTT

CAGGATCTGGGTGGCGAAGTCGTGTTAAACGCCAGAGTCAGCCACATGGAAACGACA GGAAACAAGATTGAAGCCGT

GCATTTAGAGGACGGTCGCAGGTTCCTGACGCAAGCCGTCGCGTCAAATGCAGATGT GGTTCATACCTATCGCGACC

TGTTAAGCCAGCACCCTGCCGCGGTTAAGCAGTCCAACAAACTGCAGACTAAGCGCA TGAGTAACTCTCTGTTTGTG

CTCTATTTTGGTTTGAATCACCATCATGATCAGCTCGCGCATCACACGGTTTGTTTC GGCCCGCGTTACCGCGAGCT

GATTGACGAAATTTTTAATCATGATGGCCTCGCAGAGGACTTCTCACTTTATCTGCA CGCGCCCTGTGTCACGGATT

CGTCACTGGCGCCTGAAGGTTGCGGCAGTTACTATGTGTTGGCGCCGGTGCCGCATT TAGGCACCGCGAACCTCGAC

TGGACGGTTGAGGGGCCAAAACTACGCGACCGTATTTTTGCGTACCTTGAGCAGCAT TACATGCCTGGCTTACGGAG

TCAGCTGGTCACGCACCGGATGTTTACGCCGTTTGATTTTCGCGACCAGCTTAATGC CTATCATGGCTCAGCCTTTT

CTGTGGAGCCCGTTCTTACCCAGAGCGCCTGGTTTCGGCCGCATAACCGCGATAAAA CCATTACTAATCTCTACCTG

GTCGGCGCAGGCACGCATCCCGGCGCAGGCATTCCTGGCGTCATCGGCTCGGCAAAA GCGACAGCAGGTTTGATGCT

GGAGGATCTGATTTGA pa V2TcR-crtEBI (SEQ ID NO: 70)

GGATCCGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTG ATAGAGACGCAGTAAGAGAG

GAATGTACATATGTATCCGTTTATAAGGACAGCCCGAATGACGGTCTGCGCAAAAAA ACACGTTCATCTCACTCGCG

ATGCTGCGGAGCAGTTACTGGCTGATATTGATCGACGCCTTGATCAGTTATTGCCCG TGGAGGGAGAACGGGATGTT

GTGGGTGCCGCGATGCGTGAAGGTGCGCTGGCACCGGGAAAACGTATTCGCCCCATG TTGCTGTTGCTGACCGCCCG

CGATCTGGGTTGCGCTGTCAGCCATGACGGATTACTGGATTTGGCCTGTGCGGTGGA AATGGTCCACGCGGCTTCGC

TGATCCTTGACGATATGCCCTGCATGGACGATGCGAAGCTGCGGCGCGGACGCCCTA CCATTCATTCTCATTACGGA

GAGCATGTGGCAATACTGGCGGCGGTTGCCTTGCTGAGTAAAGCCTTTGGCGTAATT GCCGATGCAGATGGCCTCAC

GCCGCTGGCAAAAAATCGGGCGGTTTCTGAACTGTCAAACGCCATCGGCATGCAAGG ATTGGTTCAGGGTCAGTTCA

AGGATCTGTCTGAAGGGGATAAGCCGCGCAGCGCTGAAGCTATTTTGATGACGAATC ACTTTAAAACCAGCACGCTG

TTTTGTGCCTCCATGCAGATGGCCTCGATTGTTGCGAATGCCTCCAGCGAAGCGCGT GATTGCCTGCATCGTTTTTC

ACTTGATCTTGGTCAGGCATTTCAACTGCTGGACGATTTGACCGATGGCATGACCGA CACCGGTAAGGATAGCAATC

AGGACGCCGGTAAATCGACGCTGGTCAATCTGTTAGGCCCGAGGGCGGTTGAAGAAC GTCTGAGACAACATCTTCAG

CTTGCCAGTGAGCATCTCTCTGCGGCCTGCCAACACGGGCACGCCACTCAACATTTT ATTCAGGCCTGGTTTGACAA

AAAACTCGCTGCCGTCAGTTAACGCAGTAAGAGAGGAATGTAGATATGAATAATCCG TCGTTACTCAATCATGCGGT

CGAAACGATGGCAGTTGGCTCGAAAAGTTTTGCGACAGCCTCAAAGTTATTTGATGC AAAAACCCGGCGCAGCGTAC

TGATGCTCTACGCCTGGTGCCGCCATTGTGACGATGTTATTGACGATCAGACGCTGG GCTTTCAGGCCCGGCAGCCT GCCTTACAAACGCCCGAACAACGTCTGATGCAACTTGAGATGAAAACGCGCCAGGCCTAT GCAGGATCGCAGATGCA

CGAACCGGCGTTTGCGGCTTTTCAGGAAGTGGCTATGGCTCATGATATCGCCCCGGC TTACGCGTTTGATCATCTGG

AAGGCTTCGCCATGGATGTACGCGAAGCGCAATACAGCCAACTGGATGATACGCTGC GCTATTGCTATCACGTTGCA

GGCGTTGTCGGCTTGATGATGGCGCAAATCATGGGCGTGCGGGATAACGCCACGCTG GACCGCGCCTGTGACCTTGG

GCTGGCATTTCAGTTGACCAATATTGCTCGCGATATTGTGGACGATGCGCATGCGGG CCGCTGTTATCTGCCGGCAA

GCTGGCTGGAGCATGAAGGTCTGAACAAAGAGAATTATGCGGCACCTGAAAACCGTC AGGCGCTGAGCCGTATCGCC

CGTCGTTTGGTGCAGGAAGCAGAACCTTACTATTTGTCTGCCACAGCCGGCCTGGCA GGGTTGCCCCTGCGTTCCGC

CTGGGCAATCGCTACGGCGAAGCAGGTTTACCGGAAAATAGGTGTCAAAGTTGAACA GGCCGGTCAGCAAGCCTGGG

ATCAGCGGCAGTCAACGACCACGCCCGAAAAATTAACGCTGCTGCTGGCCGCCTCTG GTCAGGCCCTTACTTCCCGG

ATGCGGGCTCATCCTCCCCGCCCTGCGCATCTCTGGCAGCGCCCGCTCTGAAATAAT TTTGTTTAACTTTAAGAAGG

AGATATAATGAAACCAACTACGGTAATTGGTGCAGGCTTCGGTGGCCTGGCACTGGC AATTCGTCTACAAGCTGCGG

GGATTCCCGTCTTACTGCTTGAACAACGTGATAAACCCGGCGGTCGGGCTTATGTCT ACGAGGATCAGGGGTTTACC

TTTGATGCAGGCCCGACGGTTATCACCGATCCCAGTGCCATTGAAGAACTGTTTGCA CTGGCAGGAAAACAGTTAAA

AGAGTATGTCGAACTGCTGCCGGTTACGCCGTTTTACCGCCTGTGTTGGGAGTCAGG GAAGGTCTTTAATTACGATA

ACGATCAAACCCGGCTCGAAGCGCAGATTCAGCAGTTTAATCCCCGCGATGTCGAAG GTTATCGTCAGTTTCTGGAC

TATTCACGCGCGGTGTTTAAAGAAGGCTATCTAAAGCTCGGTACTGTCCCTTTTTTA TCGTTCAGAGACATGCTTCG

CGCCGCACCTCAACTGGCGAAACTGCAGGCATGGAGAAGCGTTTACAGTAAGGTTGC CAGTTACATCGAAGATGAAC

ATCTGCGCCAGGCGTTTTCTTTCCACTCGCTGTTGGTGGGCGGCAATCCCTTCGCCA CCTCATCCATTTATACGTTG

ATACACGCGCTGGAGCGTGAGTGGGGCGTCTGGTTTCCGCGTGGCGGCACCGGCGCA TTAGTTCAGGGGATGATAAA

GCTGTTTCAGGATCTGGGTGGCGAAGTCGTGTTAAACGCCAGAGTCAGCCACATGGA AACGACAGGAAACAAGATTG

AAGCCGTGCATTTAGAGGACGGTCGCAGGTTCCTGACGCAAGCCGTCGCGTCAAATG CAGATGTGGTTCATACCTAT

CGCGACCTGTTAAGCCAGCACCCTGCCGCGGTTAAGCAGTCCAACAAACTGCAGACT AAGCGCATGAGTAACTCTCT

GTTTGTGCTCTATTTTGGTTTGAATCACCATCATGATCAGCTCGCGCATCACACGGT TTGTTTCGGCCCGCGTTACC

GCGAGCTGATTGACGAAATTTTTAATCATGATGGCCTCGCAGAGGACTTCTCACTTT ATCTGCACGCGCCCTGTGTC

ACGGATTCGTCACTGGCGCCTGAAGGTTGCGGCAGTTACTATGTGTTGGCGCCGGTG CCGCATTTAGGCACCGCGAA

CCTCGACTGGACGGTTGAGGGGCCAAAACTACGCGACCGTATTTTTGCGTACCTTGA GCAGCATTACATGCCTGGCT

TACGGAGTCAGCTGGTCACGCACCGGATGTTTACGCCGTTTGATTTTCGCGACCAGC TTAATGCCTATCATGGCTCA

GCCTTTTCTGTGGAGCCCGTTCTTACCCAGAGCGCCTGGTTTCGGCCGCATAACCGC GATAAAACCATTACTAATCT

CTACCTGGTCGGCGCAGGCACGCATCCCGGCGCAGGCATTCCTGGCGTCATCGGCTC GGCAAAAGCGACAGCAGGTT

TGATGCTGGAGGATCTGATTTGAAAGCTTCTCGGTACCAAATTCCAGAAAAGAGGCC TCCCGAAAGGGGGGCCTTTT

TTCGTTTTGGTCCGAATTCTTGACAGCTAGCTCAGTCCTAGGTATAATGCTAGCCGC AGTAAGAGAGGAATGTACAC

ATGTCCCGCCTGGATAAATCGAAAGTGATTAACTCGGCCCTCGAATTGCTGAATGAA GTCGGTATCGAGGGGCTGAC

GACCCGTAAATTGGCACAAAAGTTGGGGGTGGAGCAACCCACGTTGTATTGGCACGT CAAAAATAAGCGGGCATTGC

TGGATGCCCTCGCTATTGAAATGTTGGATCGCCACCATACCCATTTCTGTCCACTGG AGGGCGAGTCCTGGCAGGAC

TTTCTCCGCAACAACGCGAAATCCTTTCGCTGTGCACTCTTGTCCCATCGGGACGGT GCTAAGGTGCACTTGGGCAC

CCGTCCCACCGAAAAACAATACGAAACCTTGGAAAATCAATTGGCGTTTTTGTGCCA GCAAGGGTTTAGCTTGGAGA

ATGCTCTCTATGCGCTCTCGGCTGTCGGGCACTTTACGTTGGGGTGCGTGTTGGAGG ACCAGGAGCATCAAGTCGCA

AAAGAGGAGCGTGAAACCCCAACCACGGACTCGATGCCACCTCTGCTCCGCCAAGCT ATCGAACTCTTCGATCATCA

GGGCGCGGAGCCAGCCTTCCTCTTTGGGCTGGAGCTGATTATCTGCGGTTTGGAAAA ACAACTCAAGTGTGAAAGCG

GGTCCTAACTGCAGCTCGAG pa V2TcR-crtEBIY (SEQ ID NO: 71)

GGATCCGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTG ATAGAGACGCAGTAAGAGAG

GAATGTACATATGTATCCGTTTATAAGGACAGCCCGAATGACGGTCTGCGCAAAAAA ACACGTTCATCTCACTCGCG

ATGCTGCGGAGCAGTTACTGGCTGATATTGATCGACGCCTTGATCAGTTATTGCCCG TGGAGGGAGAACGGGATGTT

GTGGGTGCCGCGATGCGTGAAGGTGCGCTGGCACCGGGAAAACGTATTCGCCCCATG TTGCTGTTGCTGACCGCCCG

CGATCTGGGTTGCGCTGTCAGCCATGACGGATTACTGGATTTGGCCTGTGCGGTGGA AATGGTCCACGCGGCTTCGC

TGATCCTTGACGATATGCCCTGCATGGACGATGCGAAGCTGCGGCGCGGACGCCCTA CCATTCATTCTCATTACGGA

GAGCATGTGGCAATACTGGCGGCGGTTGCCTTGCTGAGTAAAGCCTTTGGCGTAATT GCCGATGCAGATGGCCTCAC

GCCGCTGGCAAAAAATCGGGCGGTTTCTGAACTGTCAAACGCCATCGGCATGCAAGG ATTGGTTCAGGGTCAGTTCA

AGGATCTGTCTGAAGGGGATAAGCCGCGCAGCGCTGAAGCTATTTTGATGACGAATC ACTTTAAAACCAGCACGCTG

TTTTGTGCCTCCATGCAGATGGCCTCGATTGTTGCGAATGCCTCCAGCGAAGCGCGT GATTGCCTGCATCGTTTTTC

ACTTGATCTTGGTCAGGCATTTCAACTGCTGGACGATTTGACCGATGGCATGACCGA CACCGGTAAGGATAGCAATC

AGGACGCCGGTAAATCGACGCTGGTCAATCTGTTAGGCCCGAGGGCGGTTGAAGAAC GTCTGAGACAACATCTTCAG

CTTGCCAGTGAGCATCTCTCTGCGGCCTGCCAACACGGGCACGCCACTCAACATTTT ATTCAGGCCTGGTTTGACAA

AAAACTCGCTGCCGTCAGTTAACGCAGTAAGAGAGGAATGTAGATATGAATAATCCG TCGTTACTCAATCATGCGGT

CGAAACGATGGCAGTTGGCTCGAAAAGTTTTGCGACAGCCTCAAAGTTATTTGATGC AAAAACCCGGCGCAGCGTAC

TGATGCTCTACGCCTGGTGCCGCCATTGTGACGATGTTATTGACGATCAGACGCTGG GCTTTCAGGCCCGGCAGCCT

GCCTTACAAACGCCCGAACAACGTCTGATGCAACTTGAGATGAAAACGCGCCAGGCC TATGCAGGATCGCAGATGCA

CGAACCGGCGTTTGCGGCTTTTCAGGAAGTGGCTATGGCTCATGATATCGCCCCGGC TTACGCGTTTGATCATCTGG

AAGGCTTCGCCATGGATGTACGCGAAGCGCAATACAGCCAACTGGATGATACGCTGC GCTATTGCTATCACGTTGCA

GGCGTTGTCGGCTTGATGATGGCGCAAATCATGGGCGTGCGGGATAACGCCACGCTG GACCGCGCCTGTGACCTTGG

GCTGGCATTTCAGTTGACCAATATTGCTCGCGATATTGTGGACGATGCGCATGCGGG CCGCTGTTATCTGCCGGCAA

GCTGGCTGGAGCATGAAGGTCTGAACAAAGAGAATTATGCGGCACCTGAAAACCGTC AGGCGCTGAGCCGTATCGCC

CGTCGTTTGGTGCAGGAAGCAGAACCTTACTATTTGTCTGCCACAGCCGGCCTGGCA GGGTTGCCCCTGCGTTCCGC

CTGGGCAATCGCTACGGCGAAGCAGGTTTACCGGAAAATAGGTGTCAAAGTTGAACA GGCCGGTCAGCAAGCCTGGG

ATCAGCGGCAGTCAACGACCACGCCCGAAAAATTAACGCTGCTGCTGGCCGCCTCTG GTCAGGCCCTTACTTCCCGG

ATGCGGGCTCATCCTCCCCGCCCTGCGCATCTCTGGCAGCGCCCGCTCTGAAATAAT TTTGTTTAACTTTAAGAAGG

AGATATAATGAAACCAACTACGGTAATTGGTGCAGGCTTCGGTGGCCTGGCACTGGC AATTCGTCTACAAGCTGCGG

GGATTCCCGTCTTACTGCTTGAACAACGTGATAAACCCGGCGGTCGGGCTTATGTCT ACGAGGATCAGGGGTTTACC

TTTGATGCAGGCCCGACGGTTATCACCGATCCCAGTGCCATTGAAGAACTGTTTGCA CTGGCAGGAAAACAGTTAAA

AGAGTATGTCGAACTGCTGCCGGTTACGCCGTTTTACCGCCTGTGTTGGGAGTCAGG GAAGGTCTTTAATTACGATA

ACGATCAAACCCGGCTCGAAGCGCAGATTCAGCAGTTTAATCCCCGCGATGTCGAAG GTTATCGTCAGTTTCTGGAC

TATTCACGCGCGGTGTTTAAAGAAGGCTATCTAAAGCTCGGTACTGTCCCTTTTTTA TCGTTCAGAGACATGCTTCG

CGCCGCACCTCAACTGGCGAAACTGCAGGCATGGAGAAGCGTTTACAGTAAGGTTGC CAGTTACATCGAAGATGAAC

ATCTGCGCCAGGCGTTTTCTTTCCACTCGCTGTTGGTGGGCGGCAATCCCTTCGCCA CCTCATCCATTTATACGTTG

ATACACGCGCTGGAGCGTGAGTGGGGCGTCTGGTTTCCGCGTGGCGGCACCGGCGCA TTAGTTCAGGGGATGATAAA

GCTGTTTCAGGATCTGGGTGGCGAAGTCGTGTTAAACGCCAGAGTCAGCCACATGGA AACGACAGGAAACAAGATTG

AAGCCGTGCATTTAGAGGACGGTCGCAGGTTCCTGACGCAAGCCGTCGCGTCAAATG CAGATGTGGTTCATACCTAT

CGCGACCTGTTAAGCCAGCACCCTGCCGCGGTTAAGCAGTCCAACAAACTGCAGACT AAGCGCATGAGTAACTCTCT

GTTTGTGCTCTATTTTGGTTTGAATCACCATCATGATCAGCTCGCGCATCACACGGT TTGTTTCGGCCCGCGTTACC

GCGAGCTGATTGACGAAATTTTTAATCATGATGGCCTCGCAGAGGACTTCTCACTTT ATCTGCACGCGCCCTGTGTC ACGGATTCGTCACTGGCGCCTGAAGGTTGCGGCAGTTACTATGTGTTGGCGCCGGTGCCG CATTTAGGCACCGCGAA

CCTCGACTGGACGGTTGAGGGGCCAAAACTACGCGACCGTATTTTTGCGTACCTTGA GCAGCATTACATGCCTGGCT

TACGGAGTCAGCTGGTCACGCACCGGATGTTTACGCCGTTTGATTTTCGCGACCAGC TTAATGCCTATCATGGCTCA

GCCTTTTCTGTGGAGCCCGTTCTTACCCAGAGCGCCTGGTTTCGGCCGCATAACCGC GATAAAACCATTACTAATCT

CTACCTGGTCGGCGCAGGCACGCATCCCGGCGCAGGCATTCCTGGCGTCATCGGCTC GGCAAAAGCGACAGCAGGTT

TGATGCTGGAGGATCTGATTTGACGCAGTAAGAGAGGAATGTAGATATGGGAGCGGC TATGCAACCGCATTATGATC

TGATTCTCGTGGGGGCTGGACTCGCGAATGGCCTTATCGCCCTGCGTCTCCAGCAGC AGCAACCTGATATGCGTATT

TTGCTTATCGACGCCGCACCCCAGGCGGGCGGGAATCATACGTGGTCATTTCACCAC GATGATTTGACTGAGAGCCA

ACATCGTTGGATAGCTCCGCTGGTGGTTCATCACTGGCCCGACTATCAGGTACGCTT TCCCACACGCCGTCGTAAGC

TGAACAGCGGCTACTTTTGTATTACTTCTCAGCGTTTCGCTGAGGTTTTACAGCGAC AGTTTGGCCCGCACTTGTGG

ATGGATACCGCGGTCGCAGAGGTTAATGCGGAATCTGTTCGGTTGAAAAAGGGTCAG GTTATCGGTGCCCGCGCGGT

GATTGACGGGCGGGGTTATGCGGCAAATTCAGCACTGAGCGTGGGCTTCCAGGCGTT TATTGGCCAGGAATGGCGAT

TGAGCCACCCGCATGGTTTATCGTCTCCCATTATCATGGATGCCACGGTCGATCAGC AAAATGGTTATCGCTTCGTG

TACAGCCTGCCGCTCTCGCCGACCAGATTGTTAATTGAAGATACGCACTATATTGAT AATGCGACATTAGATCCTGA

ATGCGCGCGGCAAAATATTTGCGACTATGCCGCGCAACAGGGTTGGCAGCTTCAGAC ACTGCTGCGAGAAGAACAGG

GCGCCTTACCCATTACTCTGTCGGGCAATGCCGACGCATTCTGGCAGCAGCGCCCCC TGGCCTGTAGTGGATTACGT

GCCGGTCTGTTCCATCCTACCACCGGCTATTCACTGCCGCTGGCGGTTGCCGTGGCC GACCGCCTGAGTGCACTTGA

TGTCTTTACGTCGGCCTCAATTCACCATGCCATTACGCATTTTGCCCGCGAGCGCTG GCAGCAGCAGGGCTTTTTCC

GCATGCTGAATCGCATGCTGTTTTTAGCCGGACCCGCCGATTCACGCTGGCGGGTTA TGCAGCGTTTTTATGGTTTA

CCTGAAGATTTAATTGCCCGTTTTTATGCGGGAAAACTCACGCTGACCGATCGGCTA CGTATTCTGAGCGGCAAGCC

GCCTGTTCCGGTATTAGCAGCATTGCAAGCCATTATGACGACTCATCGTTGAAAGCT TCTCGGTACCAAATTCCAGA

AAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCCGAATTCTTGACAGCT AGCTCAGTCCTAGGTATAAT

GCTAGCCGCAGTAAGAGAGGAATGTACACATGTCCCGCCTGGATAAATCGAAAGTGA TTAACTCGGCCCTCGAATTG

CTGAATGAAGTCGGTATCGAGGGGCTGACGACCCGTAAATTGGCACAAAAGTTGGGG GTGGAGCAACCCACGTTGTA

TTGGCACGTCAAAAATAAGCGGGCATTGCTGGATGCCCTCGCTATTGAAATGTTGGA TCGCCACCATACCCATTTCT

GTCCACTGGAGGGCGAGTCCTGGCAGGACTTTCTCCGCAACAACGCGAAATCCTTTC GCTGTGCACTCTTGTCCCAT

CGGGACGGTGCTAAGGTGCACTTGGGCACCCGTCCCACCGAAAAACAATACGAAACC TTGGAAAATCAATTGGCGTT

TTTGTGCCAGCAAGGGTTTAGCTTGGAGAATGCTCTCTATGCGCTCTCGGCTGTCGG GCACTTTACGTTGGGGTGCG

TGTTGGAGGACCAGGAGCATCAAGTCGCAAAAGAGGAGCGTGAAACCCCAACCACGG ACTCGATGCCACCTCTGCTC

CGCCAAGCTATCGAACTCTTCGATCATCAGGGCGCGGAGCCAGCCTTCCTCTTTGGG CTGGAGCTGATTATCTGCGG

TTTGGAAAAACAACTCAAGTGTGAAAGCGGGTCCTAACTGCAGCTCGAG pa V2TcR-crtEBI/V2LacI crtY (SEQ ID NO: 72) ggatccGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTGATA GAGACGCAGTAAGAGAG

GAATGTACATATGTATCCGTTTATAAGGACAGCCCGAATGACGGTCTGCGCAAAAAA ACACGTTCATCTCACTCGCG

ATGCTGCGGAGCAGTTACTGGCTGATATTGATCGACGCCTTGATCAGTTATTGCCCG TGGAGGGAGAACGGGATGTT

GTGGGTGCCGCGATGCGTGAAGGTGCGCTGGCACCGGGAAAACGTATTCGCCCCATG TTGCTGTTGCTGACCGCCCG

CGATCTGGGTTGCGCTGTCAGCCATGACGGATTACTGGATTTGGCCTGTGCGGTGGA AATGGTCCACGCGGCTTCGC

TGATCCTTGACGATATGCCCTGCATGGACGATGCGAAGCTGCGGCGCGGACGCCCTA CCATTCATTCTCATTACGGA

GAGCATGTGGCAATACTGGCGGCGGTTGCCTTGCTGAGTAAAGCCTTTGGCGTAATT GCCGATGCAGATGGCCTCAC

GCCGCTGGCAAAAAATCGGGCGGTTTCTGAACTGTCAAACGCCATCGGCATGCAAGG ATTGGTTCAGGGTCAGTTCA AGGATCTGTCTGAAGGGGATAAGCCGCGCAGCGCTGAAGCTATTTTGATGACGAATCACT TTAAAACCAGCACGCTG

TTTTGTGCCTCCATGCAGATGGCCTCGATTGTTGCGAATGCCTCCAGCGAAGCGCGT GATTGCCTGCATCGTTTTTC

ACTTGATCTTGGTCAGGCATTTCAACTGCTGGACGATTTGACCGATGGCATGACCGA CACCGGTAAGGATAGCAATC

AGGACGCCGGTAAATCGACGCTGGTCAATCTGTTAGGCCCGAGGGCGGTTGAAGAAC GTCTGAGACAACATCTTCAG

CTTGCCAGTGAGCATCTCTCTGCGGCCTGCCAACACGGGCACGCCACTCAACATTTT ATTCAGGCCTGGTTTGACAA

AAAACTCGCTGCCGTCAGTTAACGCAGTAAGAGAGGAATGTAGATATGAATAATCCG TCGTTACTCAATCATGCGGT

CGAAACGATGGCAGTTGGCTCGAAAAGTTTTGCGACAGCCTCAAAGTTATTTGATGC AAAAACCCGGCGCAGCGTAC

TGATGCTCTACGCCTGGTGCCGCCATTGTGACGATGTTATTGACGATCAGACGCTGG GCTTTCAGGCCCGGCAGCCT

GCCTTACAAACGCCCGAACAACGTCTGATGCAACTTGAGATGAAAACGCGCCAGGCC TATGCAGGATCGCAGATGCA

CGAACCGGCGTTTGCGGCTTTTCAGGAAGTGGCTATGGCTCATGATATCGCCCCGGC TTACGCGTTTGATCATCTGG

AAGGCTTCGCCATGGATGTACGCGAAGCGCAATACAGCCAACTGGATGATACGCTGC GCTATTGCTATCACGTTGCA

GGCGTTGTCGGCTTGATGATGGCGCAAATCATGGGCGTGCGGGATAACGCCACGCTG GACCGCGCCTGTGACCTTGG

GCTGGCATTTCAGTTGACCAATATTGCTCGCGATATTGTGGACGATGCGCATGCGGG CCGCTGTTATCTGCCGGCAA

GCTGGCTGGAGCATGAAGGTCTGAACAAAGAGAATTATGCGGCACCTGAAAACCGTC AGGCGCTGAGCCGTATCGCC

CGTCGTTTGGTGCAGGAAGCAGAACCTTACTATTTGTCTGCCACAGCCGGCCTGGCA GGGTTGCCCCTGCGTTCCGC

CTGGGCAATCGCTACGGCGAAGCAGGTTTACCGGAAAATAGGTGTCAAAGTTGAACA GGCCGGTCAGCAAGCCTGGG

ATCAGCGGCAGTCAACGACCACGCCCGAAAAATTAACGCTGCTGCTGGCCGCCTCTG GTCAGGCCCTTACTTCCCGG

ATGCGGGCTCATCCTCCCCGCCCTGCGCATCTCTGGCAGCGCCCGCTCTGAAATAAT TTTGTTTAACTTTAAGAAGG

AGATATAATGAAACCAACTACGGTAATTGGTGCAGGCTTCGGTGGCCTGGCACTGGC AATTCGTCTACAAGCTGCGG

GGATTCCCGTCTTACTGCTTGAACAACGTGATAAACCCGGCGGTCGGGCTTATGTCT ACGAGGATCAGGGGTTTACC

TTTGATGCAGGCCCGACGGTTATCACCGATCCCAGTGCCATTGAAGAACTGTTTGCA CTGGCAGGAAAACAGTTAAA

AGAGTATGTCGAACTGCTGCCGGTTACGCCGTTTTACCGCCTGTGTTGGGAGTCAGG GAAGGTCTTTAATTACGATA

ACGATCAAACCCGGCTCGAAGCGCAGATTCAGCAGTTTAATCCCCGCGATGTCGAAG GTTATCGTCAGTTTCTGGAC

TATTCACGCGCGGTGTTTAAAGAAGGCTATCTAAAGCTCGGTACTGTCCCTTTTTTA TCGTTCAGAGACATGCTTCG

CGCCGCACCTCAACTGGCGAAACTGCAGGCATGGAGAAGCGTTTACAGTAAGGTTGC CAGTTACATCGAAGATGAAC

ATCTGCGCCAGGCGTTTTCTTTCCACTCGCTGTTGGTGGGCGGCAATCCCTTCGCCA CCTCATCCATTTATACGTTG

ATACACGCGCTGGAGCGTGAGTGGGGCGTCTGGTTTCCGCGTGGCGGCACCGGCGCA TTAGTTCAGGGGATGATAAA

GCTGTTTCAGGATCTGGGTGGCGAAGTCGTGTTAAACGCCAGAGTCAGCCACATGGA AACGACAGGAAACAAGATTG

AAGCCGTGCATTTAGAGGACGGTCGCAGGTTCCTGACGCAAGCCGTCGCGTCAAATG CAGATGTGGTTCATACCTAT

CGCGACCTGTTAAGCCAGCACCCTGCCGCGGTTAAGCAGTCCAACAAACTGCAGACT AAGCGCATGAGTAACTCTCT

GTTTGTGCTCTATTTTGGTTTGAATCACCATCATGATCAGCTCGCGCATCACACGGT TTGTTTCGGCCCGCGTTACC

GCGAGCTGATTGACGAAATTTTTAATCATGATGGCCTCGCAGAGGACTTCTCACTTT ATCTGCACGCGCCCTGTGTC

ACGGATTCGTCACTGGCGCCTGAAGGTTGCGGCAGTTACTATGTGTTGGCGCCGGTG CCGCATTTAGGCACCGCGAA

CCTCGACTGGACGGTTGAGGGGCCAAAACTACGCGACCGTATTTTTGCGTACCTTGA GCAGCATTACATGCCTGGCT

TACGGAGTCAGCTGGTCACGCACCGGATGTTTACGCCGTTTGATTTTCGCGACCAGC TTAATGCCTATCATGGCTCA

GCCTTTTCTGTGGAGCCCGTTCTTACCCAGAGCGCCTGGTTTCGGCCGCATAACCGC GATAAAACCATTACTAATCT

CTACCTGGTCGGCGCAGGCACGCATCCCGGCGCAGGCATTCCTGGCGTCATCGGCTC GGCAAAAGCGACAGCAGGTT

TGATGCTGGAGGATCTGATTTGAAAGCTTCTCGGTACCAAATTCCAGAAAAGAGGCC TCCCGAAAGGGGGGCCTTTT

TTCGTTTTGGTCCGAATTCTTGACAGCTAGCTCAGTCCTAGGTATAATGCTAGCCGC AGTAAGAGAGGAATGTACAC

ATGTCCCGCCTGGATAAATCGAAAGTGATTAACTCGGCCCTCGAATTGCTGAATGAA GTCGGTATCGAGGGGCTGAC

GACCCGTAAATTGGCACAAAAGTTGGGGGTGGAGCAACCCACGTTGTATTGGCACGT CAAAAATAAGCGGGCATTGC TGGATGCCCTCGCTATTGAAATGTTGGATCGCCACCATACCCATTTCTGTCCACTGGAGG GCGAGTCCTGGCAGGAC

TTTCTCCGCAACAACGCGAAATCCTTTCGCTGTGCACTCTTGTCCCATCGGGACGGT GCTAAGGTGCACTTGGGCAC

CCGTCCCACCGAAAAACAATACGAAACCTTGGAAAATCAATTGGCGTTTTTGTGCCA GCAAGGGTTTAGCTTGGAGA

ATGCTCTCTATGCGCTCTCGGCTGTCGGGCACTTTACGTTGGGGTGCGTGTTGGAGG ACCAGGAGCATCAAGTCGCA

AAAGAGGAGCGTGAAACCCCAACCACGGACTCGATGCCACCTCTGCTCCGCCAAGCT ATCGAACTCTTCGATCATCA

GGGCGCGGAGCCAGCCTTCCTCTTTGGGCTGGAGCTGATTATCTGCGGTTTGGAAAA ACAACTCAAGTGTGAAAGCG

GGTCCTAACTGCAGTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGC ATTAATGAATCGGCCAACGC

GCGGGGAGAGGCGGTTTGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGA GACGGGCAACAGCTGATTGC

CCTTCACCGCCTGGCCCTGAGAGAGTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCA GCAGGCGAAAATCCTGTTTG

ATGGTGGTTAACGGCGGGATATAACATGAGCTGTCTTCGGTATCGTCGTATCCCACT ACCGAGATATCCGCACCAAC

GCGCAGCCCGGACTCGGTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGC AACCAGCATCGCAGTGGGAA

CGATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAACCGGACATGGCACTCCAGT CGCCTTCCCGTTCCGCTATC

GGCTGAATTTGATTGCGAGTGAGATATTTATGCCAGCCAGCCAGACGCAGACGCGCC GAGACAGAACTTAATGGGCC

CGCTAACAGCGCGATTTGCTGGTGACCCAATGCGACCAGATGCTCCACGCCCAGTCG CGTACCGTCTTCATGGGAGA

AAATAATACTGTTGATGGGTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACAT TAGTGCAGGCAGCTTCCACA

GCAATGGCATCCTGGTCATCCAGCGGATAGTTAATGATCAGCCCACTGACGCGTTGC GCGAGAAGATTGTGCACCGC

CGCTTTACAGGCTTCGACGCCGCTTCGTTCTACCATCGACACCACCACGCTGGCACC CAGTTGATCGGCGCGAGATT

TAATCGCCGCGACAATTTGCGACGGCGCGTGCAGGGCCAGACTGGAGGTGGCAACGC CAATCAGCAACGACTGTTTG

CCCGCCAGTTGTTGTGCCACGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCCGCT TCCACTTTTTCCCGCGTTTT

CGCAGAAACGTGGCTGGCCTGGTTCACCACGCGGGAAACGGTCTGATAAGAGACACC GGCATACTCTGCGACATCGT

ATAACGTTACTGGTTTCACATTCACCACCCTGAATTGACTCTCTTCCGGGCGCTATC ATGCCATACCGCGAAAGGTT

TTGCGCCATTCGATGGTGTCCGGGATCTCGACGCTCTCCCTTATGCGACTCCTGCAT TAGGAAGCAGCCCAGTAGTA

GGTTGAGGCCGTTGAGCACCGCCGCCGCAAGGAATGGTGCATGCAAGGAGATGGCGC CCAACAGTCCCCCGGCCACG

GGGagtcaaaagcctccggtcggaggcttztgactTCTAGAGAGCTGTTGACACTTT ATGCTTCCGGCTCGTATAAT

GTGTGTGGAATTGTGAGCGGATAACAACGCAGTAAGAGAGGAATGTACCCATGGAGC GGCTATGCAACCGCATTATG

ATCTGATTCTCGTGGGGGCTGGACTCGCGAATGGCCTTATCGCCCTGCGTCTCCAGC AGCAGCAACCTGATATGCGT

ATTTTGCTTATCGACGCCGCACCCCAGGCGGGCGGGAATCATACGTGGTCATTTCAC CACGATGATTTGACTGAGAG

CCAACATCGTTGGATAGCTCCGCTGGTGGTTCATCACTGGCCCGACTATCAGGTACG CTTTCCCACACGCCGTCGTA

AGCTGAACAGCGGCTACTTTTGTATTACTTCTCAGCGTTTCGCTGAGGTTTTACAGC GACAGTTTGGCCCGCACTTG

TGGATGGATACCGCGGTCGCAGAGGTTAATGCGGAATCTGTTCGGTTGAAAAAGGGT CAGGTTATCGGTGCCCGCGC

GGTGATTGACGGGCGGGGTTATGCGGCAAATTCAGCACTGAGCGTGGGCTTCCAGGC GTTTATTGGCCAGGAATGGC

GATTGAGCCACCCGCATGGTTTATCGTCTCCCATTATCATGGATGCCACGGTCGATC AGCAAAATGGTTATCGCTTC

GTGTACAGCCTGCCGCTCTCGCCGACCAGATTGTTAATTGAAGATACGCACTATATT GATAATGCGACATTAGATCC

TGAATGCGCGCGGCAAAATATTTGCGACTATGCCGCGCAACAGGGTTGGCAGCTTCA GACACTGCTGCGAGAAGAAC

AGGGCGCCTTACCCATTACTCTGTCGGGCAATGCCGACGCATTCTGGCAGCAGCGCC CCCTGGCCTGTAGTGGATTA

CGTGCCGGTCTGTTCCATCCTACCACCGGCTATTCACTGCCGCTGGCGGTTGCCGTG GCCGACCGCCTGAGTGCACT

TGATGTCTTTACGTCGGCCTCAATTCACCATGCCATTACGCATTTTGCCCGCGAGCG CTGGCAGCAGCAGGGCTTTT

TCCGCATGCTGAATCGCATGCTGTTTTTAGCCGGACCCGCCGATTCACGCTGGCGGG TTATGCAGCGTTTTTATGGT

TTACCTGAAGATTTAATTGCCCGTTTTTATGCGGGAAAACTCACGCTGACCGATCGG CTACGTATTCTGAGCGGCAA

GCCGCCTGTTCCGGTATTAGCAGCATTGCAAGCCATTATGACGACTCATCGTTGACT CGAG pa V2TcR-crtEBI/V3LacI-crt¥ (SEQ ID NO: 73) ggatccGAGCTGTTGACAACTCTATCATTGATAGAGTTATAATGTTCCCTATCAGTGATA GAGACGCAGTAAGAGAG

GAATGTACATATGTATCCGTTTATAAGGACAGCCCGAATGACGGTCTGCGCAAAAAA ACACGTTCATCTCACTCGCG

ATGCTGCGGAGCAGTTACTGGCTGATATTGATCGACGCCTTGATCAGTTATTGCCCG TGGAGGGAGAACGGGATGTT

GTGGGTGCCGCGATGCGTGAAGGTGCGCTGGCACCGGGAAAACGTATTCGCCCCATG TTGCTGTTGCTGACCGCCCG

CGATCTGGGTTGCGCTGTCAGCCATGACGGATTACTGGATTTGGCCTGTGCGGTGGA AATGGTCCACGCGGCTTCGC

TGATCCTTGACGATATGCCCTGCATGGACGATGCGAAGCTGCGGCGCGGACGCCCTA CCATTCATTCTCATTACGGA

GAGCATGTGGCAATACTGGCGGCGGTTGCCTTGCTGAGTAAAGCCTTTGGCGTAATT GCCGATGCAGATGGCCTCAC

GCCGCTGGCAAAAAATCGGGCGGTTTCTGAACTGTCAAACGCCATCGGCATGCAAGG ATTGGTTCAGGGTCAGTTCA

AGGATCTGTCTGAAGGGGATAAGCCGCGCAGCGCTGAAGCTATTTTGATGACGAATC ACTTTAAAACCAGCACGCTG

TTTTGTGCCTCCATGCAGATGGCCTCGATTGTTGCGAATGCCTCCAGCGAAGCGCGT GATTGCCTGCATCGTTTTTC

ACTTGATCTTGGTCAGGCATTTCAACTGCTGGACGATTTGACCGATGGCATGACCGA CACCGGTAAGGATAGCAATC

AGGACGCCGGTAAATCGACGCTGGTCAATCTGTTAGGCCCGAGGGCGGTTGAAGAAC GTCTGAGACAACATCTTCAG

CTTGCCAGTGAGCATCTCTCTGCGGCCTGCCAACACGGGCACGCCACTCAACATTTT ATTCAGGCCTGGTTTGACAA

AAAACTCGCTGCCGTCAGTTAACGCAGTAAGAGAGGAATGTAGATATGAATAATCCG TCGTTACTCAATCATGCGGT

CGAAACGATGGCAGTTGGCTCGAAAAGTTTTGCGACAGCCTCAAAGTTATTTGATGC AAAAACCCGGCGCAGCGTAC

TGATGCTCTACGCCTGGTGCCGCCATTGTGACGATGTTATTGACGATCAGACGCTGG GCTTTCAGGCCCGGCAGCCT

GCCTTACAAACGCCCGAACAACGTCTGATGCAACTTGAGATGAAAACGCGCCAGGCC TATGCAGGATCGCAGATGCA

CGAACCGGCGTTTGCGGCTTTTCAGGAAGTGGCTATGGCTCATGATATCGCCCCGGC TTACGCGTTTGATCATCTGG

AAGGCTTCGCCATGGATGTACGCGAAGCGCAATACAGCCAACTGGATGATACGCTGC GCTATTGCTATCACGTTGCA

GGCGTTGTCGGCTTGATGATGGCGCAAATCATGGGCGTGCGGGATAACGCCACGCTG GACCGCGCCTGTGACCTTGG

GCTGGCATTTCAGTTGACCAATATTGCTCGCGATATTGTGGACGATGCGCATGCGGG CCGCTGTTATCTGCCGGCAA

GCTGGCTGGAGCATGAAGGTCTGAACAAAGAGAATTATGCGGCACCTGAAAACCGTC AGGCGCTGAGCCGTATCGCC

CGTCGTTTGGTGCAGGAAGCAGAACCTTACTATTTGTCTGCCACAGCCGGCCTGGCA GGGTTGCCCCTGCGTTCCGC

CTGGGCAATCGCTACGGCGAAGCAGGTTTACCGGAAAATAGGTGTCAAAGTTGAACA GGCCGGTCAGCAAGCCTGGG

ATCAGCGGCAGTCAACGACCACGCCCGAAAAATTAACGCTGCTGCTGGCCGCCTCTG GTCAGGCCCTTACTTCCCGG

ATGCGGGCTCATCCTCCCCGCCCTGCGCATCTCTGGCAGCGCCCGCTCTGAAATAAT TTTGTTTAACTTTAAGAAGG

AGATATAATGAAACCAACTACGGTAATTGGTGCAGGCTTCGGTGGCCTGGCACTGGC AATTCGTCTACAAGCTGCGG

GGATTCCCGTCTTACTGCTTGAACAACGTGATAAACCCGGCGGTCGGGCTTATGTCT ACGAGGATCAGGGGTTTACC

TTTGATGCAGGCCCGACGGTTATCACCGATCCCAGTGCCATTGAAGAACTGTTTGCA CTGGCAGGAAAACAGTTAAA

AGAGTATGTCGAACTGCTGCCGGTTACGCCGTTTTACCGCCTGTGTTGGGAGTCAGG GAAGGTCTTTAATTACGATA

ACGATCAAACCCGGCTCGAAGCGCAGATTCAGCAGTTTAATCCCCGCGATGTCGAAG GTTATCGTCAGTTTCTGGAC

TATTCACGCGCGGTGTTTAAAGAAGGCTATCTAAAGCTCGGTACTGTCCCTTTTTTA TCGTTCAGAGACATGCTTCG

CGCCGCACCTCAACTGGCGAAACTGCAGGCATGGAGAAGCGTTTACAGTAAGGTTGC CAGTTACATCGAAGATGAAC

ATCTGCGCCAGGCGTTTTCTTTCCACTCGCTGTTGGTGGGCGGCAATCCCTTCGCCA CCTCATCCATTTATACGTTG

ATACACGCGCTGGAGCGTGAGTGGGGCGTCTGGTTTCCGCGTGGCGGCACCGGCGCA TTAGTTCAGGGGATGATAAA

GCTGTTTCAGGATCTGGGTGGCGAAGTCGTGTTAAACGCCAGAGTCAGCCACATGGA AACGACAGGAAACAAGATTG

AAGCCGTGCATTTAGAGGACGGTCGCAGGTTCCTGACGCAAGCCGTCGCGTCAAATG CAGATGTGGTTCATACCTAT

CGCGACCTGTTAAGCCAGCACCCTGCCGCGGTTAAGCAGTCCAACAAACTGCAGACT AAGCGCATGAGTAACTCTCT

GTTTGTGCTCTATTTTGGTTTGAATCACCATCATGATCAGCTCGCGCATCACACGGT TTGTTTCGGCCCGCGTTACC

GCGAGCTGATTGACGAAATTTTTAATCATGATGGCCTCGCAGAGGACTTCTCACTTT ATCTGCACGCGCCCTGTGTC

ACGGATTCGTCACTGGCGCCTGAAGGTTGCGGCAGTTACTATGTGTTGGCGCCGGTG CCGCATTTAGGCACCGCGAA CCTCGACTGGACGGTTGAGGGGCCAAAACTACGCGACCGTATTTTTGCGTACCTTGAGCA GCATTACATGCCTGGCT

TACGGAGTCAGCTGGTCACGCACCGGATGTTTACGCCGTTTGATTTTCGCGACCAGC TTAATGCCTATCATGGCTCA

GCCTTTTCTGTGGAGCCCGTTCTTACCCAGAGCGCCTGGTTTCGGCCGCATAACCGC GATAAAACCATTACTAATCT

CTACCTGGTCGGCGCAGGCACGCATCCCGGCGCAGGCATTCCTGGCGTCATCGGCTC GGCAAAAGCGACAGCAGGTT

TGATGCTGGAGGATCTGATTTGAAAGCTTCTCGGTACCAAATTCCAGAAAAGAGGCC TCCCGAAAGGGGGGCCTTTT

TTCGTTTTGGTCCGAATTCTTGACAGCTAGCTCAGTCCTAGGTATAATGCTAGCCGC AGTAAGAGAGGAATGTACAC

ATGTCCCGCCTGGATAAATCGAAAGTGATTAACTCGGCCCTCGAATTGCTGAATGAA GTCGGTATCGAGGGGCTGAC

GACCCGTAAATTGGCACAAAAGTTGGGGGTGGAGCAACCCACGTTGTATTGGCACGT CAAAAATAAGCGGGCATTGC

TGGATGCCCTCGCTATTGAAATGTTGGATCGCCACCATACCCATTTCTGTCCACTGG AGGGCGAGTCCTGGCAGGAC

TTTCTCCGCAACAACGCGAAATCCTTTCGCTGTGCACTCTTGTCCCATCGGGACGGT GCTAAGGTGCACTTGGGCAC

CCGTCCCACCGAAAAACAATACGAAACCTTGGAAAATCAATTGGCGTTTTTGTGCCA GCAAGGGTTTAGCTTGGAGA

ATGCTCTCTATGCGCTCTCGGCTGTCGGGCACTTTACGTTGGGGTGCGTGTTGGAGG ACCAGGAGCATCAAGTCGCA

AAAGAGGAGCGTGAAACCCCAACCACGGACTCGATGCCACCTCTGCTCCGCCAAGCT ATCGAACTCTTCGATCATCA

GGGCGCGGAGCCAGCCTTCCTCTTTGGGCTGGAGCTGATTATCTGCGGTTTGGAAAA ACAACTCAAGTGTGAAAGCG

GGTCCTAACTGCAGTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGC ATTAATGAATCGGCCAACGC

GCGGGGAGAGGCGGTTTGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGA GACGGGCAACAGCTGATTGC

CCTTCACCGCCTGGCCCTGAGAGAGTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCA GCAGGCGAAAATCCTGTTTG

ATGGTGGTTAACGGCGGGATATAACATGAGCTGTCTTCGGTATCGTCGTATCCCACT ACCGAGATATCCGCACCAAC

GCGCAGCCCGGACTCGGTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGC AACCAGCATCGCAGTGGGAA

CGATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAACCGGACATGGCACTCCAGT CGCCTTCCCGTTCCGCTATC

GGCTGAATTTGATTGCGAGTGAGATATTTATGCCAGCCAGCCAGACGCAGACGCGCC GAGACAGAACTTAATGGGCC

CGCTAACAGCGCGATTTGCTGGTGACCCAATGCGACCAGATGCTCCACGCCCAGTCG CGTACCGTCTTCATGGGAGA

AAATAATACTGTTGATGGGTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACAT TAGTGCAGGCAGCTTCCACA

GCAATGGCATCCTGGTCATCCAGCGGATAGTTAATGATCAGCCCACTGACGCGTTGC GCGAGAAGATTGTGCACCGC

CGCTTTACAGGCTTCGACGCCGCTTCGTTCTACCATCGACACCACCACGCTGGCACC CAGTTGATCGGCGCGAGATT

TAATCGCCGCGACAATTTGCGACGGCGCGTGCAGGGCCAGACTGGAGGTGGCAACGC CAATCAGCAACGACTGTTTG

CCCGCCAGTTGTTGTGCCACGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCCGCT TCCACTTTTTCCCGCGTTTT

CGCAGAAACGTGGCTGGCCTGGTTCACCACGCGGGAAACGGTCTGATAAGAGACACC GGCATACTCTGCGACATCGT

ATAACGTTACTGGTTTCACATTCACCACCCTGAATTGACTCTCTTCCGGGCGCTATC ATGCCATACCGCGAAAGGTT

TTGCGCCATTCGATGGTGTCCGGGATCTCGACGCTCTCCCTTATGCGACTCCTGCAT TAGGAAGCAGCCCAGTAGTA

GGTTGAGGCCGTTGAGCACCGCCGCCGCAAGGAATGGTGCATGCAAGGAGATGGCGC CCAACAGTCCCCCGGCCACG

GGGagtcaaaagcctccggtcggaggcttztgactTCTAGAGAGCTGTTGACACTTT ATGCTTCCGGCTCGTATAAT

GTGTGTGGAATTGTGAGCGGATAACAAGTGGAATTGTGAGCGGATAACAATTTCACA CAGGAAACAGAATCCCATGG

AGCGGCTATGCAACCGCATTATGATCTGATTCTCGTGGGGGCTGGACTCGCGAATGG CCTTATCGCCCTGCGTCTCC

AGCAGCAGCAACCTGATATGCGTATTTTGCTTATCGACGCCGCACCCCAGGCGGGCG GGAATCATACGTGGTCATTT

CACCACGATGATTTGACTGAGAGCCAACATCGTTGGATAGCTCCGCTGGTGGTTCAT CACTGGCCCGACTATCAGGT

ACGCTTTCCCACACGCCGTCGTAAGCTGAACAGCGGCTACTTTTGTATTACTTCTCA GCGTTTCGCTGAGGTTTTAC

AGCGACAGTTTGGCCCGCACTTGTGGATGGATACCGCGGTCGCAGAGGTTAATGCGG AATCTGTTCGGTTGAAAAAG

GGTCAGGTTATCGGTGCCCGCGCGGTGATTGACGGGCGGGGTTATGCGGCAAATTCA GCACTGAGCGTGGGCTTCCA

GGCGTTTATTGGCCAGGAATGGCGATTGAGCCACCCGCATGGTTTATCGTCTCCCAT TATCATGGATGCCACGGTCG

ATCAGCAAAATGGTTATCGCTTCGTGTACAGCCTGCCGCTCTCGCCGACCAGATTGT TAATTGAAGATACGCACTAT ATTGATAATGCGACATTAGATCCTGAATGCGCGCGGCAAAATATTTGCGACTATGCCGCG CAACAGGGTTGGCAGCT

TCAGACACTGCTGCGAGAAGAACAGGGCGCCTTACCCATTACTCTGTCGGGCAATGC CGACGCATTCTGGCAGCAGC

GCCCCCTGGCCTGTAGTGGATTACGTGCCGGTCTGTTCCATCCTACCACCGGCTATT CACTGCCGCTGGCGGTTGCC

GTGGCCGACCGCCTGAGTGCACTTGATGTCTTTACGTCGGCCTCAATTCACCATGCC ATTACGCATTTTGCCCGCGA

GCGCTGGCAGCAGCAGGGCTTTTTCCGCATGCTGAATCGCATGCTGTTTTTAGCCGG ACCCGCCGATTCACGCTGGC

GGGTTATGCAGCGTTTTTATGGTTTACCTGAAGATTTAATTGCCCGTTTTTATGCGG GAAAACTCACGCTGACCGAT

CGGCTACGTATTCTGAGCGGCAAGCCGCCTGTTCCGGTATTAGCAGCATTGCAAGCC ATTATGACGACTCATCGTTG ACTCGAG cocE DNA sequence (SEQ ID NO: 74)

ATGGTGGACGGTAATTATTCGGTAGCGTCCAACGTTATGGTGCCGATGCGCGACGGG GTGCGCTTGGCTGTAGATCT

GTACCGCCCGGACGCAGATGGCCCTGTACCGGTCCTGCTGGTCCGCAACCCCTACGA CAAATTCGACGTGTTCGCTT

GGAGTACGCAGAGCACGAACTGGCTGGAATTTGTGCGCGATGGGTACGCCGTCGTCA TCCAAGACACCCGGGGCCTC

TTTGCATCCGAAGGTGAGTTCGTTCCACATGTTGATGACGAGGCGGATGCGGAAGAC ACGCTGAGCTGGATCTTGGA

ACAAGCATGGTGCGACGGCAATGTGGGTATGTTCGGTGTAAGCTACCTGGGCGTTAC GCAGTGGCAAGCTGCTGTTA

GCGGTGTGGGTGGTTTGAAGGCAATCGCCCCGAGCATGGCGAGCGCGGATCTGTACC GTGCCCCCTGGTACGGTCCT

GGCGGCGCCCTGAGCGTGGAAGCACTCCTGGGCTGGAGCGCATTGATCGGTACGGGC CTGATTACCAGCCGTAGCGA

TGCCCGCCCGGAAGACGCAGCCGACTTCGTACAGCTGGCAGCCATCCTGAACGATGT GGCCGGTGCCGCAAGCGTGA

CCCCTCTGGCCGAACAGCCCTTGTTGGGCCGCCTGATCCCTTGGGTGATCGACCAGG TGGTGGACCATCCAGACAAC

GACGAGTCGTGGCAGAGCATCTCGCTCTTTGAACGTTTGGGTGGGCTCGCTACCCCG GCCTTGATTACCGCCGGGTG

GTACGATGGCTTCGTGGGCGAGAGCCTCCGTACCTTCGTAGCTGTGAAGGACAACGC GGATGCGCGTCTGGTGGTGG

GGCCGTGGAGCCACAGCAATCTGACCGGCCGTAATGCCGACCGTAAGTTTGGGATCG CCGCGACCTACCCCATCCAG

GAGGCGACGACCATGCACAAGGCTTTTTTCGACCGGCACCTCCGTGGCGAGACCGAT GCCCTGGCAGGGGTGCCCAA

GGTGCGCCTCTTCGTAATGGGTATCGATGAGTGGCGCGACGAGACCGACTGGCCATT GCCAGATACCGCTTACACGC

CTTTTTACCTCGGGGGCTCCGGTGCGGCCAACACGAGCACGGGTGGTGGGACCCTGT CGACCTCGATCAGCGGCACG

GAGTCGGCGGACACCTACCTGTATGATCCTGCCGACCCCGTGCCAAGTCTGGGCGGC ACCCTCCTCTTCCATAATGG

GGACAACGGTCCAGCTGACCAGCGCCCGATTCACGATCGCGACGACGTGCTGTGCTA CTCCACCGAGGTGTTGACCG

ACCCCGTGGAAGTAACGGGGACGGTTTCGGCTCGCCTGTTCGTGTCCTCGTCGGCCG TGGATACCGATTTTACCGCC

AAGTTGGTCGACGTGTTCCCCGATGGTCGGGCAATCGCTCTCTGCGACGGCATCGTG CGTATGCGCTACCGGGAGAC

CTTGGTAAATCCTACGCTCATTGAGGCCGGTGAGATTTACGAGGTGGCTATTGATAT GCTGGCCACCAGCAACGTGT

TTTTGCCGGGCCACCGCATCATGGTGCAAGTTAGCAGCTCGAACTTCCCGAAGTACG ACCGCAACTCCAACACCGGC

GGCGTCATCGCTCGCGAGCAACTGGAGGAAATGTGCACCGCCGTAAACCGCATTCAC CGCGGCCCCGAACACCCGTC

CCATATCGTGCTGCCGATCATTAAGCGCGACTATAAGGACGACGACGATAAGTGA cocE Amino acid sequence (SEQ ID NO: 75)

WDGNYSVASNVMVPMRDGVRLAVDLYRPDADGPVPVLLVRNPYDKFDVFAWSTQSTN WLEFVRDGYAWIQDTRGL

FASEGEFVPHVDDEADAEDTLSWI LEQAWCDGNVGMFGVSYLGVTQWQAAVSGVGGLKAIAPSMASADLYRAPWYGP

GGALSVEALLGWSALIGTGLITSRSDARPEDAADFVQLAAILNDVAGAASVTPLAEQ PLLGRLI PWVIDQVVDHPDN

DESWQSI SLFERLGGLATPALITAGWYDGFVGESLRTFVAVKDNADARLWGPWSHSNLTGRNADRKF GIAATYPIQ EATTMHKAFFDRHLRGETDALAGVPKVRLFVMGIDEWRDETDWPLPDTAYTPFYLGGSGA ANTSTGGGTLSTS I SGT

ESADTYLYDPADPVPSLGGTLLFHNGDNGPADQRPIHDRDDVLCYSTEVLTDPVEVT GTVSARLFVS SSAVDTDFTA

KLVDVFPDGRAIALCDGIVRMRYRETLVNPTLIEAGEIYEVAIDMLATSNVFLPGHR IMVQVS SSNFPKYDRNSNTG

GVIAREQLEEMCTAVNRIHRGPEHPSHIVLPI IKRDYKDDDDK- crtE (SEQ ID NO: 76)

ATGTATCCGTTTATAAGGACAGCCCGAATGACGGTCTGCGCAAAAAAACACGTTCAT CTCACTCGCGATGCTGCGGA

GCAGTTACTGGCTGATATTGATCGACGCCTTGATCAGTTATTGCCCGTGGAGGGAGA ACGGGATGTTGTGGGTGCCG

CGATGCGTGAAGGTGCGCTGGCACCGGGAAAACGTATTCGCCCCATGTTGCTGTTGC TGACCGCCCGCGATCTGGGT

TGCGCTGTCAGCCATGACGGATTACTGGATTTGGCCTGTGCGGTGGAAATGGTCCAC GCGGCTTCGCTGATCCTTGA

CGATATGCCCTGCATGGACGATGCGAAGCTGCGGCGCGGACGCCCTACCATTCATTC TCATTACGGAGAGCATGTGG

CAATACTGGCGGCGGTTGCCTTGCTGAGTAAAGCCTTTGGCGTAATTGCCGATGCAG ATGGCCTCACGCCGCTGGCA

AAAAATCGGGCGGTTTCTGAACTGTCAAACGCCATCGGCATGCAAGGATTGGTTCAG GGTCAGTTCAAGGATCTGTC

TGAAGGGGATAAGCCGCGCAGCGCTGAAGCTATTTTGATGACGAATCACTTTAAAAC CAGCACGCTGTTTTGTGCCT

CCATGCAGATGGCCTCGATTGTTGCGAATGCCTCCAGCGAAGCGCGTGATTGCCTGC ATCGTTTTTCACTTGATCTT

GGTCAGGCATTTCAACTGCTGGACGATTTGACCGATGGCATGACCGACACCGGTAAG GATAGCAATCAGGACGCCGG

TAAATCGACGCTGGTCAATCTGTTAGGCCCGAGGGCGGTTGAAGAACGTCTGAGACA ACATCTTCAGCTTGCCAGTG

AGCATCTCTCTGCGGCCTGCCAACACGGGCACGCCACTCAACATTTTATTCAGGCCT GGTTTGACAAAAAACTCGCT GCCGTCAGTTAA crtB (SEQ ID NO: 77)

ATGAATAATCCGTCGTTACTCAATCATGCGGTCGAAACGATGGCAGTTGGCTCGAAA AGTTTTGCGACAGCCTCAAA

GTTATTTGATGCAAAAACCCGGCGCAGCGTACTGATGCTCTACGCCTGGTGCCGCCA TTGTGACGATGTTATTGACG

ATCAGACGCTGGGCTTTCAGGCCCGGCAGCCTGCCTTACAAACGCCCGAACAACGTC TGATGCAACTTGAGATGAAA

ACGCGCCAGGCCTATGCAGGATCGCAGATGCACGAACCGGCGTTTGCGGCTTTTCAG GAAGTGGCTATGGCTCATGA

TATCGCCCCGGCTTACGCGTTTGATCATCTGGAAGGCTTCGCCATGGATGTACGCGA AGCGCAATACAGCCAACTGG

ATGATACGCTGCGCTATTGCTATCACGTTGCAGGCGTTGTCGGCTTGATGATGGCGC AAATCATGGGCGTGCGGGAT

AACGCCACGCTGGACCGCGCCTGTGACCTTGGGCTGGCATTTCAGTTGACCAATATT GCTCGCGATATTGTGGACGA

TGCGCATGCGGGCCGCTGTTATCTGCCGGCAAGCTGGCTGGAGCATGAAGGTCTGAA CAAAGAGAATTATGCGGCAC

CTGAAAACCGTCAGGCGCTGAGCCGTATCGCCCGTCGTTTGGTGCAGGAAGCAGAAC CTTACTATTTGTCTGCCACA

GCCGGCCTGGCAGGGTTGCCCCTGCGTTCCGCCTGGGCAATCGCTACGGCGAAGCAG GTTTACCGGAAAATAGGTGT

CAAAGTTGAACAGGCCGGTCAGCAAGCCTGGGATCAGCGGCAGTCAACGACCACGCC CGAAAAATTAACGCTGCTGC

TGGCCGCCTCTGGTCAGGCCCTTACTTCCCGGATGCGGGCTCATCCTCCCCGCCCTG CGCATCTCTGGCAGCGCCCG CTCTAG crtl (SEQ ID NO: 78)

ATGAAACCAACTACGGTAATTGGTGCAGGCTTCGGTGGCCTGGCACTGGCAATTCGT CTACAAGCTGCGGGGATTCC

CGTCTTACTGCTTGAACAACGTGATAAACCCGGCGGTCGGGCTTATGTCTACGAGGA TCAGGGGTTTACCTTTGATG

CAGGCCCGACGGTTATCACCGATCCCAGTGCCATTGAAGAACTGTTTGCACTGGCAG GAAAACAGTTAAAAGAGTAT

GTCGAACTGCTGCCGGTTACGCCGTTTTACCGCCTGTGTTGGGAGTCAGGGAAGGTC TTTAATTACGATAACGATCA AACCCGGCTCGAAGCGCAGATTCAGCAGTTTAATCCCCGCGATGTCGAAGGTTATCGTCA GTTTCTGGACTATTCAC GCGCGGTGTTTAAAGAAGGCTATCTAAAGCTCGGTACTGTCCCTTTTTTATCGTTCAGAG ACATGCTTCGCGCCGCA CCTCAACTGGCGAAACTGCAGGCATGGAGAAGCGTTTACAGTAAGGTTGCCAGTTACATC GAAGATGAACATCTGCG CCAGGCGTTTTCTTTCCACTCGCTGTTGGTGGGCGGCAATCCCTTCGCCACCTCATCCAT TTATACGTTGATACACG CGCTGGAGCGTGAGTGGGGCGTCTGGTTTCCGCGTGGCGGCACCGGCGCATTAGTTCAGG GGATGATAAAGCTGTTT CAGGATCTGGGTGGCGAAGTCGTGTTAAACGCCAGAGTCAGCCACATGGAAACGACAGGA AACAAGATTGAAGCCGT GCATTTAGAGGACGGTCGCAGGTTCCTGACGCAAGCCGTCGCGTCAAATGCAGATGTGGT TCATACCTATCGCGACC TGTTAAGCCAGCACCCTGCCGCGGTTAAGCAGTCCAACAAACTGCAGACTAAGCGCATGA GTAACTCTCTGTTTGTG CTCTATTTTGGTTTGAATCACCATCATGATCAGCTCGCGCATCACACGGTTTGTTTCGGC CCGCGTTACCGCGAGCT GATTGACGAAATTTTTAATCATGATGGCCTCGCAGAGGACTTCTCACTTTATCTGCACGC GCCCTGTGTCACGGATT CGTCACTGGCGCCTGAAGGTTGCGGCAGTTACTATGTGTTGGCGCCGGTGCCGCATTTAG GCACCGCGAACCTCGAC TGGACGGTTGAGGGGCCAAAACTACGCGACCGTATTTTTGCGTACCTTGAGCAGCATTAC ATGCCTGGCTTACGGAG TCAGCTGGTCACGCACCGGATGTTTACGCCGTTTGATTTTCGCGACCAGCTTAATGCCTA TCATGGCTCAGCCTTTT CTGTGGAGCCCGTTCTTACCCAGAGCGCCTGGTTTCGGCCGCATAACCGCGATAAAACCA TTACTAATCTCTACCTG GTCGGCGCAGGCACGCATCCCGGCGCAGGCATTCCTGGCGTCATCGGCTCGGCAAAAGCG ACAGCAGGTTTGATGCT GGAGGATCTGATTTGA crtY (SEQ ID NO: 79)

ATGGGAGCGGCTATGCAACCGCATTATGATCTGATTCTCGTGGGGGCTGGACTCGCG AATGGCCTTATCGCCCTGCG TCTCCAGCAGCAGCAACCTGATATGCGTATTTTGCTTATCGACGCCGCACCCCAGGCGGG CGGGAATCATACGTGGT CATTTCACCACGATGATTTGACTGAGAGCCAACATCGTTGGATAGCTCCGCTGGTGGTTC ATCACTGGCCCGACTAT CAGGTACGCTTTCCCACACGCCGTCGTAAGCTGAACAGCGGCTACTTTTGTATTACTTCT CAGCGTTTCGCTGAGGT

TTTACAGCGACAGTTTGGCCCGCACTTGTGGATGGATACCGCGGTCGCAGAGGTTAA TGCGGAATCTGTTCGGTTGA AAAAGGGTCAGGTTATCGGTGCCCGCGCGGTGATTGACGGGCGGGGTTATGCGGCAAATT CAGCACTGAGCGTGGGC TTCCAGGCGTTTATTGGCCAGGAATGGCGATTGAGCCACCCGCATGGTTTATCGTCTCCC ATTATCATGGATGCCAC GGTCGATCAGCAAAATGGTTATCGCTTCGTGTACAGCCTGCCGCTCTCGCCGACCAGATT GTTAATTGAAGATACGC

ACTATATTGATAATGCGACATTAGATCCTGAATGCGCGCGGCAAAATATTTGCGACT ATGCCGCGCAACAGGGTTGG CAGCTTCAGACACTGCTGCGAGAAGAACAGGGCGCCTTACCCATTACTCTGTCGGGCAAT GCCGACGCATTCTGGCA GCAGCGCCCCCTGGCCTGTAGTGGATTACGTGCCGGTCTGTTCCATCCTACCACCGGCTA TTCACTGCCGCTGGCGG TTGCCGTGGCCGACCGCCTGAGTGCACTTGATGTCTTTACGTCGGCCTCAATTCACCATG CCATTACGCATTTTGCC

CGCGAGCGCTGGCAGCAGCAGGGCTTTTTCCGCATGCTGAATCGCATGCTGTTTTTA GCCGGACCCGCCGATTCACG CTGGCGGGTTATGCAGCGTTTTTATGGTTTACCTGAAGATTTAATTGCCCGTTTTTATGC GGGAAAACTCACGCTGA CCGATCGGCTACGTATTCTGAGCGGCAAGCCGCCTGTTCCGGTATTAGCAGCATTGCAAG CCATTATGACGACTCAT CGTTGA

[00141] While exemplary embodiments have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will occur to those skilled in the art. It should be understood that vanous alternatives to the embodiments described herein may be employed. It is intended that the following claims define the scope of the disclosure and that methods and structures within the scope of these claims and their equivalents be covered thereby.