Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
EXTRACELLULAR TRANSPORT OF BIOSYNTHETIC HYDROCARBONS AND OTHER MOLECULES
Document Type and Number:
WIPO Patent Application WO/2013/096475
Kind Code:
A1
Abstract:
Disclosed are methods and compositions for modifying photoautotrophic organisms as hosts, such that the organisms efficiently convert carbon dioxide and light into hydrocarbons, e.g., n-alkanes and n-alkenes, wherein the n-alkanes are secreted into the culture medium via recombinantly expressed transporter proteins. The use of such organisms for the commercial production of n-alkanes and related molecules is contemplated.

Inventors:
SMITH KEVIN M (US)
RIDLEY CHRISTIAN PERRY (US)
Application Number:
PCT/US2012/070666
Publication Date:
June 27, 2013
Filing Date:
December 19, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
JOULE UNLTD TECHNOLOGIES INC (US)
International Classes:
C12P5/02
Domestic Patent References:
WO2007136762A22007-11-29
Foreign References:
US20110072714A12011-03-31
US20110117618A12011-05-19
US20070099277A12007-05-03
US20110111470A12011-05-12
US20100189777A12010-07-29
Other References:
FRIGAARD ET AL.: "Seeing green bacteria in a new light: genomics-enabled studies of the photosynthetic apparatus in green sulfur bacteria and filamentous anoxygenic phototrophic bacteria.", ARCH MICROBIOL., vol. 182, no. 4, 1 September 2004 (2004-09-01), pages 265 - 276
Attorney, Agent or Firm:
ULLSPERGER, Christian, J. et al. (Silicon Valley Center801 California Stree, Mountain View CA, US)
Download PDF:
Claims:
What is claimed is:

1. An engineered microorganism, wherein said engineered microorganism comprises (i) one or more recombinant genes encoding enzymes which catalyze the production of alkanes and/or alkenes, and (ii) one or more recombinant genes encoding one or more tripartite transporter proteins selected from the group consisting of Emr proteins comprising EmrA and EmrB, EmrA, EmrB, AcrE, AcrF, EmrK, EmrY, MacA, MacB, MdtA, MdtB, MdtC, MdtE, MdtF, SdsR, SdsQ, and SdsP.

2. The engineered microorganism of claim 1, wherein said one or more tripartite

transporter proteins is EmrA and EmrB, wherein said microorganism is a

Synechococcus species, and wherein said enzymes which catalyze the production of alkanes and/or alkenes comprise recombinant acyl-ACP reductase ("AAR") and recombinant alkanal deformylative monooxygenase ("ADM").

3. The engineered microorganism of any one of claims 1-2, wherein said one or more tripartite transporter proteins is EmrA and/or EmrB.

4. The engineered microorganism of any one of claims 1-3, wherein said one or more tripartite transporter proteins is EmrA and EmrB.

5. The engineered microorganism of any of claims 1-4, wherein said microorganism is a bacterium.

6. The engineered microorganism of any of claims 1-5, wherein said microorganism is a gram-negative bacterium.

7. The engineered microorganism of any of claims 1-6, wherein said microorganism is E. coli.

8. The engineered microorganism of any one of claims 1-7, wherein expression of an operon comprising the one or more recombinant genes encoding one or more tripartite transporter proteins is controlled by a recombinant promoter, and wherein the promoter is constitutive or inducible.

9. The engineered microorganism of claim 8, wherein said operon is integrated into the genome of said microorganism.

10. The engineered microorganism of claim 8, wherein said operon is extrachromosomal.

11. The engineered microorganism of any of claims 1-10, wherein said microorganism is a photosynthetic microorganism.

12. The engineered photosynthetic microorganism of any one of claims 1-11, wherein said microorganism is a cyanobacterium.

13. The engineered photosynthetic microorganism of any one of claims 1-12, wherein said microorganism is a Synechococcus species.

14. The engineered photosynthetic microorganism of any of claims 1-13, wherein said one or more tripartite transporter proteins are selected from the group consisting of EmrA and EmrB, and wherein the native leader sequence of said one or more proteins is replaced with a leader sequence native to said photosynthetic microorganism.

15. The engineered microorganism of any one of claims 1-14, wherein said one or more recombinant genes encoding one or more tripartite transporter proteins are E. coli genes.

16. The engineered microorganism of any one of claims 1-15, wherein the expression of said one or more tripartite transporter proteins is increased relative to an otherwise identical microorganism lacking said one or more tripartite transporter proteins.

17. The engineered microorganism of any one of claims 1-16, wherein the activity of said one or more tripartite transporter proteins is increased relative to an otherwise identical microorganism lacking said one or more tripartite transporter proteins.

18. The engineered microorganism of any one of claims 1-17, wherein said

microorganism is a photosynthetic organism, and wherein said recombinant genes encoding enzymes which catalyze the production of alkanes and/or alkenes comprise recombinant acyl-ACP reductase ("AAR") and recombinant alkanal deformylative monooxygenase ("ADM").

19. The engineered microorganism of any one of claims 1-18, wherein the expression of said one or more tripartite transporter proteins is driven by a T5 promoter or wherein said one or more recombinant genes encoding one or more tripartite transporter proteins are operably linked to a T5 promoter.

20. The engineered microorganism of any one of claims 1-19, wherein said microorganism comprises a recombinant TolC gene.

21. The engineered microorganism of any one of claims 1-20, wherein said enzymes which catalyze the production of alkanes and/or alkenes are selected from the group consisting of a recombinant AAR enzyme and a recombinant ADM enzyme or wherein said enzymes which catalyze the production of alkanes and/or alkenes comprise a recombinant AAR enzyme and a recombinant ADM enzyme.

22. The engineered microorganism of any one of claims 1-21, wherein said one or more recombinant genes encoding enzymes which catalyze the production of alkanes and/or alkenes are Synechococcus AAR and ADM genes.

23. The engineered microorganism of any one of claims 1-22, wherein said one or more recombinant genes encoding enzymes which catalyze the production of alkanes and/or alkenes are Synechococcus elongatus AAR and ADM genes.

24. The engineered microorganism of any one of claims 1-23, wherein said engineered photosynthetic microorganism comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 or more recombinant genes encoding the one or more tripartite transporter protein(s).

25. The engineered microorganism of any one of claims 1-24, wherein said recombinant tripartite transporter protein is at least 90% or at least 95% identical to a recombinant tripartite transporter protein sequence shown in Table 5.

26. A cell culture comprising a culture medium and the microorganism of any one of claims 1-25.

27. A method for producing hydrocarbons, comprising:

culturing an engineered microorganism of any of claims 1-25 in a culture medium, wherein said engineered microorganism secretes increased amounts of n-alkanes or n- alkenes into the culture medium relative to an otherwise identical microorganism, cultured under identical conditions, but lacking said recombinant genes.

28. The method of claim 27, wherein said culture medium does not include a surfactant.

29. The method of any one of claims 27-28, wherein said culture medium does not include EDTA.

30. The method of any one of claims 27-29, wherein said culture medium does not

include Tris buffer.

31. The method of any one of claims 27-30, wherein said engineered microorganism secretes at least 2.5 -fold more n-alkanes and n-alkenes relative to an otherwise identical microorganism, cultured under identical conditions, but lacking said recombinant genes for efflux of n-alkanes or n-alkenes.

32. The method of any one of claims 27-31 , wherein said engineered microorganism secretes 2-4 fold more n-alkanes and n-alkenes relative to an otherwise identical microorganism, cultured under identical conditions, but lacking said recombinant genes for efflux of n-alkanes or n-alkenes.

33. The method of any one of claims 27-32, wherein said engineered microorganism is an engineered E. coli, and wherein at least 70% of said n-alkanes or n-alkenes are secreted into the culture medium.

34. The method of any one of claims 27-33, wherein said engineered microorganism is an engineered E. coli, and wherein at least 95% of said n-alkanes or n-alkenes are secreted into the culture medium.

35. The method of any one of claims 27-34, wherein said engineered microorganism produces at least 0.1 mg/L/OD/hr of said n-alka/enes.

36. The method of any one of claims 27-35, further comprising allowing n-alkanes or n- alkenes to accumulate in the culture or in the organism.

37. The method of any one of claims 27-36, further comprising isolating at least a portion of the n-alkanes or n-alkenes from said culture.

38. The method of any one of claims 27-37, further comprising processing the isolated n- alkanes or n-alkenes to produce a processed material.

39. A composition comprising n-alkanes or n-alkenes, wherein said n-alkanes or n- alkenes is produced by the method of any one of claims 27-38.

40. The composition of claim 39, wherein the composition comprises at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 99% n- alkanes or n-alkenes.

41. A method for producing hydrocarbons, comprising:

(i) culturing an engineered microorganism of any of claims 1-25 in a culture medium; and

(ii) exposing said engineered microorganism to light and inorganic carbon, wherein said exposure results in the conversion of said carbon dioxide by said microorganism into n-alkanes, wherein said n-alkanes are secreted into said culture medium in an amount greater than that secreted by an otherwise identical microorganism, cultured under identical conditions, but lacking said recombinant genes.

42. The method of claim 41, wherein said engineered photosynthetic microorganism

further produces at least one n-alkene or n-alkanol.

43. The method of any one of claims 41-42, wherein said engineered photosynthetic microorganism produces at least one n-alkene or n-alkanol selected from the group consisting of n-pentadecene, n-heptadecene, and 1-octadecanol.

44. The method of any one of claims 41-43, wherein said n-alkanes comprise

predominantly n-heptadecane, n-pentadecane or a combination thereof.

45. The method of any one of claims 41-44, further comprising isolating at least one n- alkane, n-alkene or n-alkanol from said culture medium.

46. The method of any one of claims 41-45, wherein at least one of said recombinant genes is encoded on a plasmid.

47. The method of any one of claims 41-46, wherein at least one of said recombinant genes is incorporated into the genome of said engineered photosynthetic

microorganism.

48. The method of any one of claims 41-47, wherein at least one of said recombinant genes is present in multiple copies in said engineered photosynthetic microorganism.

49. The method of any one of claims 41-48, wherein at least two of said recombinant genes are part of an operon, and wherein the expression of said genes is controlled by a single promoter.

50. The method of any one of claims 41-49, wherein at least 95% of said n-alkanes are n- pentadecane and n-heptadecane.

51. The method of any one of claims 41-50, wherein the expression of at least one of said recombinant genes is controlled by one or more inducible promoters.

52. The method of claim 51 , wherein at least one promoter is a urea-repressible, nitrate- inducible promoter.

53. The method of any one of claims 41-52, wherein the inorganic carbon is carbon

dioxide.

54. The method of any one of claims 41-53, further comprising allowing n-alkanes or n- alkenes to accumulate in the culture or in the organism.

55. The method of any one of claims 41-54, further comprising isolating at least a portion of the n-alkanes or n-alkenes from said culture.

56. The method of any one of claims 41-55, further comprising processing the isolated n- alkanes or n-alkenes to produce a processed material.

57. A composition comprising n-alkanes or n-alkenes, wherein said n-alkanes or n- alkenes is produced by the method of any one of claims 41-56.

58. The composition of claim 57, wherein the composition comprises at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 99% n- alkanes or n-alkenes.

59. A method of identifying a cyanobacteria capable of effluxing n-alkane and/or n- alkene, comprising determining that the cyanobacteria is resistant to an antibiotic applied to the cyanobacteria, wherein the antibiotic resistance is conferred by the presence of a tripartite transporter complex comprising EmrA, EmrB, and TolC, and wherein the complex is capable of mediating efflux of n-alkane and/or n-alkene.

60. The method of claim 59, wherein the antibiotic is sodium dodecyl sulfate (SDS), deoxycholate, carbonylcyanide-m-chlorophenylhydrazone (CCCP), Rhomadine 6G, nalidixic acid, tetrachlorosalicyl anilide, 2-chlorophenylhydrazine, thiolactomycin, or methylviologen.

61. A method of producing an engineered cyanobacteria capable of effluxing n-alkane and/or n-alkene, comprising contacting the cyanobacteria with an engineered outer membrane protein capable of forming a tripartite transporter complex with endogenous EmrA and EmrB, wherein the complex is capable of mediating efflux of n-alkane and/or n-alkene.

62. A method of producing an engineered cyanobacteria capable of effluxing n-alkane and/or n-alkene, comprising contacting the cyanobacteria with an engineered EmrA and/or an engineered EmrB protein capable of forming a tripartite transporter complex with one or more endogenous outer membrane proteins, wherein the complex is capable of mediating efflux of n-alkane and/or n-alkene.

63. An engineered photosynthetic microbe, wherein said engineered microbe comprises a recombinant nucleic acid or recombinant protein comprising a sequence selected from SEQ ID NO: 15-16, 19-47, or 51-52.

64. The engineered microbe of claim 63, wherein said engineered microbe is a

photosynthetic microbe.

65. The engineered microbe of claim 64, wherein said engineered photosynthetic microbe is a cyanobacterium.

Description:
EXTRACELLULAR TRANSPORT OF BIOSYNTHETIC HYDROCARBONS AND

OTHER MOLECULES

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application No.

61/577,243, filed December 19, 2011, the entire disclosure of which is hereby incorporated by reference in its entirety for all purposes.

[0002] This application incorporates by reference, for all purposes, the entire disclosures of U.S. Provisional Patent Application No. 61/382,917, filed September 14, 2010, U.S.

Provisional Patent Application No. 61/414,877, filed November 17, 2010, U.S. Provisional Patent Application No. 61/416,713, filed November 23, 2010, and U.S. Provisional Patent Application No. 61/478,045, filed April 21, 2011, U.S. Provisional Patent Application No. 61/224,463 filed, July 9, 2009, U.S. Provisional Patent Application No. 61/228,937, filed July 27, 2009, U.S. utility application 12/759,657, filed April 13, 2010 (now U.S. Pat. No.

7,794,969), and U.S. utility application 12/833,821, filed July 9, 2010, U.S. utility application 13/232,961, filed September 14, 2011, and U.S. utility application 13/232,945, filed

September 14, 2011.

SEQUENCE LISTING

[0003] This application includes a Sequence Listing submitted electronically as a text file named X_PCT_sequencelisting.txt, created on X, with a size of X bytes. The sequence listing is incorporated by reference.

BACKGROUND

[0004] Recombinant photosynthetic microorganisms have been engineered to produce hydrocarbons, including alkanes, in amounts that exceed the levels produced naturally by the organism. A need exists for engineered photosynthetic microorganisms which have enhanced secretion capabilities such that greater amounts of the biosynthetic hydrocarbon products are excreted into the culture medium, thereby minimizing downstream processing steps.

SUMMARY

[0005] Disclosed herein are compositions and methods for increasing the amount of hydrocarbons (e.g., n-alkanes and n-alkenes) that are secreted by engineered microorganisms which have been modified to biosynthetically produce such hydrocarbons. In certain aspects, engineered microorganisms comprising recombinant enzymes for producing hydrocarbons are provided, wherein said microorganisms are further modified to secrete said hydrocarbons in greater amounts than otherwise identical hydrocarbon-producing microorganisms lacking the modifications.

[0006] Also disclosed herein is an engineered microorganism, wherein said engineered microorganism comprises (i) one or more recombinant genes encoding enzymes which catalyze the production of alkanes and/or alkenes, and (ii) one or more recombinant genes encoding one or more tripartite transporter proteins selected from the group consisting of EmrA, EmrB, AcrE, AcrF, EmrK, EmrY, MacA, MacB, MdtA, MdtB, MdtC, MdtE, MdtF, SdsR, SdsQ, and SdsP.

[0007] In some aspects, said one or more recombinant genes encoding one or more tripartite transporter proteins is EmrA and/or EmrB or a homologue thereof. In some aspects, said one or more recombinant genes encoding one or more tripartite transporter proteins is EmrA and EmrB.

[0008] In some aspects, said microorganism is a bacterium. In some aspects, said microorganism is a gram-negative bacterium. In some aspects, said microorganism is E. coli.

[0009] In some aspects, expression of an operon comprising the one or more recombinant genes encoding tripartite transporter proteins is controlled by a recombinant promoter, and wherein the promoter is constitutive or inducible. In some aspects, said operon is integrated into the genome of said microorganism. In some aspects, said operon is extrachromosomal.

[0010] In some aspects, said microorganism is a photosynthetic microorganism. In some aspects, said microorganism is a cyanobacterium. In some aspects, said microorganism is a Synechococcus species.

[0011] In some aspects, said one or more tripartite transporter proteins are selected from the group consisting of EmrA and EmrB, and wherein the native leader sequences of said proteins are replaced with leader sequences native to said photosynthetic microorganism.

[0012] In some aspects, said one or more recombinant genes encoding one or more tripartite transporter proteins are E. coli genes. In some aspects, the expression of said one or more tripartite transporter proteins is increased relative to an otherwise identical

microorganism lacking said one or more tripartite transporter proteins. In some aspects, the activity of said one or more tripartite transporter proteins is increased relative to an otherwise identical microorganism lacking said one or more tripartite transporter proteins.

[0013] In some aspects, said microorganism is a photosynthetic organism, wherein said recombinant genes encoding enzymes which catalyze the production of alkanes and/or alkenes comprise recombinant AAR and recombinant ADM, and wherein said one or more recombinant genes encoding one or more tripartite transporter proteins comprise EmrA and EmrB. In some aspects, the expression of said one or more tripartite transporter proteins is driven by a T5 promoter. In some aspects, said microorganism comprises a recombinant TolC gene or homologue thereof.

[0014] In some aspects, said enzymes which catalyze the production of alkanes and/or alkenes are selected from the group consisting of a recombinant acyl-ACP reductase (AAR) enzyme and a recombinant alkanal deformylative monooxygenase (ADM) enzyme. In some aspects, said one or more recombinant genes encode enzymes which catalyze the production of alkanes and/or alkenes are Synechococcus AAR and ADM genes. In some aspects, said one or more recombinant genes encoding enzymes which catalyze the production of alkanes and/or alkenes are Synechococcus elongatus AAR and ADM genes.

[0015] Also described herein is a method for producing hydrocarbons, comprising:

culturing an engineered microorganism disclosed herein in a culture medium, wherein said engineered microorganism secretes increased amounts of n-alkanes or n-alkenes into the culture medium relative to an otherwise identical microorganism, cultured under identical conditions, but lacking said recombinant genes.

[0016] In some aspects, said culture medium does not include a surfactant. In some aspects, said culture medium does not include EDTA. In some aspects, said culture medium does not include Tris buffer.

[0017] In some aspects, said engineered microorganism secretes at least 2.5-fold more n- alkanes and n-alkenes relative to an otherwise identical microorganism, cultured under identical conditions, but lacking said recombinant genes for efflux of n-alkanes or n-alkenes. In some aspects, said engineered microorganism secretes 2-4 fold more n-alkanes and n- alkenes relative to an otherwise identical microorganism, cultured under identical conditions, but lacking said recombinant genes for efflux of n-alkanes or n-alkenes. In some aspects, said engineered microorganism is an engineered E. coli, and wherein at least 70% of said n- alkanes or n-alkenes are secreted into the culture medium. In some aspects, said engineered microorganism is an engineered E. coli, and wherein at least 95% of said n-alkanes or n- alkenes are secreted into the culture medium. In some aspects, said engineered

microorganism produces at least 0.1 mg/L/OD/hr of said n-alka/enes.

[0018] Also described herein is a method for producing hydrocarbons, comprising:

culturing an engineered photosynthetic microorganism disclosed herein in a culture medium; and exposing said engineered photosynthetic microorganism to light and carbon dioxide, wherein said exposure results in the conversion of said carbon dioxide by said microorganism into n-alkanes, wherein said n-alkanes are secreted into said culture medium in an amount greater than that secreted by an otherwise identical microorganism, cultured under identical conditions, but lacking said recombinant genes.

[0019] In some aspects, said engineered photosynthetic microorganism further produces at least one n-alkene or n-alkanol. In some aspects, said engineered photosynthetic microorganism produces at least one n-alkene or n-alkanol selected from the group consisting of ft-pentadecene, n-heptadecene, and 1-octadecanol. In some aspects, said n-alkanes comprise predominantly n-heptadecane, n-pentadecane or a combination thereof.

[0020] In some aspects, the above method further comprises isolating at least one n- alkane, n-alkene or n-alkanol from said culture medium.

[0021] In some aspects, at least one of said recombinant genes is encoded on a plasmid. In some aspects, at least one of said recombinant genes is incorporated into the genome of said engineered photosynthetic microorganism. In some aspects, at least one of said recombinant genes is present in multiple copies in said engineered photosynthetic

microorganism. In some aspects, at least two of said recombinant genes are part of an operon, and wherein the expression of said genes is controlled by a single promoter.

[0022] In some aspects, at least 50 to 95% of said n-alkanes are n-pentadecane, n- heptadecane, and/or n-heptadecene. In some aspects, at least 95% of said n-alkanes are n- pentadecane and n-heptadecane.

[0023] In some aspects, the expression of at least one of said recombinant genes is controlled by one or more inducible promoters. In some aspects, at least one promoter is a urea-repressible, nitrate-inducible promoter.

[0024] Also described herein is a method of identifying a cyanobacteria capable of effluxing n-alkane and/or n-alkene, comprising determining that the cyanobacteria is resistant to an antibiotic applied to the cyanobacteria, wherein the antibiotic resistance is conferred by the presence of a tripartite transporter complex comprising EmrA, EmrB, and TolC, and wherein the complex is capable of mediating efflux of n-alkane and/or n-alkene. In some aspects, the antibiotic is SDS, deoxycholate, CCCP, Rhomadine 6G, nalidixic acid, tetrachlorosalicyl anilide, 2-chlorophenylhydrazine, thiolactomycin or methylviologen.

[0025] Also described herein is a method of producing an engineered cyanobacteria capable of effluxing n-alkane and/or n-alkene, comprising contacting the cyanobacteria with an engineered outer membrane protein capable of forming a tripartite transporter complex with endogenous EmrA and EmrB, wherein the complex is capable of mediating efflux of n- alkane and/or n-alkene.

[0026] Also described herein is a method of producing an engineered cyanobacteria capable of effluxing n-alkane and/or n-alkene, comprising contacting the cyanobacteria with an engineered EmrA and/or an engineered EmrB protein capable of forming a tripartite transporter complex with one or more endogenous outer membrane proteins, wherein the complex is capable of mediating efflux of n-alkane and/or n-alkene.

[0027] Also described herein is an engineered photosynthetic microbe comprising a recombinant nucleic acid or recombinant protein comprising a sequence selected from SEQ ID NOs shown in the sequence listing. In some aspects, said engineered microbe is a photosynthetic microbe. In some aspects, said engineered photosynthetic microbe is a cyanobacterium.

DETAILED DESCRIPTION

[0028] Unless otherwise defined herein or in the above-mentioned utility applications, e.g., U.S. Pat. App. No. 12/833,821, filed July 9, 2010, scientific and technical terms shall have the meanings that are commonly understood by those of ordinary skill in the art.

Further, unless otherwise required by context, singular terms shall include the plural and plural terms shall include the singular. Generally, nomenclatures used in connection with, and techniques of, biochemistry, enzymology, molecular and cellular biology, microbiology, genetics and protein and nucleic acid chemistry and hybridization described herein are those well known and commonly used in the art.

[0029] Cyanobacteria contain not only a plasma membrane (PM) like non-photosynthetic prokaryotic hosts (as well as an outer membrane like their Gram-negative non-photosynthetic counterparts), but also, typically, an intracellular thylakoid membrane (TM) system that serves as the site for photosynthetic electron transfer and proton pumping. Given that both the plasma membrane and thylakoid membrane are typically loaded with proteins, both integral and peripheral, and, further, that a significant fraction of experimentally detected membrane proteins, both integral and peripheral, appear to be uniquely localized in each membrane, the question arises as to how differential localization of membrane proteins between the PM and TM is achieved in cyanobacteria (Rajalahti T et al. (2007) J Proteome Res 6:2420-2434). This question is of relevance to cyanobacterial metabolic engineering because certain heterologous enzymatic functions that may be desirable to engineer into said photosynthetic hosts are encoded by heterologous integral plasma membrane proteins (HIPMPs), both prokaryotic and eukaryotic in origin that must be targeted to the plasma membrane of the cyanobacterial host in order to function as desired. The HIPMPs of interest in this respect comprise proteins that mediate transport, typically efflux, of substrates across the

cyanobacterial plasma membrane. HIPMPs of particular interest with respect to the efflux of n-alkanes and n-alkenes are the integral plasma membrane subunits, EmrA and EmrB.

[0030] The methods described herein can be extended to integral membrane proteins that are not HIPMPs, i.e., proteins that are derived from membranes other than the plasma membrane. Such alternative membranes include: the thylakoid membrane, the endoplasmic reticulum membrane, the chloroplast inner membrane, and the mitochondrial inner membrane.

[0031] In one aspect, the disclosure provides methods for designing a protein comprising a pseudo-leader sequence (PLS) of defined sequence fused to the N-terminus of an HIPMP of interest, wherein the resulting chimeric protein is expressed in a cyanobacterial host cell, e.g. , JCC138 (Synechocystis sp. PCC 7002) or an engineered derivative thereof. The expression of the chimeric protein will increase the amount of hydrocarbon products of interest (e.g., alkanes, alkenes, alkyl alkanoates, etc.) exported from the cynanobacterial host cell. The PLS encodes a contiguous polypeptide sub-fragment of a protein from a different thylakoid- membrane-containing cyanobacterial host, e.g., JCC160 (Synechococcus sp. PCC 6803), that localizes as uniquely as possible to the plasma membrane of that host. The mechanism that this non-JCC138 host natively employs to effect the localization of the protein to the plasma membrane (rather than the thylakoid membrane) should be conserved in order for the localization to occur in the recipient host.

[0032] While PLSs are designed to ensure, or at least bias, the targeting of HIPMPs to the plasma membrane of the heterologous cyanobacterial host, they may not always be required. This is because sufficient levels of functional HIPMP may become embedded in the plasma membrane if the cyanobacterial host does, in fact, mechanistically recognize the protein as a native plasma membrane protein - even if some fraction of the protein is targeted to the thylakoid membrane or ends up in neither membrane (e.g. , as inclusion bodies).

[0033] For HIPMPs with cytoplasmic N-termini (Ni n ), (i) the PLS is derived from a plasma-membrane-resident protein that is naturally anchored in the membrane of a different cyanobacterial species (i.e. , different than the species into which the PLS will be functionally expressed) via two transmembrane a helices, and (ii) said plasma- membrane -resident protein naturally has its N-terminus within the cytoplasm and its C-terminus within the cytoplasm (Ni n /Ci n ), spanning the plasma membrane via an in-to-out transmembrane a helix, followed by an (ideally short) periplasmic loop sequence, followed by an out-to-in transmembrane a helix. Correspondingly, for HIPMPs with periplasmic N-termini (N out ), (i) the PLS is derived from a plasma-membrane-resident protein that is naturally anchored in the membrane of a different cyanobacterial species via one transmembrane a helix, and (ii) said plasma- membrane-resident protein naturally has its N-terminus within the cytoplasm and its C- terminus within the periplasm (Ni n /C ou t).

[0034] In some aspects, PLSs are derived from host proteins that have most of their mass in either the periplasmic and/or cytoplasmic spaces. In another aspect, said PLSs should contain only two a helices with Ni n /Q n topology (for creating Ni n HIPMPs) and only one a helix with Ni n /C out topology (for creating N out HIPMPs). In a related aspect, the potential for intermolecular homomultimerization among the transmembrane helices of the PLSs is minimized.

[0035] The terms "fused", "fusion" or "fusing" used herein in the context of chimeric proteins refers to the joining of one functional protein or protein subunit (e.g. , a pseudo- leader sequence) to another functional protein or protein subunit (e.g. , an integral plasma membrane protein). Fusing can occur by any method which results in the covalent attachment of the C-terminus of one such protein molecule to the N-terminus of another. For example, one skilled in the art will recognize that fusing occurs when the two proteins to be fused are encoded by a recombinant nucleic acid under control of a promoter and expressed as a single structural gene in vivo or in vitro.

[0036] As used herein, the term "non-target" refers to a protein or nucleic acid that is native to a species that is different than the species that will be used to recombinantly express the protein or nucleic acid. [0037] Alkanes, also known as paraffins, are chemical compounds that consist only of the elements carbon (C) and hydrogen (H) (i.e., hydrocarbons), wherein these atoms are linked together exclusively by single bonds (i.e., they are saturated compounds) without any cyclic structure. n-Alkanes are linear, i.e., unbranched, alkanes.

[0038] Genes encoding AAR or ADM enzymes are referred to herein as Aar genes (aar) or Adm genes (adm), respectively. Together, AAR and ADM enzymes function to synthesize n-alkanes from acyl-ACP molecules. As used herein, an AAR enzyme refers to an enzyme with the amino acid sequence of the SYNPCC7942 1594 protein or a homolog thereof, wherein a SYNPCC7942 1594 homolog is a protein whose BLAST alignment (i) covers >90% length of SYNPCC7942 1594, (ii) covers >90% of the length of the matching protein, and (iii) has >50% identity with SYNPCC7942_1594 (when optimally aligned using the parameters provided herein), and retains the functional activity of SYNPCC7942_1594, i.e., the conversion of an acyl-ACP (acyl-acyl carrier protein) to an n-alkanal. An ADM enzyme refers to an enzyme with the amino acid sequence of the SYNPCC7942 1593 protein or a homolog thereof, wherein a SYNPCC7942 1593 homolog is defined as a protein whose amino acid sequence alignment (i) covers >90% length of SYNPCC7942_1593, (ii) covers >90% of the length of the matching protein, and (iii) has >50% identity with

SYNPCC7942_1593 (when aligned using the parameters provided herein), and retains the functional activity of SYNPCC7942 1593, i.e., the conversion of an n-alkanal to an (n-X)- alkane. Exemplary AAR and ADM enzymes are listed in Table 1 and Table 2, respectively, of U.S. utility application 12/759,657, filed April 13, 2010 (now U.S. Pat. No. 7,794,969), and U.S. utility application 12/833,821, filed July 9, 2010. Other ADM activities are described in U.S. Pat. App. No. 12/620,328, filed November 17, 2009. Applicants note that in previous related applications, this enzyme was referred to as an alkanal decarboxylative monooxygenase. The protein is referred to herein as an alkanal deformylative

monooxygenase or abbreviated as ADM; to be clear, it is the same protein referred to in the related applications.

[0039] In some aspects parameters for BLASTp are: Expectation value: 10 (default); Filter: none; Cost to open a gap: 11 (default); Cost to extend a gap: 1 (default); Maximum alignments: 100 (default); Word size: 11 (default); No. of descriptions: 100 (default); Penalty Matrix: BLOWSUM62.

[0040] Functional homologs of other proteins described herein (e.g., TolC homologs, EmrA homologs, and EmrB homologs) may share significant amino acid identity (e.g., >50%) with the named proteins whose sequences are presented herein. Such homo logs may be obtained from other organisms where the proteins are known to share structural and functional characteristics with the named proteins. For example, an outer membrane protein that is at least 95% identical to E. coli TolC is considered a TolC homolog. Likewise, a functional outer membrane protein that is at least 95% identical to TolC except for the replacement/addition of leader sequences, C-terminal sequences or other modifications intended to increase its functionality in a particular environment (e.g., a non-native host) are also considered functional homo logs of TolC. The same definitions apply to other protein homologs referred to herein.

[0041] EmrA, EmrB, and their homologs are members of the Major Facilitator

Superfamily (MFS) of proteins (Law CJ et al. (2008). Ins and Outs of Major Facilitator Superfamily Antiporters. Ann. Rev. Micrbiol. 62:289-305). The criteria used to identify EmrAB homologs are based on analysis of the sequence of the entire protein(s) and (1) The EmrAB homolog is >30% identical to E. coli EmrB or EmrA; (2) The EmrAB homolog classifies based on its amino acid sequence as a member of the drug:proton antiporter-2 subfamily (containing 14 transmembrane alpha helices) of the MFS based on the Transporter Classification Database (TCDB 2. A.1.3); and/or (3) the gene encoding the EmrB homolog protein is found in an operon also containing a gene encoding an EmrA homolog protein that classifies based on its amino acid sequence as a member of the Membrane Fusion Protein (MFP) family of proteins (TCDB 8.A.1).

[0042] The methods and techniques of the present disclosure are generally performed according to conventional methods well known in the art and as described in various general and more specific references that are cited and discussed throughout the present specification unless otherwise indicated. See, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989); Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates (1992, and Supplements to 2002); Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1990); Taylor and Drickamer,

Introduction to Glycobiology, Oxford Univ. Press (2003); Worthington Enzyme Manual, Worthington Biochemical Corp., Freehold, N.J.; Handbook of Biochemistry: Section A Proteins, Vol I, CRC Press (1976); Handbook of Biochemistry: Section A Proteins, Vol II, CRC Press (1976); Essentials of Glycobiology, Cold Spring Harbor Laboratory Press (1999). [0043] One skilled in the art will also recognize, in light of the teachings herein, that the methods and compositions described herein for use in particular organisms, e.g.,

cyanobacteria, are also applicable other organisms, e.g., gram-negative bacteria such as E. coli. For example, a chimeric integral plasma membrane protein for facilitating alkane efflux in E. coli could be designed by fusing a pseudo leader sequence derived from E. coli or a related bacterium to a heterologous integral plasma membrane protein.

[0044] The following terms, unless otherwise indicated, shall be understood to have the following meanings:

[0045] The term "polynucleotide" or "nucleic acid molecule" refers to a polymeric form of nucleotides of at least 10 bases in length. The term includes DNA molecules {e.g., cDNA or genomic or synthetic DNA) and RNA molecules {e.g., mRNA or synthetic RNA), as well as analogs of DNA or RNA containing non-natural nucleotide analogs, non-native internucleoside bonds, or both. The nucleic acid can be in any topological conformation. For instance, the nucleic acid can be single-stranded, double-stranded, triple-stranded, quadruplexed, partially double-stranded, branched, hairpinned, circular, or in a padlocked conformation.

[0046] Unless otherwise indicated, and as an example for all sequences described herein under the general format "SEQ ID NO:", "nucleic acid comprising SEQ ID NO: l" refers to a nucleic acid, at least a portion of which has either (i) the sequence of SEQ ID NO: 1 , or (ii) a sequence complementary to SEQ ID NO: 1. The choice between the two is dictated by the context. For instance, if the nucleic acid is used as a probe, the choice between the two is dictated by the requirement that the probe be complementary to the desired target.

[0047] An "isolated" RNA, DNA or a mixed polymer is one which is substantially separated from other cellular components that naturally accompany the native polynucleotide in its natural host cell, e.g., ribosomes, polymerases and genomic sequences with which it is naturally associated.

[0048] As used herein, an "isolated" organic molecule (e.g., an alkane, alkene, or alkanal) is one which is substantially separated from the cellular components (membrane lipids, chromosomes, proteins) of the host cell from which it originated, or from the medium in which the host cell was cultured. The term does not require that the biomolecule has been separated from all other chemicals, although certain isolated biomolecules may be purified to near homogeneity. [0049] The term "recombinant" refers to a biomolecule, e.g., a gene or protein, that (1) has been removed from its naturally occurring environment, (2) is not associated with all or a portion of a polynucleotide in which the gene is found in nature, (3) is operatively linked to a polynucleotide which it is not linked to in nature, or (4) does not occur in nature. The term "recombinant" can be used in reference to cloned DNA isolates, chemically synthesized polynucleotide analogs, or polynucleotide analogs that are biologically synthesized by heterologous systems, as well as proteins and/or mR As encoded by such nucleic acids.

[0050] As used herein, an endogenous nucleic acid sequence in the genome of an organism (or the encoded protein product of that sequence) is deemed "recombinant" herein if a heterologous sequence is placed adjacent to the endogenous nucleic acid sequence, such that the expression of this endogenous nucleic acid sequence is altered. In this context, a heterologous sequence is a sequence that is not naturally adjacent to the endogenous nucleic acid sequence, whether or not the heterologous sequence is itself endogenous (originating from the same host cell or progeny thereof) or exogenous (originating from a different host cell or progeny thereof). By way of example, a promoter sequence can be substituted (e.g., by homologous recombination) for the native promoter of a gene in the genome of a host cell, such that this gene has an altered expression pattern. This gene would now become

"recombinant" because it is separated from at least some of the sequences that naturally flank it.

[0051] A nucleic acid is also considered "recombinant" if it contains any modifications that do not naturally occur to the corresponding nucleic acid in a genome. For instance, an endogenous coding sequence is considered "recombinant" if it contains an insertion, deletion or a point mutation introduced artificially, e.g., by human intervention. A "recombinant nucleic acid" also includes a nucleic acid integrated into a host cell chromosome at a heterologous site and a nucleic acid construct present as an episome.

[0052] As used herein, the phrase "degenerate variant" of a reference nucleic acid sequence encompasses nucleic acid sequences that can be translated, according to the standard genetic code, to provide an amino acid sequence identical to that translated from the reference nucleic acid sequence. The term "degenerate oligonucleotide" or "degenerate primer" is used to signify an oligonucleotide capable of hybridizing with target nucleic acid sequences that are not necessarily identical in sequence but that are homologous to one another within one or more particular segments. [0053] The term "percent sequence identity" or "identical" in the context of nucleic acid sequences refers to the residues in the two sequences which are the same when aligned for maximum correspondence. The length of sequence identity comparison may be over a stretch of at least about nine nucleotides, usually at least about 20 nucleotides, more usually at least about 24 nucleotides, typically at least about 28 nucleotides, more typically at least about 32 nucleotides, and in some instances at least about 36 or more nucleotides. There are a number of different algorithms known in the art which can be used to measure nucleotide sequence identity. For instance, polynucleotide sequences can be compared using FASTA, Gap or Bestfit, which are programs in Wisconsin Package Version 10.0, Genetics Computer Group (GCG), Madison, Wis. FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences. Pearson, Methods Enzymol. 183:63-98 (1990) (hereby incorporated by reference in its entirety). For instance, percent sequence identity between nucleic acid sequences can be determined using FASTA with its default parameters (a word size of 6 and the NOP AM factor for the scoring matrix) or using Gap with its default parameters as provided in GCG Version 6.1, herein incorporated by reference. Alternatively, sequences can be compared using the computer program, BLAST (Altschul et al, J. Mol. Biol. 215:403-410 (1990); Gish and States, Nature Genet. 3:266-272 (1993); Madden et al, Meth. Enzymol. 266: 131-141 (1996); Altschul et al, Nucleic Acids Res. 25:3389-3402 (1997); Zhang and Madden, Genome Res. 7:649-656 (1997)), especially blastp or tblastn (Altschul et al, Nucleic Acids Res. 25:3389-3402 (1997)).

[0054] The term "substantial homology" or "substantial similarity," when referring to a nucleic acid or fragment thereof, indicates that, when optimally aligned with appropriate nucleotide insertions or deletions with another nucleic acid (or its complementary strand), there is nucleotide sequence identity in at least about 76%, 80%, 85%>, preferably at least about 90%, and more preferably at least about 95%, 96%, 97%, 98% or 99% of the nucleotide bases, as measured by any well-known algorithm of sequence identity, such as FASTA, BLAST or Gap, as discussed above.

[0055] Alternatively, substantial homology or similarity exists when a nucleic acid or fragment thereof hybridizes to another nucleic acid, to a strand of another nucleic acid, or to the complementary strand thereof, under stringent hybridization conditions. "Stringent hybridization conditions" and "stringent wash conditions" in the context of nucleic acid hybridization experiments depend upon a number of different physical parameters. Nucleic acid hybridization will be affected by such conditions as salt concentration, temperature, solvents, the base composition of the hybridizing species, length of the complementary regions, and the number of nucleotide base mismatches between the hybridizing nucleic acids, as will be readily appreciated by those skilled in the art. One having ordinary skill in the art knows how to vary these parameters to achieve a particular stringency of

hybridization.

[0056] In general, "stringent hybridization" is performed at about 25°C below the thermal melting point (T m ) for the specific DNA hybrid under a particular set of conditions.

"Stringent washing" is performed at temperatures about 5°C lower than the T m for the specific DNA hybrid under a particular set of conditions. The T m is the temperature at which 50% of the target sequence hybridizes to a perfectly matched probe. See Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989), page 9.51, hereby incorporated by reference. For purposes herein, "stringent conditions" are defined for solution phase hybridization as aqueous hybridization (i.e., free of formamide) in 6xSSC (where 20xSSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% SDS at 65°C for 8-12 hours, followed by two washes in 0.2xSSC, 0.1% SDS at 65°C for 20 minutes. It will be appreciated by the skilled worker that hybridization at 65°C will occur at different rates depending on a number of factors including the length and percent identity of the sequences which are hybridizing.

[0057] The nucleic acids (also referred to as polynucleotides) of this present disclosure may include both sense and antisense strands of RNA, cDNA, genomic DNA, and synthetic forms and mixed polymers of the above. They may be modified chemically or biochemically or may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by those of skill in the art. Such modifications include, for example, labels, methylation, substitution of one or more of the naturally occurring nucleotides with an analog,

intemucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, etc.), charged linkages (e.g.,

phosphorothioates, phosphorodithioates, etc.), pendent moieties (e.g., polypeptides), intercalators (e.g., acridine, psoralen, etc.), chelators, alkylators, and modified linkages (e.g., alpha anomeric nucleic acids, etc.) Also included are synthetic molecules that mimic polynucleotides in their ability to bind to a designated sequence via hydrogen bonding and other chemical interactions. Such molecules are known in the art and include, for example, those in which peptide linkages substitute for phosphate linkages in the backbone of the molecule. Other modifications can include, for example, analogs in which the ribose ring contains a bridging moiety or other structure such as the modifications found in "locked" nucleic acids.

[0058] The term "mutated" when applied to nucleic acid sequences means that nucleotides in a nucleic acid sequence may be inserted, deleted or changed compared to a reference nucleic acid sequence. A single alteration may be made at a locus (a point mutation) or multiple nucleotides may be inserted, deleted or changed at a single locus. In addition, one or more alterations may be made at any number of loci within a nucleic acid sequence. A nucleic acid sequence may be mutated by any method known in the art including but not limited to mutagenesis techniques such as "error-prone PCR" (a process for performing PCR under conditions where the copying fidelity of the DNA polymerase is low, such that a high rate of point mutations is obtained along the entire length of the PCR product; see, e.g., Leung et al., Technique, 1 : 11-15 (1989) and Caldwell and Joyce, PCR Methods App lie. 2:28-33 (1992)); and "oligonucleotide-directed mutagenesis" (a process which enables the generation of site-specific mutations in any cloned DNA segment of interest; see, e.g., Reidhaar-Olson and Sauer, Science 241 :53-57 (1988)).

[0059] The term "attenuate" as used herein generally refers to a functional deletion, including a mutation, partial or complete deletion, insertion, or other variation made to a gene sequence or a sequence controlling the transcription of a gene sequence, which reduces or inhibits production of the gene product, or renders the gene product non-functional. In some instances a functional deletion is described as a knockout mutation. Attenuation also includes amino acid sequence changes by altering the nucleic acid sequence, placing the gene under the control of a less active promoter, down-regulation, expressing interfering RNA, ribozymes or antisense sequences that target the gene of interest, or through any other technique known in the art. In one example, the sensitivity of a particular enzyme to feedback inhibition or inhibition caused by a composition that is not a product or a reactant (non-pathway specific feedback) is lessened such that the enzyme activity is not impacted by the presence of a compound. In other instances, an enzyme that has been altered to be less active can be referred to as attenuated.

[0060] The term "deletion" refers to the removal of one or more nucleotides from a nucleic acid molecule or one or more amino acids from a protein, the regions on either side being joined together.

[0061] The term "knock out" refers to a gene whose level of expression or activity has been reduced to zero. In some examples, a gene is knocked-out via deletion of some or all of its coding sequence. In other examples, a gene is knocked-out via introduction of one or more nucleotides into its open reading frame, which results in translation of a non-sense or otherwise non-functional protein product.

[0062] The term "vector" as used herein is intended to refer to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a "plasmid," which generally refers to a circular double stranded DNA loop into which additional DNA segments may be ligated, but also includes linear double-stranded molecules such as those resulting from amplification by the polymerase chain reaction (PCR) or from treatment of a circular plasmid with a restriction enzyme. Other vectors include cosmids, bacterial artificial chromosomes (BAC) and yeast artificial chromosomes (YAC). Another type of vector is a viral vector, wherein additional DNA segments may be ligated into the viral genome (discussed in more detail below). Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., vectors having an origin of replication which functions in the host cell). Other vectors can be integrated into the genome of a host cell upon introduction into the host cell, and are thereby replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "recombinant expression vectors" (or simply "expression vectors").

[0063] "Operatively linked" or "operably linked" expression control sequences refers to a linkage in which the expression control sequence is contiguous with the gene of interest to control the gene of interest, as well as expression control sequences that act in trans or at a distance to control the gene of interest.

[0064] The term "expression control sequence" as used herein refers to polynucleotide sequences which are necessary to affect the expression of coding sequences to which they are operatively linked. Expression control sequences are sequences which control the transcription, post-transcriptional events and translation of nucleic acid sequences.

Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (e.g., ribosome binding sites); sequences that enhance protein stability; and when desired, sequences that enhance protein secretion. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding site, and transcription termination sequence. The term "control sequences" is intended to include, at a minimum, all components whose presence is essential for expression, and can also include additional components whose presence is advantageous, for example, leader sequences and fusion partner sequences.

[0065] The term "recombinant host cell" (or simply "host cell"), as used herein, is intended to refer to a cell into which a recombinant vector has been introduced. It should be understood that such terms are intended to refer not only to the particular subject cell but to the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term "host cell" as used herein. A recombinant host cell may be an isolated cell or cell line grown in culture or may be a cell which resides in a living tissue or organism.

[0066] The term "peptide" as used herein refers to a short polypeptide, e.g., one that is typically less than about 50 amino acids long and more typically less than about 30 amino acids long. The term as used herein encompasses analogs and mimetics that mimic structural and thus biological function.

[0067] The term "polypeptide" encompasses both naturally-occurring and non-naturally- occurring proteins, and fragments, mutants, derivatives and analogs thereof. A polypeptide may be monomeric or polymeric. Further, a polypeptide may comprise a number of different domains each of which has one or more distinct activities.

[0068] The term "isolated protein" or "isolated polypeptide" is a protein or polypeptide that by virtue of its origin or source of derivation (1) is not associated with naturally associated components that accompany it in its native state, (2) exists in a purity not found in nature, where purity can be adjudged with respect to the presence of other cellular material {e.g., is free of other proteins from the same species) (3) is expressed by a cell from a different species, or (4) does not occur in nature {e.g., it is a fragment of a polypeptide found in nature or it includes amino acid analogs or derivatives not found in nature or linkages other than standard peptide bonds). Thus, a polypeptide that is chemically synthesized or synthesized in a cellular system different from the cell from which it naturally originates will be "isolated" from its naturally associated components. A polypeptide or protein may also be rendered substantially free of naturally associated components by isolation, using protein purification techniques well known in the art. As thus defined, "isolated" does not necessarily require that the protein, polypeptide, peptide or oligopeptide so described has been physically removed from its native environment. [0069] The term "polypeptide fragment" as used herein refers to a polypeptide that has a deletion, e.g., an amino-terminal and/or carboxy-terminal deletion compared to a full-length polypeptide. In one aspect, the polypeptide fragment is a contiguous sequence in which the amino acid sequence of the fragment is identical to the corresponding positions in the naturally-occurring sequence. Fragments typically are at least 5, 6, 7, 8, 9 or 10 amino acids long, preferably at least 12, 14, 16 or 18 amino acids long, more preferably at least 20 amino acids long, more preferably at least 25, 30, 35, 40 or 45, amino acids, even more preferably at least 50 or 60 amino acids long, and even more preferably at least 70 amino acids long.

[0070] A "modified derivative" refers to polypeptides or fragments thereof that are substantially homologous in primary structural sequence but which include, e.g. , in vivo or in vitro chemical and biochemical modifications or which incorporate amino acids that are not found in the native polypeptide. Such modifications include, for example, acetylation, carboxylation, phosphorylation, glycosylation, ubiquitination, labeling, e.g., with

radionuclides, and various enzymatic modifications, as will be readily appreciated by those skilled in the art. A variety of methods for labeling polypeptides and of substituents or labels useful for such purposes are well known in the art, and include radioactive isotopes such as

125 32 35 3

I, P, S, and H, ligands which bind to labeled antiligands (e.g., antibodies),

fluorophores, chemiluminescent agents, enzymes, and antiligands which can serve as specific binding pair members for a labeled ligand. The choice of label depends on the sensitivity required, ease of conjugation with the primer, stability requirements, and available instrumentation. Methods for labeling polypeptides are well known in the art. See, e.g., Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates (1992, and Supplements to 2002) (hereby incorporated by reference).

[0071] The term "fusion protein" refers to a polypeptide comprising a polypeptide or fragment coupled to heterologous amino acid sequences. Fusion proteins are useful because they can be constructed to contain two or more desired functional elements from two or more different proteins. A fusion protein comprises at least 10 contiguous amino acids from a polypeptide of interest, more preferably at least 20 or 30 amino acids, even more preferably at least 40, 50 or 60 amino acids, yet more preferably at least 75, 100 or 125 amino acids. Fusions that include the entirety of the proteins of the present disclosure have particular utility. The heterologous polypeptide included within the fusion protein of the present disclosure is at least 6 amino acids in length, often at least 8 amino acids in length, and usefully at least 15, 20, and 25 amino acids in length. Fusions that include larger polypeptides, such as an IgG Fc region, and even entire proteins, such as the green fluorescent protein ("GFP") chromophore-containing proteins, have particular utility. Fusion proteins can be produced recombinantly by constructing a nucleic acid sequence which encodes the polypeptide or a fragment thereof in frame with a nucleic acid sequence encoding a different protein or peptide and then expressing the fusion protein. Alternatively, a fusion protein can be produced chemically by crosslinking the polypeptide or a fragment thereof to another protein.

[0072] As used herein, the term "antibody" refers to a polypeptide, at least a portion of which is encoded by at least one immunoglobulin gene, or fragment thereof, and that can bind specifically to a desired target molecule. The term includes naturally-occurring forms, as well as fragments and derivatives.

[0073] Fragments within the scope of the term "antibody" include those produced by digestion with various proteases, those produced by chemical cleavage and/or chemical dissociation and those produced recombinantly, so long as the fragment remains capable of specific binding to a target molecule. Among such fragments are Fab, Fab', Fv, F(ab') 2 , and single chain Fv (scFv) fragments.

[0074] Derivatives within the scope of the term include antibodies (or fragments thereof) that have been modified in sequence, but remain capable of specific binding to a target molecule, including: interspecies chimeric and humanized antibodies; antibody fusions; heteromeric antibody complexes and antibody fusions, such as diabodies (bispecific antibodies), single-chain diabodies, and intrabodies (see, e.g., Intracellular Antibodies:

Research and Disease Applications, (Marasco, ed., Springer- Verlag New York, Inc., 1998), the disclosure of which is incorporated herein by reference in its entirety).

[0075] As used herein, antibodies can be produced by any known technique, including harvest from cell culture of native B lymphocytes, harvest from culture of hybridomas, recombinant expression systems and phage display.

[0076] The term "non-peptide analog" refers to a compound with properties that are analogous to those of a reference polypeptide. A non-peptide compound may also be termed a "peptide mimetic" or a "peptidomimetic." See, e.g., Jones, Amino Acid and Peptide Synthesis, Oxford University Press (1992); Jung, Combinatorial Peptide and Nonpeptide Libraries: A Handbook, John Wiley (1997); Bodanszky et al., Peptide Chemistry— A

Practical Textbook, Springer Verlag (1993); Synthetic Peptides: A Users Guide, (Grant, ed., W. H. Freeman and Co., 1992); Evans et al, J. Med. Chem. 30:1229 (1987); Fauchere, J. Adv. Drug Res. 15:29 (1986); Veber and Freidinger, Trends Neurosci., 8:392-396 (1985); and references sited in each of the above, which are incorporated herein by reference. Such compounds are often developed with the aid of computerized molecular modeling. Peptide mimetics that are structurally similar to useful peptides of the present disclosure may be used to produce an equivalent effect and are therefore envisioned to be part of the present disclosure.

[0077] A "polypeptide mutant" or "mutein" refers to a polypeptide whose sequence contains an insertion, duplication, deletion, rearrangement or substitution of one or more amino acids compared to the amino acid sequence of a native or wild-type protein. A mutein may have one or more amino acid point substitutions, in which a single amino acid at a position has been changed to another amino acid, one or more insertions and/or deletions, in which one or more amino acids are inserted or deleted, respectively, in the sequence of the naturally-occurring protein, and/or truncations of the amino acid sequence at either or both the amino or carboxy termini. A mutein may have the same but preferably has a different biological activity compared to the naturally-occurring protein.

[0078] A mutein has at least 85% overall sequence homology to its wild-type counterpart. Even more preferred are muteins having at least 90% overall sequence homology to the wild- type protein.

[0079] In an even more preferred aspect, a mutein exhibits at least 95 %> sequence identity, even more preferably 98%, even more preferably 99% and even more preferably 99.9%) overall sequence identity.

[0080] Sequence homology may be measured by any common sequence analysis algorithm, such as Gap or Bestfit.

[0081] Amino acid substitutions can include those which: (1) reduce susceptibility to proteolysis, (2) reduce susceptibility to oxidation, (3) alter binding affinity for forming protein complexes, (4) alter binding affinity or enzymatic activity, and (5) confer or modify other physicochemical or functional properties of such analogs.

[0082] As used herein, the twenty conventional amino acids and their abbreviations follow conventional usage. See lmmunology-A Synthesis (Golub and Gren eds., Sinauer Associates, Sunderland, Mass., 2 nd ed. 1991), which is incorporated herein by reference. Stereoisomers (e.g. , D-amino acids) of the twenty conventional amino acids, unnatural amino acids such as α-, α-disubstituted amino acids, N-alkyl amino acids, and other unconventional amino acids may also be suitable components for polypeptides of the present disclosure. Examples of unconventional amino acids include: 4-hydroxyproline, γ-carboxyglutamate, ε- Ν,Ν,Ν-trimethyllysine, ε-Ν-acetyllysine, O-phosphoserine, N-acetylserine, N- formylmethionine, 3-methylhistidine, 5 -hydroxy lysine, N-methylarginine, and other similar amino acids and imino acids (e.g., 4-hydroxyproline). In the polypeptide notation used herein, the left-hand end corresponds to the amino terminal end and the right-hand end corresponds to the carboxy-terminal end, in accordance with standard usage and convention.

[0083] A protein has "homology" or is "homologous" to a second protein if the nucleic acid sequence that encodes the protein has a similar sequence to the nucleic acid sequence that encodes the second protein. Alternatively, a protein has homology to a second protein if the two proteins have "similar" amino acid sequences. (Thus, the term "homologous proteins" is defined to mean that the two proteins have similar amino acid sequences.) As used herein, homology between two regions of amino acid sequence (especially with respect to predicted structural similarities) is interpreted as implying similarity in function.

[0084] When "homologous" is used in reference to proteins or peptides, it is recognized that residue positions that are not identical often differ by conservative amino acid substitutions. A "conservative amino acid substitution" is one in which an amino acid residue is substituted by another amino acid residue having a side chain (R group) with similar chemical properties (e.g., charge or hydrophobicity). In general, a conservative amino acid substitution will not substantially change the functional properties of a protein. In cases where two or more amino acid sequences differ from each other by conservative

substitutions, the percent sequence identity or degree of homology may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. See, e.g., Pearson, 1994, Methods Mol. Biol.

24:307-31 and 25:365-89 (herein incorporated by reference).

[0085] The following six groups each contain amino acids that are conservative substitutions for one another: 1) Serine (S), Threonine (T); 2) Aspartic Acid (D), Glutamic Acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Alanine (A), Valine (V), and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W).

[0086] Sequence homology for polypeptides, which is also referred to as percent sequence identity, is typically measured using sequence analysis software. See, e.g., the Sequence Analysis Software Package of the Genetics Computer Group (GCG), University of Wisconsin Biotechnology Center, 910 University Avenue, Madison, Wis. 53705. Protein analysis software matches similar sequences using a measure of homology assigned to various substitutions, deletions and other modifications, including conservative amino acid substitutions. For instance, GCG contains programs such as "Gap" and "Bestfit" which can be used with default parameters to determine sequence homology or sequence identity between closely related polypeptides, such as homologous polypeptides from different species of organisms or between a wild-type protein and a mutein thereof. See, e.g., GCG Version 6.1.

[0087] An algorithm that can be used when comparing a particular polypeptide sequence to a database containing a large number of sequences from different organisms is the computer program BLAST (Altschul et al, J. Mol. Biol. 215:403-410 (1990); Gish and States, Nature Genet. 3:266-272 (1993); Madden et al, Meth. Enzymol. 266:131-141 (1996); Altschul et al, Nucleic Acids Res. 25:3389-3402 (1997); Zhang and Madden, Genome Res. 7:649-656 (1997)), especially blastp or tblastn (Altschul et al, Nucleic Acids Res. 25:3389- 3402 (1997)).

[0088] The length of polypeptide sequences compared for homology will generally be at least about 16 amino acid residues, usually at least about 20 residues, more usually at least about 24 residues, typically at least about 28 residues, and preferably more than about 35 residues. When searching a database containing sequences from a large number of different organisms, it can be preferable to compare amino acid sequences. Database searching using amino acid sequences can be measured by algorithms other than blastp known in the art. For instance, polypeptide sequences can be compared using FASTA, a program in GCG Version 6.1. FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences. Pearson, Methods Enzymol. 183:63-98 (1990) (incorporated by reference herein). For example, percent sequence identity between amino acid sequences can be determined using FASTA with its default parameters (a word size of 2 and the PAM250 scoring matrix), as provided in GCG Version 6.1, herein incorporated by reference.

[0089] "Specific binding" refers to the ability of two molecules to bind to each other in preference to binding to other molecules in the environment. Typically, "specific binding" discriminates over adventitious binding in a reaction by at least two-fold, more typically by at least 10-fold, often at least 100-fold. Typically, the affinity or avidity of a specific binding reaction, as quantified by a dissociation constant, is about 10 ~7 M or stronger (e.g., about 10 ~8 M, 10 ~9 M or even stronger).

[0090] "Percent dry cell weight" refers to a measurement of hydrocarbon production obtained as follows: a defined volume of culture is centrifuged to pellet the cells. Cells are washed then dewetted by at least one cycle of microcentrifugation and aspiration. Cell pellets are lyophilized overnight, and the tube containing the dry cell mass is weighed again such that the mass of the cell pellet can be calculated within ±0.1 mg. At the same time cells are processed for dry cell weight determination, a second sample of the culture in question is harvested, washed, and dewetted. The resulting cell pellet, corresponding to 1-3 mg of dry cell weight, is then extracted by vortexing in approximately 1 ml acetone plus butylated hydroxytolune (BHT) as antioxidant and an internal standard, e.g., n-eicosane. Cell debris is then pelleted by centrifugation and the supernatant (extractant) is taken for analysis by GC. For accurate quantitation of n-alkanes, flame ionization detection (FID) is used as opposed to MS total ion count. n-Alkane concentrations in the biological extracts are calculated using calibration relationships between GC-FID peak area and known concentrations of authentic n-alkane standards. Knowing the volume of the extractant, the resulting concentrations of the n-alkane species in the extractant, and the dry cell weight of the cell pellet extracted, the percentage of dry cell weight that comprised n-alkanes can be determined.

[0091] The term "region" as used herein refers to a physically contiguous portion of the primary structure of a biomolecule. In the case of proteins, a region is defined by a contiguous portion of the amino acid sequence of that protein.

[0092] The term "domain" as used herein refers to a structure of a biomolecule that contributes to a known or suspected function of the biomolecule. Domains may be coextensive with regions or portions thereof; domains may also include distinct, non-contiguous regions of a biomolecule. Examples of protein domains include, but are not limited to, an Ig domain, an extracellular domain, a transmembrane domain, and a cytoplasmic domain.

[0093] As used herein, the term "molecule" means any compound, including, but not limited to, a small molecule, peptide, protein, sugar, nucleotide, nucleic acid, lipid, etc., and such a compound can be natural or synthetic.

[0094] "Carbon-based Products of Interest" include alcohols such as ethanol, propanol, isopropanol, butanol, fatty alcohols, fatty acid esters, wax esters; hydrocarbons and alkanes such as propane, octane, diesel, Jet Propellant 8 (JP8); polymers such as terephthalate, 1,3-propanediol, 1,4-butanediol, polyols, Polyhydroxyalkanoates (PHA), poly-beta- hydroxybutyrate (PHB), acrylate, adipic acid, ε-caprolactone, isoprene, caprolactam, rubber; commodity chemicals such as lactate, docosahexaenoic acid (DHA), 3-hydroxypropionate, γ-valerolactone, lysine, serine, aspartate, aspartic acid, sorbitol, ascorbate, ascorbic acid, isopentenol, lanosterol, omega-3 DHA, lycopene, itaconate, 1,3-butadiene, ethylene, propylene, succinate, citrate, citric acid, glutamate, malate, 3-hydroxypropionic acid (HP A), lactic acid, THF, gamma butyrolactone, pyrrolidones, hydroxybutyrate, glutamic acid, levulinic acid, acrylic acid, malonic acid; specialty chemicals such as carotenoids, isoprenoids, itaconic acid; pharmaceuticals and pharmaceutical intermediates such as 7- aminodeacetoxycephalosporanic acid (7-ADCA)/cephalosporin, erythromycin, polyketides, statins, paclitaxel, docetaxel, terpenes, peptides, steroids, omega fatty acids and other such suitable products of interest. Such products are useful in the context of bio fuels, industrial and specialty chemicals, as intermediates used to make additional products, such as nutritional supplements, neutraceuticals, polymers, paraffin replacements, personal care products and pharmaceuticals.

[0095] Biofuel: A biofuel refers to any fuel that derives from a biological source. Biofuel can refer to one or more hydrocarbons, one or more alcohols, one or more fatty esters or a mixture thereof.

[0096] Hydrocarbon: The term generally refers to a chemical compound that consists of the elements carbon (C), hydrogen (H) and optionally oxygen (O). There are essentially three types of hydrocarbons, e.g., aromatic hydrocarbons, saturated hydrocarbons and unsaturated hydrocarbons such as alkenes, alkynes, and dienes. The term also includes fuels, biofuels, plastics, waxes, solvents and oils. Hydrocarbons encompass biofuels, as well as plastics, waxes, solvents and oils.

[0097] Throughout this specification and claims, the word "comprise" or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.

[0098] In another aspect, the nucleic acid molecule of the present disclosure encodes a polypeptide having the amino acid sequence of any of the protein sequences provided in the SEQ ID NOs of the sequence listing. In some aspects, the nucleic acid molecule of the present disclosure encodes a polypeptide sequence of at least 50%, 60, 70%, 80%>, 85%, 90% or 95%) identity to one of the protein sequences shown in the SEQ ID NOs in the sequence listing and the identity can even more preferably be 96%, 97%, 98%>, 99%, 99.9% or even higher.

[0099] In some aspects, a nucleic acid molecule has at least 50%, 60, 70%, 80%, 85%, 90% or 95% identity to one of the sequences shown in SEQ ID NOs 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 21 , 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, or 58 and the identity can even more preferably be 96%, 97%, 98%, 99%, 99.9% or even higher.

[0100] In some aspects, a nucleic acid molecule of the present disclosure encodes a polypeptide sequence of at least 50%, 60, 70%, 80%, 85%, 90% or 95% identity to one of the protein sequences shown in SEQ ID NOs 19, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, or 47 in the sequence listing and the identity can even more preferably be 96%, 97%, 98%, 99%, 99.9% or even higher.

[0101] The present disclosure also provides nucleic acid molecules that hybridize under stringent conditions to the above-described nucleic acid molecules. As defined above, and as is well known in the art, stringent hybridizations are performed at about 25°C below the thermal melting point (T m ) for the specific DNA hybrid under a particular set of conditions, where the T m is the temperature at which 50% of the target sequence hybridizes to a perfectly matched probe. Stringent washing is performed at temperatures about 5°C lower than the T m for the specific DNA hybrid under a particular set of conditions.

[0102] Nucleic acid molecules comprising a fragment of any one of the above-described nucleic acid sequences are also provided. These fragments preferably contain at least 20 contiguous nucleotides. More preferably the fragments of the nucleic acid sequences contain at least 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or even more contiguous nucleotides.

[0103] The nucleic acid sequence fragments of the present disclosure display utility in a variety of systems and methods. For example, the fragments may be used as probes in various hybridization techniques. Depending on the method, the target nucleic acid sequences may be either DNA or RNA. The target nucleic acid sequences may be

fractionated (e.g., by gel electrophoresis) prior to the hybridization, or the hybridization may be performed on samples in situ. One of skill in the art will appreciate that nucleic acid probes of known sequence find utility in determining chromosomal structure (e.g., by

Southern blotting) and in measuring gene expression (e.g., by Northern blotting). In such experiments, the sequence fragments are preferably detectably labeled, so that their specific hydridization to target sequences can be detected and optionally quantified. One of skill in the art will appreciate that the nucleic acid fragments of the present disclosure may be used in a wide variety of blotting techniques not specifically described herein.

[0104] It should also be appreciated that the nucleic acid sequence fragments disclosed herein also find utility as probes when immobilized on microarrays. Methods for creating microarrays by deposition and fixation of nucleic acids onto support substrates are well known in the art. Reviewed in DNA Microarrays: A Practical Approach (Practical Approach Series), Schena (ed.), Oxford University Press (1999) (ISBN: 0199637768); Nature Genet. 21(l)(suppl): l-60 (1999); Microarray Biochip: Tools and Technology, Schena (ed.), Eaton Publishing Company/BioTechniques Books Division (2000) (ISBN: 1881299376), the disclosures of which are incorporated herein by reference in their entireties. Analysis of, for example, gene expression using microarrays comprising nucleic acid sequence fragments, such as the nucleic acid sequence fragments disclosed herein, is a well-established utility for sequence fragments in the field of cell and molecular biology. Other uses for sequence fragments immobilized on microarrays are described in Gerhold et al., Trends Biochem. Sci. 24: 168-173 (1999) and Zweiger, Trends Biotechnol. 17:429-436 (1999); DNA Microarrays: A Practical Approach (Practical Approach Series), Schena (ed.), Oxford University Press (1999) (ISBN: 0199637768); Nature Genet. 21(l)(suppl): l-60 (1999); Microarray Biochip: Tools and Technology, Schena (ed.), Eaton Publishing Company/BioTechniques Books Division (2000) (ISBN: 1881299376), the disclosure of each of which is incorporated herein by reference in its entirety.

[0105] As is well known in the art, enzyme activities can be measured in various ways. For example, the pyrophosphorolysis of OMP may be followed spectroscopically

(Grubmeyer et al., (1993) J. Biol. Chem. 268:20299-20304). Alternatively, the activity of the enzyme can be followed using chromatographic techniques, such as by high performance liquid chromatography (Chung and Sloan, (1986) J. Chromatogr. 371 :71-81). As another alternative the activity can be indirectly measured by determining the levels of product made from the enzyme activity. These levels can be measured with techniques including aqueous chloroform/methanol extraction as known and described in the art (Cf. M. Kates (1986) Techniques ofLipidology; Isolation, analysis and identification of Lipids. Elsevier Science Publishers, New York (ISBN: 0444807322)). More modern techniques include using gas chromatography linked to mass spectrometry (Niessen, W. M. A. (2001). Current practice of gas chromatography— mass spectrometry. New York, NY: Marcel Dekker. (ISBN:

0824704738)). Additional modern techniques for identification of recombinant protein activity and products including liquid chromatography-mass spectrometry (LCMS), high performance liquid chromatography (HPLC), capillary electrophoresis, Matrix-Assisted Laser Desorption Ionization time of flight-mass spectrometry (MALDI-TOF MS), nuclear magnetic resonance (NMR), near-infrared (NIR) spectroscopy, viscometry (Knothe, G (1997) Am. Chem. Soc. Symp. Series, 666: 172-208), titration for determining free fatty acids (Komers (1997) Fett/Lipid, 99(2): 52-54), enzymatic methods (Bailer (1991) Fresenius J. Anal. Chem. 340(3): 186), physical property-based methods, wet chemical methods, etc. can be used to analyze the levels and the identity of the product produced by the organisms of the present disclosure. Other methods and techniques may also be suitable for the measurement of enzyme activity, as would be known by one of skill in the art.

[0106] Also provided by the present disclosure are vectors, including expression vectors, which comprise the above nucleic acid molecules of the present disclosure, as described further herein. In a first aspect, the vectors include the isolated nucleic acid molecules described above. In an alternative aspect, the vectors of the present disclosure include the above-described nucleic acid molecules operably linked to one or more expression control sequences. The vectors of the instant disclosure may thus be used to express an Aar and/or Adm polypeptide contributing to n-alkane producing activity by a host cell, and/or a chimeric efflux protein for effluxing n-alkanes and other hydrocarbons out of the cell.

[0107] In another aspect of the present disclosure, host cells transformed with the nucleic acid molecules or vectors of the present disclosure, and descendants thereof, are provided. In some aspects of the present disclosure, these cells carry the nucleic acid sequences of the present disclosure on vectors, which may but need not be freely replicating vectors. In other aspects of the present disclosure, the nucleic acids have been integrated into the genome of the host cells.

[0108] In a preferred aspect, the host cell comprises one or more AAR or ADM encoding nucleic acids which express AAR or ADM in the host cell.

[0109] In an alternative aspect, the host cells of the present disclosure can be mutated by recombination with a disruption, deletion or mutation of the isolated nucleic acid of the present disclosure so that the activity of the AAR and/or ADM protein(s) in the host cell is reduced or eliminated compared to a host cell lacking the mutation.

[0110] The term "microorganism" includes prokaryotic and eukaryotic microbial species from the Domains Archaea, Bacteria and Eucarya, the latter including yeast and filamentous fungi, protozoa, algae, or higher Protista. The terms "microbial cells" and "microbes" are used interchangeably with the term microorganism.

[0111] A variety of host organisms can be transformed to produce a product of interest. Photoautotrophic organisms include eukaryotic plants and algae, as well as prokaryotic cyanobacteria, green-sulfur bacteria, green non-sulfur bacteria, purple sulfur bacteria, and purple non-sulfur bacteria.

[0112] Extremophiles are also contemplated as suitable organisms. Such organisms withstand various environmental parameters such as temperature, radiation, pressure, gravity, vacuum, desiccation, salinity, pH, oxygen tension, and chemicals. They include

hyperthermophiles, which grow at or above 80°C such as Pyrolobus fumarii; thermophiles, which grow between 60-80°C such as Synechococcus lividis; mesophiles, which grow between 15-60°C and psychrophiles, which grow at or below 15°C such as Psychrobacter and some insects. Radiation tolerant organisms include Deinococcus radiodurans . Pressure- tolerant organisms include piezophiles, which tolerate pressure of 130 MPa. Weight-tolerant organisms include barophiles. Hypergravity {e.g., >lg) and hypogravity {e.g., <lg) tolerant organisms are also contemplated. Vacuum tolerant organisms include tardigrades, insects, microbes and seeds. Dessicant tolerant and anhydrobiotic organisms include xerophiles such as Artemia salina; nematodes, microbes, fungi and lichens. Salt-tolerant organisms include halophiles {e.g., 2-5 M NaCl) Halobacteriacea and Dunaliella salina. pH-tolerant organisms include alkaliphiles such as Natronobacterium, Bacillus firmus OF4, Spirulina spp. {e.g. , pH > 9) and acidophiles such as Cyanidium caldarium, Ferroplasma sp. {e.g., low pH).

Anaerobes, which cannot tolerate 0 2 such as Methanococcus jannaschii; microaerophils, which tolerate some 0 2 such as Clostridium and aerobes, which require 0 2 are also contemplated. Gas-tolerant organisms, which tolerate pure C0 2 include Cyanidium caldarium and metal tolerant organisms include metalotolerants such as Ferroplasma acidarmanus {e.g., Cu, As, Cd, Zn), Ralstonia sp. CH34 {e.g., Zn, Co, Cd, Hg, Pb). Gross, Michael. Life on the Edge: Amazing Creatures Thriving in Extreme Environments. New YorK: Plenum (1998) and Seckbach, J. "Search for Life in the Universe with Terrestrial Microbes Which Thrive Under Extreme Conditions." In Cristiano Batalli Cosmovici, Stuart Bowyer, and Dan Wertheimer, eds., Astronomical and Biochemical Origins and the Search or Life in the Universe, p. 511. Milan: Editrice Compositori (1997). [0113] Plants include but are not limited to the following genera: Arabidopsis, Beta, Glycine, Jatropha, Miscanthus, Panicum, Phalaris, Populus, Saccharum, Salix, Simmondsia and Zea.

[0114] Algae and cyanobacteria include but are not limited to the following genera: Acanthoceras, Acanthococcus, Acaryochloris, Achnanthes, Achnanthidium, Actinastrum, Actinochloris, Actinocyclus, Actinotaenium, Amphichrysis, Amphidinium, Amphikrikos, Amphipleura, Amphiprora, Amphithrix, Amphora, Anabaena, Anabaenopsis, Aneumastus, Ankistrodesmus, Ankyra, Anomoeoneis, Apatococcus, Aphanizomenon, Aphanocapsa, Aphanochaete, Aphanothece, Apiocystis, Apistonema, Arthrodesmus, Artherospira, Ascochloris, Asterionella, Asterococcus, Audouinella, Aulacoseira, Bacillaria, Balbiania, Bambusina, Bangia, Basichlamys, Batrachospermum, Binuclearia, Bitrichia, Blidingia, Botrdiopsis, Botrydium, Botryococcus, Botryosphaerella, Brachiomonas, Brachysira, Brachytrichia, Brebissonia, Bulbochaete, Bumilleria, Bumilleriopsis, Caloneis, Calothrix, Campylodiscus, Capsosiphon, Carteria, Catena, Cavinula, Centritractus, Centronella, Ceratium, Chaetoceros, Chaetochloris, Chaetomorpha, Chaetonella, Chaetonema,

Chaetopeltis, Chaetophora, Chaetosphaeridium, Chamaesiphon, Chara, Characiochloris, Characiopsis, Characium, Charales, Chilomonas, Chlainomonas, Chlamydoblepharis, Chlamydocapsa, Chlamydomonas, Chlamydomonopsis, Chlamydomyxa, Chlamydonephris, Chlorangiella, Chlorangiopsis, Chlorella, Chlorobotrys, Chlorobrachis, Chlorochytrium, Chlorococcum, Chlorogloea, Chlorogloeopsis, Chlorogonium, Chlorolobion, Chloromonas, Chlorophysema, Chlorophyta, Chlorosaccus, Chlorosarcina, Choricystis, Chromophyton, Chromulina, Chroococcidiopsis, Chroococcus, Chroodactylon, Chroomonas, Chroothece, Chrysamoeba, Chrysapsis, Chrysidiastrum, Chrysocapsa, Chrysocapsella, Chrysochaete, Chrysochromulina, Chrysococcus, Chrysocrinus, Chrysolepidomonas, Chrysolykos, Chrysonebula, Chrysophyta, Chrysopyxis, Chrysosaccus, Chrysophaerella,

Chrysostephanosphaera, Clodophora, Clastidium, Closteriopsis, Closterium, Coccomyxa, Cocconeis, Coelastrella, Coelastrum, Coelosphaerium, Coenochloris, Coenococcus, Coenocystis, Colacium, Coleochaete, Collodictyon, Compsogonopsis, Compsopogon, Conjugatophyta, Conochaete, Coronastrum, Cosmarium, Cosmioneis, Cosmocladium, Crateriportula, Craticula, Crinalium, Crucigenia, Crucigeniella, Cryptoaulax, Cryptomonas, Cryptophyta, Ctenophora, Cyanodictyon, Cyanonephron, Cyanophora, Cyanophyta, Cyanothece, Cyanothomonas, Cyclonexis, Cyclostephanos, Cyclotella, Cylindrocapsa, Cylindrocystis, Cylindrospermum, Cylindrotheca, Cymatopleura, Cymbella,

Cymbellonitzschia, Cystodinium Dactylococcopsis, Debarya, Denticula, Dermatochrysis, Dermocarpa, Dermocarpella, Desmatractum, Desmidium, Desmococcus, Desmonema, Desmosiphon, Diacanthos, Diacronema, Diadesmis, Diatoma, Diatomella, Dicellula, Dichothrix, Dichotomococcus, Dicranochaete, Dictyochloris, Dictyococcus,

Dictyosphaerium, Didymocystis, Didymogenes, Didymosphenia, Dilabifilum,

Dimorphococcus, Dinobryon, Dinococcus, Diplochloris, Diploneis, Diplostauron,

Distrionella, Docidium, Draparnaldia, Dunaliella, Dysmorphococcus, Ecballocystis,

Elakatothrix, EUerbeckia, Encyonema, Enteromorpha, Entocladia, Entomoneis, Entophysalis, Epichrysis, Epipyxis, Epithemia, Eremosphaera, Euastropsis, Euastrum, Eucapsis,

Eucocconeis, Eudorina, Euglena, Euglenophyta, Eunotia, Eustigmatophyta, Eutreptia, Fallacia, Fischerella, Fragilaria, Fragilariforma, Franceia, Frustulia, Curcilla, Geminella, Genicularia, Glaucocystis, Glaucophyta, Glenodiniopsis, Glenodinium, Gloeocapsa,

Gloeochaete, Gloeochrysis, Gloeococcus, Gloeocystis, Gloeodendron, Gloeomonas,

Gloeoplax, Gloeothece, Gloeotila, Gloeotrichia, Gloiodictyon, Golenkinia, Golenkiniopsis, Gomontia, Gomphocymbella, Gomphonema, Gomphosphaeria, Gonatozygon, Gongrosia, Gongrosira, Goniochloris, Gonium, Gonyostomum, Granulochloris, Granulocystopsis, Groenbladia, Gymnodinium, Gymnozyga, Gyrosigma, Haematococcus, Hafniomonas, Hallassia, Hammatoidea, Hannaea, Hantzschia, Hapalosiphon, Haplotaenium, Haptophyta, Haslea, Hemidinium, Hemitoma, Heribaudiella, Heteromastix, Heterothrix, Hibberdia, Hildenbrandia, Hillea, Holopedium, Homoeothrix, Hormanthonema, Hormotila,

Hyalobrachion, Hyalocardium, Hyalodiscus, Hyalogonium, Hyalotheca, Hydrianum, Hydrococcus, Hydrocoleum, Hydrocoryne, Hydrodictyon, Hydrosera, Hydrurus, Hyella, Hymenomonas, Isthmochloron, Johannesbaptistia, Juranyiella, Karayevia, Kathablepharis, Katodinium, Kephyrion, Keratococcus, Kirchneriella, Klebsormidium, Kolbesia, Koliella, Komarekia, Korshikoviella, Kraskella, Lagerheimia, Lagynion, Lamprothamnium, Lemanea, Lepocinclis, Leptosira, Lobococcus, Lobocystis, Lobomonas, Luticola, Lyngbya,

Malleochloris, Mallomonas, Mantoniella, Marssoniella, Martyana, Mastigocoleus,

Gastogloia, Melosira, Merismopedia, Mesostigma, Mesotaenium, Micractinium,

Micrasterias, Microchaete, Microcoleus, Microcystis, Microglena, Micromonas, Microspora, Microthamnion, Mischococcus, Monochrysis, Monodus, Monomastix, Monoraphidium, Monostroma, Mougeotia, Mougeotiopsis, Myochloris, Myromecia, Myxosarcina,

Naegeliella, Nannochloris, Nautococcus, Navicula, Neglectella, Neidium, Nephroclamys, Nephrocytium, Nephrodiella, Nephroselmis, Netrium, Nitella, Nitellopsis, Nitzschia, Nodularia, Nostoc, Ochromonas, Oedogonium, Oligochaetophora, Onychonema, Oocardium, Oocystis, Opephora, Ophiocytium, Orthoseira, Oscillatoria, Oxyneis, Pachycladella, Palmella, Palmodictyon, Pnadorina, Pannus, Paralia, Pascherina, Paulschulzia, Pediastrum, Pedinella, Pedinomonas, Pedinopera, Pelagodictyon, Penium, Peranema, Peridiniopsis, Peridinium, Peronia, Petroneis, Phacotus, Phacus, Phaeaster, Phaeodermatium, Phaeophyta, Phaeosphaera, Phaeothamnion, Phormidium, Phycopeltis, Phyllariochloris, Phyllocardium, Phyllomitas, Pinnularia, Pitophora, Placoneis, Planctonema, Planktosphaeria, Planothidium, Plectonema, Pleodorina, Pleurastrum, Pleurocapsa, Pleurocladia, Pleurodiscus, Pleurosigma, Pleurosira, Pleurotaenium, Pocillomonas, Podohedra, Polyblepharides, Polychaetophora, Polyedriella, Polyedriopsis, Polygoniochloris, Polyepidomonas, Polytaenia, Polytoma, Polytomella, Porphyridium, Posteriochromonas, Prasinochloris, Prasinocladus, Prasinophyta, Prasiola, Prochlorphyta, Prochlorothrix, Protoderma, Protosiphon, Provasoliella,

Prymnesium, Psammodictyon, Psammothidium, Pseudanabaena, Pseudenoclonium,

Psuedocarteria, Pseudochate, Pseudocharacium, Pseudococcomyxa, Pseudodictyosphaerium, Pseudokephyrion, Pseudoncobyrsa, Pseudoquadrigula, Pseudosphaerocystis,

Pseudostaurastrum, Pseudostaurosira, Pseudotetrastrum, Pteromonas, Punctastruata,

Pyramichlamys, Pyramimonas, Pyrrophyta, Quadrichloris, Quadricoccus, Quadrigula, Radiococcus, Radiofilum, Raphidiopsis, Raphidocelis, Raphidonema, Raphidophyta, Peimeria, Rhabdoderma, Rhabdomonas, Rhizoclonium, Rhodomonas, Rhodophyta,

Rhoicosphenia, Rhopalodia, Rivularia, Rosenvingiella, Rossithidium, Roya, Scenedesmus, Scherffelia, Schizochlamydella, Schizochlamys, Schizomeris, Schizothrix, Schroederia, Scolioneis, Scotiella, Scotiellopsis, Scourfieldia, Scytonema, Selenastrum, Selenochloris, Sellaphora, Semiorbis, Siderocelis, Diderocystopsis, Dimonsenia, Siphononema,

Sirocladium, Sirogonium, Skeletonema, Sorastrum, Spermatozopsis, Sphaerellocystis, Sphaerellopsis, Sphaerodinium, Sphaeroplea, Sphaerozosma, Spiniferomonas, Spirogyra, Spirotaenia, Spirulina, Spondylomorum, Spondylosium, Sporotetras, Spumella, Staurastrum, Stauerodesmus, Stauroneis, Staurosira, Staurosirella, Stenopterobia, Stephanocostis,

Stephanodiscus, Stephanoporos, Stephanosphaera, Stichococcus, Stichogloea, Stigeoclonium, Stigonema, Stipitococcus, Stokesiella, Strombomonas, Stylochrysalis, Stylodinium,

Styloyxis, Stylosphaeridium, Surirella, Sykidion, Symploca, Synechococcus, Synechocystis, Synedra, Synochromonas, Synura, Tabellaria, Tabularia, Teilingia, Temnogametum,

Tetmemorus, Tetrachlorella, Tetracyclus, Tetradesmus, Tetraedriella, Tetraedron,

Tetraselmis, Tetraspora, Tetrastrum, Thalassiosira, Thamniochaete, Thorakochloris, Thorea, Tolypella, Tolypothrix, Trachelomonas, Trachydiscus, Trebouxia, Trentepholia, Treubaria, Tribonema, Trichodesmium, Trichodiscus, Trochiscia, Tryblionella, Ulothrix, Uroglena, Uronema, Urosolenia, Urospora, Uva, Vacuolaria, Vaucheria, Volvox, Volvulina, Westella, Woloszynskia, Xanthidium, Xanthophyta, Xenococcus, Zygnema, Zygnemopsis, and Zygonium. A partial list of cyanobacteria that can be engineered to express the recombinant described herein include members of the genus Chamaesiphon, Chroococcus,

Cyanobacterium, Cyanobium, Cyanothece, Dactylococcopsis, Gloeobacter, Gloeocapsa, Gloeothece, Microcystis, Prochlorococcus, Prochloron, Synechococcus, Synechocystis, Cyanocystis, Dermocarpella, Stanieria, Xenococcus, Chroococcidiopsis, Myxosarcina, Arthrospira, Borzia, Crinalium, Geitlerinemia, Leptolyngbya, Limnothrix, Lyngbya, Microcoleus, Oscillatoria, Planktothrix, Prochiorothrix, Pseudanabaena, Spirulina, Starria, Symploca, Trichodesmium, Tychonema, Anabaena, Anabaenopsis, Aphanizomenon, Cyanospira, Cylindrospermopsis, Cylindrospermum, Nodularia, Nostoc, Scylonema, Calothrix, Rivularia, Tolypothrix, Chlorogloeopsis, Fischerella, Geitieria, Iyengariella, Nostochopsis, Stigonema and Thermosynechococcus.

[0115] Green non-sulfur bacteria include but are not limited to the following genera: Chloroflexus, Chloronema, Oscillochloris, Heliothrix, Herpetosiphon, Roseiflexus, and Thermomicrobium.

[0116] Green sulfur bacteria include but are not limited to the following genera:

[0117] Chlorobium, Clathrochloris, and Prosthecochloris.

[0118] Purple sulfur bacteria include but are not limited to the following genera:

Allochromatium, Chromatium, Halochromatium, Isochromatium, Marichromatium,

Rhodovulum, Thermochromatium, Thiocapsa, Thiorhodococcus, and Thiocystis,

[0119] Purple non-sulfur bacteria include but are not limited to the following genera: Phaeospirillum, Rhodobaca, Rhodobacter, Rhodomicrobium, Rhodopila,

Rhodopseudomonas, Rhodothalassium, Rhodospirillum, Rodovibrio, and Roseospira.

[0120] Aerobic chemolithotrophic bacteria include but are not limited to nitrifying bacteria such as Nitrobacteraceae sp., Nitrobacter sp., Nitrospina sp., Nitrococcus sp., Nitrospira sp., Nitrosomonas sp., Nitrosococcus sp., Nitrosospira sp., Nitrosolobus sp., Nitrosovibrio sp.; colorless sulfur bacteria such as, Thiovulum sp., Thiobacillus sp.,

Thiomicrospira sp., Thiosphaera sp., Thermothrix sp.; obligately chemolithotrophic hydrogen bacteria such as Hydrogenobacter sp., iron and manganese-oxidizing and/or depositing bacteria such as Siderococcus sp., and magnetotactic bacteria such as Aquaspirillum sp.

[0121] Archaeobacteria include but are not limited to methanogenic archaeobacteria such as Methanobacterium sp., Methanobrevibacter sp., Methanothermus sp., Methanococcus sp., Methanomicrobium sp., Methanospirillum sp., Methanogenium sp., Methanosarcina sp., Methanolobus sp., Methanothrix sp., Methanococcoides sp., Methanoplanus sp.; extremely thermophilic S-Metabolizers such as Thermoproteus sp., Pyrodictium sp., Sulfolobus sp., Acidianus sp. and other microorganisms such as, Bacillus subtilis, Saccharomyces cerevisiae, Streptomyces sp., Ralstonia sp., Rhodococcus sp., Corymb acteria sp., Brevibacteria sp., Mycobacteria sp., and oleaginous yeast.

[0122] Preferred organisms for the manufacture of n-alkanes according to the methods discloused herein include: Arabidopsis thaliana, Panicum virgatum, Miscanthus giganteus, and Zea mays (plants); Botryococcus braunii, Chlamydomonas reinhardtii and Dunaliela salina (algae); Synechococcus sp PCC 7002, Synechococcus sp. PCC 7942, Synechocystis sp. PCC 6803, Thermosynechococcus elongatus BP-1 (cyanobacteria); Chlorobium tepidum (green sulfur bacteria), Chloroflexus auranticus (green non-sulfur bacteria); Chromatium tepidum and Chromatium vinosum (purple sulfur bacteria); Rhodospirillum rubrum,

Rhodobacter capsulatus, and Rhodopseudomonas palusris (purple non- sulfur bacteria).

[0123] Yet other suitable organisms include synthetic cells or cells produced by synthetic genomes as described in Venter et al. US Pat. Pub. No. 2007/0264688, and cell-like systems or synthetic cells as described in Glass et al. US Pat. Pub. No. 2007/0269862.

[0124] Still, other suitable organisms include microorganisms that can be engineered to fix carbon dioxide bacteria such as Escherichia coli, Acetobacter aceti, Bacillus subtilis, yeast and fungi such as Clostridium ljungdahlii, Clostridium thermocellum, Penicillium chrysogenum, Pichia pastoris, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Pseudomonas fluorescens, or Zymomonas mobilis.

[0125] A suitable organism for selecting or engineering is autotrophic fixation of C0 2 to products. This would cover photosynthesis and methanogenesis. Acetogenesis,

encompassing the three types of C0 2 fixation; Calvin cycle, acetyl-CoA pathway and reductive TCA pathway is also covered. The capability to use carbon dioxide as the sole source of cell carbon (autotrophy) is found in almost all major groups ofprokaryotes. The C0 2 fixation pathways differ between groups, and there is no clear distribution pattern of the four presently-known autotrophic pathways. See, e.g., Fuchs, G. 1989. Alternative pathways of autotrophic CO 2 fixation, p. 365-382. In H. G. Schlegel, and B. Bowien (ed.), Autotrophic bacteria. Springer- Verlag, Berlin, Germany. The reductive pentose phosphate cycle

(Calvin-Bassham-Benson cycle) represents the C0 2 fixation pathway in almost all aerobic autotrophic bacteria, for example, the cyanobacteria. [0126] For producing n-alkanes via the recombinant expression of Aar and/or Adm enzymes, an engineered cyanobacterium, e.g., a Synechococcus or Thermosynechococcus species, is preferred. Other preferred organisms include Synechocystis, Klebsiella oxytoca, Escherichia coli or Saccharomyces cerevisiae. Other prokaryotic, archaeal and eukaryotic host cells are also encompassed within the scope of the present disclosure.

[0127] In various aspects of the disclosure, desired hydrocarbons and/or alcohols of certain chain length or a mixture thereof can be produced. In certain aspects, the host cell produces at least one of the following carbon-based products of interest: 1-dodecanol, 1- tetradecanol, 1-pentadecanol, n-tridecane, n-tetradecane, 15: 1 n-pentadecene, n-pentadecane, 16: 1 n-hexadecene, n-hexadecane, 17: 1 n-heptadecene, n-heptadecane, 16: 1 n-hexadecen-ol, n-hexadecan-l-ol and n-octadecen-l-ol, as shown in the Examples herein. In other aspects, the carbon chain length ranges from C 10 to C 20 . Accordingly, the disclosure provides production of various chain lengths of alkanes, alkenes and alkanols suitable for use as fuels and chemicals.

[0128] In some aspects, the methods of the present disclosure include culturing host cells for direct product secretion for easy recovery without the need to extract biomass. These carbon-based products of interest are secreted directly into the medium. Since the disclosure enables production of various defined chain length of hydrocarbons and alcohols, the secreted products are easily recovered or separated. The products of the disclosure, therefore, can be used directly or used with minimal processing.

[0129] In some aspects, the methods disclosed herein include producing a material from the microorganisms disclosed herein such as a hydrocarbon (e.g., an alkane or an alkene). In some aspects, the methods disclosed herein include recovering the material. In some aspects, the methods disclosed herein include extracting the material. In some aspects, the methods disclosed herein include processing the material, e.g., the hydrocarbon. In some aspects, the processing of the material produces a processed material. The term "processed material" refers to a carbon-based material produced using one or more hydrocarbons as a raw starting material, wherein the one or more hydrocarbons are produced via one or more methods disclosed herein. Such processed materials can include fuel, biodiesel, plastic, rubber, a cosmetic, a pharmaceutical agent, a specialty chemical, and a surfactant. Other processed materials are generally known to one of skill in the art. Various methods for making processed materials from hydrocarbons are generally known to one of skill in the art. Such methods can include, e.g., cracking, distillation, hydrotreating, reforming, and resid processing.

[0130] In some aspects, disclosed herein is a method for producing hydrocarbons, comprising: culturing one or more engineered microorganisms disclosed herein in a culture medium, wherein said engineered microorganism secretes increased amounts of n-alkanes or n-alkenes into the culture medium relative to an otherwise identical microorganism, cultured under identical conditions, but lacking said recombinant genes; and processing of the n- alkanes or n-alkenes to produce a processed material.

[0131] In various aspects, compositions produced by the methods of the disclosure are used as fuels. Such fuels comply with ASTM standards, for instance, standard specifications for diesel fuel oils D 975-09b, and Jet A, Jet A-l and Jet B as specified in ASTM

Specification D. 1655-68. Fuel compositions may require blending of several products to produce a uniform product. The blending process is relatively straightforward, but the determination of the amount of each component to include in a blend is much more difficult. Fuel compositions may, therefore, include aromatic and/or branched hydrocarbons, for instance, 75% saturated and 25% aromatic, wherein some of the saturated hydrocarbons are branched and some are cyclic. Preferably, the methods of the disclosure produce an array of hydrocarbons, such as C 13 -C 17 or C 10 -C 15 to alter cloud point. Furthermore, the compositions may comprise fuel additives, which are used to enhance the performance of a fuel or engine. For example, fuel additives can be used to alter the freezing/gelling point, cloud point, lubricity, viscosity, oxidative stability, ignition quality, octane level, and flash point. Fuels compositions may also comprise, among others, antioxidants, static dissipater, corrosion inhibitor, icing inhibitor, biocide, metal deactivator and thermal stability improver.

[0132] In addition to many environmental advantages of the disclosure such as C0 2 conversion and renewable source, other advantages of the fuel compositions disclosed herein include low sulfur content, low emissions, being free or substantially free of alcohol and having high cetane number.

[0133] The following examples are for illustrative purposes and are not intended to limit the scope of the disclosure. EXAMPLES

EXAMPLE 1: Construction of pJB2166 for tetracycline-inducible hydrocarbon production in E. coli

[0134] pJB2166 was constructed using molecular biology techniques. Table 1 shows the regions and sequences of pJB2166.

Example 2: Construction of pJB1849 for IPTG-inducible emrAB expression in E. coli

[0135] To construct pJB1849, emrAB was PCR amplified from E. coli MG1655 genomic DNA using primers KS313 (5' aataCATATGAGCGCAAATGCGGAGACTCAAA 3'; SEQ ID NO:21) and KS314 (5' aataGCATGCTTAGTGCGCACCGCCTCCGC 3'; SEQ ID NO:22) and Phusion HF DNA polymerase (NEB). The resulting emrAB PCR product was digested with Ndel and Sphl, and ligated into pJB1719 (\r)l5A_ori\P(T5)>\acrAB>\<cat\) cut with the same restriction enzymes to remove acrAB. The resulting plasmid, pJB1849 (|pl5A_ori|P(T5)>|em 45>|<cat|) is an IPTG inducible construct. Table 2 shows the regions and sequences of pJB1849.

Example 3: Hji/rocarbon secretion in E. coli by emrAB overexpression

[0136] A plasmid for anhydrotetracycline inducible expression of adm-aar, pJB2166, and another plasmid for expression of emrAB were introduced into an E. coli strain (JCC2264) lacking the only other hydrocarbon transporter found to date (YbhGFSR). emrAB expression rescued cell growth, and promoted hydrocarbon production and secretion in this strain.

[0137] E. coli strain JCC2264 (AfadEAybhGFSR) was contructed using a gene knockout vector system obtained from Yale (See Coli Genetic Stock Center website on November 9, 2011). JCC2264 was co-transformed with pJB2166 + pJB1849 and pJB2166 + pJB1720 (pJB1720 does not have any genes downstream of the T5 promoter). The latter strain served as a negative control. Transformants were isolated on LB plates containing carbenicillin (100 μg/ml) and chloramphenicol (25 μg/ml), picked from colonies into 3 ml LB seed cultures with carbenicillin (100 μg/ml) and chloramphenicol (25 μg/ml)and incubated for 16 hours at 37°C, 260 rpm.

[0138] Alkane and alkene production and efflux of each strain was tested in 125 ml shake flasks containing 10 ml M9f media (M9 minimal media + 30 g/L glucose + 30 mg/L

FeCl 3 -6H 2 0 + A5 metals (27 mg/L FeCl 3 6H 2 0, 2 mg/L ZnCl 2 4H 2 0, 2 mg/L CaCl 2 2H 2 0, 2 mg/L Na 2 Mo0 4 -2H 2 0, 1.9 mg/L CuS0 4 5H 2 0, 0.5 mg/L H 3 B0 3 )) with carbenicillin (100 μg/ml), chloramphenicol (25 μg/ml), and a 2.5 ml DBE (25 mg/L butylated hydroxytoluene + 25 mg/L eicosane in dodecane) overlay for extraction of hydrocarbons from the aqueous phase that were secreted by the cells. Cells were harvested from LB seed cultures and used to inoculate (1% v/v) the shake flask cultures. Following inoculation, all cultures were incubated at 37°C, 260 rpm for 2.25 hours, at which point 0.1 mM IPTG and 100 ng/ml anhydrotetracycline (aTc) were added to each culture to induce gene expression from the T5 and P(LtetOl) promoters, respectively. After induction with IPTG and aTc, all cultures were overlayed with 2.5 ml DBE and incubated at 30°C, 260 rpm. Alkane and alkene production and efflux of each strain was tested at 7.25 hours and 25.25 hours post inoculation by extraction of de-wetted cell pellets with ABE (acetone + 25 mg/L butylated hydroxytoluene + 25 mg/L eicosane) and direct measurement of secreted hydrocarbons present in the DBE overlay. The results (Table 3) show that overexpression of emrAB allows for continuous cell growth following induction of the alkane/alkene pathway. Further, overexpression of emrAB dramatically increases total hydrocarbon production and % secretion in comparison with the negative control strain that does not overexpress a hydrocarbon transporter

(JCC2264/p JB2166/p JB 1720).

[0139] Table 3 shows the effect of emrAB overexpression on hydrocarbons production and secretion in E. coli strain JCC2264 harboring pJB2166.

Example 4: Hydrocarbon secretion by emrAB overexpression

[0140] In some instances, vectors disclosed herein (e.g., pJB1849 and pJB2166) are re- engineered for use in cyanobacteria. For example, to alter vectors pJB1849 and pJB2166 for expression of EmrAB and Adm-Aar in cyanobacteria, regions of DNA (homologous to PCC 7002 genomic DNA) are introduced upstream and downstream (UHR and DHR) of the genetic elements of interest. This allows homologous recombination to occur once the vector is transformed into a bacterial strain (e.g., a cyanobacteria strain, e.g., JCC 138 and/or JCC2055), resulting in integration of the foreign DNA. For example, the genetic elements of interest can include a promoter that is known to function in PCC 7002, EmrAB-TolC (or other functional outer membrane protein), and/or adm-aar, and an antibiotic marker. In some instances, a panel of promoters are tested and whichever provides the best phenotype is selected for use. See, e.g., Huang H-H et al. (2010). Design and characterization of molecular tools for a Synthetic Biology approach towards developing cyanobacterial biotechnology. NAR 38:2577-2593; and Dexter J and Fu P (2009). Metabolic engineering of cyanobacteria for ethanol production. Energy and Environ. Sci. 8:857-864; and Lan EI and Liao JC (201 1). Metabolic engineering of cyanobacteria for 1-butanol production from carbon dioxide . Met. Engineer. 13 :353-363. [0141] A photosynthetic bacterial strain (e.g., a cyanobacteria strain such as JCC138) is transformed with one or more plasmids (e.g. pJB1849 vector re-engineered for use in cyanobacteria) expressing recombinant EmrA and/or EmrB (e.g., SEQ ID NO: 15-16 and 19- 20) or homologues thereof. The strain can also be transformed with one or more plasmids expressing recombinant TolC (e.g., SEQ ID NO:21-22) or homologues thereof. The strain can also be transformed with one or more plasmids (e.g., pJB2166 vector re-engineered for use in cyanobacteria) expressing ADM and/or AAR (e.g., SEQ ID NO:7-8) or homologues thereof. Negative controls include the photosynthetic bacterial strain transformed with identical plasmids as above but without emrAB (e.g., pJB1720 vector re-engineered for use in cyanobacteria). Transformants are isolated on plates with appropriate antibiotics, picked from colonies into seed cultures with appropriate antibiotics, and incubated.

[0142] Hydrocarbon production and efflux of each strain is tested. Cells are harvested from the seed cultures and used to inoculate shake cultures. Following inoculation, cultures are incubated. In some instances, the hydrocarbon pathway is activated as described in the Examples above. Hydrocarbon production and efflux of each strain can be determined at various time points under various conditions. Overexpression of emrAB in the test strain(s) allows for continuous cell growth and increases total hydrocarbon production and % secretion in comparison with the negative control strain(s).

[0143] In some instances, the hydrocarbons are recovered. In some instances, the hydrocarbons are extracted. In some instances, the hydrocarbons are processed to produce a processed material.

Example 5: Hydrocarbon secretion by tripartite transporter protein overexpression

[0144] In some instances, vectors disclosed herein (e.g., pJB1849 and pJB2166) are re- engineered for use in cyanobacteria as described above. A photosynthetic bacterial strain (e.g., a cyanobacteria strain, such as JCC138) is transformed with one or more plasmids (e.g. p JB 1849 vector re-engineered for use in cyanobacteria) expressing a recombinant tripartite transporter protein such as EmrA, EmrB (e.g., SEQ ID NO: 15-16 and 19-20), AcrE, AcrF, EmrK, EmrY, MacA, MacB, MdtA, MdtB, MdtC, MdtE, MdtF, SdsR, SdsQ, and SdsP or homologues thereof. The amino acid sequences of AcrE, AcrF, EmrK, EmrY, MacA, MacB, MdtA, MdtB, MdtC, MdtE, MdtF, SdsR, SdsQ, and SdsP are shown in SEQ ID NOs 34-47. The strain can also be transformed with one or more plasmids expressing recombinant TolC (e.g., SEQ ID NO :21-22) or homologues thereof. The strain can also be transformed with one or more plasmids (e.g., pJB2166 vector re-engineered for use in cyanobacteria) expressing ADM and/or AAR (e.g., SEQ ID NO:7-8) or homologues thereof. Negative controls include the photosynthetic bacterial strain transformed with identical plasmids as above but without the recombinant tripartite transporter protein(s) (e.g., pJB1720 vector re- engineered for use in cyanobacteria). Transformants are isolated on plates with appropriate antibiotics, picked from colonies into seed cultures with appropriate antibiotics, and incubated.

[0145] Hydrocarbon production and efflux of each strain is tested. Cells are harvested from the seed cultures and used to inoculate shake cultures. Following inoculation, cultures are incubated. In some instances, the hydrocarbon pathway is activated as described in the Examples above. Hydrocarbon production and efflux of each strain can be determined at various time points under various conditions. Overexpression of the recombinant tripartite transporter protein(s) in the test strain(s) allows for continuous cell growth and increases total hydrocarbon production and % secretion in comparison with the negative control strain(s).

[0146] In some instances, the hydrocarbons are recovered. In some instances, the hydrocarbons are extracted. In some instances, the hydrocarbons are processed to produce a processed material.

Example 6: Hydrocarbon secretion by EmrAB homologue overexpression

[0147] The following four construct types are expressed in a JCC138 alkanogen, such as JCC2055, to permit pentadecane secretion from such a strain using non-JCC138

cyanobacterial (e.g., Nostoc punctiforme PCC 73102 (Npun) or Cyanothetce PCC 7822 (Cyan7822)) EmrA and EmrB homologs..

1. P(xxx)-Cyan 7822 _2244-Cyan 7822_2243-Cyan 7822 _2189 (emrA-emrB-omp)

2. P(xxx)-Npun_F3670-Npun_F3671-Npun_F3672-Npun_R2885 (emrA-entS-emrB- ompl)

3. P(xxx)-Npun_F3670-Npun_F3671-Npun_F3672-Npun_F6385 (emrA-entS-emrB- ompl)

4. P(xxx)-Npun_F6382-Npun_F6383-Npun_F6384-Npun_F638 (ybhG-macBl-macB2- omp) where P(xxx) indicates a multiplicity of promoters operably linked to the indicated operons, and the sequences in parantheses indicate the corresponding gene identities, omp representing a gene encoding an outer membrane protein. Dashes indicate separations between promoter and/or coding sequences. emrA, emrB, entS, ybhG, macBI, and macB2 sequences from Npun and/or Cyan7822 are shown in the sequence listing at SEQ ID NOs:23-33. [0148] Transformants are isolated on plates with appropriate antibiotics, picked from colonies into seed cultures with appropriate antibiotics, and incubated.

[0149] Hydrocarbon production and efflux of each strain is tested. Cells are harvested from the seed cultures and used to inoculate shake cultures. Following inoculation, cultures are incubated. In some instances, the hydrocarbon pathway is activated as described in the Examples above. Hydrocarbon production and efflux of each strain can be determined at various time points under various conditions. Overexpression of the recombinant tripartite transporter protein(s) in the test strain(s) allows for continuous cell growth and increases total hydrocarbon production and % secretion in comparison with the negative control strain(s).

[0150] In some instances, the hydrocarbons are recovered. In some instances, the hydrocarbons are extracted. In some instances, the hydrocarbons are processed to produce a processed material.

Example 7: Hydrocarbon secretion by emrAB overexpression

[0151] pJB2302 was constructed using molecular biology techniques. Table 4 shows the regions and sequences of pJB2302. This vector integrates into the A2208 locus.

[0152] pJB2303 was constructed using molecular biology techniques. This vector included the same sequences as pJB2302, except that the P{psaA} promter of pJB2302 was exchanged for the P{tsr2142} promoter from cyanobacteria BP-1 shown in SEQ ID NO:55. This vector integrates into the A2208 locus.

[0153] pJB2304 was constructed using molecular biology techniques. This vector included the same sequences as pJB2302, except that the P{psaA} promter of pJB2302 was exchanged for the P{aphII} promoter shown in SEQ ID NO:56. This vector integrates into the A2208 locus. [0154] pJB2305 was constructed using molecular biology techniques. This vector included the same sequences as pJB2302, except that the P{psaA} promter of pJB2302 was exchanged for the P{ompR} promoter shown in SEQ ID NO:57. This vector integrates into the A2208 locus.

[0155] pJB2306 was constructed using molecular biology techniques. This vector included the same sequences as pJB2302, except that the P{psaA} promter of pJB2302 was exchanged for the P{ nir_07_PnirA_PCC7942 v2} promoter shown in SEQ ID NO:58. This vector integrates into the A2208 locus.

[0156] A photosynthetic bacterial strain (e.g., a cyanobacteria strain such as JCC138 or JCC2055) is transformed with one or more plasmids (e.g. pJB2302, pJB2303, pJB2304, pJB2305, or pJB2306) expressing recombinant EmrA and EmrB (e.g., SEQ ID NO: 15-16 and 19-20 and 51-52). The strain can also be transformed with one or more plasmids expressing recombinant TolC (e.g., SEQ ID NO:21-22) or homologues thereof. The strain can also be transformed with one or more plasmids (e.g., pJB2166 vector re-engineered for use in cyanobacteria) expressing ADM and/or AAR (e.g., SEQ ID NO:7-8) or homologues thereof. Negative controls include the photosynthetic bacterial strain transformed with identical plasmids as above but without emrAB (e.g., pJB1720 vector re-engineered for use in cyanobacteria). Transformants are isolated on plates with appropriate antibiotics, picked from colonies into seed cultures with appropriate antibiotics, and incubated.

[0157] Hydrocarbon production and efflux of each strain is tested. Cells are harvested from the seed cultures and used to inoculate shake cultures. Following inoculation, cultures are incubated. In some instances, the hydrocarbon pathway is activated as described in the Examples above. Hydrocarbon production and efflux of each strain can be determined at various time points under various conditions. Overexpression of emrAB in the test strain(s) allows for continuous cell growth and increases total hydrocarbon production and % secretion in comparison with the negative control strain(s).

[0158] In some instances, the hydrocarbons are recovered. In some instances, the hydrocarbons are extracted. In some instances, the hydrocarbons are processed to produce a processed material.

TABLE 5

SEQ DESCRIPTION SEQUENCE (ALL NUCLEOTIDE SEQUENCES ARE 5'-3')

ID

NO

1 pUC origin of TCATGACCAAAATCCCTTAACGTGAGTTACGCGCGCGTCGTTCCACTGAGCGTCAGAC

CCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCT replication GCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCT ACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTT CTTCTAGTGTAGCCGTAGTTAGCCCACCACTTCAAGAACTCTGTAGCACCGCCTACAT ACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCT TACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACG GGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACC TACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTA TCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAAC GCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTT TGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTT ACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCT GATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCG

rpn txn CTGGGCGGTTCTGATAACGAGTAATCGTTAATCCGCAAATAACGTAAAAACCCGCTTC terminator GGCGGGTTTTTTTATGGGGGGAGTTTAGGGAAAGAGCATTTGTCAGAATATTTAAGGG

CGCCTGTCACTTTGCTTGATATATGAGAATTATTTAACCTTATAAATGAGAAAAAAGC AACGCACTTTAAATAAGATACGTTGCTTTTTCGATTGATGAACACCTATAATTAAACT ATTCATCTATTATTTATGATTTTTTG A A ACAA ATTTCTAGTTTGT AAAGAGAA TTAAGAAAATAAATCTCGAAAATAATAAAGGGAAAATCAG

bla txn CTGGGCGGTTCTGATAACGAGTAATCGTTAATCCGCAAATAACGTAAAAACCCGCTTC terminator GGCGGGTTTTTTTATGGGGGGAGTTTAGGGAAAGAGCATTTGTCAGAATATTTAAGGG

CGCCTGTCACTTTGCTTGATATATGAGAATTATTTAACCTTATAAATGAGAAAAAAGC AACGCACTTTAAATAAGATACGTTGCTTTTTCGATTGATGAACACCTATAATTAAACT ATTCATCTATTATTTATGATTTTTTGTATATACAATATTTCTAGTTTGTTAAAGAGAA TTAAGAAAATAAATCTCGAAAATAATAAAGGGAAAATCAG

tetR gene TTAAGACCCACTTTCACATTTAAGTTGTTTTTCTAATCCGCAAATGATCAATTCAAGG

CCGAATAAGAAGGCTGGCTCTGCACCTTGGTGTTCAAATAATTCGATAGCTTGTCGTA ATAATGCTGGCATACTATCAGTAGTAGGTGTTTCCCTTTCTTCTTTAGCGACTTGATG CTCTTGATCTTCCAATACGCAACCTAAAGTAAGATGCCCCACAGCGCTGAGTGCATAT AATGCATTCTCTAGTGAAAAACCTTGTTGGCATAAAAAGGCTAATTGATTTTCGAGAG TTTCATACTGTTTTTCTGTAGGCCGTGTACCTAAATGTACTTTTGCTCCATCGCGATG ACTTAGTAAAGCACATCTAAAACTTTTAGCGTTATTACGTAAAAAATCTTGCCAGCTT TCCCCTTCTAAAGGGCAAAAGTGAGTATGGTGCCTATCTAACATCTCAATGGCTAAGG CGTCGAGCAAAGCCCGCTTATTTTTTACATGCCAATACAATGTAGGCTGCTCTACACC CAGCTTCTGGGCGAGTTTACGGGTTTTTAAACCTTCGATTCCGACCTCATTAAGCAGC TCTAATGCGCTGTTAATCACTTTACTTTTATCTAATCTGGACATCAT

P(cl) promoter gcaaccattatcaccgccagaggtaaaatagtcaacacgcacggtgtta

P(LtetOl) TCCCTATCAGTGATAGAGATTGACATCCCTATCAGTGATAGAGATACTGAGCACATCA promoter GCAGGACGCACTGACC

adm gene 1593 ATGCCGCAGCTTGAAGCCAGCCTTGAACTGGACTTTCAAAGCGAGTCCTACAAAGACG

CTTACAGCCGCATCAACGCGATCGTGATTGAAGGCGAACAAGAGGCGTTCGACAACTA CAATCGCCTTGCTGAGATGCTGCCCGACCAGCGGGATGAGCTTCACAAGCTAGCCAAG ATGGAACAGCGCCACATGAAAGGCTTTATGGCCTGTGGCAAAAATCTCTCCGTCACTC CTGACATGGGTTTTGCCCAGAAATTTTTCGAGCGCTTGCACGAGAACTTCAAAGCGGC GGCTGCGGAAGGCAAGGTCGTCACCTGCCTACTGATTCAATCGCTAATCATCGAGTGC TTTGCGATCGCGGCTTACAACATCTACATCCCAGTGGCGGATGCTTTTGCCCGCAAAA TCACGGAGGGGGTCGTGCGCGACGAATACCTGCACCGCAACTTCGGTGAAGAGTGGCT GAAGGCGAATTTTGATGCTTCCAAAGCCGAACTGGAAGAAGCCAATCGTCAGAACCTG CCCTTGGTTTGGCTAATGCTCAACGAAGTGGCCGATGATGCTCGCGAACTCGGGATGG AGCGTGAGTCGCTCGTCGAGGACTTTATGATTGCCTACGGTGAAGCTCTGGAAAACAT CGGCTTCACAACGCGCGAAATCATGCGTATGTCCGCCTATGGCCTTGCGGCCGTTTGA

aar gene 1594 ATGTTCGGTCTTATCGGTCATCTCACCAGTTTGGAGCAGGCCCGCGACGTTTCTCGCA

GGATGGGCTACGACGAATACGCCGATCAAGGATTGGAGTTTTGGAGTAGCGCTCCTCC TCAAATCGTTGATGAAATCACAGTCACCAGTGCCACAGGCAAGGTGATTCACGGTCGC TACATCGAATCGTGTTTCTTGCCGGAAATGCTGGCGGCGCGCCGCTTCAAAACAGCCA CGCGCAAAGTTCTCAATGCCATGTCCCATGCCCAAAAACACGGCATCGACATCTCGGC CTTGGGGGGCTTTACCTCGATTATTTTCGAGAATTTCGATTTGGCCAGTTTGCGGCAA GTGCGCGACACTACCTTGGAGTTTGAACGGTTCACCACCGGCAATACTCACACGGCCT ACGTAATCTGTAGACAGGTGGAAGCCGCTGCTAAAACGCTGGGCATCGACATTACCCA AGCGACAGTAGCGGTTGTCGGCGCGACTGGCGATATCGGTAGCGCTGTCTGCCGCTGG CTCGACCTCAAACTGGGTGTCGGTGATTTGATCCTGACGGCGCGCAATCAGGAGCGTT TGGATAACCTGCAGGCTGAACTCGGCCGGGGCAAGATTCTGCCCTTGGAAGCCGCTCT GCCGGAAGCTGACTTTATCGTGTGGGTCGCCAGTATGCCTCAGGGCGTAGTGATCGAC CCAGCAACCCTGAAGCAACCCTGCGTCCTAATCGACGGGGGCTACCCCAAAAACTTGG GCAGCAAAGTCCAAGGTGAGGGCATCTATGTCCTCAATGGCGGGGTAGTTGAACATTG CTTCGACATCGACTGGCAGATCATGTCCGCTGCAGAGATGGCGCGGCCCGAGCGCCAG ATGTTTGCCTGCTTTGCCGAGGCGATGCTCTTGGAATTTGAAGGCTGGCATACTAACT TCTCCTGGGGCCGCAACCAAATCACGATCGAGAAGATGGAAGCGATCGGTGAGGCATC GGTGCGCCACGGCTTCCAACCCTTGGCATTGGCAATTTGA

rrnBl B2 Tl txn AATTAGCCCGGGCGAAAGGCCCAGTCTTTCGACTGAGCCTTTCGTTTTATTTGATGCC terminator TGGCAGTTCCCTACTCTCGCATGGGGAGTCCCCACACTACCATCGGCGCTACGGCGTT TCACTTCTGAGTTCGGCATGGGGTCAGGTGGGACCACCGCGCTACTGCCGCCAGGCAA

A

bla gene ATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTC

CTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGG TGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTT CGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGG TATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCA GAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACA GTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTAC TTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGA TCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGAC GAGCGTGACACCACGATGCCTGTAGCGATGGCAACAACGTTGCGCAAACTATTAACTG GCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAA AGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAA TCCGGAGCCGGTGAGCGTGGTTCTCGCGGTATCATCGCAGCGCTGGGGCCAGATGGTA AGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACG AAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAA

lacl gene ATGAAACCAGTAACGTTATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCG

TTTCCCGCGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTGGA AGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAACAACTGGCGGGC AAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCTGGCCCTGCACGCGCCGTCGC AAATTGTCGCGGCGATTAAATCTCGCGCCGATCAACTGGGTGCCAGCGTGGTGGTGTC GATGGTAGAACGAAGCGGCGTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCGCG CAACGCGTCAGTGGGCTGATCATTAACTATCCGCTGGATGACCAGGATGCCATTGCTG TGGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGACCAGACACC CATCAACAGTATTATTTTCTCCCATGAGGACGGTACGCGACTGGGCGTGGAGCATCTG GTCGCATTGGGTCACCAGCAAATCGCGCTGTTAGCGGGCCCATTAAGTTCTGTCTCGG CGCGTCTGCGTCTGGCTGGCTGGCATAAATATCTCACTCGCAATCAAATTCAGCCGAT AGCGGAACGGGAAGGCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATG CTGAATGAGGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCGCTGG GCGCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGGATATCTCGGTAGT GGGATACGACGATACCGAAGATAGCTCATGTTATATCCCGCCGTTAACCACCATCAAA CAGGATTTTCGCCTGCTGGGGCAAACCAGCGTGGACCGCTTGCTGCAACTCTCTCAGG GCCAGGCGGTGAAGGGCAATCAGCTGTTGCCAGTCTCACTGGTGAAAAGAAAAACCAC CCTGGCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAG CTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGA

pJB2166 CTCATGACCAAAATCCCTTAACGTGAGTTACGCGCGCGTCGTTCCACTGAGCGTCAGA

CCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGC TGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGC TACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGT TCTTCTAGTGTAGCCGTAGTTAGCCCACCACTTCAAGAACTCTGTAGCACCGCCTACA TACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTC TTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAAC GGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATAC CTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGT ATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAA CGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTT TTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTT TACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCC TGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGC CGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGGCGAGAGTAGGGAAC TGCCAGGCATCAAACTAAGCAGAAGGCCCCTGACGGATGGCCTTTTTGCGTTTCTACA AACTCTTTCTGTGTTGTAAAACGACGGCCAGTCTTAAGCTCGGGCCCCCTGGGCGGTT CTGATAACGAGTAATCGTTAATCCGCAAATAACGTAAAAACCCGCTTCGGCGGGTTTT TTTATGGGGGGAGTTTAGGGAAAGAGCATTTGTCAGAATATTTAAGGGCGCCTGTCAC T T T G C T T GA A AT G AG AAT AT T AAC C T A AAAT G AG AAAAAAG C AAC G C AC T T T AAATAAGATACGTTGCTTTTTCGATTGATGAACACCTATAATTAAACTATTCATCTAT T AT T T AT GAT T T T T T GT AT AT AC AAT AT TTCTAGTT T GT T AAAG AG AAT T AAG AAAAT AAATCTCGAAAATAATAAAGGGAAAATCAGTTTTTGATATCAAAATTATACATGTCAA CGATAATACAAAATATAATACAAACTATAAGATGTTATCAGTATTTATTATGCAGccg CTTAAGACCCACTTTCACATTTAAGTTGTTTTTCTAATCCGCAAATGATCAATTCAAG GCCGAATAAGAAGGCTGGCTCTGCACCTTGGTGTTCAAATAATTCGATAGCTTGTCGT AATAATGCTGGCATACTATCAGTAGTAGGTGTTTCCCTTTCTTCTTTAGCGACTTGAT GCTCTTGATCTTCCAATACGCAACCTAAAGTAAGATGCCCCACAGCGCTGAGTGCATA TAATGCATTCTCTAGTGAAAAACCTTGTTGGCATAAAAAGGCTAATTGATTTTCGAGA GTTTCATACTGTTTTTCTGTAGGCCGTGTACCTAAATGTACTTTTGCTCCATCGCGAT GACTTAGTAAAGCACATCTAAAACTTTTAGCGTTATTACGTAAAAAATCTTGCCAGCT TTCCCCTTCTAAAGGGCAAAAGTGAGTATGGTGCCTATCTAACATCTCAATGGCTAAG GCGTCGAGCAAAGCCCGCTTATTTTTTACATGCCAATACAATGTAGGCTGCTCTACAC CCAGCTTCTGGGCGAGTTTACGGGTTTTTAAACCTTCGATTCCGACCTCATTAAGCAG CTCTAATGCGCTGTTAATCACTTTACTTTTATCTAATCTGGACATCATttggttttCC tccagcaaaatgtacagcaaccattatcaccgccagaggtaaaatagtcaacacgcac ggtgttagagctcTCCCTATCAGTGA AGAGATTGACATCCC ATCAGTGA AGAGAT ACTGAGCACATCAGCAGGACGCACTGACCCAATTCATTAAAGAGGAGAAAGGTCATAT GCCGCAGCTTGAAGCCAGCCTTGAACTGGACTTTCAAAGCGAGTCCTACAAAGACGCT TACAGCCGCATCAACGCGATCGTGATTGAAGGCGAACAAGAGGCGTTCGACAACTACA ATCGCCTTGCTGAGATGCTGCCCGACCAGCGGGATGAGCTTCACAAGCTAGCCAAGAT GGAACAGCGCCACATGAAAGGCTTTATGGCCTGTGGCAAAAATCTCTCCGTCACTCCT GACATGGGTTTTGCCCAGAAATTTTTCGAGCGCTTGCACGAGAACTTCAAAGCGGCGG CTGCGGAAGGCAAGGTCGTCACCTGCCTACTGATTCAATCGCTAATCATCGAGTGCTT TGCGATCGCGGCTTACAACATCTACATCCCAGTGGCGGATGCTTTTGCCCGCAAAATC ACGGAGGGGGTCGTGCGCGACGAATACCTGCACCGCAACTTCGGTGAAGAGTGGCTGA AGGCGAATTTTGATGCTTCCAAAGCCGAACTGGAAGAAGCCAATCGTCAGAACCTGCC CTTGGTTTGGCTAATGCTCAACGAAGTGGCCGATGATGCTCGCGAACTCGGGATGGAG CGTGAGTCGCTCGTCGAGGACTTTATGATTGCCTACGGTGAAGCTCTGGAAAACATCG GCTTCACAACGCGCGAAATCATGCGTATGTCCGCCTATGGCCTTGCGGCCGTTTGAtc caggaaatctgaATGTTCGGTCTTATCGGTCATCTCACCAGTTTGGAGCAGGCCCGCG ACGTTTCTCGCAGGATGGGCTACGACGAATACGCCGATCAAGGATTGGAGTTTTGGAG TAGCGCTCCTCCTCAAATCGTTGATGAAATCACAGTCACCAGTGCCACAGGCAAGGTG ATTCACGGTCGCTACATCGAATCGTGTTTCTTGCCGGAAATGCTGGCGGCGCGCCGCT TCAAAACAGCCACGCGCAAAGTTCTCAATGCCATGTCCCATGCCCAAAAACACGGCAT CGACATCTCGGCCTTGGGGGGCTTTACCTCGATTATTTTCGAGAATTTCGATTTGGCC AGTTTGCGGCAAGTGCGCGACACTACCTTGGAGTTTGAACGGTTCACCACCGGCAATA CTCACACGGCCTACGTAATCTGTAGACAGGTGGAAGCCGCTGCTAAAACGCTGGGCAT CGACATTACCCAAGCGACAGTAGCGGTTGTCGGCGCGACTGGCGATATCGGTAGCGCT GTCTGCCGCTGGCTCGACCTCAAACTGGGTGTCGGTGATTTGATCCTGACGGCGCGCA ATCAGGAGCGTTTGGATAACCTGCAGGCTGAACTCGGCCGGGGCAAGATTCTGCCCTT GGAAGCCGCTCTGCCGGAAGCTGACTTTATCGTGTGGGTCGCCAGTATGCCTCAGGGC GTAGTGATCGACCCAGCAACCCTGAAGCAACCCTGCGTCCTAATCGACGGGGGCTACC CCAAAAACTTGGGCAGCAAAGTCCAAGGTGAGGGCATCTATGTCCTCAATGGCGGGGT AGTTGAACATTGCTTCGACATCGACTGGCAGATCATGTCCGCTGCAGAGATGGCGCGG CCCGAGCGCCAGATGTTTGCCTGCTTTGCCGAGGCGATGCTCTTGGAATTTGAAGGCT GGCATACTAACTTCTCCTGGGGCCGCAACCAAATCACGATCGAGAAGATGGAAGCGAT CGGTGAGGCATCGGTGCGCCACGGCTTCCAACCCTTGGCATTGGCAATTTGAgaattc AAAacgtttcaattggctaataggatccTAGACGTCgcTAAtacggccggccaccctt ttttaggtagcGCTAGCatagggcccTAACTCGAGCCCCAAGGGCGACACCCCATAAT TAGCCCGGGCGAAAGGCCCAGTCTTTCGACTGAGCCTTTCGTTTTATTTGATGCCTGG CAGTTCCCTACTCTCGCATGGGGAGTCCCCACACTACCATCGGCGCTACGGCGTTTCA CTTCTGAGTTCGGCATGGGGTCAGGTGGGACCACCGCGCTACTGCCGCCAGGCAAACA AGGGGTGTTATGAGCCATATTCAGGTATAAATGGGCTCGCGATAATGTTCAGAATTGG TTAATTGGTTGTAACACTGACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATG TATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGA ATATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCT TCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTG GGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTT TTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGC GGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCT CAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGA CAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTT ACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGG GATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACG ACGAGCGTGACACCACGATGCCTGTAGCGATGGCAACAACGTTGCGCAAACTATTAAC TGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGAT AAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATA AATCCGGAGCCGGTGAGCGTGGTTCTCGCGGTATCATCGCAGCGCTGGGGCCAGATGG TAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAA CGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAAGCGGCGC GCCATCGAATGGCGCAAAACCTTTCGCGGTATGGCATGATAGCGCCCGGAAGAGAGTC AATTCAGGGTGGTGAATATGAAACCAGTAACGTTATACGATGTCGCAGAGTATGCCGG TGTCTCTTATCAGACCGTTTCCCGCGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAA ACGCGGGAAAAAGTGGAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGG CACAACAACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCTGGC CCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCCGATCAACTGGGT GCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGCGTCGAAGCCTGTAAAGCGGCGG TGCACAATCTTCTCGCGCAACGCGTCAGTGGGCTGATCATTAACTATCCGCTGGATGA CCAGGATGCCATTGCTGTGGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGAT GTCTCTGACCAGACACCCATCAACAGTATTATTTTCTCCCATGAGGACGGTACGCGAC TGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAGCAAATCGCGCTGTTAGCGGGCCC ATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGGCATAAATATCTCACTCGC AATCAAATTCAGCCGATAGCGGAACGGGAAGGCGACTGGAGTGCCATGTCCGGTTTTC AACAAACCATGCAAATGCTGAATGAGGGCATCGTTCCCACTGCGATGCTGGTTGCCAA CGATCAGATGGCGCTGGGCGCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGT GCGGATATCTCGGTAGTGGGATACGACGATACCGAAGATAGCTCATGTTATATCCCGC CGTTAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACCAGCGTGGACCGCTT GCTGCAACTCTCTCAGGGCCAGGCGGTGAAGGGCAATCAGCTGTTGCCAGTCTCACTG GTGAAAAGAAAAACCACCCTGGCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGG CCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGA

P15A origin of gcgctagcggagtgtatactggcttactatgttggcactgatgagggtgtcagtgaag replication tgcttcatgtggcaggagaaaaaaggctgcaccggtgcgtcagcagaatatgtgatac aggatatattccgcttcctcgctcactgactcgctacgctcggtcgttcgactgcggc gagcggaaatggcttacgaacggggcggagatttcctggaagatgccaggaagatact taacagggaagtgagagggccgcggcaaagccgtttttccataggctccgcccccctg acaagcatcacgaaatctgacgctcaaatcagtggtggcgaaacccgacaggactata aagataccaggcgtttccccctggcggctccctcgtgcgctctcctgttcctgccttt cggtttaccggtgtcattccgctgttatggccgcgtttgtctcattccacgcctgaca ctcagttccgggtaggcagttcgctccaagctggactgtatgcacgaaccccccgttc agtccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggaaagaca tgcaaaagcaccactggcagcagccactggtaattgatttagaggagttagtcttgaa gtcatgcgccggttaaggctaaactgaaaggacaagttttggtgactgcgctcctcca agccagttacctcggttcaaagagttggtagctcagagaaccttcgaaaaaccgccct gcaaggcggttttttcgttttcagagcaagagattacgcgcagaccaaaacgatctca agaagatcatcttattaatcagataaaatatttctagatttcagtgcaatttatctct tcaaatgtagcacctgaagtcagccccatacgatataagttgt

T5 promoter AATTGTGAGCGGATAACAATTACGAGCTTCATGCACAGTGAAATCATGAAAAATTTAT

TTGCTTTGTGAGCGGATAACAATTATAATATGTGGAATTGTGAGCGCTCACAATTCCA CAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTTTTAGGAGGTAAAA

emrAgene ATGAGCGCAAATGCGGAGACTCAAACCCCGCAGCAACCGGTAAAGAAGAGCGGCAAAC

GTAAGCGTCTGCTCCTCCTTCTCACCTTGCTCTTTATAATTATTGCCGTAGCGATAGG GATTTATTGGTTTTTGGTACTGCGTCACTTCGAAGAAACCGATGACGCATACGTGGCA GGGAATCAAATTCAAATTATGTCTCAGGTGTCTGGCAGCGTGACGAAAGTCTGGGCCG ATAACACCGATTTTGTAAAAGAAGGCGACGTGCTGGTCACTCTCGACCCGACAGATGC TCGCCAGGCGTTTGAAAAAGCCAAAACTGCACTGGCTTCCAGCGTTCGCCAAACCCAC CAGCTGATGATTAACAGCAAGCAGTTGCAGGCGAATATTGAGGTGCAGAAAATCGCCC TCGCGAAAGCACAAAGCGACTACAACCGCCGTGTGCCGCTGGGCAATGCCAACCTGAT TGGTCGCGAAGAGCTGCAACACGCCCGCGACGCCGTCACCAGTGCCCAGGCGCAACTG GACGTCGCGATTCAACAATACAATGCCAATCAGGCGATGATTCTGGGGACTAAACTGG AAGATCAGCCAGCCGTGCAACAGGCTGCCACCGAAGTACGTAACGCCTGGCTGGCGCT GGAGCGTACTCGTATTATCAGTCCGATGACCGGTTATGTCTCCCGCCGCGCGGTACAG CCTGGGGCGCAAATTAGCCCAACGACGCCGCTGATGGCGGTCGTTCCAGCCACCAATA TGTGGGTGGATGCCAACTTTAAAGAGACGCAGATTGCCAATATGCGTATCGGTCAGCC GGTCACTATCACCACGGATATTTACGGCGATGATGTGAAATACACCGGTAAAGTGGTT GGTCTGGATATGGGCACAGGTAGCGCGTTCTCACTGCTTCCAGCGCAAAATGCGACCG GTAACTGGATCAAAGTCGTTCAGCGTCTGCCTGTGCGTATCGAACTGGACCAGAAACA GCTGGAGCAATATCCGCTGCGTATCGGTTTGTCCACGCTGGTGAGCGTCAATACCACT AACCGTGACGGTCAGGTACTGGCAAATAAAGTACGTTCCACTCCGGTAGCGGTAAGCA CCGCGCGTGAAATCAGCCTGGCACCTGTCAATAAACTGATCGACGATATCGTAAAAGC TAACGCTGGCtaa

emrB gene ATGCAACAGCAAAAACCGCTGGAAGGCGCGCAACTGGTCATTATGACGATTGCGCTGT

CACTGGCGACATTCATGCAGGTGCTGGACTCCACCATTGCTAACGTGGCGATCCCCAC TATCGCCGGGAATCTGGGCTCATCGCTCAGCCAGGGAACGTGGGTAATCACTTCTTTC GGGGTGGCGAATGCCATCTCGATCCCGCTTACCGGCTGGCTGGCAAAGCGCGTCGGGG AAGTGAAACTGTTCCTTTGGTCCACCATCGCCTTTGCTATTGCGTCGTGGGCGTGTGG TGTCTCCAGCAGCCTGAATATGCTGATCTTCTTCCGCGTGATTCAGGGGATTGTCGCC GGGCCGTTGATCCCGCTTTCGCAAAGTCTATTGCTGAATAACTACCCGCCAGCCAAAC GCTCGATCGCGCTGGCGTTGTGGTCGATGACGGTGATTGTCGCGCCAATTTGCGGCCC GATCCTCGGCGGTTATATCAGCGATAATTACCACTGGGGCTGGATATTCTTCATCAAC GTGCCGATTGGCGTGGCGGTGGTGTTGATGACACTGCAAACTCTGCGCGGACGTGAAA CCCGCACCGAACGGCGGCGGATTGATGCCGTGGGGCTGGCACTGCTGGTTATTGGTAT CGGCAGCCTGCAGATTATGCTCGACCGCGGTAAAGAGCTGGACTGGTTTTCATCACAG GAAATTATCATCCTTACCGTGGTGGCGGTGGTGGCTATCTGCTTCCTGATTGTCTGGG AGCTGACCGACGATAACCCGATAGTCGATCTGTCGTTGTTTAAGTCGCGCAACTTCAC CATCGGCTGCTTGTGTATCAGCCTCGCGTATATGCTCTACTTCGGCGCTATTGTTCTG CTGCCGCAGTTGTTGCAGGAGGTCTACGGTTACACGGCGACCTGGGCAGGTTTGGCCT CTGCGCCGGTAGGGATTATTCCGGTGATCCTGTCGCCGATTATCGGCCGCTTCGCGCA TAAACTGGATATGCGGCGGCTGGTAACCTTCAGCTTTATTATGTATGCCGTCTGCTTC TACTGGCGTGCCTATACCTTTGAACCAGGTATGGATTTTGGCGCGTCGGCCTGGCCGC AGTTTATCCAGGGGTTTGCGGTGGCCTGCTTCTTTATGCCGCTGACCACCATTACGCT GTCTGGTTTGCCACCGGAACGACTGGCGGCGGCATCGAGCCTCTCTAACTTTACGCGA ACGCTGGCGGGGTCTATCGGCACGTCGATAACCACGACCATGTGGACCAACCGCGAGT CGATGCACCATGCGCAGTTGACTGAGTCGGTAAACCCGTTCAACCCGAATGCCCAGGC GATGTACAGTCAACTGGAAGGGCTTGGGATGACGCAACAGCAGGCATCAGGCTGGATT GCCCAGCAGATCACCAATCAGGGGCTGATTATTTCCGCCAATGAGATCTTCTGGATGT CAGCCGGGATATTCCTCGTCCTGCTGGGGCTGGTGTGGTTTGCTAAACCGCCATTTGG CGCAGGTGGCGGCGGAGGCGGTGCGCACTAA

cat gene ttacgccccgccctgccactcatcgcagtactgttgtaattcattaagcattctgccg acatggaagccatcacagacggcatgatgaacctgaatcgccagcggcatcagcacct tgtcgccttgcgtataatatttgcccatggtgaaaacgggggcgaagaagttgtccat attggccacgtttaaatcaaaactggtgaaactcacccagggattggctgagacgaaa aacatattctcaataaaccctttagggaaataggccaggttttcaccgtaacacgcca catcttgcgaatatatgtgtagaaactgccggaaatcgtcgtggtattcactccagag cgatgaaaacgtttcagtttgctcatggaaaacggtgtaacaagggtgaacactatcc catatcaccagctcaccgtctttcattgccatacggaattccggatgagcattcatca ggcgggcaagaatgtgaataaaggccggataaaacttgtgcttatttttctttacggt ctttaaaaaggccgtaatatccagctgaacggtctggttataggtacattgagcaact gactgaaatgcctcaaaatgttctttacgatgccattgggatatatcaacggtggtat atccagtgatttttttctccat

pJB1849 ggtgatgctgccaacttactgatttagtgtatgatggtgtttttgaggtgctccagtg gcttctgtttctatcagctgtccctcctgttcagctactgacggggtggtgcgtaacg gcaaaagcaccgccggacatcagcgctagcggagtgtatactggcttactatgttggc actgatgagggtgtcagtgaagtgcttcatgtggcaggagaaaaaaggctgcaccggt gcgtcagcagaatatgtgatacaggatatattccgcttcctcgctcactgactcgcta cgctcggtcgttcgactgcggcgagcggaaatggcttacgaacggggcggagatttcc tggaagatgccaggaagatacttaacagggaagtgagagggccgcggcaaagccgttt ttccataggctccgcccccctgacaagcatcacgaaatctgacgctcaaatcagtggt ggcgaaacccgacaggactataaagataccaggcgtttccccctggcggctccctcgt gcgctctcctgttcctgcctttcggtttaccggtgtcattccgctgttatggccgcgt ttgtctcattccacgcctgacactcagttccgggtaggcagttcgctccaagctggac tgtatgcacgaaccccccgttcagtccgaccgctgcgccttatccggtaactatcgtc ttgagtccaacccggaaagacatgcaaaagcaccactggcagcagccactggtaattg atttagaggagttagtcttgaagtcatgcgccggttaaggctaaactgaaaggacaag ttttggtgactgcgctcctccaagccagttacctcggttcaaagagttggtagctcag agaaccttcgaaaaaccgccctgcaaggcggttttttcgttttcagagcaagagatta cgcgcagaccaaaacgatctcaagaagatcatcttattaatcagataaaatatttcta gatttcagtgcaatttatctcttcaaatgtagcacctgaagtcagccccatacgatat aagttgtaattctcatgtttgacagcttatcatcgataAGCTTCCTTAATTGTGAGCG GATAACAATTACGAGCTTCATGCACAGTGAAATCATGAAAAATTTATTTGCTTTGTGA GCGGATAACAATTATAATATGTGGAATTGTGAGCGCTCACAATTCCACAACGGTTTCC CTCTAGAAATAATTTTGTTTAACTTTTAGGAGGTAAAACATATGAGCGCAAATGCGGA GACTCAAACCCCGCAGCAACCGGTAAAGAAGAGCGGCAAACGTAAGCGTCTGCTCCTC CTTCTCACCTTGCTCTTTATAATTATTGCCGTAGCGATAGGGATTTATTGGTTTTTGG TACTGCGTCACTTCGAAGAAACCGATGACGCATACGTGGCAGGGAATCAAATTCAAAT TATGTCTCAGGTGTCTGGCAGCGTGACGAAAGTCTGGGCCGATAACACCGATTTTGTA AAAGAAGGCGACGTGCTGGTCACTCTCGACCCGACAGATGCTCGCCAGGCGTTTGAAA AAGCCAAAACTGCACTGGCTTCCAGCGTTCGCCAAACCCACCAGCTGATGATTAACAG CAAGCAGTTGCAGGCGAATATTGAGGTGCAGAAAATCGCCCTCGCGAAAGCACAAAGC GACTACAACCGCCGTGTGCCGCTGGGCAATGCCAACCTGATTGGTCGCGAAGAGCTGC AACACGCCCGCGACGCCGTCACCAGTGCCCAGGCGCAACTGGACGTCGCGATTCAACA ATACAATGCCAATCAGGCGATGATTCTGGGGACTAAACTGGAAGATCAGCCAGCCGTG CAACAGGCTGCCACCGAAGTACGTAACGCCTGGCTGGCGCTGGAGCGTACTCGTATTA TCAGTCCGATGACCGGTTATGTCTCCCGCCGCGCGGTACAGCCTGGGGCGCAAATTAG CCCAACGACGCCGCTGATGGCGGTCGTTCCAGCCACCAATATGTGGGTGGATGCCAAC TTTAAAGAGACGCAGATTGCCAATATGCGTATCGGTCAGCCGGTCACTATCACCACGG ATATTTACGGCGATGATGTGAAATACACCGGTAAAGTGGTTGGTCTGGATATGGGCAC AGGTAGCGCGTTCTCACTGCTTCCAGCGCAAAATGCGACCGGTAACTGGATCAAAGTC GTTCAGCGTCTGCCTGTGCGTATCGAACTGGACCAGAAACAGCTGGAGCAATATCCGC TGCGTATCGGTTTGTCCACGCTGGTGAGCGTCAATACCACTAACCGTGACGGTCAGGT ACTGGCAAATAAAGTACGTTCCACTCCGGTAGCGGTAAGCACCGCGCGTGAAATCAGC CTGGCACCTGTCAATAAACTGATCGACGATATCGTAAAAGCTAACGCTGGCtaaTCCA GAGGTGCGTGTGATGCAACAGCAAAAACCGCTGGAAGGCGCGCAACTGGTCATTATGA CGATTGCGCTGTCACTGGCGACATTCATGCAGGTGCTGGACTCCACCATTGCTAACGT GGCGATCCCCACTATCGCCGGGAATCTGGGCTCATCGCTCAGCCAGGGAACGTGGGTA ATCACTTCTTTCGGGGTGGCGAATGCCATCTCGATCCCGCTTACCGGCTGGCTGGCAA AGCGCGTCGGGGAAGTGAAACTGTTCCTTTGGTCCACCATCGCCTTTGCTATTGCGTC GTGGGCGTGTGGTGTCTCCAGCAGCCTGAATATGCTGATCTTCTTCCGCGTGATTCAG GGGATTGTCGCCGGGCCGTTGATCCCGCTTTCGCAAAGTCTATTGCTGAATAACTACC CGCCAGCCAAACGCTCGATCGCGCTGGCGTTGTGGTCGATGACGGTGATTGTCGCGCC AATTTGCGGCCCGATCCTCGGCGGTTATATCAGCGATAATTACCACTGGGGCTGGATA TTCTTCATCAACGTGCCGATTGGCGTGGCGGTGGTGTTGATGACACTGCAAACTCTGC GCGGACGTGAAACCCGCACCGAACGGCGGCGGATTGATGCCGTGGGGCTGGCACTGCT GGTTATTGGTATCGGCAGCCTGCAGATTATGCTCGACCGCGGTAAAGAGCTGGACTGG TTTTCATCACAGGAAATTATCATCCTTACCGTGGTGGCGGTGGTGGCTATCTGCTTCC TGATTGTCTGGGAGCTGACCGACGATAACCCGATAGTCGATCTGTCGTTGTTTAAGTC GCGCAACTTCACCATCGGCTGCTTGTGTATCAGCCTCGCGTATATGCTCTACTTCGGC GCTATTGTTCTGCTGCCGCAGTTGTTGCAGGAGGTCTACGGTTACACGGCGACCTGGG CAGGTTTGGCCTCTGCGCCGGTAGGGATTATTCCGGTGATCCTGTCGCCGATTATCGG CCGCTTCGCGCATAAACTGGATATGCGGCGGCTGGTAACCTTCAGCTTTATTATGTAT GCCGTCTGCTTCTACTGGCGTGCCTATACCTTTGAACCAGGTATGGATTTTGGCGCGT CGGCCTGGCCGCAGTTTATCCAGGGGTTTGCGGTGGCCTGCTTCTTTATGCCGCTGAC CACCATTACGCTGTCTGGTTTGCCACCGGAACGACTGGCGGCGGCATCGAGCCTCTCT AACTTTACGCGAACGCTGGCGGGGTCTATCGGCACGTCGATAACCACGACCATGTGGA CCAACCGCGAGTCGATGCACCATGCGCAGTTGACTGAGTCGGTAAACCCGTTCAACCC GAATGCCCAGGCGATGTACAGTCAACTGGAAGGGCTTGGGATGACGCAACAGCAGGCA TCAGGCTGGATTGCCCAGCAGATCACCAATCAGGGGCTGATTATTTCCGCCAATGAGA TCTTCTGGATGTCAGCCGGGATATTCCTCGTCCTGCTGGGGCTGGTGTGGTTTGCTAA ACCGCCATTTGGCGCAGGTGGCGGCGGAGGCGGTGCGCACTAAaGCATGCcatgcacc attccttgcggcggcggtgctcaacggcctcaacctactactgggctgcttcctaatg caggagtcgcataagggagagcgtcgaccgatgcccttgagagccttcaacccagtca gctccttccggtgggcgcggggcatgactatcgtcgccgcacttatgactgtcttctt tatcatgcaactcgtaggacaggtgccggcagcgctctgggtcattttcggcgaggac cgctttcgctggagcgcgacgatgatcggcctgtcgcttgcggtattcggaatcttgc acgccctcgctcaagccttcgtcactggtcccgccaccaaacgtttcggcgagaagca ggccattatcgccggcatggcggccgacgcgctgggctacgtcttgctggcgttcgcg acgcgaggctggatggccttccccattatgattcttctcgcttccggcggcatcggga tgcccgcgttgcaggccatgctgtccaggcaggtagatgacgaccatcagggacagct tcaaggatcgctcgcggctcttaccagcctaacttcgatcactggaccgctgatcgtc acggcgatttatgccgcctcggcgagcacatggaacgggttggcatggattgtaggcg ccgccctataccttgtctgcctccccgcgttgcgtcgcggtgcatggagccgggccac ctcgacctgaatggaagccggcggcacctcgctaacggattcaccactccaagaattg gagccaatcaattcttgcggagaactgtgaatgcgcaaaccaacccttggcagaacat atccatcgcgtccgccatctccagcagccgcacgcggcgcatctcgggcagcgttggg tcctggccacgggtgcgcatgatcgtgctcctgtcgttgaggacccggctaggctggc ggggttgccttactggttagcagaatgaatcaccgatacgcgagcgaacgtgaagcga ctgctgctgcaaaacgtctgcgacctgagcaacaacatgaatggtcttcggtttccgt gtttcgtaaagtctggaaacgcggaagtcccctacgtgctgctgaagttgcccgcaac agagagtggaaccaaccggtgataccacgatactatgactgagagtcaacgccatgag cggcctcatttcttattctgagttacaacagtccgcaccgctgtccggtagctccttc cggtgggcgcggggcatgactatcgtcgccgcacttatgactgtcttctttatcatgc aactcgtaggacaggtgccggcagcgcccaacagtcccccggccacggggcctgccac catacccacgccgaaacaagcgccctgcaccattatgttccggatctgcatcgcagga tgctgctggctaccctgtggaacacctacatctgtattaacgaagcgctaaccgtttt tatcaggctctgggaggcagaataaatgatcatatcgtcaattattacctccacgggg agagcctgagcaaactggcctcaggcatttgagaagcacacggtcacactgcttccgg tagtcaataaaccggtaaaccagcaatagacataagcggctatttaacgaccctgccc tgaaccgacgaccgggtcgaatttgctttcgaatttctgccattcatccgcttattat cacttattcaggcgtagcaccaggcgtttaagggcaccaataactgccttaaaaaaat tacgccccgccctgccactcatcgcagtactgttgtaattcattaagcattctgccga catggaagccatcacagacggcatgatgaacctgaatcgccagcggcatcagcacctt gtcgccttgcgtataatatttgcccatggtgaaaacgggggcgaagaagttgtccata ttggccacgtttaaatcaaaactggtgaaactcacccagggattggctgagacgaaaa acatattctcaataaaccctttagggaaataggccaggttttcaccgtaacacgccac atcttgcgaatatatgtgtagaaactgccggaaatcgtcgtggtattcactccagagc gatgaaaacgtttcagtttgctcatggaaaacggtgtaacaagggtgaacactatccc atatcaccagctcaccgtctttcattgccatacggaattccggatgagcattcatcag gcgggcaagaatgtgaataaaggccggataaaacttgtgcttatttttctttacggtc tttaaaaaggccgtaatatccagctgaacggtctggttataggtacattgagcaactg actgaaatgcctcaaaatgttctttacgatgccattgggatatatcaacggtggtata tccagtgatttttttctccattttagcttccttagctcctgaaaatctcgataactca aaaaatacgcccggtagtgatcttatttcattatggtgaaagttggaacctcttacgt gccgatcaacgtctcattttcgccaaaagttggcccagggcttcccggtatcaacagg gacaccaggatttatttattctgcgaagtgatcttccgtcacaggtatttattcggcg caaagtgcgtcg

EmrA protein MSANAETQTP QQPVKKSGKR KRLLLLLTLL FIIIAVAIGI YWFLVLRHFE

E DDAYVAGN QIQIMSQVSG SVTKVWADNT DFVKEGDVLV TLDPTDARQA FEKAK ALAS SVRQTHQLMI NSKQLQANIE VQKIALAKAQ SDYNRRVPLG NANLIGREEL QHARDAVTSA QAQLDVAIQQ YNANQAMILG TKLEDQPAVQ QAATEVRNAW LALERTRI IS PMTGYVSRRA VQPGAQISPT TPLMAVVPAT NMWVDANFKE TQIANMRIGQ PV I TDIYG DDVKYTGKVV GLDMGTGSAF SLLPAQNATG NWIKVVQRLP VRIELDQKQL EQYPLRIGLS TLVSVNTTNR DGQVLANKVR STPVAVSTAR EISLAPVNKL IDDIVKANAG EmrB protein MQQQKPLEGA QLVIMTIALS LATFMQVLDS TIANVAIPTI AGNLGSSLSQ GTWVITSFGV ANAISIPLTG WLAKRVGEVK LFLWSTIAFA lASWACGVSS SLNMLIFFRV IQGIVAGPLI PLSQSLLLNN YPPAKRSIAL ALWSMTVIVA PICGPILGGY ISDNYHWGWI FFINVPIGVA VVLMTLQTLR GRETRTERRR IDAVGLALLV IGIGSLQIML DRGKELDWFS SQEIIILTVV AVVAICFLIV WELTDDNPIV DLSLFKSRNF TIGCLCISLA YMLYFGAIVL LPQLLQEVYG YTATWAGLAS APVGIIPVIL SPIIGRFAHK LDMRRLVTFS FIMYAVCFYW RAYTFEPGMD FGASAWPQFI QGFAVACFFM PLTTITLSGL PPERLAAASS LSNFTRTLAG SIGTSITTTM WTNRESMHHA QLTESVNPFN PNAQAMYSQL EGLGMTQQQA SGWIAQQITN QGLIISANEI FWMSAGIFLV LLGLVWFAKP PFGAGGGGGG AH

tolC gene ATGAAGAAATTGCTCCCCATTCTTATCGGCCTGAGCCTTTCTGGGTTCAGTTCGTTGA

GCCAGGCCGAGAACCTGATGCAAGTTTATCAGCAAGCACGCCTTAGTAACCCGGAATT GCGTAAGTCTGCCGCCGATCGTGATGCTGCCTTTGAAAAAATTAATGAAGCGCGCAGT CCATTACTGCCACAGCTAGGTTTAGGTGCAGATTACACCTATAGCAACGGCTACCGCG ACGCGAACGGCATCAACTCTAACGCGACCAGTGCGTCCTTGCAGTTAACTCAATCCAT TTTTGATATG

TCGAAATGGCGTGCGTTAACGCTGCAGGAAAAAGCAGCAGGGATTCAGGACGTCACG T ATCAGACCGATCAGCAAACCTTGATCCTCAACACCGCGACCGCTTATTTCAACGTGTT GAATGCTATTGACGTTCTTTCCTATACACAGGCACAAAAAGAAGCGATCTACCGTCAA TTAGATCAAACCACCCAACGTTTTAACGTGGGCCTGGTAGCGATCACCGACGTGCAGA ACGCCCGCGCACAGTACGATACCGTGCTGGCGAACGAAGTGACCGCACGTAATAACCT TGATAACGCG

GTAGAGCAGCTGCGCCAGATCACCGGTAACTACTATCCGGAACTGGCTGCGCTGAAT G TCGAAAACTTTAAAACCGACAAACCACAGCCGGTTAACGCGCTGCTGAAAGAAGCCGA AAAACGCAACCTGTCGCTGTTACAGGCACGCTTGAGCCAGGACCTGGCGCGCGAGCAA ATTCGCCAGGCGCAGGATGGTCACTTACCGACTCTGGATTTAACGGCTTCTACCGGGA TTTCTGACACCTCTTATAGCGGTTCGAAAACCCGTGGTGCCGCTGGTACCCAGTATGA CGATAGCAAT

ATGGGCCAGAACAAAGTTGGCCTGAGCTTCTCGCTGCCGATTTATCAGGGCGGAATG G TTAACTCGCAGGTGAAACAGGCACAGTACAACTTTGTCGGTGCCAGCGAGCAACTGGA AAGTGCCCATCGTAGCGTCGTGCAGACCGTGCGTTCCTCCTTCAACAACATTAATGCA TCTATCAGTAGCATTAACGCCTACAAACAAGCCGTAGTTTCCGCTCAAAGCTCATTAG ACGCGATGGAAGCGGGCTACTCGGTCGGTACGCGTACCATTGTTGATGTGTTGGATGC GACCACCACG

TTGTACAACGCCAAGCAAGAGCTGGCGAATGCGCGTTATAACTACCTGATTAATCAG C TGAATATTAAGTCAGCTCTGGGTACGTTGAACGAGCAGGATCTGCTGGCACTGAACAA TGCGCTGAGCAAACCGGTTTCCACTAATCCGGAAAACGTTGCACCGCAAACGCCGGAA CAGAATGCTATTGCTGATGGTTATGCGCCTGATAGCCCGGCACCAGTCGTTCAGCAAA CATCCGCACGCACTACCACCAGTAACGGTCATAACCCTTTCCGTAACTGA

TolC protein MKKLLPILIGLSLSGFSSLSQAENLMQVYQQARLSNPELRKSAADRDAAFE

KINEARSPLLPQLGLGADYTYSNGYRDANGINSNATSASLQLTQSIFDMSK WRALTLQEKAAGIQDVTYQTDQQTLILNTATAYFNVLNAIDVLSYTQAQKE AIYRQLDQTTQRFNVGLVAITDVQNARAQYDTVLANEVTARNNLDNAVEQL RQITGNYYPELAALNVENFKTDKPQPVNALLKEAEKRNLSLLQARLSQDLA REQIRQAQDGHLPTLDLTASTGISDTSYSGSKTRGAAGTQYDDSNMGQNKV GLSFSLPIYQGGMVNSQVKQAQYNFVGASEQLESAHRSVVQTVRSSFNNIN ASISSINAYKQAVVSAQSSLDAAGYSVGTRTIVDVLDATTTLYNAKQELAN ARYNYLINQLNKSALGTLNEQDLLALNNALSKPVSTNPENVAPQTPEQNAI ADGYAPDSPAPVVQQTSARTTTSNGHNPFRN

Cyan7822 22 MKPEPLTNNSIQLESSPVSNGKKTQLERNPQQSSQTTYEDQIESTSSIASEGNETKLK 44 PNEQKNLPIKNEDTVTFKDPGLEEDPQQNQEKNPLKSLVKRGLMLLITLSLFGVGGLW

GWRWWQFGQTHVTTDNAQIQGHLSPISAKISATVQQVLIEDGDPVEAGQILI ILEDQD LNLKIQQAQANLRAAQAHLKTARDTVSVTRQTNPTQVQQAQSKLASSQSAVSAEQANV HQMQAKVETEQANVAQAQTLVNKTLADFRRYEFLYEQGAVSAQQFDTARAAYEDARSH LAATNKTVAQAQAEVKNAQAQLKQAQAQVEAARGQVAETQVSGQTVTVQTDQQQEAQA QVEQAKAALALARQQLKYTLIKSPIKGTIGQLTAQMGQKVQPEQPLLSVVPLQTERVY VQANFKETQLGKLHIGQPADIEVDAYPQEKFHATIAGISPATGASFALIPPDNATGNF NKVVQWVPVRLVFNRNADPQHKLRPGLNVKVTVNTAKDKR

Cyan7822 22 MATNQFSSKTKLSKKSKDLVYDAGGYVQGPRKWAIAVTASLGAILEVIDTSI INVALT 43 DIQTTLGATITEIAWVATGYAIANVILIPLSAWLGDFFGKKTYFVFSMVGFTFASVLC

GLSPNLAILVIARILQGLFGGGLLAKAQAILFESFPPAEQGVAQSVFGVGVIAGPAIG PTLGGYLTDNLGWRWIFFINLPIGLVAIAMSMIFLRPDPPKVNKTQQPVDWWGIALLA VSVGSCQAFLEEGERNDWFESSFITTLAITGVVGLVLFIWRELNIKAPAVDLRVLKHK SLAAGSLYSGILGMGLYGALFAVPLFAQGVLGFSATQTGDLLAPGALASAIVMI ILGK MSGKVDARILIGLGAVGTSAVMFNLATITPQTGTDDLFWPLVWRGATTVLMFLPLSLA CLGPIPKQDISAASGFYNLSRQLGGSIGIAILTTLLQQREAFHQTILLAKLTPYDLET HERLKVLTSLFMSRGSDAGSAHQQALASLQQLVNIQAAILSFADIFRFVGLIFLCSLP LLLFLGKGGASSKAPAAH Cyan7822 21 MVLSWSSELLNNHRQHLTQLKAMLSFPKWMTFSTGILAFVIYGASVNAQMIPTAGPSQ 89 VNLESSELFPVPSKAADLMEPPETTPVPRVSEMLKLAQEPSPSNTQQTSPQQTSPQQT

SPQTPQPSSQSAPGNAYPPPPVPPYLNPGANPLLIPNSPNQVDIGRVQPITLEQAVEL AINNNEEIKQARFALQRAGAQLREAQSLQFPTVNTNFDFERQSDPSIDQSRQRANENN VPLDFNFLAQTTNNGQIELNYNVYSAGERPARILAARKEVNRNQLEVERVAEEVRFNT TEAYYLLQRADAQVAIAQAAVEDASVSLRDARLLEQAGLGTRFSVLQAEVDLARANQD LTRAIADQRTARRRLAQVLNVGQ IELTAADEIREAGTWPISLEQ IVQAYHNRAELE QQILQREIYQQTSYAELASVKPQVDFIARYLYTDTFEDRLSVADGYAFIARLRWLIFD GGRAEARARQNYRNMDIANAEFARLRNEIRFNVEQAYYDLIANQENIQTANTNVI AT ESLRLARLRFQAGVGTQIDVINSQRDLTQARSDYLQAVIDYNRSLNQLQRQVSNWPAN NLFDKF

Npun F3670 MKTNTFNGRNQNKTIVIEKELIPNSEALKAVTAEAPVVTPEIEKEVPPKRKKPTGLIL

AGIGVGAIVAGTFGYNYWQYASTHQETDNATVAGNIHQVSSRIPGTVSQVLVNDNQLV QPGQLLVKLDPRDYESKVQQAQAALENARGQAQAAQANIALTSQTTTGKTTQAQGDVS GAVAAISTAQAAVQEAQAGIPAAQAEVRLAEAGIPAAQAQVAQANANLENAQADYNRY NELYKNGAIARQQLDAAKAAFNVATAQRNAAVQGVEQAQAKLASARVGVAKAQSQLAQ AQENVTNAQAKLAASKGGLQQATAGGQDTTVKRSQYEAAKAAIAQSEASLKDAQLQLS YANVTAPSAGRVGRKNVEVGNRVAVGTPLMAIVDNEYWVIANFKETQLEKMRPGEMAE IKLDAFPHHTFVGRVESLSPASGAQFALLPPDNATGNFTKVVQRIPVKVVFDQKSIQG YESRITPGMSAEVAVEVK

Npun F3671 MKKPLLPALRSKNYQLFFAGQGVSLVGTWMTQLATIWLVYNLTNSPLMLGVVGFSSQI

PSFFLAPFGGVFVDRFSRYRTLIGTQVLAMIQSLTLAVLALTGVIQVWHI IALSLCQG FINALDAPARQAFVPELVERREDLANAIAINSTMINGARLIGPAIGGLLIARVGTGYC FLIDGLSYIAVLAALLAMKVKPWKNVVTDGNPLQKVKEGFVYAFSFPPIRSILLLSTL VSFMGLQNTILVPVIAEQVLKGGAESLGFLMAASGVGALTGGIYLATRQTILGIGKLI ALAPAILGVALIAFSLSRYLPLSLFTMLFVGLGTILQIAASNTFLQTIVEDDKRGRLM SLYTMSFLGMIPVGNLLGGVLASRIGAPNTLI IDGIACILGSI IFSRQLPALRQIMRP IYEQKGIVMSKRA

Npun F3672 MANTCVIHQQAPPERVPLRTWIGVLASMLGAFMAVLDIQITNASLQDIQASLGATLEE

GSWISTAYLVAEIVVIPLTGWLSRVFSLQRYLLVNTALFIFFSICCAWSWNLNSMILF RALQGFSGGVLIPTAMTVVLTTLPQSKQSVGLAAFGFSAVFAPSIGPTLGGWLTENFG WEYNFYINVIPGALMLAGVWYGIKQEKPQINLLKQGDWWGI IAMAIGLGSLQVVLEEG SRKDWFSSALIVRLSAIAVIFLAIFFFIELTRKQPFINLRLVFRRNFGLASIVNVSLG VGLYGSIYILPLYLAQIQKYNALQIGEVLIWAGIPQLFIIPLIPKLMQRIDVRLMVAV GVTLFSISAFMNSGMTNQTGLDQLRWSQFVRAMGQPLIMVPLTSIATAGLSPKEAGSA SGLFNMMRNMGGSIGIASLATLLTNREQFHSNRLGDGISLYNPETQQRIDQMTQYFVS KGADLSTAQNQAIASISNIVRREAYVMAFNDCFYFIGIALLLSGLAVLFFKKVKPSGN AVAH

Npun R2885 MKGQQLFYSFLPGVTAAVLTTQPAWAGTVKLTGVQLASSPSVLTSTYGQNSVVDMMNT

QLPHGANVSVTTLLPGFGFTKLSMKPLSHNSIPVFTAGNTVVPIKQVLKKDEGRFVSL TPTSNASQQLDGSRSAQNNQKQSNSSISGQKSESIVVPNYTAKPSSVQRKIFPLSSAQ QPVVQRKNAVTELQAFLQTSATGGESAKLLSAPRCLKESGKSKTDSSAALLLASNTCL QQNAIGRIAQNDTSIPANSTPVPTVPGTVTPAPSGPVQPSTVPRTITPAPSGPVQIPG NLIPSSNPLQFPTKPEEVRLQGNQPITLAQALELARRNNRDLQVSLLELERNRAALRE AQAALLPTLGISADITRSQSASSQLSSKLQEQQTGISSPDEAGTSFSGQAQLSYNIYT SGRVQASIRAAEEQVRFNELAVETQSETIRLNVATDYYNLQQADEQVRIAQSAVQNSE ASLRDAEALERAGVGTRFDVLRSQVNLANAQQDLTNARSQQAISRRQLATRISLPQGI NISAADPVQLAGLWNPTLEQSIVLAFQNRPELQQQLAQRNISEQQRRQALAELGPQVS LVGSYNLLDQFDDSVSVTDGYSLGVRATINLYDGGAARARAAQSSVNIAIAETQFAEQ RNQIRFQVEQAYSTQQSSLENVQTSNTALEQAREALRLARLRFQAGVGTQTDVINSEN DLTRAEGNRVTAILDYNRALAQLQRSVTLRALR

Npun F6385 MNFSLLFVHSTWVGVGFAILFPALASAVTPPKPQNNSSSVQVPDYLNPNPNPLQFPTK

PQEVRIQGTVPITLAQALELARRNNRDLQVAILQLERSRSALRESQAALFPTLGINSN LTNSGNGFTNNSSQSSTSFNGSAQLNYNLYTSGNRQGAIQAAEEQLRVDELNVESQSL TIRLNATTQYYDLQQADEQVRINRAAVENAQASLRDTQAREQAGVGTRFDVLQAQVNL ANAQQQLTNAISQQQIARRQLATTLSLSQSVDISTADPVQLAGLWSQTVEQSIVQAFQ NRPELQQQLAQRNISEQQRRQALSQLGPQISLAGNYNLLDRYDDGVSITDGYSVGLQG NLTLFDGGAARARAAQSRTNIAIAESQFATQRDQIRFDVEQFYSQLQSNLNNVQTSSV ALNQAREALNLARLRFQAGVGTQTEVISAENDLTRAEGNRVTAILDYNRALANLQRSV TSRGSR

Npun F6382 MDMKIDTQNAVDSSVLVPEVKKKKGKRNWLSWLIAFCLLGGIGYAVYYQVAVVSQQQA

SRRVLTRPVQRQSLTITVSANGTVKPERSINLSPKNSGILKTLLAKEGDLVKQGQIVA YMDDSNLRGQLTSAQGQLAQAEANLQKAIAGNRPQDIAQAQGVLDEAQANLQKVQAGN RSQDIAQAQARLQSAQATLRQAEDDFVRNQQLYNAGGISLQTLNQNRATRDSAQASVN EAQQALGLQKAGSRPEDIEQAQAVVKQRQQALALLKAGTRQEDINAARAQVTSARGSL QNIQAEINDTI IRAPFDGVVTKKFADPGAFVTPTTASSEVASSSSSSILSLASTNEVV ANLAETNISKISLGQKVSITADAYPGKTFEGKVSQIAAQAIVEQNVTSFEVRVSLSDP QRLLRSGMNAEVDFQVGQVENVLVVPTASVVRQQNATGVYVTGADNKPVFTRIETGVT ANNFTEVKSGLTGNERVLLSFPPGSRPQSTPRGGVFPGLGGGGGTGGGGGRSGGGGSR SGGGGGGRSGGSSGGGSP Npun_F6383 MFKIFKGFYKAKKTRTVPLLEILTMAAETLWSNKLRTGLTMLGVI IGISSVIAI SVG QGVQKGVEQQIQALGTDVIQILAGAARSGNVRQGVGSSSTLTWEDAKAIATQAPSAQM VSAYLQRTAQVVYAGQNTST IYGTDLNYPEVRNTHPQQGRYFTQEELDTVAQVAILG PTVQTTLFGQGVNPIGEKIRIQGEAYQVIGVMEPKGSQGPMDRDDQVFIPLTSMSKRL VGNNALVGVSVNGILVKGANQEQLEAAQFQVTNLLRLRHNIYPPQADDFRLTNQADIV STFTSVVGLFTVMVVAIAGISLVVGGIGIANIMLVSVVERTREIGIRKAVGATNSAIL NQFLAEAIVISIVGGGIGMATGILLAFIASSIFKFPFI ISFLSI IAGFVLSLSVGLVA GVIPARNASKLDPINALRSD

Npun_F6384 MPTMIWMESITKTYHLGEVSVPILKGIQLSIEEGEYVSIMGASGSGKSTLMNILGCLD

RPTTGDYIFEGRNLTTFDDDELAYIRNQRIGFVFQQFNLLARATALENVMLPMVYANL PKPKRRERALEALEKVGLQGRIANRPSQLSGGQQQRVAIARALVNRPALVLADEPTGA LDTETSYEVMNLLTELNDQGITIVIVTHEPDIAAQTKRI IRVQDGLIVG

AcrE MTKHARFFLL PSFILISAAL IAGCNDKGEE KAHVGEPQVT VHIVKTAPLE

VKTELPGRTN AYRIAEVRPQ VSGIVLNRNF TEGSDVQAGQ SLYQIDPATY QANYDSAKGE LAKSEAAAAI AHLTVKRYVP LVGTKYISQQ EYDQAIADAR QADAAVIAAK ATVESARINL AYTKV APIS GRIGKSTVTE GALVTNGQTT ELATVQQLDP IYVDVTQSSN DFMRLKQSVE QGNLHKENAT SNVELVMENG QTYPLKGTLQ FSDVTVDEST GSITLRAVFP NPQHTLLPGM FVRARIDEGV QPDAILIPQQ GVSRTPRGDA TVLIVNDKSQ VEARPVVASQ AIGDKWLISE GLKSGDQVIV SGLQKARPGE QVKATTDTPA DTASK

AcrF MANFFIRRPI FAWVLAI ILM MAGALAILQL PVAQYP IAP PAVSVSANYP

GADAQTVQDT VTQVIEQNMN GIDNLMYMSS TSDSAGSVTI TLTFQSGTDP DIAQVQVQNK LQLATPLLPQ EVQQQGISVE KSSSSYLMVA GFVSDNPGTT QDDISDYVAS NVKDTLSRLN GVGDVQLFGA QYAMRIWLDA DLLNKYKLTP VDVINQLKVQ NDQIAAGQLG GTPALPGQQL NASI IAQTRF K PEEFGKVT LRVNSDGSVV RLKDVARVEL GGENYNVIAR INGKPAAGLG IKLA GANAL DTAKAIKAKL AELQPFFPQG MKVLYPYDTT PFVQLSIHEV VKTLFEAIML VFLVMYLFLQ NMRATLIP I AVPVVLLGTF AILAAFGYSI NTLTMFGMVL AIGLLVDDAI VVVENVERVM MEDKLPPKEA TEKSMSQIQG ALVGIAMVLS AVFIPMAFFG GSTGAIYRQF SITIVSAMAL SVLVALILTP ALCATLLKPV SAEHHENKGG FFGWFNTTFD HSVNHYTNSV GKILGSTGRY LLIYALIVAG MVVLFLRLPS SFLPEEDQGV FLTMIQLPAG ATQERTQKVL DQVTDYYLKN EKANVESVFT VNGFSFSGQA QNAGMAFVSL KPWEERNGDE NSAEAVIHRA KMELGKIRDG FVIPFNMPAI VELGTATGFD FELIDQAGLG HDALTQARNQ LLGMAAQHPA SLVSVRPNGL EDTAQFKLEV DQEKAQALGV SLSDINQTIS TALGGTYVND FIDRGRVKKL YVQADAKFRM LPEDVDKLYV RSANGEMVPF SAFTTSHWVY GSPRLERYNG LPSMEIQGEA APGTSSGDAM ALMENLASKL PAGIGYDWTG MSYQERLSGN QAPALVAISF VVVFLCLAAL YESWSIPVSV MLVVPLGIVG VLLAATLFNQ KNDVYFMVGL LTTIGLSAKN AILIVEFAKD LMEKEGKGVV EATLMAVRMR LRPILMTSLA FILGVLPLAI SNGAGSGAQN AVGIGVMGGM VSATLLAIFF VPVFFVVIRR CFKG

EmrK MEQINSNKKH SNRRKYFSLL AVVLFIAFSG AYAYWSMELE DMISTDDAYV

TGNADPISAQ VSGSVTVVNH KDTNYVRQGD ILVSLDKTDA IALNKAKNN LANIVRQTNK LYLQDKQYSA EVASARIQYQ QSLEDYNRRV PLAKQGVISK ETLEHTKDTL ISSKAALNAA IQAYKANKAL VMNTPLNRQP QVVEAADATK EAWLALKRTD IKSPVTGYIA QRSVQVGETV SPGQSLMAVV PARQMWVNAN FKETQLTDVR IGQSVNIISD LYGENVVFHG RVTGINMGTG NAFSLLPAQN ATGNWIKIVQ RVPVEVSLDP KELMEHPLRI GLSMTATIDT KNEDIAEMPE LASTVTSMPA YTSKALVIDT SPIEKEISNI ISHNGQL

EmrY MAI KSTPAP LTGGTLWCVT IALSLATFMQ MLDS ISNVA IP ISGFLGA

STDEGTWVIT SFGVANAIAI PVTGRLAQRI GELRLFLLSV TFFSLSSLMC SLSTNLDVLI FFRVVQGLMA GPLIPLSQSL LLRNYPPEKR TFALALWSMT VIIAPICGPI LGGYICDNFS WGWIFLINVP MGI IVLTLCL TLLKGRETET SPVKMNLPGL TLLVLGVGGL QIMLDKGRDL DWFNSSTIII LTVVSVISLI SLVIWESTSE NPILDLSLFK SRNFTIGIVS ITCAYLFYSG AIVLMPQLLQ ETMGYNAIWA GLAYAPIGIM PLLISPLIGR YGNKIDMRLL VTFSFLMYAV CYYWRSVTFM PTIDFTGIIL PQFFQGFAVA CFFLPLTTIS FSGLPDNKFA NASSMSNFFR TLSGSVGTSL TMTLWGRRES LHHSQLTATI DQFNPVFNSS SQIMDKYYGS LSGVLNEINN EITQQSLSIS ANEIFRMAAI AFILLTVLVW FAKPPFTAKG VG

MacA MKKRKTVKKR YVIALVIVIA GLITLWRILN APVPTYQTLI VRPGDLQQSV

LATGKLDALR KVDVGAQVSG QLKTLSVAIG DKVKKDQLLG VIDPEQAENQ IKEVEATLME LRAQRQQAEA ELKLARVTYS RQQRLAQTKA VSQQDLDTAA TEMAVKQAQI G IDAQIKRN QASLDTAKTN LDYTRIVAPM AGEVTQI TL QGQTVIAAQQ APNILTLADM SAMLVKAQVS EADVIHLKPG QKAWFTVLGD PLTRYEGQIK DVLPTPEKVN DAIFYYARFE VPNPNGLLRL DMTAQVHIQL TDVK VL IP LSALGDPVGD NRYKVKLLRN GETREREVTI GARNDTDVEI VKGLEAGDEV VIGEAKPGAA Q

MacB MTPLLELKDI RRSYPAGDEQ VEVLKGISLD IYAGEMVAIV GASGSGKSTL

MNILGCLDKA TSGTYRVAGQ DVATLDADAL AQLRREHFGF IFQRYHLLSH LTAEQNVEVP AVYAGLERKQ RLLRAQELLQ RLGLEDRTEY YPAQLSGGQQ QRVSIARALM NGGQVILADE PTGALDSHSG EEVMAILHQL RDRGHTVI IV THDPQVAAQA ERVIEIRDGE IVRNPPAIEK VNVTGGTEPV VNTVSGWRQF VSGFNEALTM AWRALAANKM RTLLTMLGI I IGIASVVSIV VVGDAAKQMV LADIRSIGTN TIDVYPGKDF GDDDPQYQQA LKYDDLIAIQ KQPWVASA P AVSQNLRLRY NNVDVAASAN GVSGDYFNVY GMTFSEGNTF NQEQLNGRAQ VVVLDSNTRR QLFPHKADVV GEVILVGNMP ARVIGVAEEK QSMFGSSKVL RVWLPYSTMS GRVMGQSWLN SI VRVKEGF DSAEAEQQLT RLLSLRHGKK DFFTWNMDGV LKTVEKTTRT LQLFLTLVAV ISLVVGGIGV MNIMLVSVTE RTREIGIRMA VGARASDVLQ QFLIEAVLVC LVGGALGI L SLLIAFTLQL FLPGWEIGFS PLALLLAFLC STVTGILFGW LPARNAARLD PVDALARE

MdtA MKGSYKSRWV IVIVVVIAAI AAFWFWQGRN DSRSAAPGAT KQAQQSPAGG

RRGMRSGPLA PVQAATAVEQ AVPRYLTGLG TITAANTVTV RSRVDGQLIA LHFQEGQQVK AGDLLAEIDP SQFKVALAQA QGQLAKDKAT LANARRDLAR YQQLAKTNLV SRQELDAQQA LVSETEGTIK ADEASVASAQ LQLDWSRITA PVDGRVGLKQ VDVGNQISSG DTTGIVVITQ THPIDLVFTL PESDIATVVQ AQKAGKPLVV EAWDRTNSKK LSEGTLLSLD NQIDATTGTI KVKARFNNQD DALFPNQFVN ARMLVDTEQN AVVIPTAALQ MGNEGHFVWV LNSENKVSKH LVTPGIQDSQ KVVIRAGISA GDRVVTDGID RLTEGAKVEV VEAQSATTPE EKATSREYAK KGARS

MdtB MQVLPPSSTG GPSRLFIMRP VATTLLMVAI LLAGI IGYRA LPVSALPEVD

YPTIQVVTLY PGASPDVMTS AVTAPLERQF GQMSGLKQMS SQSSGGASVI TLQFQLTLPL DVAEQEVQAA INAATNLLPS DLPNPPVYSK VNPADPPIMT LAVTSTAMPM TQVEDMVETR VAQKISQISG VGLVTLSGGQ RPAVRVKLNA QAIAALGLTS ETVRTAITGA NVNSAKGSLD GPSRAVTLSA NDQMQSAEEY RQLI IAYQNG APIRLGDVAT VEQGAENSWL GAWANKEQAI VMNVQRQPGA NIISTADSIR QMLPQLTESL PKSVKVTVLS DRTTNIRASV DDTQFELMMA IALVVMIIYL FLRNIPATI I PGVAVPLSLI GTFAVMVFLD FSINNLTLMA LTIATGFVVD DAIVVIENIS RYIEKGEKPL AAALKGAGEI GFTI ISLTFS LIAVLIPLLF MGDIVGRLFR EFAITLAVAI LISAVVSLTL TPMMCARMLS QESLRKQNRF SRASEKMFDR I IAAYGRGLA KVLNHPWLTL SVALSTLLLS VLLWVFIPKG FFPVQDNGI I QGTLQAPQSS SFANMAQRQR QVADVILQDP AVQSLTSFVG VDGTNPSLNS ARLQINLKPL DERDDRVQKV lARLQTAVDK VPGVDLFLQP TQDLTIDTQV SRTQYQFTLQ ATSLDALSTW VPQLMEKLQQ LPQLSDVSSD WQDKGLVAYV NVDRDSASRL GISMADVDNA LYNAFGQRLI STIYTQANQY RVVLEHNTEN TPGLAALDTI RLTSSDGGVV PLSSIAKIEQ RFAPLSINHL DQFPVTTISF NVPDNYSLGD AVQAIMDTEK TLNLPVDITT QFQGSTLAFQ SALGSTVWLI VAAVVAMYIV LGILYESFIH PITILSTLPT AGVGALLALL IAGSELDVIA IIGIILLIGI VKKNAIMMID FALAAEREQG MSPREAIYQA CLLRFRPILM TTLAALLGAL PLMLSTGVGA ELRRPLGIGM VGGLIVSQVL TLFTTPVIYL LFDRLALWTK SRFARHEEEA

MdtC MKFFALFIYR PVATILLSVA ITLCGILGFR MLPVAPLPQV DFPVIIVSAS

LPGASPETMA SSVATPLERS LGRIAGVSEM TSSSSLGSTR I ILQFDFDRD INGAARDVQA AINAAQSLLP SGMPSRPTYR KANPSDAPIM ILTLTSDTYS QGELYDFAST QLAPTISQID GVGDVDVGGS SLPAVRVGLN PQALFNQGVS LDDVRTAVSN ANVRKPQGAL EDGTHRWQIQ TNDELKTAAE YQPLI IHYNN GGAVRLGDVA TVTDSVQDVR NAGMTNAKPA ILLMIRKLPE ANIIQTVDSI RAKLPELQET IPAAIDLQIA QDRSPTIRAS LEEVEQTLI I SVALVILVVF LFLRSGRA I IPAVSVPVSL IGTFAAMYLC GFSLNNLSLM AL IATGFVV DDAIVVLENI ARHLEAGMKP LQAALQGTRE VGFTVLSMSL SLVAVFLPLL LMGGLPGRLL REFAVTLSVA IGISLLVSLT LTPMMCGWML KASKPREQKR LRGFGRMLVA LQQGYGKSLK WVLNHTRLVG VVLLG IALN IWLYISIPKT FFPEQDTGVL MGGIQADQSI SFQAMRGKLQ DFMKI IRDDP AVDNVTGFTG GSRVNSGMMF ITLKPRDERS ETAQQI IDRL RVKLAKEPGA NLFLMAVQDI RVGGRQSNAS YQYTLLSDDL AALREWEPKI RKKLATLPEL ADVNSDQQDN GAEMNLVYDR DTMARLGIDV QAANSLLNNA FGQRQISTIY QPMNQYKVVM EVDPRYTQDI SALEKMFVIN NEGKAIPLSY FAKWQPANAP LSVNHQGLSA ASTISFNLPT GKSLSDASAA IDRAMTQLGV PSTVRGSFAG TAQVFQETMN SQVILIIAAI ATVYIVLGIL YESYVHPLTI LSTLPSAGVG ALLALELFNA PFSLIALIGI MLLIGIVKKN AIMMVDFALE AQRHGNLTPQ EAIFQACLLR FRPIMMTTLA ALFGALPLVL SGGDGSELRQ PLGITIVGGL VMSQLLTLYT TPVVYLFFDR LRLRFSRKPK QTVTE

MdtE MNRRRKLLIP LLFCGAMLTA CDDKSAENAA AMTPEVGVVT LSPGSVNVLS

ELPGRTVPYE VAEIRPQVGG I I IKRNFIEG DKVNQGDSLY QIDPAPLQAE LNSAKGSLAK ALSTASNARI TFNRQASLLK TNYVSRQDYD TARTQLNEAE ANVTVAKAAV EQATINLQYA NVTSPITGVS GKSSVTVGAL VTANQADSLV TVQRLDPIYV DLTQSVQDFL RMKEEVASGQ IKQVQGSTPV QLNLENGKRY SQTGTLKFSD PTVDETTGSV TLRAIFPNPN GDLLPGMYVT ALVDEGSRQN VLLVPQEGVT HNAQGKATAL ILDKDDVVQL REIEASKAIG DQWVVTSGLQ AGDRVIVSGL QRIRPGIKAR AISSSQENAS TESKQ

MdtF MANYFIDRPV FAWVLAI IMM LAGGLAIMNL PVAQYPQIAP PTITVSATYP

GADAQTVEDS VTQVIEQNMN GLDGLMYMSS TSDAAGNASI TLTFETGTSP DIAQVQVQNK LQLAMPSLPE AVQQQGISVD KSSSNILMVA AFISDNGSLN QYDIADYVAS NIKDPLSRTA GVGSVQLFGS EYAMRIWLDP QKLNKYNLVP SDVISQIKVQ NNQISGGQLG GMPQAADQQL NASI IVQTRL QTPEEFGKIL LKVQQDGSQV LLRDVARVEL GAEDYSTVAR YNGKPAAGIA IKLAAGANAL DTSRAVKEEL NRLSAYFPAS LKTVYPYDTT PFIEISIQEV FKTLVEAIIL VFLVMYLFLQ NFRATI IPTI AVPVVILGTF AILSAVGFTI NTLTMFGMVL AIGLLVDDAI VVVENVERVI AEDKLPPKEA THKSMGQIQR ALVGIAVVLS AVFMPMAFMS GATGEIYRQF SITLISSMLL SVFVAMSLTP ALCATILKAA PEGGHKPNAL FARFNTLFEK STQHYTDSTR SLLRCTGRYM VVYLLICAGM AVLFLRTPTS FLPEEDQGVF MTTAQLPSGA TMVNTTKVLQ QVTDYYLTKE KDNVQSVFTV GGFGFSGQGQ NNGLAFISLK PWSERVGEEN SVTAI IQRAM IALSSINKAV VFPFNLPAVA ELGTASGFDM ELLDNGNLGH EKLTQARNEL LSLAAQSPNQ VTGVRPNGLE DTPMFKVNVN AAKAEAMGVA LSDINQTIST AFGSSYVNDF LNQGRVKKVY VQAGTPFRML PDNINQWYVR NASGTMAPLS AYSSTEWTYG SPRLERYNGI PSMEILGEAA AGKSTGDAMK FMADLVAKLP AGVGYSWTGL SYQEALSSNQ APALYAISLV VVFLALAALY ESWSIPFSVM LVVPLGVVGA LLATDLRGLS NDVYFQVGLL TTIGLSAKNA ILIVEFAVEM MQKEGKTPIE AIIEAARMRL RPILMTSLAF ILGVLPLVIS HGAGSGAQNA VGTGVMGGMF AATVLAIYFV PVFFVVVEHL FARFKKA

SdsR MESTPKKAPR SKFPALLVVA LALVALVFVI WRVDSAPSTN DAYASADTID

VVPEVSGRIV ELAVTDNQAV KQGDLLFRID PRPYEANLAK AEASLAALDK QIMLTQRSVD AQQFGADSVN ATVEKARAAA KQATDTLRRT EPLLKEGFVS AEDVDRARTA QRAAEADLNA VLLQAQSAAS AVSGVDALVA QRAAVEADIA LTKLHLEMAT VRAPFDGRVI SLKTSVGQFA SAMRPIFTLI DTRHWYVIAN FRETDLKNIR SGTPATIRLM SDSGKTFEGK VDSIGYGVLP DDGGLVLGGL PKVSRSINWV RVAQRFPVKI MVDKPDPEMF RIGASAVANL EPQ

SdsQ MSALNSLPLP VVRLLAFFHE ELSERRPGRV PQTVQLWVGC LLVILISMTF

EIPFVALSLA VLFYGIQSNA FYTKFVAILF VVATVLEIGS LFLIYKWSYG EPLIRLI IAG PILMGCMFLM RTHRLGLVFF AVAIVAIYGQ TFPAMLDYPE VVVRLTLWCI VVGLYPTLLM TLIGVLWFPS RAISQMHQAL NDRLDDAISH LTDSLAPLPE TRIEREALAL QKLNVFCLAD DANWRTQNAW WQSCVATVTY IYSTLNRYDP TSFADSQAI I EFRQKLASEI NKLQHAVAEG QCWQSDWRIS ESEAMAAREC NLENICQTLL QLGQMDPNTP PTPAAKPPSM AADAFTNPDY MRYAVKTLLA CLICYTFYSG VDWEGIHTCM LTCVIVANPN VGSSYQKMVL RFGGAFCGAI LALLFTLLVM PWLDNIVELL FVLAPIFLLG AWIATSSERS SYIGTQMVVT FALATLENVF GPVYDLVEIR DRALGI I IGT VVSAVIYTFV WPESEARTLP QKLAGTLGML SKVMRIPRQQ EVTALRTYLQ IRIGLHAAFN ACEEMCQRVA LERQLDSEER ALLIERSQTV IRQGRDLLHA WDATWNSAQA LDNALQPDRA GQFADALEKY AAGLATALSR SPQITLEETP ASQAILPTLL KQEQHVCQLF ARLPDWTAPA LTPATEQAQG ATQ

SdsP MINRQLSRLL LCSILGSTTL ISGCALVRKD SAPHQQLKPE QIKLADDIHL

ASSGWPQAQW WKQLNDPQLD ALIQRTLSGS HTLAEAKLRE EKAQSQADLL DAGSQLQVAA LGMLNRQRVS ANGFLSPYSM DAPALGMDGP YYTEATVGLF AGLDLDLWGV HRSAVAAAIG AHNAALAE A AVELSLATGV AQLYYSMQAS YQMLDLLEQT HDVIDYAVKA HQSKVAHGLE AQVPFHGARA QILAVDKQIV AVKGQI ETR ESLRALIGAG ASDMPEIRPV ALPQVQTGIP ATLSYELLAR RPDLQAMRWY VQASLDQVDS ARALFYPSFD IKAFFGLDSI HLHTLFKKTS RQFNFIPGLK LPLFDGGRLN ANLEGTRAAS NMMIERYNQS VLNAVRDVAV NGTRLQTLND EREMQAERVE ATRFTQRAAE AAYQRGLTSR LQATEARLPV LAEEMSLLML DSRRVIQSIQ LMKSLGGGYQ AGPVVEKK

A2208 UHR gtgggtgctgcagtagtcgggcctcgcctcggcaaataccgtgatggtcaagtccacg ccattcctggtcacaacatgagtattgcgaccttaggctgtctaattctttggattgg ctggtttggttttaaccccggttctcaattggcagcagatgctgcggtgccttacatc gcaatcactacaaacctttcggctgcagctgggggaatcaccgcaaccgcaacctctt ggatcaaagatgggaagccagacctgtctatgattattaacggtattttggctggtct cgttgggattacagccggttgtgatggcgtcagtttcttttctgctgtgatcatcggg gcgatcgccggtgtactcgtcgtcttctctgtggccttcttcgatgctattaaaatcg atgaccccgttggtgcgacctctgtGcacctcgtctgcggtatctggggaactcttgc cgttggtctgttcaagatggatgggggtttattcactggcggtggcatccaacagctg attgcccaaatcgtcggaatCctttccattggtggctttaccgtcgcctttagcttta ttgtttggtatgccctatcggcagtccttggtggCattcgcgtcgaaaaagacgagga actccggggtctcgacattggtgagcacggcatggaagcttacagcggctttgttaaa gagtccgatgttatcttccgagggactgccactggttccgaaaccgaaggataa

Kan ATGATTGAACAAGATGGCCTGCATGCTGGTTCTCCGGCTGCTTGGGTGGAACGCCTGT

TTGGTTACGACTGGGCTCAGCTGACTATTGGCTGTAGCGATGCAGCGGTTTTCCGTCT GTCTGCACAGGGTCGTCCGGTTCTGTTTGTGAAAACCGACCTGTCCGGCGCACTGAAC GAACTGCAGGACGAAGCGGCCCGTCTGTCCTGGCTCGCGACGACTGGTGTTCCGTGCG CGGCAGTTCTGGACGTAGTTACTGAAGCCGGTCGCGATTGGCTGCTGCTGGGTGAAGT TCCGGGTCAGGATCTGCTGAGCAGCCACCTCGCTCCGGCAGAAAAAGTTTCCATCATG GCGGACGCGATGCGCCGTCTGCACACCCTGGACCCGGCAACTTGCCCGTTTGACCATC AGGCTAAACACCGTATTGAACGTGCACGCACTCGTATGGAAGCGGGTCTGGTTGATCA GGACGACCTGGATGAAGAGCACCAGGGCCTCGCACCGGCGGAACTGTTTGCACGTCTG AAAGCCCGCATGCCGGACGGCGAAGACCTGGTGGTAACGCATGGCGACGCTTGTCTGC CAAACATTATGGTGGAAAACGGCCGCTTCTCTGGTTTTATTGACTGTGGCCGTCTGGG TGTAGCTGATCGCTATCAGGATATCGCCCTCGCTACCCGCGATATTGCAGAAGAACTG GGTGGTGAATGGGCTGACCGTTTCCTGGTGCTGTACGGTATCGCAGCGCCGGATTCTC AGCGCATTGCCTTCTACCGTCTGCTGGATGAGTTCTTCTAA

promoter: gcccctatattatgcatttatacccccacaatcatgtcaagaattcaagcatcttaaa P{psaA} taatgttaattatcggcaaagtctgtgctccccttctataatgctgaattgagcattc gcctcctgaacggtctttattcttccattgtgggtctttagattcacgattcttcaca atcattgatctaaggatctttgtagattctcTGTACA

emrA atgagcgcaaatgcggagactcaaaccccgcagcaaccggtaaagaagagcggcaaac gtaagcgtctgctcctccttctcaccttgctctttataattattgccgtagcgatagg gatttattggtttttggtactgcgtcacttcgaagaaaccgatgacgcatacgtggca gggaatcaaattcaaattatgtctcaggtgtctggcagcgtgacgaaagtctgggccg ataacaccgattttgtaaaagaaggcgacgtgctggtcactctcgacccgacagatgc tcgccaggcgtttgaaaaagccaaaactgcactggcttccagcgttcgccaaacccac cagctgatgattaacagcaagcagttgcaggcgaatattgaggtgcagaaaatcgccc tcgcgaaagcacaaagcgactacaaccgccgtgtgccgctgggcaatgccaacctgat tggtcgcgaagagctgcaacacgcccgcgacgccgtcaccagtgcccaggcgcaactg gacgtcgcgattcaacaatacaatgccaatcaggcgatgattctggggactaaactgg aagatcagccagccgtgcaacaggctgccaccgaagtacgtaacgcctggctggcgct ggagcgtactcgtattatcagtccgatgaccggttatgtctcccgccgcgcggtacag cctggggcgcaaattagcccaacgacgccgctgatggcggtcgttccagccaccaata tgtgggtggatgccaactttaaagagacgcagattgccaatatgcgtatcggtcagcc ggtcactatcaccacggatatttacggcgatgatgtgaaatacaccggtaaagtggtt ggtctggatatgggcacaggtagcgcgttctcactgcttccagcgcaaaatgcgaccg gtaactggatcaaagtcgttcagcgtctgcctgtgcgtatcgaactggaccagaaaca gctggagcaatatccgctgcgtatcggtttgtccacgctggtgagcgtcaataccact aaccgtgacggtcaggtactggcaaataaagtacgttccactccggtagcggtaagca ccgcgcgtgaaatcagcctggcacctgtcaataaactgatcgacgatatcgtaaaagc taacgctggctaa

emrB atgcaacagcaaaaaccgctggaaggcgcgcaactggtcattatgacgattgcgctgt cactggcgacattcatgcaggtgctggactccaccattgctaacgtggcgatccccac tatcgccgggaatctgggctcatcgctcagccagggaacgtgggtaatcacttctttc ggggtggcgaatgccatctcgatcccgcttaccggctggctggcaaagcgcgtcgggg aagtgaaactgttcctttggtccaccatcgcctttgctattgcgtcgtgggcgtgtgg tgtctccagcagcctgaatatgctgatcttcttccgcgtgattcaggggattgtcgcc gggccgttgatcccgctttcgcaaagtctattgctgaataactacccgccagccaaac gctcgatcgcgctggcgttgtggtcgatgacggtgattgtcgcgccaatttgcggccc gatcctcggcggttatatcagcgataattaccactggggctggatattcttcatcaac gtgccgattggcgtggcggtggtgttgatgacactgcaaactctgcgcggacgtgaaa cccgcaccgaacggcggcggattgatgccgtggggctggcactgctggttattggtat cggcagcctgcagattatgctcgaccgcggtaaagagctggactggttttcatcacag gaaattatcatccttaccgtggtggcggtggtggctatctgcttcctgattgtctggg agctgaccgacgataacccgatagtcgatctgtcgttgtttaagtcgcgcaacttcac catcggctgcttgtgtatcagcctcgcgtatatgctctacttcggcgctattgttctg ctgccgcagttgttgcaggaggtctacggttacacggcgacctgggcaggtttggcct ctgcgccggtagggattattccggtgatcctgtcgccgattatcggccgcttcgcgca taaactggatatgcggcggctggtaaccttcagctttattatgtatgccgtctgcttc tactggcgtgcctatacctttgaaccaggtatggattttggcgcgtcggcctggccgc agtttatccaggggtttgcggtggcctgcttctttatgccgctgaccaccattacgct gtctggtttgccaccggaacgactggcggcggcatcgagcctctctaactttacgcga acgctggcggggtctatcggcacgtcgataaccacgaccatgtggaccaaccgcgagt cgatgcaccatgcgcagttgactgagtcggtaaacccgttcaacccgaatgcccaggc gatgtacagtcaactggaagggcttgggatgacgcaacagcaggcatcaggctggatt gcccagcagatcaccaatcaggggctgattatttccgccaatgagatcttctggatgt cagccgggatattcctcgtcctgctggggctggtgtggtttgctaaaccgccatttgg cgcaggtggcggcggaggcggtgcgcactaa

A2208 DHR atcctcccaggaaatccttaaaacaatctaaagaaatttttcctaaccttccttaccc aagggaggttttttatgtgagttcacattttgttacgttacccaatcaatacttgagc cgctcaaaaagtctgacctagagcagaaagtccctgagtatatcgactcattaatccg gtctttccgcttggtttcttgagttgattttctgcgaaattttggaaattcagagatG taaccttagggggagtccacttaaaaacggctctgctcaaccttgcaaatgccctact cttcttctgtctagcccaagcactccctgagaaaattagcggcgatcgcctataaaca tgaagttttatgacagatcAttttacaagatgtaatgtttaaatg pJB2302 gtgggtgctgcagtagtcgggcctcgcctcggcaaataccgtgatggtcaagtccacg ccattcctggtcacaacatgagtattgcgaccttaggctgtctaattctttggattgg ctggtttggttttaaccccggttctcaattggcagcagatgctgcggtgccttacatc gcaatcactacaaacctttcggctgcagctgggggaatcaccgcaaccgcaacctctt ggatcaaagatgggaagccagacctgtctatgattattaacggtattttggctggtct cgttgggattacagccggttgtgatggcgtcagtttcttttctgctgtgatcatcggg gcgatcgccggtgtactcgtcgtcttctctgtggccttcttcgatgctattaaaatcg atgaccccgttggtgcgacctctgtGcacctcgtctgcggtatctggggaactcttgc cgttggtctgttcaagatggatgggggtttattcactggcggtggcatccaacagctg attgcccaaatcgtcggaatCctttccattggtggctttaccgtcgcctttagcttta ttgtttggtatgccctatcggcagtccttggtggCattcgcgtcgaaaaagacgagga actccggggtctcgacattggtgagcacggcatggaagcttacagcggctttgttaaa gagtccgatgttatcttccgagggactgccactggttccgaaaccgaaggataaTTAA TTAAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGA CAA AACCCTGA AAATGCTTCAA AA ATTGAAAAAGGAAGAG ATGATTGAACAAG ATGGCCTGCATGCTGGTTCTCCGGCTGCTTGGGTGGAACGCCTGTTTGGTTACGACTG GGCTCAGCTGACTATTGGCTGTAGCGATGCAGCGGTTTTCCGTCTGTCTGCACAGGGT CGTCCGGTTCTGTTTGTGAAAACCGACCTGTCCGGCGCACTGAACGAACTGCAGGACG AAGCGGCCCGTCTGTCCTGGCTCGCGACGACTGGTGTTCCGTGCGCGGCAGTTCTGGA CGTAGTTACTGAAGCCGGTCGCGATTGGCTGCTGCTGGGTGAAGTTCCGGGTCAGGAT CTGCTGAGCAGCCACCTCGCTCCGGCAGAAAAAGTTTCCATCATGGCGGACGCGATGC GCCGTCTGCACACCCTGGACCCGGCAACTTGCCCGTTTGACCATCAGGCTAAACACCG TATTGAACGTGCACGCACTCGTATGGAAGCGGGTCTGGTTGATCAGGACGACCTGGAT GAAGAGCACCAGGGCCTCGCACCGGCGGAACTGTTTGCACGTCTGAAAGCCCGCATGC CGGACGGCGAAGACCTGGTGGTAACGCATGGCGACGCTTGTCTGCCAAACATTATGGT GGAAAACGGCCGCTTCTCTGGTTTTATTGACTGTGGCCGTCTGGGTGTAGCTGATCGC TATCAGGATATCGCCCTCGCTACCCGCGATATTGCAGAAGAACTGGGTGGTGAATGGG CTGACCGTTTCCTGGTGCTGTACGGTATCGCAGCGCCGGATTCTCAGCGCATTGCCTT CTACCGTCTGCTGGATGAGTTCTTCTAAGGCGCGCCgagcatctcttcgaagtattcc aggcatcaaataaaacgaaaggctcagtcgaaagactgggcctttcgttttatctgtt gtttgtcggtgaacgctctctactagagtcacactggctcaccttcgggtgggccttt ctgcgtttataAAGCTTgcccctatattatgcatttatacccccacaatcatgtcaag aattcaagcatcttaaataatgttaattatcggcaaagtctgtgctccccttctataa tgctgaattgagcattcgcctcctgaacggtctttattcttccattgtgggtctttag attcacgattcttcacaatcattgatctaaggatctttgtagattctcTGTACATatg agcgcaaatgcggagactcaaaccccgcagcaaccggtaaagaagagcggcaaacgta agcgtctgctcctccttctcaccttgctctttataattattgccgtagcgatagggat ttattggtttttggtactgcgtcacttcgaagaaaccgatgacgcatacgtggcaggg aatcaaattcaaattatgtctcaggtgtctggcagcgtgacgaaagtctgggccgata acaccgattttgtaaaagaaggcgacgtgctggtcactctcgacccgacagatgctcg ccaggcgtttgaaaaagccaaaactgcactggcttccagcgttcgccaaacccaccag ctgatgattaacagcaagcagttgcaggcgaatattgaggtgcagaaaatcgccctcg cgaaagcacaaagcgactacaaccgccgtgtgccgctgggcaatgccaacctgattgg tcgcgaagagctgcaacacgcccgcgacgccgtcaccagtgcccaggcgcaactggac gtcgcgattcaacaatacaatgccaatcaggcgatgattctggggactaaactggaag atcagccagccgtgcaacaggctgccaccgaagtacgtaacgcctggctggcgctgga gcgtactcgtattatcagtccgatgaccggttatgtctcccgccgcgcggtacagcct ggggcgcaaattagcccaacgacgccgctgatggcggtcgttccagccaccaatatgt gggtggatgccaactttaaagagacgcagattgccaatatgcgtatcggtcagccggt cactatcaccacggatatttacggcgatgatgtgaaatacaccggtaaagtggttggt ctggatatgggcacaggtagcgcgttctcactgcttccagcgcaaaatgcgaccggta actggatcaaagtcgttcagcgtctgcctgtgcgtatcgaactggaccagaaacagct ggagcaatatccgctgcgtatcggtttgtccacgctggtgagcgtcaataccactaac cgtgacggtcaggtactggcaaataaagtacgttccactccggtagcggtaagcaccg cgcgtgaaatcagcctggcacctgtcaataaactgatcgacgatatcgtaaaagctaa cgctggctaatccagaggtgcgtgtgatgcaacagcaaaaaccgctggaaggcgcgca actggtcattatgacgattgcgctgtcactggcgacattcatgcaggtgctggactcc accattgctaacgtggcgatccccactatcgccgggaatctgggctcatcgctcagcc agggaacgtgggtaatcacttctttcggggtggcgaatgccatctcgatcccgcttac cggctggctggcaaagcgcgtcggggaagtgaaactgttcctttggtccaccatcgcc tttgctattgcgtcgtgggcgtgtggtgtctccagcagcctgaatatgctgatcttct tccgcgtgattcaggggattgtcgccgggccgttgatcccgctttcgcaaagtctatt gctgaataactacccgccagccaaacgctcgatcgcgctggcgttgtggtcgatgacg gtgattgtcgcgccaatttgcggcccgatcctcggcggttatatcagcgataattacc actggggctggatattcttcatcaacgtgccgattggcgtggcggtggtgttgatgac actgcaaactctgcgcggacgtgaaacccgcaccgaacggcggcggattgatgccgtg gggctggcactgctggttattggtatcggcagcctgcagattatgctcgaccgcggta aagagctggactggttttcatcacaggaaattatcatccttaccgtggtggcggtggt ggctatctgcttcctgattgtctgggagctgaccgacgataacccgatagtcgatctg tcgttgtttaagtcgcgcaacttcaccatcggctgcttgtgtatcagcctcgcgtata tgctctacttcggcgctattgttctgctgccgcagttgttgcaggaggtctacggtta cacggcgacctgggcaggtttggcctctgcgccggtagggattattccggtgatcctg tcgccgattatcggccgcttcgcgcataaactggatatgcggcggctggtaaccttca gctttattatgtatgccgtctgcttctactggcgtgcctatacctttgaaccaggtat ggattttggcgcgtcggcctggccgcagtttatccaggggtttgcggtggcctgcttc tttatgccgctgaccaccattacgctgtctggtttgccaccggaacgactggcggcgg catcgagcctctctaactttacgcgaacgctggcggggtctatcggcacgtcgataac cacgaccatgtggaccaaccgcgagtcgatgcaccatgcgcagttgactgagtcggta aacccgttcaacccgaatgcccaggcgatgtacagtcaactggaagggcttgggatga cgcaacagcaggcatcaggctggattgcccagcagatcaccaatcaggggctgattat ttccgccaatgagatcttctggatgtcagccgggatattcctcgtcctgctggggctg gtgtggtttgctaaaccgccatttggcgcaggtggcggcggaggcggtgcgcactaaG CCGGCtcaggtatccggtacgccgcCGCAAAAAACCCCGCTTCGGCGGGGTTTTTTCG Cggcgcgccatcctcccaggaaatccttaaaacaatctaaagaaatttttcctaacct tccttacccaagggaggttttttatgtgagttcacattttgttacgttacccaatcaa tacttgagccgctcaaaaagtctgacctagagcagaaagtccctgagtatatcgactc attaatccggtctttccgcttggtttcttgagttgattttctgcgaaattttggaaat tcagagatGtaaccttagggggagtccacttaaaaacggctctgctcaaccttgcaaa tgccctactcttcttctgtctagcccaagcactccctgagaaaattagcggcgatcgc ctataaacatgaagttttatgacagatcAttttacaagatgtaatgtttaaatg

P{tsr2142} ccaaggtggctacttcaacgatagcttaaacttcgctgctccagcgaggggatttcac tggtttgaatgcttcaatgcttgccaaaagagtgctactggaacttacaagagtgacc ctgcgtcaggggagctagcactcaaaaaagactcctcc TGTACA

P{aphII} gggggggggggggaaagccacgttgtgtctcaaaatctctgatgttacattgcacaag ataaaaatatatcatcatgaacaataaaactgtctgcttacataaacagtaatacaag TGTACA

P{ompR} tagtacaaaaagacgattaaccccatgggtaaaagcaggggagccactaaagttcaca ggtttacaccgaattttccatttgaaaagtagtaaatcatacagaaaacaatcatgta aaaattgaatactctaatggtttgatgtccgaaaaagtctagtttcttctattcttcg accaaatctatggcagggcactatcacagagctggcttaataatttgggagaaatggg tgggggcggactttcgtagaacaatgtagattaaagtacTGTACA

P{ gcTTGTAgCAAttgcTACtAAAaactgcgatcgctgctgaaatgagctggaattTtgt nir 07 PnirA P ccctctcagctcaAAaAGTAtCAAtgAtTACttAAtGTTTGttctgcgCAAACttctT CC7942 v2} GCAgaacaTGCAtgatttacaaaaAgTTGTAgtttctGtTACcaATTgcgaatcgaga actgccTAatcTgccgagtaTgcgatcctttAgcAGGAGGaTGTACA

Various aspects are described in the claims.