Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
LIPOXYGENASE-CATALYZED PRODUCTION OF UNSATURATED C10-ALDEHYDES FROM POLYUNSATURATED FATTY ACIDS (PUFA)
Document Type and Number:
WIPO Patent Application WO/2020/079223
Kind Code:
A1
Abstract:
The present invention provides novel methods for the lipoxygenase (LOX)-catalyzed production of aliphatic unsaturated C10- aldehyde compounds from polyunsaturated fatty acid (PUFA) sources. The present invention also relates to the isolation and characterization of novel, preferably bifunctional LOXs from different algae sources and the identification of structurally and/or functionally related LOXs from different bacterial sources. The present invention also relates to the provision of enzyme mutants derived from said newly identified enzymes. A further aspect of the present invention relates to corresponding coding sequences of said enzymes, recombinant vectors, and recombinant host cells suitable for the production of such LOXs and for performing the novel production methods of aliphatic unsaturated C10- aldehyde compounds. Another aspect of the invention relates to the use of particular aldehydes or aldehyde mixtures, as obtained according to the present invention as flavor ingredient or ingredient for food or feed compositions.

Inventors:
HAN LEI (CN)
WANG QI (CN)
HAEFLIGER OLIVIER (CH)
BELORGEY DIDIER (CN)
CERNY CHRISTOPH (CN)
Application Number:
PCT/EP2019/078370
Publication Date:
April 23, 2020
Filing Date:
October 18, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
FIRMENICH & CIE (CH)
International Classes:
C12P7/24
Domestic Patent References:
WO2001039614A12001-06-07
WO2008056291A22008-05-15
Foreign References:
CN104293805A2015-01-21
CN104293837A2015-01-21
EP1921134A12008-05-14
EP1149849A12001-10-31
EP1069183A22001-01-17
DE10019373A12001-10-31
Other References:
DATABASE UniProt [online] 2013, N.N.: "Lipoxygenase family protein", XP002797244, Database accession no. UniProtKB - L7E8T2
ÖZKAYA ET AL: "Characterization of the free and glycosidically bound aroma potential of two important tomato cultivars grown in Turkey", JOURNAL OF FOOD SCIENCE AND TECHNOLOGY, vol. 55, 28 August 2018 (2018-08-28), pages 4440 - 4449, XP036607594
ALSUFYANI ET AL: "Prevalence and mechanism of polyunsaturated aldehydes production in the green tide forming macroalgal genus Ulva (Ulvales, Chlorophyta)", CHEMISTRY AND PHYSICS OF LIPIDS, vol. 183, 2014, pages 100 - 109, XP002797243
MANDAL ET AL: "In vitro kinetics of soybean lipoxygenase with combinatorial fatty substrates and its functional significance in off flavour development", FOOD CHEMISTRY, vol. 146, 2014, pages 394 - 403, XP028758524
ADOLPH ET AL: "Synthesis and biological activity of alpha, beta, gamma, delta-unsaturated aldehydes from diatoms", TETRAHEDRON, vol. 59, 2003, pages 3003 - 3008, XP004420654
ANDRIANARISON ET AL: "Oxodiene formation during the Vicia sativa lipoxygenase-catalyzed reaction: occurrence of dioxygenase and fatty acid lyase activities associated in a single protein", BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, vol. 180, 1991, pages 1002 - 1009, XP029118879
ST. ANGELO ET AL: "Identification of lipoxygenase-linoleate decomposition products by direct gas chromatography-mass spectrometry", LIPIDS, vol. 15, 1980, pages 45 - 49, XP035177584
BOONPRAB ET AL: "11-hydroperoxide eicosanoid-mediated 2(E),4(E)-decadienal production from archidonic acid in the brown algae, Saccharina angustata", JOURNAL OF APPLIED PHYCOLOGY, vol. 31, 9 March 2019 (2019-03-09), pages 2719 - 2727, XP036853921
ALSUFYANI, T. ET AL., CHEMISTRY AND PHYSICS OF LIPIDS, vol. 183, 2014, pages 100 - 109
LEE, J. ET AL., ENVIRONMENTAL POLLUTION, vol. 227, 2017, pages 252e262
ZHU, Z-J. ET AL., JOURNAL OF AGRICULTURE AND FOOD CHEMISTRY, vol. 66, no. 5, 2018, pages 1233 - 1241
CHEN, HAI-MIN ET AL., ALGAL RESEARCH, vol. 12, 2015, pages 316 - 327
ZHU ET AL., PLOS ONE, vol. 10, no. 2, 2015, pages e0117351
PEARSONLIPMAN, PROC. NATL. ACAD, SCI. (USA, vol. 85, no. 8, 1988, pages 2444 - 2448
SAMBROOK, J.FRITSCH, E.F.MANIATIS, T.: "Molecular cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS, pages: 6.3.1 - 6.3.6
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR PRESS
J. LALONDEA. MARGOLINK. DRAUZH. WALDMANN: "Enzyme Catalysis in Organic Synthesis", vol. III, 2002, COLD SPRING HARBOR LAB PRESS, article "Immobilization of Enzymes", pages: 991 - 1032
TATIANA ET AL., FEMS MICROBIOL LETT., vol. 174, 1999, pages 247 - 250
SAMBROOKRUSSELL: "Molecular Cloning", 2001, COLD SPRING HARBOR LABORATORY PRESS, pages: 896 - 897
SRIRAM KOSURIGEORGE M CHURCH, NATURE METHODS, vol. 11, 2014, pages 499 - 507
GREENER ACALLAHAN MJERPSETH B: "In vitro mutagenesis protocols", 1996, HUMANA PRESS, article "An efficient random mutagenesis technique using an E.coli mutator strain"
BARETTINO DFEIGENBUTZ MVALCAREL RSTUNNENBERG HG, NUCLEIC ACIDS RES, vol. 22, 1994, pages 1593
BARIK S, MOL BIOTECHNOL, vol. 3, 1995, pages 1
ECKERT KAKUNKEL TA, NUCLEIC ACIDS RES, vol. 18, 1990, pages 3739
SCHENK ET AL., BIOSPEKTRUM, vol. 3, 2006, pages 277 - 279
STEMMER WPC, NATURE, vol. 370, 1994, pages 389
STEMMER WPC, PROC NATL ACAD SCI USA, vol. 91, 1994, pages 10747
ZHAO HMOORE JCVOLKOV AAARNOLD FH: "Manual of industrial microbiology and biotechnology", vol. 200, 1999, AMERICAN SOCIETY FOR MICROBIOLOGY, article "Methods for optimizing industrial polypeptides by directed evolution", pages: 31
GOEDDEL: "Gene Expression Technology: Methods in Enzymology", vol. 185, 1990, ACADEMIC PRESS
"cloning vectors", 1985, ELSEVIER
T.J. SILHAVYM.L. BERMANL.W. ENQUIST: "Biotechnologie - Lehrbuch der angewandten Mikrobiologie", 1984, COLD SPRING HARBOR LABORATORY
AUSUBEL, F.M. ET AL.: "Applied Microbiol. Physiology, A Practical Approach", 1997, GREENE PUBLISHING ASSOC. AND WILEY INTERSCIENCE, pages: 53 - 73
"Bioprozesstechnik 1. Einfuhrung in die Bioverfahrenstechnik [Bioprocess technology 1. Introduction to bioprocess technology", 1991, GUSTAV FISCHER VERLAG
PATEK ET AL., APPL. ENVIRON. MICROBIOL., vol. 60, 1994, pages 133 - 140
"Manual of Methods for General Bacteriology", 1981, AMERICAN SOCIETY FOR BACTERIOLOGY
HARLOW, E.LANE, D.: "Biochemische Arbeitsmethoden [Biochemical processes", 1988, VERLAG WALTER DE GRUYTER
MALAKHOVA ET AL., BIOTEKHNOLOGIYA, vol. 11, 1996, pages 27 - 32
SCHMIDT ET AL., BIOPROCESS ENGINEER, vol. 19, 1998, pages 67 - 70
"Ullmann's Encyclopedia of Industrial Chemistry", vol. A27, 1996, VCH
MICHAL, G: "Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology", 1999, JOHN WILEY AND SONS
FALLON, A. ET AL., APPLICATIONS OF HPLC IN BIOCHEMISTRY IN: LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, vol. 17, 1987
TORALF SENGER ET AL., J. BIOL. CHEM., vol. 280, 2005, pages 7588 - 7596
ALEXANDRA ANDREOU ET AL., J. BIOL. CHEM., 2010
Attorney, Agent or Firm:
BAUMGARTNER HARRIS, Pauline (CH)
Download PDF:
Claims:
CLAIMS

1. A method for preparing at least one mono- or polyunsaturated aliphatic aldehyde, which method comprises

(1) contacting at least one polyunsaturated fatty acid (PUFA) substrate with a polypeptide

which comprises the enzymatic activity of a lipoxygenase comprising an amino acid sequence that comprises a consensus sequence pattern selected from SEQ ID NO:54; or comprises at least one partial consensus sequence pattern of SEQ ID NO:54 selected from

a) AKxxxxxADxxxxxxxxHxxxxHxxxxPxA (SEQ ID NO:240), b) VxGxxxxxxxxxxLxxxxxxxxxxxxxxHxxxNxxQxxYxxxxxN (SEQ ID NO:24l),

and

c) LxxxxxxIxxxNxxxxxxYxxxxPxxxxxSI (SEQ ID NO:242); d) or any combination from a) , b) and c)

wherein each amino acid residue x independently of each other may be selected from any natural amino acid residue

thereby converting said at least one PUFA compound to a reaction product comprising at least one mono- or polyunsaturated aliphatic aldehyde; and

(2) optionally isolating at least one mono- or polyunsaturated aliphatic aldehyde as obtained in step a).

2. The method of claim 1, wherein the polypeptide comprises the enzymatic activity of a lipoxygenase comprising an amino acid sequence that comprises a consensus sequence pattern selected from SEQ ID NO:53; or comprises at least one partial consensus sequence pattern of SEQ ID NO:53 selected from

a) LxxxxxYxxxxxXixxxxxxXiGxxxxxxxKxLPxPxxxFxWxxxXsxxxPxxI (SEQ ID NO:243)

b) WxxAKxCxQxADxxHxExxxHxxxxHxxMxPxA (SEQ ID NO:244);

c) GxVxGxxxxxxxxxxLxxxxxxxxxxCxPxHxxxNxxQxxYxxxxxNMPxAxY (SEQ ID NO:245),

d) QXXXXXXLXXXXXDXXGXYXXXX4F (SEQ ID NO:246),

e) QxxLxxxxxxIxxxNxxRxxxYxxxxxxxxxNSI (SEQ ID NO:247), f) or any combination from a) to e)

wherein

each amino acid residue x independently of each other may be selected from any natural amino acid residue,

Xi represents 0 to 7 identical or different natural amino acid residues,

X2 represents 0 or 1 natural amino acid residue,

X3 represents 0 to 7 identical or different natural amino acid residues, and

X4 represents 0 to 8 identical or different natural amino acid residues.

3. The method of claim 1, wherein the polypeptide comprises the enzymatic activity of a lipoxygenase comprising an amino acid sequence that comprises a consensus sequence pattern selected from SEQ ID NO:52; or comprises at least one partial consensus sequence pattern of SEQ ID NO:52 selected from

a) LxxxxxYxxxxx Xi xxxxxx X2GGxxxxxxKxLPxPxAxFxWxxx X3 xxxPxxI (SEQ ID NO:248),

b) WxxAKxCxQxADxNHxExxxHxxxTHxVMxPxAxxT_(SEQ ID NO:249), c) GxVxGxxxxxxxxxxLxxxxxxxxxxCxPxHxxxNxxQxxYxxxxxNMPxAxY (SEQ ID NO:250),

d) QxxxxxxLxxxxYDxLGxYxxx X4 F (SEQ ID NO:25l),

e) FQxxLxxxxxxIxxxNxxRxxxYxxxxPxxxxNSI (SEQ ID NO:252),

f) or any combination from a) to e)

wherein

each amino acid residue x independently of each other may be selected from any natural amino acid residue,

Xi represents 0 to 7 identical or different natural amino acid residues,

X2 represents 0 or 1 natural amino acid residue,

X3 represents 0 to 6 identical or different natural amino acid residues, and

X4 represents 0 to 8 identical or different natural amino acid residues.

4. The method of any one of the preceding claims, wherein the polypeptide comprises an amino acid sequence selected from

a) SEQ ID NO: 3, 6, 9, 12 or 15;

b) SEQ ID NO: 18 c) SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, or 50; d) amino acid sequences having at least 40% sequence identity to at least one of the sequences of a), b) or c) and retaining said enzymatic activity of a lipoxygenase; and e) single and multiple mutants of anyone of the polypeptides c) retaining said enzymatic activity of a lipoxygenase, in particular selected from mutants comprising an amino acid sequence selected from SEQ ID NO: 254, 256, 258, 260, 262, 264, 266, and 284.

5. The method of anyone of the claims 1 to 4, wherein the polypeptide comprises the enzymatic activity of a bifunctional lipoxygenase and in particular of a combination of lipoxygenase and hydroperoxide lyase activity.

6. The method of anyone of the claims 1 to 5, wherein the polypeptide comprises the ability of converting at least one PUFA, in particular selected from omega-3 and omega-6 PUFA, to at least one mono- or polyunsaturated aliphatic aldehyde, in particular to at least one polyunsaturated aliphatic Cio-aldeyde, more particularly selected from decadienals and decatrienals.

7. The method of claim 6, wherein said decadienal is selected from 2E,4E-decadienal and 2E,4Z-decadienal and mixtures thereof; and wherein said decatrienal is selected from 2E,4E, 7Z-decatrienal and 2E,4Z,7Z-decatrienal and mixtures thereof.

8. The method of any one of the claims 1 to 7, wherein said PUFA is selected from Ci6- C22-, in particular from C16-C22-PUFAS, more particularly selected from omega-3 C16-C22- PUFAs and omega-6 C16-C22-PUFAS, like for example selected from omega-3 C18-C20- PUFAs and omega-6 C18-C20-PUFAS, and, for example, PUFAs like (4Z,7Z,10Z,13Z)- hexadeca-4,7,l0,l3-tetraenoic acid, SDA, EPA, DHA GFA and ARA.

9. The method of anyone of the preceding claims, wherein step a) is performed in vivo in cell culture in the presence of oxygen, or in vitro in a liquid reaction medium in the presence of oxygen.

10. The method of any one of the preceding claims wherein step a) is carried out by cultivating a non-human host organism or cell expressing at least one of said polypeptides having the enzymatic activity of a lipoxygenase in the presence of a PUFA substrate under conditions conducive to the peroxidation and subsequent cleavage of at least one PUFA.

11. The method of any one of the preceding claims, wherein said PUFA substrate is an isolated PUFA compound or a natural or synthetic composition comprising at least one PUFA convertible by said lipoxygenase; and wherein in particular said natural PUFA composition is selected from

a) borage oil,

b) arachidonic oil,

c) fish oil corresponding oil hydrolysates,

d) mixtures of LA and ALA; and

e) mixtures containing at least two of a) to e).

12. The method of any one of the preceding claims, which further comprises a chemical or enzymatic isomerization of an obtained mono- or polyunsaturated aliphatic aldehyde; or a chemical or enzymatic conversion of an obtained mono- or polyunsaturated aliphatic aldehyde to the corresponding alcohol or hydrocarbyl ester.

13. A polypeptide which comprises the enzymatic activity of a lipoxygenase, wherein said polypeptide comprises an amino acid sequence selected from

a) SEQ ID NO: 3, 6, 9, 12 or 15;

b) SEQ ID NO: 18

c) SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, or 50; d) amino acid sequences having at least 40% sequence identity to at least one of the sequences of a), b) or c) and retaining said enzymatic activity of a lipoxygenase; and

e) single and multiple mutants of anyone of the polypeptides c) retaining said enzymatic activity of a lipoxygenase, in particular selected from mutants comprising an amino acid sequence selected from SEQ ID NO: 254, 256, 258, 260, 262, 264, 266, and 284.

14. A nucleic acid encoding the polypeptide of claim 13 or the complement thereof.

15. The nucleic acid of claim 14, comprising a coding nucleotide selected from a) SEQ ID NO: 1, 2, 4, 5, 7, 8, 10, 11, 13 and 14;

b) SEQ ID NO: 16 and 17;

c) SEQ ID NO: 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 59,

60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73 and 74;

d) a nucleotide sequence having at least 40% sequence identity to at least one of the sequences of a), b), or c) and encoding a polypeptide having the enzymatic activity of a lipoxygenase;

e) nucleotide sequences encoding a single and multiple mutants of anyone of the sequences c) encoding a polypeptide retaining said enzymatic activity of a lipoxygenase, in particular selected from SEQ ID NO: 253, 255, 257, 259, 261, 263, 265, and 283.

f) the complement of anyone of the sequences of a), b) ,c), d) or e).

16. An expression vector comprising the coding nucleic acid of any one of claims 14 and

15.

17. A recombinant non-human host organism or cell harboring at least one nucleic acid according to any one of claims 14 and 15 or harboring at least one expression vector of claim

16.

18. A method for producing at least one polypeptide according to claim 13 comprising: a) culturing a non-human host organism or cell harboring at least one nucleic acid according to any one of claims 14 and 15 and expressing or over expressing at least one polypeptide according to claim 13;

b) optionally isolating said polypeptide from the non-human host organism or cell cultured in step a).

19. A method for preparing a mutant polypeptide capable of converting at least one polyunsaturated fatty acid (PUFA), in particular omega-3 or omega-6 PUFA, to at least one mono- or polyunsaturated aliphatic aldehyde, comprising the steps of:

a) selecting a nucleic acid according to any one of claims 14 and 15;

b) modifying the selected nucleic acid to obtain at least one mutant nucleic acid; c) providing host cells or unicellular organisms with the mutant nucleic acid sequence to express a polypeptide encoded by the mutant nucleic acid sequence; d) screening for at least one mutant polypeptide with activity in converting at least one polyunsaturated fatty acid (PUFA), in particular omega-3 of omega-6 PUFA, to at least one mono- or polyunsaturated aliphatic aldehyde;

e) optionally, if the mutated polypeptide has no desired activity, repeating the process steps a) to d) until a polypeptide with a desired activity is obtained; and,

f) optionally, if a mutant polypeptide having a desired activity was identified in step d) or e), isolating the corresponding mutant nucleic acid. A combination of at least two unsaturated Cio-aldehyde isomers, selected from 2E,4Z- decadienal, 2E,4E-decadienal, 2E,4Z,7Z-decatrienal and 2E,4E, 7Z-decatrienal, wherein a particular ratio between 2E,4E-decadienal and 2E,4Z-decadienal is from 3:1 to 1:9 and a particular ratio between 2E,4Z,7Z-decatrienal and 2E,4E, 7Z-decatrienal is from 3:1 to 1:9.

20. The use of a mono- or polyunsaturated aliphatic aldehyde or of a mixture of at least two of such aldehydes, and/or of corresponding conversion products and mixtures thereof as obtained by a method of anyone of the claims 1 to 12 or of an isomer combination of claim 20 as flavour ingredient for the manufacture of food or feed compositions.

21. A food or feed composition supplemented by at least one flavour ingredient as defined in claim 21.

Description:
LIPOXYGENASE-CATALYZED PRODUCTION OF UNSATURATED

C10-ALDEHYDES FROM

POLYUNSATURATED FATTY ACIDS (PUFA)

Technical field

The present invention provides novel methods for the lipoxygenase (LOX)- catalyzed production of aliphatic unsaturated Cio-aldehyde compounds from polyunsaturated fatty acid (PUFA) sources. The present invention also relates to the isolation and characterization of novel, preferably bifunctional LOXs from different algae sources and the identification of structurally and/or functionally related LOXs from different bacterial sources. The present invention also relates to the provision of enzyme mutants derived from said newly identified enzymes. A further aspect of the present invention relates to corresponding coding sequences of said enzymes, recombinant vectors, and recombinant host cells suitable for the production of such LOXs and for performing the novel production methods of aliphatic unsaturated Cio-aldehyde compounds. Another aspect of the invention relates to the use of particular aldehydes or aldehyde mixtures, as obtained according to the present invention as flavor ingredient or ingredient for food or feed compositions. Background

The unsaturated Cio-aldehydes decadienal and decatrienal are very important ingredients for chicken and citrus flavours. In spite of high production costs and low production volumes, flavorists cannot replace them with other ingredients due to their unique olfactory properties. More than 200 commercial formulas contain Cio-aldehydes.

C 6 and Cg aldehydes are typically biosynthesised by plant defensive systems through a two-step enzymatic reaction starting from polyunsaturated fatty acids (PUFAs) (see Scheme 1 below). First, LOXs convert fatty acids to fatty acid hydroperoxides (HPOs). Subsequently, hydroperoxide lyases (HPL) break down HPOs into metabolites including aldehydes and alcohols. The production of C 6 and Cg ingredients by enzymes from plant extracts or enzymes from overexpressed microbial systems is well known. The industrial routes to manufacture C 6 and Cg aldehyde flavour ingredients are relatively mature and the product quality is stable. Consequently, the prices remain lower than for Cio analogs. In comparison to the C 6 and C9 analogues, the industrial process to manufacture C 10 aldehyde ingredients is more challenging (see Scheme 1 below, right half). It stats with the 9-LOX catalysed peroxidation of linoleic acid and alpha-linolenic acid. The 9- LOX is obtained from a plant source (potato). Considering that no HPL is available that would cleave the 9-HPO intermediates into C 10 fragments, a typical process currently relies instead on thermal degradation of 9-HPO. Overall, the approach has two drawbacks. One is product variation issues due to variations in the quality of the potato extracts from different suppliers, i.e. different yields achieved for each production batch since the enzyme content from potato is different. Another one is the low yield of the thermal cracking step which leads to high production costs.

C18:3ccc, a-linolenic acid

C18:2cc, linoleic acid

13-lipoxygenase

(soybean) 9-lipoxygenase

Scheme 1. Current industrial production routes for C 6 ~Cio aldehyde ingredients.

Alsufyani, T. et al describe in Chemistry and Physics of Lipids 183 (2014) 100- 109 several seaweeds including Ulva which could produce decadienals and decatrienals through the conventional LOX/HPL pathway. This prior art document doesn’t identify any gene sequence, coding sequence, or protein sequence involved in said bioconversion or any key amino acid residues that determine high LOX activity.

Lee, J. et al provide in Environmental Pollution 227 (2017) 252e262 a review pertaining to algae and bacterial odor problems that have been published over the last five decades. Two microcystis species (Cyanobacteria) were reported to produce decatrienal. While said prior art has its focus on odorant pollution in water no particular teaching on genes, coding sequences, or protein sequences responsible for said decatrienal formation is provided.

Zhu, Z-J. et al further investigate in Journal of Agriculture and Food Chemistry. (2018) 66(5): 1233-1241 the multifunctional LOX, PhLOX from seaweed Pyropia haitanensis (also described by the Chen, Hai-min et al in Algal Research, 12, (2015) 316- 327), in the one-step bioconversion of fatty acids to primarily Cg-Cg aldehydes based on LOX activity and HPL activity. Said multifunctional LOX is said to show LOX, HPL and allene oxide synthase (AOS) activity. The production of a 2E,4Z-decadienal side product was observed merely by feeding with hydrolyzed fish oil but not with the numerous other testes substrates, like ALA, ARA, EPA and DHA. Decatrienals were not observed. Gamma-linolenic acid was not used as substrates in said prior art. The productivity of said decadienal side product is quite low and not of industrial value.

Zhu, et al describe in PLoS One. (2015) l0(2):e0l 17351) another multifunctional LOX, PhLOX2, from seaweed Pyropia haitanensis. EPA, ARA, GLA and DHA were investigated as substrates; no production of any unsaturated Cio aldehyde was reported therein.

Chinese Patent Application CN 104293805 describes a multifunctional LOX protein sequence from seaweed Pyropia haitanensis (PhLOX)which was also expressed in E. coli. Said LOX species did not produce decadienals and decatrienals when feeding with fatty acid substrates. It only produces short chain aldehydes

Chinese Patent Application CN 104293837 A describes another multifunctional LOX from seaweed Pyropia haitanensis (PhLOX) which was expressed in E. coli. No evidence for a production for Cio-aldehydes, in particular decadienals and decatrienals is provided therein.

W02008056291 and EP-A-l 921 134 describe a cyanobacterial LOX,

WP_012407347.1, and suggest its use in the production of fatty acid hydroperoxides, however do not provide evidence for the production of unsaturated Cio-aldehydes, like decadienal. Despite of different reports on the biocatalytic synthesis of unsaturated Cio- aldehydes, the enzymatic systems described in the prior art still suffer from the problem of low productivity and, consequently, do not provide a suitable basis for the industrial scale production of Cio-aldehydes.

The problem to be solved by the present invention is, therefore, the provision of an improved biocatalytic method for the production of unsaturated Cio-aldehyde compounds, in particular decadienals and/or decatrienals. Another problem to be solved by the present invention is the provision of novel biocatalysts applicable in the fully biosynthetic production of unsaturated Cio-aldehydes, in particular decadienals and/or decatrienals.

Summary

The above-mentioned problems could, surprisingly, be solved by providing unique and superior LOXs from new sources. In particular, the present inventors succeeded in isolating novel bi-functional LOXs from the seaweed sources Cladophora oligoclara producing high amounts of decadienals and/or decatrienals from different PUFA substrates. The present inventors also succeeded in isolating a novel bi-functional LOX from the seaweed Ulva fasciata which also produces high amounts of decadienals and /or decatrienals from different PUFA substrates.

On the basis of the sequence information derived from said new LOXs, the present inventors also surprisingly succeeded in the identification of LOXs with the desired catalytic LOX activity from bacterial sources, mainly from cyanobacteria.

On the basis of sequence comparisons between said newly identified enzymes, the present inventors were able to perform a systematic investigation on structure and functionality of suitable bifunctional LOXs showing superior productivity and/or specificity, for unsaturated Cio-aldehyde compounds, in particular decadienals and/or decatrienals, more particularly decadienals. Improved productivity was observed for several bacterial LOXs. On the basis of such investigations the inventors were able to further improve LOX productivity in the industrial production of such Cio-aldehydes.

The newly identified protein sequences may be functionally expressed in the bacterial hosts like Escherichia coli. Surprisingly, cultures with high cell density could be obtained with improved enzymatic capability for the industrial scale production of said Cio-aldehydes. Feeding with specific fatty acids as substrates, such recombinant E. coli hosts are highly productive in different decadienals and/or decatrienals.

The new approach allows the provision of more cost-effective methods for the fully biocatalytic production of decadienals and/or decatrienals.

If required said aldehydes may be converted to suitable derivatives, in particular to corresponding alcohols, by chemical or , in particular, biochemical conversion, for example by applying conventional alcohol dehydrogenase (ADH) enzymes.

Description of the drawings.

Figure 1. Structural formulae of the unsaturated Cio aldehyde stereoisomers 2E,4Z,7Z-decatrienal, 2E,4E,7Z-decatrienal, 2E,4Z-decadienal and 2E,4E-decadienal.

Figure 2. SPME/GC/MS chromatogram of fresh samples of U. fasciata.

Figure 3. SPME/GC/MS chromatogram of fresh samples of C. oligoclara.

Figure 4. MS spectrum of 2E,4Z-decadienal.

Figure 5. MS spectrum of 2E,4E-decadienal.

Figure 6. MS spectrum of 2E,4Z,7Z-decatrienal.

Figure 7. MS spectrum of 2E,4E,7Z-decatrienal.

Figure 8. Feeding results of CoLOXs of the present invention with gamma- linolenic acid; in comparison with negative controls (BL21 = non-transformed E. coli cells; pETDuet = BL21 transformed with empty vector);

Figure 9. Feeding result of CoLOXs of the present invention with alpha-linolenic acid and linoleic acid mixture in comparison with negative controls (BL21 = non- transformed E. coli cells; Empty vector = pETDuet- 1 transformed E. coli cells);

Figure 10. Feeding result of CoLOXs of the present invention with fish oil in comparison with negative controls (BL21 = non-transformed E. coli cells; Empty vector = pETDuet- 1 transformed E. coli cells);

Figure 11. Sequence alignment of UfLOX2 and bacterial LOX to mine key amino acid residues.

Figure 12. The results of mutagenesis studies of UfLOX2.

Figure 13. Influence of different cofactors on the activity of UfLOX2. Figure 14. Alignment of different CoLOX amino acid sequences to generate consensus sequence of SEQ ID NO:5l.

Figure 15. Alignment of different bacterial LOX amino acid sequences to generate consensus sequence of SEQ ID NO:52.

Figure 16. Alignment of UfLOX2 and different bacterial LOX amino acid sequences to generate consensus sequence of SEQ ID NO:53.

Figure 17. Alignment of different CoLOXs, UfLOX2 and different bacterial LOX amino acid sequences to generate consensus sequence of SEQ ID NO:54.

Figure 18. The average productivity of bacterial LOX mutants (black) compared to their natural sequences (grey), respectively.

Abbreviations used

AOS allene oxide synthase

bp base pair

kb kilo base

DNA deoxyribonucleic acid

cDNA complementary DNA

GC gas chromatograph

HPO Hydroperoxide

HPL Hydroperoxide lyase

LOX Lipoxygenase

MS mass spectrometer / mass spectrometry

PUFA Polyunsaturated fatty acid

PCR polymerase chain reaction

RNA ribonucleic acid

mRNA messenger ribonucleic acid

miRNA micro RNA

siRNA small interfering RNA

rRNA ribosomal RNA tRNA transfer RNAXaa (or X) as used in the sequence listings herein or attached to this description, refers to, unless otherwise specified, for any known natural amino acid residue or a chemical bond. Particular PUFAs (PUFA substrates) as specifically referred to herein are selected from the following polyunsaturated omega-3 and omega-6 fatty acids and natural or synthetic mixtures of at least two of them:

Omega-3 fatty acids

Common name (abbreviation) Lipid name Chemical name

16:4 (n-3) all-cis hexadeca-4,7,10,13-tetraenoic acid,

Hexadecatrienoic acid (HTA) 16:3 (n-3) all-cis 7,10,13-hexadecatrienoic acid Alpha-linolenic acid (ALA) 18:3 (n-3) all-cis- 9,12,15-octadecatrienoic acid Stearidonic acid (SDA) 18:4 (n-3) all-cis- 6,9,12,15,-octadecatetraenoic acid Eicosapentaenoic acid (EPA) 20:5 (n-3) all-cis-5, 8,11,14, 17-eicosapentaenoic acid all-cis- 4,7, 10, 13, 16, 19-docosahexaenoic

Docosahexaenoic acid (DHA) 22:6 (n-3)

acid

Omega-6 fatty acids

Common name (abbreviation) Lipid name Chemical name

Linoleic acid (LA) 18:2 (n-6) all-ci.s-9, 12-octadecadienoic acid Gamma-linolenic acid (GLA) 18:3 (n-6) all-cis- 6,9, 12-octadecatrienoic acid Arachidonic acid (ARA) 20:4 (n-6) all-cis-5,SJ 1,14-eicosatetraenoic acid Non-limiting examples of particular PUFA mixtures as specifically referred to herein are selected from: fish oil, linseed oil, arachidonic acid oil, linseed oil, evening primrose oil echium oil, micro algae oil and borage oil.

Definitions

“Lipoxygenase” (LOX) (also designated linoleate: oxygen oxidoreductases, EC

1.13.11.12) constitute a large gene family of non-heme iron-containing fatty acid dioxygenases, which are ubiquitous in plants and animals. LOXs catalyze the regio- and stereo specific dioxygenation of PUFAs containing at least one (lZ,4Z)-pentadiene system. Thus, substrates for LOXs are for example linoleic acid (LA), alpha-linolenic acid (ALA), or arachidonic acid (ARA). The term“LOX” as used herein specifically refers to such PUFA degrading enzymes which have the ability initiate a dioxygenation step in a suitable chain position of said PUFA molecule which ultimately results in the formation of at least one unsaturated Cio-aldehyde fragment, in particular at least one decadienal and/or decatrienals compound, as the result of such oxidative degradation reaction. Said Cio compound(s) may be produced as side product (s) together with other oxidation product(s) of different chain length, for example of shorter chain lengths, as for example C 6 - or Cg unsaturated aldehydes, particularly however said Cio compound(s) may be produced as predominant product (s), i.e. in an molar excess over other oxidation product of different, for example shorter chain lengths, as for example C 6 - or Cg unsaturated aldehydes, or more particularly said Cio compound(s) may be produced as the single product species.

The“LOX /HPL pathway” or“LOX/HPL pathway” refers to the classical two- step enzymatic reaction for the oxidative degradation of polyunsaturated fatty acid molecules. First, LOXs (LOX) convert said fatty acids to fatty acid hydroperoxides (HPOs). Subsequently, HPLs (HPL) break down HPOs into metabolites including aldehydes and alcohols.

A“bifunctional” LOX designates herein a single enzyme molecule which shows both LOX and HPL activity required for the oxidative degradation of polyunsaturated fatty acid molecules (irrespective of a particular enzymatic mechanism). In a particular embodiment such bi-functional LOX may shows essentially no AOS activity, and more particularly may be absent of such AOS activity. As shown in the experimental section such bifunctional LOX do not only form fatty acid hydroperoxides intermediates they also show the ability to degrade such fatty acid hydroperoxides compounds if applied as synthetic artificial substrate. A“bifunctional” LOX in particular herein refers to a single enzyme molecule which shows both LOX and HPL activity required for the oxidative degradation of polyunsaturated fatty acid molecules (irrespective of a particular enzymatic mechanism). Thus said bifunctional LOX catalyzes the formation of at least one unsaturated Cio-aldehyde fragment, in particular at least one decadienal and/or decatrienals compound, as the result of such oxidative degradation reaction. Said Cio compound(s) may be produced as side product(s) together with other oxidation product(s) of different chain length, for example of shorter chain lengths, as for example C 6 - or Cg unsaturated aldehydes, particularly however said Cio compound(s) may be produced as predominant product(s), i.e. in an molar excess over other oxidation product of different, for example shorter chain lengths, as for example C 6 - or C 9 unsaturated aldehydes, or more particularly said C 10 compound(s) may be produced as the single product species.

Without being bound to any mechanistic considerations, the HLP activity of a “Bifunctional LOX” of the present invention may be further described as the ability to exclusively or preferentially cleave the hydroperoxides intermediate of the PUFA substrate at the C-C bond on the carboxyl-terminal side relative to its the HOO- group. This distinguishes the present enzymes also from plant derived LOX/HLP enzyme systems, as for example depicted in the above Scheme 1. Starting out from LA or ALA (i.e. Ci 8 -PUFAs) a bifunctional LOX of the invention may be considered to encompass both a 9-LOX activity and a 9-HPL activity. As opposed to the prior art 9-HLP of rice plants, the 9-HPL activity of the bifunctional LOX of the present invention, however, results in a cleavage of the hydroperoxides intermediate on the opposite (carboxyl- terminal) side of the HOO- group of the intermediate. For cleavage resulting in a C 10 - aldehyde an extra double bond in beta-position relative to the HOO-group appears to be favorable or necessary, so that a cleavage of the carbon chain between the C- atom carrying the HOO-group and the carbon atom in alpha-position thereto will occur. As a result of this a Cio-aldehyde rather than a Cg-aldehyde as in the case of the plant enzyme is produced. This is illustrated below in Scheme 2 with GLA as an example.

C9 aldehyde C10 aldehyde (decadienal)

Scheme 2. Comparison of cleavage sites of FOX enzyme of the invention (resulting in Cio aldehyde formation) and prior art enzymes (resulting G, aldehyde formation).

As is evident from the above Scheme 2 a“bifunctional LOX” of the present invention, in order to produce an unsaturated ClO-aldehyde, utilizes particular PUFA substrates. Essentially, a preferred PUFA substrate should comprise cis-double bonds between omega-9 and 10 carbon atoms (i.e. between position (C-9) and (C-10) in Cl 8 fatty acid and between position (C-l l) and (C-12) in C20 fatty acid) as well as between omega 12 and 13 carbon atoms (i.e. between position (C-6) and (C-7) in C18 fatty acid and between position (C-8) and (C-9) in C20 fatty acid). For example, in case of C18 fatty acids those comprising two cis double bonds in an all-cis-6, 9 configuration (cf. GFA and SDA) are preferred substrates, and in case of C20 fatty acids those comprising two cis double bonds an all-cis-8, 11 configuration (cf. EPA or ARA) are preferred substrates. These preferred PUFA substrates may also be considered as “reference substrates”. In order to qualify as a“bifunctional FOX of the present invention” it is sufficient if the FOX is able to convert at least one of such“reference substrate” to an unsaturated ClO-aldehyde, in particular at least one selected from (2E,4Z)-2,4-decadienal, (2E,4E)-2,4-decadienal, (2E,4Z,7Z)-2,4,7-decatrienal and (2E,4E,7Z)-2,4,7-decatrienal. An“unsaturated Cio-aldehyde” encompasses any mono-, di- or tri-unsaturated linear aliphatic aldehyde having ten carbon atoms in its hydrocarbyl chain. It encompasses such compound in any stereoisomerically pure form or in the form of mixtures of at least two different stereoisomers. Particular, non-limiting examples of such aldehydes are decadienals and decatrienals.

A“decadienal” encompasses such compound in any stereoisomerically pure form or in the form of mixtures of at least two different stereoisomers. Typical examples are 2E,4Z-decadienal and 2E,4E-decadienal and mixtures thereof.

A“decatrienal” encompasses such compound in any stereoisomerically pure form or in the form of mixtures of at least two different stereoisomers. Typical examples are 2E,4Z,7Z-decatrienal, 2E,4E,7Z-decatrienal and mixtures thereof.

The term“PUFA” as used herein has to be understood broadly. In particular it encompasses one single“pure” or“essentially pure” type of PUFA molecule (like HTA, ALA, SDA, EPA, LA, GLA, or ARA) or any mixture containing at least two different types of PUFAs. A PUFA substrate also encompasses natural products containing at least one PUFA typein admixture with other natural or synthetic constituents, as for example a) borage oil (containing elevated proportions of GLA)

b) evening primrose oil (containing elevated proportions of GLA)

c) arachidonic oil (containing elevated proportions of ARA)

d) echium seed oil (containing elevated proportions of SDA

e) fish oil (containing elevated proportions of EPA and DHA)

f) linseed oil (containing elevated proportions of ALA)

g) micro algae oil (containing elevated proportions of DHA)

“Bifunctional LOX Activity” is determined under “standard conditions” as described in the experimental section. In general, the LOX product GLA-HPO and HPL product hexanal, and decadienal were quantified by GC-MS and LC-UV by peak areas. To deduce bifunctional LOX activity to make decadienal, we can calculate the peak area ratio of decadienal to GLA-HPO from the LC-UV data as shown in Table 9.

The terms“biological function,”“function”,“biological activity” or“activity” of a LOX refer to the ability of a LOX as described herein to catalyze the formation of at least one unsaturated C10 aldehyde from at least one type of PUFA molecule. As used herein, the term“host cell” or“transformed cell” refers to a cell (or organism) altered to harbor at least one nucleic acid molecule, for instance, a recombinant gene encoding a desired protein or nucleic acid sequence which upon transcription yields at least one functional polypeptide of the present invention, i.p. a LOX or bifunctional LOX as defined herein above. The host cell is particularly a bacterial cell, a fungal cell or a plant cell or plants. The host cell may contain a recombinant gene or several genes, as for example organized as an operon, which has been integrated into the nuclear or organelle genomes of the host cell. Alternatively, the host may contain the recombinant gene extra-chromosomally.

The term “organism” refers to any non-human multicellular or unicellular organism such as a plant, or a microorganism. Particularly, a micro-organism is a bacterium, a yeast, an algae or a fungus.

The term“plant” is used interchangeably to include plant cells including plant protoplasts, plant tissues, plant cell tissue cultures giving rise to regenerated plants, or parts of plants, or plant organs such as roots, stems, leaves, flowers, pollen, ovules, embryos, fruits and the like. Any plant can be used to carry out the methods of an embodiment herein.

A particular organism or cell is meant to be“capable of producing” an unsaturated Cio aldehyde when it produces such aldehyde naturally or when it does not produce such aldehyde naturally but is transformed to produce such aldehyde with a nucleic acid as described herein. Organisms or cells transformed to produce a higher amount of such aldehyde than the naturally occurring organism or cell are also encompassed by the “organisms or cells capable of producing unsaturated Cio aldehyde”.

For the descriptions herein and the appended claims, the use of “or” means “and/or” unless stated otherwise. Similarly, “comprise”, “comprises”, “comprising”, “include”, “includes”, and “including” are interchangeable and not intended to be limiting.

It is to be further understood that where descriptions of various embodiments use the term "comprising," those skilled in the art would understand that in some specific instances, an embodiment can be alternatively described using language "consisting essentially of" or "consisting of’. The terms "purified", "substantially purified", and "isolated" as used herein refer to the state of being free of other, dissimilar compounds with which a compound of the invention is normally associated in its natural state, so that the "purified", "substantially purified", and "isolated" subject comprises at least 0.5%, 1%, 5%, 10%, or 20%, or at least 50% or 75% of the mass, by weight, of a given sample. In one embodiment, these terms refer to the compound of the invention comprising at least 95, 96, 97, 98, 99 or 100%, of the mass, by weight, of a given sample. As used herein, the terms "purified," "substantially purified," and "isolated" when referring to a nucleic acid or protein, or nucleic acids or proteins, also refers to a state of purification or concentration different than that which occurs naturally, for example in an prokaryotic or eukaryotic environment, like, for example in a bacterial or fungal cell, or in the mammalian organism, especially human body. Any degree of purification or concentration greater than that which occurs naturally, including (1) the purification from other associated structures or compounds or (2) the association with structures or compounds to which it is not normally associated in said prokaryotic or eukaryotic environment, are within the meaning of "isolated”. The nucleic acid or protein or classes of nucleic acids or proteins, described herein, may be isolated, or otherwise associated with structures or compounds to which they are not normally associated in nature, according to a variety of methods and processes known to those of skill in the art.

The term“about” indicates a potential variation of ± 25% of the stated value, in particular ± 15%, ± 10 %, more particularly ± 5%, ± 2% or ± 1%.

The term "substantially" describes a range of values of from about 80 to 100%, such as, for example, 85-99.9%, in particular 90 to 99.9%, more particularly 95 to 99.9%, or 98 to 99.9% and especially 99 to 99.9%.

“Predominantly” refers to a proportion in the range of above 50%, as for example in the range of 51 to 100%, particularly in the range of 75 to 99,9%, more particularly 85 to 98,5%, like 95 to 99%.

A“main product” in the context of the present invention designates a single compound or a group of at least 2 compounds, like 2, 3, 4, 5 or more, particularly 2 or 3 compounds, which single compound or group of compounds is“predominantly” prepared by a reaction as described herein, and is contained in said reaction in a predominant proportion based on the total amount of the constituents of the product formed by said reaction. Said proportion may be a molar proportion, a weight proportion or, preferably based on chromatographic analytics, an area proportion calculated from the corresponding chromatogram of the reaction products.

A“side product” in the context of the present invention designates a single compound or a group of at least 2 compounds, like 2, 3, 4, 5 or more, particularly 2 or 3 compounds, which single compound or group of compounds is not“predominantly” prepared by a reaction as described herein.

Because of the reversibility of enzymatic reactions, the present invention relates, unless otherwise stated, to the enzymatic or biocatalytic reactions described herein in both directions of reaction.

"Functional mutants" of herein described polypeptides include the "functional equivalents" of such polypeptides as defined below.

The term "stereoisomers" includes in particular conformational isomers.

Included in general are, according to the invention, all“stereoisomeric forms” of the compounds described herein, such as constitutional isomers and, in particular, stereoisomers and mixtures thereof, e.g. optical isomers, or geometric isomers, such as E- and Z-isomers, and combinations thereof. If several asymmetric centers are present in one molecule, the invention encompasses all combinations of different conformations of these asymmetry centers, e.g. enantiomeric pairs

“Stereoselectivity” describes the ability to produce a particular stereoisomer of a compound in a stereoisomerically pure form or to specifically convert a particular stereoisomer in an enzyme catalyzed method as described herein out of a plurality of stereoisomers. More specifically, this means that a product of the invention is enriched with respect to a specific stereoisomer, or an educt may be depleted with respect to a particular stereoisomer. This may be quantified via the purity %ee-parameter calculated according to the formula:

%ee = [X A -X b ]/[ X A +X B ] * 100,

wherein X A and X B represent the molar ratio (Molenbruch) of the stereoisomers A and B. The terms “selectively converting” or“increasing the selectivity” in general means that a particular stereoisomeric form, as for example the E-form, of an unsaturated hydrocarbon, is converted in a higher proportion or amount (compared on a molar basis) than the corresponding other stereoisomeric form, as for example Z-form, either during the entire course of said reaction (i.e. between initiation and termination of the reaction), at a certain point of time of said reaction, or during an“interval” of said reaction. In particular, said selectivity may be observed during an“interval” corresponding 1 to 99%, 2 to 95%, 3 to 90%, 5 to 85%, 10 to 80%, 15 to 75%, 20 to 70%, 25 to 65%, 30 to 60%, or 40 to 50% conversion of the initial amount of the substrate. Said higher proportion or amount may, for example, be expressed in terms of:

a higher maximum yield of an isomer observed during the entire course of the reaction or said interval thereof;

a higher relative amount of an isomer at a defined % degree of conversion value of the substrate; and/or

an identical relative amount of an isomer at a higher % degree of conversion value;

each of which preferably being observed relative to a reference method, said reference method being performed under otherwise identical conditions with known chemical or biochemical means.

Generally also comprised in accordance with the invention are all“isomeric forms” of the compounds described herein, such as constitutional isomers and in particular stereoisomers and mixtures of these, such as, for example, optical isomers or geometric isomers, such as E- and Z-isomers, and combinations of these. If several centers of asymmetry are present in a molecule, then the invention comprises all combinations of different conformations of these centers of asymmetry, such as, for example, pairs of enantiomers, or any mixtures of stereoisomeric forms.

“Yield" and / or the "conversion rate" of a reaction according to the invention is determined over a defined period of, for example, 4, 6, 8, 10, 12, 16, 20, 24, 36 or 48 hours, in which the reaction takes place. In particular, the reaction is carried out under precisely defined conditions, for example at“standard conditions” as herein defined. The different yield parameters ("Yield" or Yp / s; "Specific Productivity Yield" ; or Space-Time- Yield (STY)) are well known in the art and are determined as described in the literature.

"Yield" and "Yp / s" (each expressed in mass of product produced/mass of material consumed) are herein used as synonyms.

The specific productivity- yield describes the amount of a product that is produced per h and L fermentation broth per g of biomass. The amount of wet cell weight stated as WCW describes the quantity of biologically active microorganism in a biochemical reaction. The value is given as g product per g WCW per h (i.e. g/gWCW 1 h 1 ). Alternatively, the quantity of biomass can also be expressed as the amount of dry cell weight stated as DCW. Furthermore, the biomass concentration can be more easily determined by measuring the optical density at 600 nm (OD 6 oo) and by using an experimentally determined correlation factor for estimating the corresponding wet cell or dry cell weight, respectively.

The term "fermentative production" or "fermentation" refers to the ability of a microorganism (assisted by enzyme activity contained in or generated by said microorganism) to produce a chemical compound in cell culture utilizing at least one carbon source added to the incubation.

The term "fermentation broth" is understood to mean a liquid, particularly aqueous or aqueous /organic solution which is based on a fermentative process and has not been worked up or has been worked up, for example, as described herein.

An“enzymatically catalyzed" or“biocatalytic” method means that said method is performed under the catalytic action of an enzyme, including enzyme mutants, as herein defined. Thus the method can either be performed in the presence of said enzyme in isolated (purified, enriched) or crude form or in the presence of a cellular system, in particular, natural or recombinant microbial cells containing said enzyme in active form, and having the ability to catalyze the conversion reaction as disclosed herein.

If the present disclosure refers to features, parameters and ranges thereof of different degree of preference (including general, not explicitly preferred features, parameters and ranges thereof) then, unless otherwise stated, any combination of two or more of such features, parameters and ranges thereof, irrespective of their respective degree of preference, is encompassed by the disclosure of the present description.

Detailed Description

a. Particular embodiments of the invention

The present invention relates to the following particular embodiments:

1. A polypeptide which comprises the enzymatic activity of a lipoxygenase, i.p. of a bifunctional LOX, with an amino acid sequence that comprises a consensus sequence pattern selected from SEQ ID NO:54; or comprises at least one partial consensus sequence pattern of SEQ ID NO:54 selected from

a) AKxxxxxADxxxxxxxxHxxxxHxxxxPxA (SEQ ID NO:240),

b) VxGxxxxxxxxxxLxxxxxxxxxxxxxxHxxxNxxQxxYxxxxxN (SEQ ID NO:24l),

and

c) LxxxxxxIxxxNxxxxxxYxxxxPxxxxxSI (SEQ ID NO:242);

d) or any combination from a), b) and c), and in particular a combination of a), b) and c).

wherein each amino acid residue x independently of each other may be selected from any natural amino acid residue.

2. The polypeptide of embodiment 1 which comprises the enzymatic activity of a lipoxygenase, i.p. of a bifunctional LOX, with an amino acid sequence that comprises a consensus sequence pattern selected from SEQ ID NO:53; or comprises at least one partial consensus sequence pattern of SEQ ID NO:53 selected from

a) LxxxxxYxxxxxXiXxxxxxXiGxxxxxxxKxLPxPxxxFxWxxxXsXxxPxxI (SEQ ID NO:243)

b) WxxAKxCxQxADxxHxExxxHxxxxHxxMxPxA (SEQ ID NO:244);

c) GxVxGxxxxxxxxxxLxxxxxxxxxxCxPxHxxxNxxQxxYxxxxxNMPxAxY

(SEQ ID NO:245),

d) QXXXXXXLXXXXXDXXGXYXXXX 4 F (SEQ ID NO:246),

e) QxxLxxxxxxIxxxNxxRxxxYxxxxxxxxxNSI (SEQ ID NO:247), f) or any combination from a) to e), and in particular a combination of b), c) and e), or a) to e),

wherein

each amino acid residue x independently of each other may be selected from any natural amino acid residue,

Xi represents 0 to 7 identical or different natural amino acid residues,

X 2 represents 0 or 1 natural amino acid residue,

X represents 0 to 7 identical or different natural amino acid residues, and

X 4 represents 0 to 8 identical or different natural amino acid residues.

3. The polypeptide of embodiment 1 which comprises the enzymatic activity of a lipoxygenase, i.p. of a bifunctional LOX, with an amino acid sequence that comprises a consensus sequence pattern selected from SEQ ID NO:52; or comprises at least one partial consensus sequence pattern of SEQ ID NO:52 selected from

a) Lxxxxx Y xxxxxX i xxxxxxX 2 GGxxxxxxKxLPxPx AxFxW xxxX 3 xxxPxxI

(SEQ ID NO:248),

b) WxxAKxCxQxADxNHxExxxHxxxTHxVMxPxAxxT (SEQ ID NO:249), c) GxVxGxxxxxxxxxxLxxxxxxxxxxCxPxHxxxNxxQxxYxxxxxNMPxAxY (SEQ ID NO:250),

d) QXXXXXXLXXXXYDXLGXYXXXX 4 F (SEQ ID NO:25l),

e) FQxxLxxxxxxIxxxNxxRxxxYxxxxPxxxxNSI (SEQ ID NO:252),

g) or any combination from a) to e), and in particular a combination of b), c) and e); or a) to e),

f)

wherein

each amino acid residue x independently of each other may be selected from any natural amino acid residue,

Xi represents 0 to 7 identical or different natural amino acid residues,

X 2 represents 0 or 1 natural amino acid residue,

X 3 represents 0 to 6 identical or different natural amino acid residues, and

X 4 represents 0 to 8 identical or different natural amino acid residues. The present invention also relates to several groups of polypeptides which comprise the enzymatic activity of a lipoxygenase, i.p. of a bifunctional LOX, and which may not show at least one of the above sequence pattern of embodiments 1, 2 and 3 in an identical manner or which may show a sequence pattern that is similar to at least one of the above pattern but does not completely match therewith.

4. Thus another embodiment of the invention refers to a polypeptide which comprises the enzymatic activity of a lipoxygenase, i.p. of a bifunctional LOX, optionally fulfilling any one of the preceding embodiments, and comprising an amino acid sequence selected from

a) SEQ ID NO: 3, 6, 9, 12 or 15;

b) SEQ ID NO: 18

c) SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, or 50 and

d) amino acid sequences having at least 40% sequence identity to at least one of the sequences of a), b) or c) and retaining said enzymatic activity of a lipoxygenase.

Thus, the polypeptides of the present embodiment may or may not meet the limitations of anyone of the embodiments 1, 2 and 3.

A first particular group of polypeptides comprises an amino acid sequence selected from SEQ ID NO: 3, 6, 9, 12 or 15; (CoLOXs) and amino acid sequences having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to at least one of these sequences and retaining said bifunctional LOX activity, and which may not meet the limitations of anyone of the embodiments 1, 2 and

3;

or alternatively selected from:

SEQ ID NO: 3, 6, 9, 12 or 15; (CoLOXs) and amino acid sequences having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to at least one of these sequences and retaining said bifunctional LOX activity, and which meet the limitations of anyone of the embodiments 1, 2 and 3. A second particular group of polypeptides comprises an amino acid sequence selected from

SEQ ID NO: 18 (UfLOX2) and amino acid sequences having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto and retaining said bifunctional LOX activity and which may not meet the limitations of anyone of the embodiments 1, 2 and 3; or alternatively selected from:

SEQ ID NO: 18 (UfLOX2) and amino acid sequences having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto and retaining said bifunctional LOX activity and which meet the limitations of anyone of the embodiments 1, 2 and 3;

A third particular group of polypeptides comprises an amino acid sequence selected from SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, or 50 (bacterial LOXs) and amino acid sequences having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to at least one of these sequences and retaining said bifunctional LOX activity, and which may not meet the limitations of anyone of the embodiments 1, 2 and 3; or alternatively selected from: SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, or 50 (bacterial LOXs) and amino acid sequences having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to at least one of these sequences and retaining said bifunctional LOX activity, and which meet the limitations of anyone of the embodiments 1, 2 and 3.

A particular subgroup of said third group of polypeptides relates to SEQ ID NO: 20 and 26 and amino acid sequences having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to at least one of these sequences and retaining said bifunctional LOX activity.

5. A polypeptide which comprises the enzymatic activity of a lipoxygenase with an amino acid sequence that is selected from SEQ ID NO: 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230 232, 234, 236, 238 or 239; and amino acid sequences having at least 40% sequence identity to at least one of said sequences and retaining said enzymatic activity of a lipoxygenase. A fourth particular group of polypeptides comprising an amino acid sequence selected from SEQ ID NO: 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144,

146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180,

182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230 232, 234, 236, 238 and 239 and amino acid sequences having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to at least one of said sequences and retaining said bifunctional LOX activity.

6. A polypeptide as defined in anyone of the preceding embodiment having, preferably bifunctional, LOX activity and mutants thereof.

Particular examples of suitable mutants of UfLOX 2 (SEQ ID NO: 18) are:

Mutants of SEQ ID NO: 18 wherein one or more, as for example 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10, like 1, 2, 3, 4, 5, 6, 7, 8 or 9 amino acid mutations are performed (generating a mutation profile) in a sequence position different from potential key positions such as C7, D134, R136, 061, A219, S256, C278, S305, C409 and G526 of SEQ ID NO: 18, and which mutation(s) provide a bifunctional LOX mutant with a feature profile, such as unsaturated Cio-aldehyde productivity, unsaturated Cio-aldehyde product profile, substrate profile, side product profile or combinations thereof, which is substantially identical if compared to the non- mutated parent enzyme; as well as further mutants derived from such a mutant, having a degree of sequence identity of least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% to SEQ ID NO: 18, retaining said mutation profile and preferably still showing a feature profile substantially identical to the non-mutated enzyme. In particular, such single or multiple mutants may be obtained by performing so-called conservative mutations, as for example conservative amino acid substitutions as explained defined herein below.

Mutants of SEQ ID NO: 18 wherein one or more, as for example 1 to 10, like 1, 2, 3, 4, 5, 6, 7, 8 or 9 amino acid mutations are performed (generating another mutation profile) in a potential key sequence position selected from C7, D134, R136, 061, A219, S256, C278, S305, C409 and G526 of SEQ ID NO: 18, and which mutation(s) provide a bifunctional LOX mutant with a, if compared to the non-mutated parent enzyme, different profile of features, like for example improved unsaturated Cio-aldehyde productivity, different unsaturated Cio- aldehyde product profile, different PUFA substrate profile, production of less side products or combinations thereof; as well as further mutants derived there form, having a degree of sequence identity of least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% to SEQ ID NO: 18, and retaining said mutation profile in said key positions and preferably still showing said modified functional profile. In particular such single or multiple mutants in key positions may be obtained by performing so-called non conservative mutations.

Based on the sequence alignments provided herein (see Figures 11, 14, 15, 16 and 17) the results of mutational experiments performed with one particular LOX (like UfLOX2) may be transferred in analogy to the corresponding amino acid residue position of another LOX enzyme as described herein in order evaluate the respective mutation in said other enzyme and in order to obtain further suitable bifunctional LOX enzymes suitable for preparing at least one unsaturated Cio-aldehyde from at least one PUFA substrate.

Particular examples of suitable mutants of bacterial LOX are:

Single and multiple mutants of anyone of the polypeptides of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, or 50, which mutants retain said enzymatic activity of a lipoxygenase, i.p. bifunctional LOX, which mutants are in particular selected from mutants comprising an amino acid sequence selected from SEQ ID NO: 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288 and 290; or encoded by a nucleotide sequences encoding a polypeptide retaining said enzymatic activity of a lipoxygenase, in particular selected from SEQ ID NO: 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287 and 289. Such bifunctional LOX mutants may show, if compared to the non-mutated parent enzyme, a different profile of features, like for example improved unsaturated Cio- aldehyde productivity, different unsaturated Cio-aldehyde product profile, different PUFA substrate profile, production of less side products, or combinations thereof; Provided are also mutants derived from SEQ ID NO: 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288 and 290, and having a degree of sequence identity of least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% to the respective native bacterial LOX amino acid sequence, while retaining said mutation profile in said key positions and preferably still showing said modified functional profile. In particular, such single or multiple mutants in key positions may be obtained by performing so-called conservative mutations.

A person of ordinary skill will be able to generate, based on the disclosed particular mutants, such further function mutants. For example, conservative amino acid substitutions in one or more of the mutation positions listed in the subsequent Table may be performed in this respect.

Non-limiting examples of possible conservative amino acid residue substitutions are provided in the subsequent section of the description.

7. The polypeptide of anyone of the embodiments 1 to 6 having the enzymatic activity of a bifunctional LOX and in particular of a combination of LOX and HPL activity.

8. The polypeptide of anyone of the embodiments 1 to 7, comprising the ability of converting at least one polyunsaturated fatty acid (PUFA), in particular selected from omega-3 and omega-6 PUFA, to at least one mono- or polyunsaturated aliphatic aldehyde.

9. The polypeptide of embodiment 8, comprising the ability to convert at least one PUFA to at least one polyunsaturated aliphatic Cio-aldeyde. 10. The polypeptide of embodiment 9, comprising the ability to convert at least one PUFA to at least one polyunsaturated aliphatic Cio-aldeyde, selected from decadienals and decatrienals, each either in essentially pure stereoisomeric form or in the form of a mixture of at least two stereoisomers, preferably selected from 2E,4Z-decadienal, 2E,4E- decadienal, 2E,4Z,7Z-decatrienal, 2E,4E,7Z-decatrienal and mixtures thereof.

11. The polypeptide of any one of the embodiments 7 to 10, wherein said PUFA is selected from Ci 6 -C 22 -, in particular from Ci 6 -C 2 o-PUFAs, more particularly selected from omega-3 Ci 6 -C 2 o-PUFAs and omega-6 C I6 -C 2Q -PUFAS.

12. The polypeptide of embodiment 11, wherein said PUFA is selected from a) the C16-PUFA hexadecatrienoic acid (HTA),

b) the Cig-PUFAs linoleic acid (LA), alpha linolenic acid (ALA) and gamma- linolenic acid (GLA), stearidonic acid (SDA) ;

c) the C 2O -PUFAS arachidonic acid (ARA) and eicosapentaenoic acid (EPA) d) the C 22 -PUFA docosahexaenoic acid (DHA)

13. A nucleic acid encoding the polypeptide of any one of embodiments 1 to 12 or the complement thereof.

14. The nucleic acid of embodiment 13, comprising a coding nucleotide selected from a) SEQ ID NO: 1, 2, 4, 5, 7, 8, 10, 11, 13 and 14 (CoLOX sequences);

b) SEQ ID NO: 16 and 17 (UfLOX2 sequences);

c) Codon optimized coding sequences according to SEQ ID NO: 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47,49, and natural coding sequences according to SEQ ID NO: 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73 and 74;

d) nucleotide sequences encoding a single and multiple mutants of anyone of the sequences c) encoding a polypeptide retaining said enzymatic activity of a lipoxygenase, in particular selected from SEQ ID NO: 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287 and 289; e) SEQ ID NO: 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103,

105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133,

135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163,

165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193,

195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223,

225, 227, 229, 231, 233, 235 and 237;

f) a sequence having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to at least one of the sequences of a) b) c, or d); or

g) the complement of anyone of the sequences of a), b) ,c), d) ,e) and f).

15. An expression vector comprising the coding nucleic acid of any one of embodiments 13 and 14.

16. The expression vector of embodiment 15, in the form of a viral vector, a bacteriophage or a plasmid.

17. The expression vector of embodiment 15 or 16, wherein the coding nucleic acid is linked to at least one regulatory sequence and, optionally, including at least one selection marker.

18. A recombinant non-human host organism or cell harboring at least one nucleic acid according to any one of embodiments 13 and 14 or harboring at least one expression vector of one of the embodiments 15 to 17.

19. The non-human host organism of embodiment 18, wherein said non-human host organism is an eukaryote or a prokaryote, in particular a plant, a bacterium or a fungus, more particular a bacterium or yeast. 20. The non-human host organism of embodiment 19, wherein said bacterium is of the genus Escherichia or Bacillus , in particular E. coli and said yeast is of the genus Saccharomyces, Yarrowia or Pichia, in particular S. cerevisiae, Y. lipolytica or P. pastoris.

21. The non-human host cell of embodiment 20, which is a plant cell, algae or seaweed. 22. A method for producing at least one polypeptide according to any one of embodiments 1 to 12 comprising: a) culturing a non-human host organism or cell harboring at least one nucleic acid according to any one of embodiments 13 and 14 and expressing or over-expressing at least one polypeptide according to any one of embodiments 1 to 12; b) optionally isolating said polypeptide from the non-human host organism or cell cultured in step a).

23. The method of embodiment 22, further comprising, prior to step a), providing a non-human host organism or cell with at least one nucleic acid according to any one of embodiments 13 or 14 so that it expresses or over-expresses the polypeptide according to any one of embodiments 1 to 12.

24. A method for preparing a mutant polypeptide capable of converting at least one polyunsaturated fatty acid (PUFA), in particular omega-3 or omega-6 PUFA, to at least one mono- or polyunsaturated aliphatic aldehyde, comprising the steps of: a) selecting a nucleic acid according to any one of embodimentsl3 and 14; b) modifying the selected nucleic acid to obtain at least one mutant nucleic acid; c) providing host cells or unicellular organisms with the mutant nucleic acid sequence to express a polypeptide encoded by the mutant nucleic acid sequence; d) screening for at least one mutant polypeptide with activity in converting at least one polyunsaturated fatty acid (PUFA), in particular omega-3 of omega-6 PUFA, to at least one mono- or polyunsaturated aliphatic aldehyde; e) optionally, if the mutated polypeptide has no desired activity, repeating the process steps a) to d) until a polypeptide with a desired activity is obtained; and, f) optionally, if a mutant polypeptide having a desired activity was identified in step d) or e), isolating the corresponding mutant nucleic acid.

25. A method for preparing an at least one mono- or polyunsaturated aliphatic aldehyde, which method comprises a) contacting at least one PUFA substrate with a polypeptide as defined in anyone of the embodiments 1 to 12, or encoded by a nucleic acid as defined in anyone of the embodiments 13 and 14, thereby converting said at least one PUFA compound to a reaction product comprising at least one mono- or polyunsaturated aliphatic aldehyde; and

b) optionally isolating least one mono- or polyunsaturated aliphatic aldehyde as obtained in step a).

26. The method of embodiment 25, wherein step a) is performed in vivo in cell culture in the presence of oxygen, or in vitro in a liquid reaction medium in the presence of oxygen. If performed in vivo, said method comprises prior to step a) introducing into a non-human host organism or cell and optionally stably integrated into the respective genome; one or more nucleic acid molecules encoding one or more polypeptides having the enzyme activities required for performing the respective biocatalytic conversion step or steps. 27. The method of any one of embodiments 25 and 26, wherein step a) is carried out by cultivating a non-human host organism or cell expressing at least one of said polypeptides having the enzymatic activity of a preferably bifunctional LOX in the presence of a PUFA substrate under conditions conducive to the peroxidation and subsequent cleavage of at least one PUFA. 28. The method of embodiment 25, wherein said at least one mono- or polyunsaturated aliphatic aldehyde is selected from decadienals and decatrienals.

29. The method of embodiment 28, wherein said decadienal is selected from 2E,4E- decadienal and 2E,4Z-decadienal and mixtures thereof; and wherein said decatrienal is selected from 2E,4E, 7Z-decatrienal and 2E,4Z,7Z-decatrienal and mixtures thereof. 30. The method of one of the embodiments 25 to 29, wherein said PULA substrate is an isolated, essentially pure PULA compound or a natural or synthetic composition comprising at least one PULA convertible by said preferably bifunctional LOX.

31. The method of embodiment 30, wherein said natural PUFA composition is selected from a) borage oil (containing elevated proportions of GLA),

b) arachidonic oil (containing elevated proportions of ARA),

c) fish oil (containing elevated proportions of EPA),

d) linseed oil

e) echium oil

f) corresponding oil hydrolysates of a) to e);

g) mixtures of LA and ALA; and

h) mixtures containing at least two of a) to g). 32. The method of embodiment 30 or 31, wherein a preferably bifunctional LOX comprising an amino acid sequence of SEQ ID NO: 3, 6, 9, 12 or 15; (CoLOX) or a sequence having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto is applied and the substrate is selected from h) borage oil (containing elevated proportions of GLA) in order to produce as main product 2E,4Z-decadienal and/or 2E,4E-decadienal

i) evening primrose oil (containing elevated proportions of GLA) in order to produce as main product 2E,4Z-decadienal and/or 2E,4E-decadienal j) Arachidonic oil (containing elevated proportions of ARA) in order to produce as main product 2E,4Z-decadienal and/or 2E,4E-decadienal k) echium seed oil (containing elevated proportions of SDA) in order to produce as main product 2E,4Z, 7Z-decatrienal and/or 2E,4E,7Z- decatrienal

l) fish oil (containing elevated proportions of EPA) ) in order to produce as mains product 2E,4Z, 7Z-decatrienal and/or 2E,4E,7Z-decatrienal m) linseed oil (containing elevated proportions of ALA) ) in order to produce as main product 2E,4Z, 7Z-decatrienal and/or 2E,4E,7Z-decatrienal n) micro algae oil (containing elevated proportions of DHA) in order to produce as main product 2E,4Z-decadienal, 2E,4E-decadienal 2E,4Z, 7Z- decatrienal and/or 2E,4E,7Z-decatrienal

o) LA in order to produce as main product 2E,4Z-decadienal and/or 2E,4E- decadienal

p) GLA in order to produce as main product 2E,4Z-decadienal and/or 2E,4E- decadienal

q) ARA in order to produce as main product 2E,4Z-decadienal and/or 2E,4E- decadienal

r) EPA in order to produce as main product 2E,4Z, 7Z-decatrienal and/or 2E,4E,7Z-decatrienal

33. The method of embodiment 30 or 31, wherein a preferably bifunctional LOX comprising an amino acid sequence of SEQ ID NO: 18 (UfLOX2) or a sequence having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto is applied and the substrate is selected from

a) borage oil (containing elevated proportions of GLA) in order to produce as main product 2E,4Z-decadienal and /or 2E,4E-decadienal

b) evening primrose oil (containing elevated proportions of GLA) in order to produce as main product 2E,4Z-decadienal and 2E,4E-decadienal c) arachidonic oil (containing elevated proportions of ARA) ) in order to produce as mains product 2E,4Z-decadienal and 2E,4E-decadienal d) echium seed oil (containing elevated proportions of SDA) in order to produce as main product 2E,4Z, 7Z-decatrienal and 2E,4E,7Z-decatrienal e) fish oil (containing elevated proportions of EPA) ) in order to produce as main product 2E,4Z, 7Z-decatrienal and 2E,4E,7Z-decatrienal f) linseed oil (containing elevated proportions of ALA) ) in order to produce as main product 2E,4Z, 7Z-decatrienal and 2E,4E,7Z-decatrienal g) micro algae oil (containing elevated proportions of DHA oil in step a) to e)) in order to produce as mains product 2E,4Z-decadienal, 2E,4E-decadienal 2E,4Z, 7Z-decatrienal and 2E,4E,7Z-decatrienal,

h) LA in order to produce as main product 2E,4Z-decadienal and/or 2E,4E- decadienal i) GLA in order to produce as main product 2E,4Z-decadienal and/or 2E,4E- decadienal

j) ARA in order to produce as main product 2E,4Z-decadienal and/or 2E,4E- decadienal

k) EPA in order to produce as main product 2E,4Z, 7Z-decatrienal and/or 2E,4E,7Z-decatrienal

34. The method of any one of the embodiments 25 to 31 or 33 wherein a crude or partially purified homogenate of Ulva fasciata containing said preferably bifunctional LOX activity is applied.

35. The method of embodiment 30 or 31, wherein a preferably bifunctional LOX comprising an amino acid sequence of SEQ ID NO: 20. 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, or 50 (bacterial LOXs) or a sequence having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto is applied and the substrate is selected from:

a) GLA in order to produce as main product 2E,4Z-decadienal and/or 2E,4E- decadienal and

b) ARA in order to produce as main product 2E,4Z-decadienal and/or 2E,4E- decadienal.

36. The method of embodiments 25 to 35, further comprises a chemical or enzymatic isomerization of an obtained mono- or polyunsaturated aliphatic aldehyde; or a chemical or enzymatic conversion of an obtained mono- or polyunsaturated aliphatic aldehyde to the corresponding alcohol or hydrocarbyl ester.

37. The method of anyone of the embodiments 25 to 36, wherein the conversion of said PUFA substrate is performed in a liquid reaction medium supplemented with at least one cofactor, selected from metal salts soluble in said liquid reaction medium, like in particular di- or polyvalent metal salts. Particular salts are halide salts like chloride, bromide or fluoride salts. As example of metal ions may be mentioned, di- or polyvalent metal cations or alkaline earth metal cations, more particularly di- or polyvalent cations derived from Mg, Mn and Fe, like Mg 2+ , Mn 2+ and Fe 2+ or Fe 3+ . Optionally said method of anyone of the preceding embodiments further comprises the processing of the obtained aldehyde to a corresponding derivative using chemical or biocatalytic synthesis or a combination of both. For example, such derivative may be selected from a hydrocarbon, an alcohol, diol, triol, acetal, ketal, acid, ether, amide, ketone, lactone, epoxide, acetate, glycoside and/or an ester.

38. A combination of at least two unsaturated Cio-aldehyde isomers, selected from 2E,4Z-decadienal, 2E,4E-decadienal, 2E,4Z,7Z-decatrienal and 2E,4E, 7Z-decatrienal, wherein a particular ratio between 2E,4E-decadienal and 2E,4Z-decadienal is from 3:1 to l:9and a particular ratio between 2E,4Z,7Z-decatrienal and 2E,4E, 7Z-decatrienal is from 3:1 to 1:9.

39. The use of a mono- or polyunsaturated aliphatic aldehyde or of a mixture of at least two of such aldehydes, and/or of corresponding conversion products and mixtures thereof as obtained by a method of anyone of the embodiments 25 to 37 or of an isomer combination of embodiment 38 as flavour ingredient for the manufacture of food or feed compositions. 40. A food or feed composition supplemented by at least one flavour ingredient as defined in embodiment 39.

41. The use of a polypeptide which comprises the enzymatic activity of a lipoxygenase as defined in anyone of the claims 1 to 12 or encoded by an nucleotide sequence as defined in anyone of the claims 13 and 14 for preparing an at least one mono- or polyunsaturated aliphatic aldehyde, in particular by a method as defined in anyone of the claims 25 to 37.

b. Polypeptides applicable according to the invention

In this context the following definitions apply:

The generic terms “polypeptide” or “peptide”, which may be used interchangeably, refer to a natural or synthetic linear chain or sequence of consecutive, peptidically linked amino acid residues, comprising about 10 to up to more than 1.000 residues. Short chain polypeptides with up to 30 residues are also designated as “oligopeptides”.

The term“protein” refers to a macromolecular structure consisting of one or more polypeptides. The amino acid sequence of its polypeptide(s) represents the“primary structure” of the protein. The amino acid sequence also predetermines the“secondary structure” of the protein by the formation of special structural elements, such as alpha- helical and beta-sheet structures formed within a polypeptide chain. The arrangement of a plurality of such secondary structural elements defines the“tertiary structure” or spatial arrangement of the protein. If a protein comprises more than one polypeptide chains said chains are spatially arranged forming the“quaternary structure” of the protein. A correct spacial arrangement or “folding” of the protein is prerequisite of protein function. Denaturation or unfolding destroys protein function. If such destruction is reversible, protein function may be restored by refolding.

A typical protein function referred to herein is an“enzyme function”, i.e. the protein acts as biocatalyst on a substrate, for example a chemical compound, and catalyzes the conversion of said substrate to a product. An enzyme may show a high or low degree of substrate and/or product specificity.

A“polypeptide” referred to herein as having a particular“activity” thus implicitly refers to a correctly folded protein showing the indicated activity, as for example a specific enzyme activity.

Thus, unless otherwise indicated the term“polypeptide” also encompasses the terms“protein” and“enzyme”.

Similarly, the term “polypeptide fragment” encompasses the terms “protein fragment" and“enzyme fragment”.

The term“isolated polypeptide” refers to an amino acid sequence that is removed from its natural environment by any method or combination of methods known in the art and includes recombinant, biochemical and synthetic methods.

“Target peptide” refers to an amino acid sequence which targets a protein, or polypeptide to intracellular organelles, i.e., mitochondria, or plastids, or to the extracellular space (secretion signal peptide). A nucleic acid sequence encoding a target peptide may be fused to the nucleic acid sequence encoding the amino terminal end, e.g., N-terminal end, of the protein or polypeptide, or may be used to replace a native targeting polypeptide.

The present invention also relates to "functional equivalents" (also designated as “analogs” or“functional mutations”) of the polypeptides specifically described herein.

For example, "functional equivalents" refer to polypeptides which, in a test used for determining enzymatic LOX activity display at least a 1 to 10 %, or at least 20 %, or at least 50 %, or at least 75 %, or at least 90 % higher or lower activity, as that of the polypeptides specifically described herein.

"Functional equivalents”, according to the invention, also cover particular mutants, which, in at least one sequence position of an amino acid sequences stated herein, have an amino acid that is different from that concretely stated one, but nevertheless possess one of the aforementioned biological activities, as for example enzyme activity. "Functional equivalents" thus comprise mutants obtainable by one or more, like 1 to 20, in particular 1 to 15 or 5 to 10 amino acid additions, substitutions, in particular conservative substitutions, deletions and/or inversions, where the stated changes can occur in any sequence position, provided they lead to a mutant with the profile of properties according to the invention. Functional equivalence is in particular also provided if the activity patterns coincide qualitatively between the mutant and the unchanged polypeptide, i.e. if, for example, interaction with the same agonist or antagonist or substrate, however at a different rate, (i.e. expressed by a EC50 or IC50 value or any other parameter suitable in the present technical field) is observed. Examples of suitable (conservative) amino acid substitutions are shown in the following table:

Original residue Examples of substitution

Ala Ser

Arg Lys

Asn Gln; His

Asp Glu

Cys Ser

Gln Asn

Glu Asp

Gly Pro

His Asn ; Gln

Ile Leu; Val

Leu Ile; Val

Lys Arg ; Gln ; Glu Met Leu ; Ile

Phe Met ; Leu ; Tyr

Ser Thr

Thr Ser

Trp Tyr

Tyr Trp ; Phe

Val Ile; Leu

"Functional equivalents" in the above sense are also "precursors" of the polypeptides described herein, as well as "functional derivatives" and "salts" of the polypeptides.

"Precursors" are in that case natural or synthetic precursors of the polypeptides with or without the desired biological activity.

The expression "salts" means salts of carboxyl groups as well as salts of acid addition of amino groups of the protein molecules according to the invention. Salts of carboxyl groups can be produced in a known way and comprise inorganic salts, for example sodium, calcium, ammonium, iron and zinc salts, and salts with organic bases, for example amines, such as triethanolamine, arginine, lysine, piperidine and the like. Salts of acid addition, for example salts with inorganic acids, such as hydrochloric acid or sulfuric acid and salts with organic acids, such as acetic acid and oxalic acid, are also covered by the invention.

"Functional derivatives" of polypeptides according to the invention can also be produced on functional amino acid side groups or at their N-terminal or C-terminal end using known techniques. Such derivatives comprise for example aliphatic esters of carboxylic acid groups, amides of carboxylic acid groups, obtainable by reaction with ammonia or with a primary or secondary amine; N-acyl derivatives of free amino groups, produced by reaction with acyl groups; or O-acyl derivatives of free hydroxyl groups, produced by reaction with acyl groups.

’’Functional equivalents” naturally also comprise polypeptides that can be obtained from other organisms, as well as naturally occurring variants. For example, areas of homologous sequence regions can be established by sequence comparison, and equivalent polypeptides can be determined on the basis of the concrete parameters of the invention.

"Functional equivalents" also comprise“fragments”, like individual domains or sequence motifs, of the polypeptides according to the invention, or N- and or C-terminally truncated forms, which may or may not display the desired biological function. Preferably such“fragments” retain the desired biological function at least qualitatively.

"Functional equivalents" are, moreover, fusion proteins, which have one of the polypeptide sequences stated herein or functional equivalents derived there from and at least one further, functionally different, heterologous sequence in functional N-terminal or C-terminal association (i.e. without substantial mutual functional impairment of the fusion protein parts). Non-limiting examples of these heterologous sequences are e.g. signal peptides, histidine anchors or enzymes.

“Functional equivalents” which are also comprised in accordance with the invention are homologs to the specifically disclosed polypeptides. These have at least 60%, preferably at least 75%, in particular at least 80 or 85%, such as, for example, 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99%, homology (or identity) to one of the specifically disclosed amino acid sequences, calculated by the algorithm of Pearson and Lipman, Proc. Natl. Acad, Sci. (USA) 85(8), 1988, 2444-2448. A homology or identity, expressed as a percentage, of a homologous polypeptide according to the invention means in particular an identity, expressed as a percentage, of the amino acid residues based on the total length of one of the amino acid sequences described specifically herein.

The identity data, expressed as a percentage, may also be determined with the aid of BLAST alignments, algorithm blastp (protein-protein BLAST), or by applying the Clustal settings specified herein below.

In the case of a possible protein glycosylation, "functional equivalents" according to the invention comprise polypeptides as described herein in deglycosylated or glycosylated form as well as modified forms that can be obtained by altering the glycosylation pattern.

Functional equivalents or homologues of the polypeptides according to the invention can be produced by mutagenesis, e.g. by point mutation, lengthening or shortening of the protein or as described in more detail below.

Functional equivalents or homologs of the polypeptides according to the invention can be identified by screening combinatorial databases of mutants, for example shortening mutants. For example, a variegated database of protein variants can be produced by combinatorial mutagenesis at the nucleic acid level, e.g. by enzymatic ligation of a mixture of synthetic oligonucleotides. There are a great many methods that can be used for the production of databases of potential homologues from a degenerated oligonucleotide sequence. Chemical synthesis of a degenerated gene sequence can be carried out in an automatic DNA synthesizer, and the synthetic gene can then be ligated in a suitable expression vector. The use of a degenerated genome makes it possible to supply all sequences in a mixture, which code for the desired set of potential protein sequences. Methods of synthesis of degenerated oligonucleotides are known to a person skilled in the art.

In the prior art, several techniques are known for the screening of gene products of combinatorial databases, which were produced by point mutations or shortening, and for the screening of cDNA libraries for gene products with a selected property. These techniques can be adapted for the rapid screening of the gene banks that were produced by combinatorial mutagenesis of homologues according to the invention. The techniques most frequently used for the screening of large gene banks, which are based on a high- throughput analysis, comprise cloning of the gene bank in expression vectors that can be replicated, transformation of the suitable cells with the resultant vector database and expression of the combinatorial genes in conditions in which detection of the desired activity facilitates isolation of the vector that codes for the gene whose product was detected. Recursive Ensemble Mutagenesis (REM), a technique that increases the frequency of functional mutants in the databases, can be used in combination with the screening tests, in order to identify homologues.

An embodiment provided herein provides orthologs and paralogs of polypeptides disclosed herein as well as methods for identifying and isolating such orthologs and paralogs. A definition of the terms“ortholog” and“paralog” is given below and applies to amino acid and nucleic acid sequences. c. Coding nucleic acid sequences applicable according to the invention

In this context the following definitions apply:

The terms“nucleic acid sequence,”“nucleic acid,”“nucleic acid molecule” and “polynucleotide” are used interchangeably meaning a sequence of nucleotides. A nucleic acid sequence may be a single- stranded or double- stranded deoxyribonucleotide, or ribonucleotide of any length, and include coding and non-coding sequences of a gene, exons, introns, sense and anti-sense complimentary sequences, genomic DNA, cDNA, miRNA, siRNA, mRNA, rRNA, tRNA, recombinant nucleic acid sequences, isolated and purified naturally occurring DNA and/or RNA sequences, synthetic DNA and RNA sequences, fragments, primers and nucleic acid probes. The skilled artisan is aware that the nucleic acid sequences of RNA are identical to the DNA sequences with the difference of thymine (T) being replaced by uracil (U). The term“nucleotide sequence” should also be understood as comprising a polynucleotide molecule or an oligonucleotide molecule in the form of a separate fragment or as a component of a larger nucleic acid.

An“isolated nucleic acid” or“isolated nucleic acid sequence” relates to a nucleic acid or nucleic acid sequence that is in an environment different from that in which the nucleic acid or nucleic acid sequence naturally occurs and can include those that are substantially free from contaminating endogenous material.

The term“naturally-occurring” as used herein as applied to a nucleic acid refers to a nucleic acid that is found in a cell of an organism in nature and which has not been intentionally modified by a human in the laboratory.

A“fragment” of a polynucleotide or nucleic acid sequence refers to contiguous nucleotides that are particularly at least 15 bp, at least 30 bp, at least 40 bp, at least 50 bp and/or at least 60 bp in length of the polynucleotide of an embodiment herein. Particularly the fragment of a polynucleotide comprises at least 25, more particularly at least 50, more particularly at least 75, more particularly at least 100, more particularly at least 150, more particularly at least 200, more particularly at least 300, more particularly at least 400, more particularly at least 500, more particularly at least 600, more particularly at least 700, more particularly at least 800, more particularly at least 900, more particularly at least 1000 contiguous nucleotides of the polynucleotide of an embodiment herein. Without being limited, the fragment of the polynucleotides herein may be used as a PCR primer, and/or as a probe, or for anti-sense gene silencing or RNAi.

As used herein, the term“hybridization” or hybridizes under certain conditions is intended to describe conditions for hybridization and washes under which nucleotide sequences that are significantly identical or homologous to each other remain bound to each other. The conditions may be such that sequences, which are at least about 70%, such as at least about 80%, and such as at least about 85%, 90%, or 95% identical, remain bound to each other. Definitions of low stringency, moderate, and high stringency hybridization conditions are provided herein below. Appropriate hybridization conditions can also be selected by those skilled in the art with minimal experimentation as exemplified in Ausubel et al. (1995, Current Protocols in Molecular Biology , John Wiley & Sons, sections 2, 4, and 6). Additionally, stringency conditions are described in Sambrook et al. (1989, Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Press, chapters 7, 9, and 11).

“Recombinant nucleic acid sequences” are nucleic acid sequences that result from the use of laboratory methods (for example, molecular cloning) to bring together genetic material from more than on source, creating or modifying a nucleic acid sequence that does not occur naturally and would not be otherwise found in biological organisms.

“Recombinant DNA technology” refers to molecular biology procedures to prepare a recombinant nucleic acid sequence as described, for instance, in Laboratory Manuals edited by Weigel and Glazebrook, 2002, Cold Spring Harbor Lab Press; and Sambrook et al, 1989, Cold Spring Harbor, NY, Cold Spring Harbor Laboratory Press.

The term “gene” means a DNA sequence comprising a region, which is transcribed into a RNA molecule, e.g., an mRNA in a cell, operably linked to suitable regulatory regions, e.g., a promoter. A gene may thus comprise several operably linked sequences, such as a promoter, a 5’ leader sequence comprising, e.g., sequences involved in translation initiation, a coding region of cDNA or genomic DNA, introns, exons, and/or a 3’non-translated sequence comprising, e.g., transcription termination sites.

“Polycistronic” refers to nucleic acid molecules, in particular mRNAs, that can encode more than one polypeptide separately within the same nucleic acid molecule

A“chimeric gene” refers to any gene which is not normally found in nature in a species, in particular, a gene in which one or more parts of the nucleic acid sequence are present that are not associated with each other in nature. For example the promoter is not associated in nature with part or all of the transcribed region or with another regulatory region. The term“chimeric gene” is understood to include expression constructs in which a promoter or transcription regulatory sequence is operably linked to one or more coding sequences or to an antisense, i.e., reverse complement of the sense strand, or inverted repeat sequence (sense and antisense, whereby the RNA transcript forms double stranded RNA upon transcription). The term "chimeric gene" also includes genes obtained through the combination of portions of one or more coding sequences to produce a new gene.

A“3’ UTR” or“3’ non-translated sequence” (also referred to as“3’ untranslated region,” or“3’end”) refers to the nucleic acid sequence found downstream of the coding sequence of a gene, which comprises, for example, a transcription termination site and (in most, but not all eukaryotic mRNAs) a polyadenylation signal such as AAUAAA or variants thereof. After termination of transcription, the mRNA transcript may be cleaved downstream of the polyadenylation signal and a poly(A) tail may be added, which is involved in the transport of the mRNA to the site of translation, e.g., cytoplasm.

The term“primer” refers to a short nucleic acid sequence that is hybridized to a template nucleic acid sequence and is used for polymerization of a nucleic acid sequence complementary to the template.

The term“selectable marker” refers to any gene which upon expression may be used to select a cell or cells that include the selectable marker. Examples of selectable markers are described below. The skilled artisan will know that different antibiotic, fungicide, auxotrophic or herbicide selectable markers are applicable to different target species.

The invention also relates to nucleic acid sequences that code for polypeptides as defined herein.

In particular, the invention also relates to nucleic acid sequences (single- stranded and double- stranded DNA and RNA sequences, e.g. cDNA, genomic DNA and mRNA), coding for one of the above polypeptides and their functional equivalents, which can be obtained for example using artificial nucleotide analogs.

The invention relates both to isolated nucleic acid molecules, which code for polypeptides according to the invention or biologically active segments thereof, and to nucleic acid fragments, which can be used for example as hybridization probes or primers for identifying or amplifying coding nucleic acids according to the invention.

The present invention also relates to nucleic acids with a certain degree of “identity” to the sequences specifically disclosed herein. "Identity" between two nucleic acids means identity of the nucleotides, in each case over the entire length of the nucleic acid.

The“identity” between two nucleotide sequences (the same applies to peptide or amino acid sequences) is a function of the number of nucleotide residues (or amino acid residues) or that are identical in the two sequences when an alignment of these two sequences has been generated. Identical residues are defined as residues that are the same in the two sequences in a given position of the alignment. The percentage of sequence identity, as used herein, is calculated from the optimal alignment by taking the number of residues identical between two sequences dividing it by the total number of residues in the shortest sequence and multiplying by 100. The optimal alignment is the alignment in which the percentage of identity is the highest possible. Gaps may be introduced into one or both sequences in one or more positions of the alignment to obtain the optimal alignment. These gaps are then taken into account as non-identical residues for the calculation of the percentage of sequence identity. Alignment for the purpose of determining the percentage of amino acid or nucleic acid sequence identity can be achieved in various ways using computer programs and for instance publicly available computer programs available on the world wide web.

Particularly, the BLAST program (Tatiana et al, FEMS Microbiol Lett., 1999, 174:247-250, 1999) set to the default parameters, available from the National Center for Biotechnology Information (NCBI) website at ncbi.nlm.nih.gov/BLAST/bl2seq/wblast2.cgi, can be used to obtain an optimal alignment of protein or nucleic acid sequences and to calculate the percentage of sequence identity.

In another example the identity may be calculated by means of the Vector NTI Suite 7.1 program of the company Informax (USA) employing the Clustal Method (Higgins DG, Sharp PM. ((1989))) with the following settings:

Multiple alignment parameters:

Gap opening penalty 10

Gap extension penalty 10

Gap separation penalty range 8

Gap separation penalty off

% identity for alignment delay 40

Residue specific gaps off

Hydrophilic residue gap off

Transition weighing 0 Pairwise alignment parameter:

FAST algorithm on

K-tuple size 1

Gap penalty 3

Window size 5

Number of best diagonals 5

Alternatively the identity may be determined according to Chenna, et al. (2003), the web page: http://www.ebi.ac.Uk/Tools/clustalw/index.html# and the following settings DNA Gap Open Penalty 15.0

DNA Gap Extension Penalty 6.66

DNA Matrix Identity

Protein Gap Open Penalty 10.0

Protein Gap Extension Penalty 0.2

Protein matrix Gonnet Protein/DNA ENDGAP -1

Protein/DNA GAPDIST 4

All the nucleic acid sequences mentioned herein (single- stranded and double- stranded DNA and RNA sequences, for example cDNA and mRNA) can be produced in a known way by chemical synthesis from the nucleotide building blocks, e.g. by fragment condensation of individual overlapping, complementary nucleic acid building blocks of the double helix. Chemical synthesis of oligonucleotides can, for example, be performed in a known way, by the phosphoamidite method (Voet, Voet, 2nd edition, Wiley Press, New York, pages 896-897). The accumulation of synthetic oligonucleotides and filling of gaps by means of the Klenow fragment of DNA polymerase and ligation reactions as well as general cloning techniques are described in Sambrook et al. (1989), see below.

The nucleic acid molecules according to the invention can in addition contain non- translated sequences from the 3' and/or 5' end of the coding genetic region.

The invention further relates to the nucleic acid molecules that are complementary to the concretely described nucleotide sequences or a segment thereof.

The nucleotide sequences according to the invention make possible the production of probes and primers that can be used for the identification and/or cloning of homologous sequences in other cellular types and organisms. Such probes or primers generally comprise a nucleotide sequence region which hybridizes under "stringent" conditions (as defined herein elsewhere) on at least about 12, preferably at least about 25, for example about 40, 50 or 75 successive nucleotides of a sense strand of a nucleic acid sequence according to the invention or of a corresponding antisense strand.

“Homologous” sequences include orthologous or paralogous sequences. Methods of identifying orthologs or paralogs including phylogenetic methods, sequence similarity and hybridization methods are known in the art and are described herein.

“Paralogs” result from gene duplication that gives rise to two or more genes with similar sequences and similar functions. Paralogs typically cluster together and are formed by duplications of genes within related plant species. Paralogs are found in groups of similar genes using pair-wise Blast analysis or during phylogenetic analysis of gene families using programs such as CLUSTAL. In paralogs, consensus sequences can be identified characteristic to sequences within related genes and having similar functions of the genes. “Orthologs”, or orthologous sequences, are sequences similar to each other because they are found in species that descended from a common ancestor. For instance, plant species that have common ancestors are known to contain many enzymes that have similar sequences and functions. The skilled artisan can identify orthologous sequences and predict the functions of the orthologs, for example, by constructing a polygenic tree for a gene family of one species using CLUSTAL or BLAST programs. A method for identifying or confirming similar functions among homologous sequences is by comparing of the transcript profiles in host cells or organisms, such as plants or microorganisms, overexpressing or lacking (in knockouts/knockdowns) related polypeptides. The skilled person will understand that genes having similar transcript profiles, with greater than 50% regulated transcripts in common, or with greater than 70% regulated transcripts in common, or greater than 90% regulated transcripts in common will have similar functions. Homologs, paralogs, orthologs and any other variants of the sequences herein are expected to function in a similar manner by making the host cells, organism such as plants or microorganisms producing LOX proteins.

The term“selectable marker” refers to any gene which upon expression may be used to select a cell or cells that include the selectable marker. Examples of selectable markers are described below. The skilled artisan will know that different antibiotic, fungicide, auxotrophic or herbicide selectable markers are applicable to different target species.

An "isolated" nucleic acid molecule is separated from other nucleic acid molecules that are present in the natural source of the nucleic acid and can moreover be substantially free from other cellular material or culture medium, if it is being produced by recombinant techniques, or can be free from chemical precursors or other chemicals, if it is being synthesized chemically.

A nucleic acid molecule according to the invention can be isolated by means of standard techniques of molecular biology and the sequence information supplied according to the invention. For example, cDNA can be isolated from a suitable cDNA library, using one of the concretely disclosed complete sequences or a segment thereof as hybridization probe and standard hybridization techniques (as described for example in Sambrook, (1989)).

In addition, a nucleic acid molecule comprising one of the disclosed sequences or a segment thereof, can be isolated by the polymerase chain reaction, using the oligonucleotide primers that were constructed on the basis of this sequence. The nucleic acid amplified in this way can be cloned in a suitable vector and can be characterized by DNA sequencing. The oligonucleotides according to the invention can also be produced by standard methods of synthesis, e.g. using an automatic DNA synthesizer.

Nucleic acid sequences according to the invention or derivatives thereof, homologues or parts of these sequences, can for example be isolated by usual hybridization techniques or the PCR technique from other bacteria, e.g. via genomic or cDNA libraries. These DNA sequences hybridize in standard conditions with the sequences ac-cording to the invention.

"Hybridize" means the ability of a polynucleotide or oligonucleotide to bind to an almost complementary sequence in standard conditions, whereas nonspecific binding does not occur between non-complementary partners in these conditions. For this, the sequences can be 90-100 % complementary. The property of complementary sequences of being able to bind specifically to one another is utilized for example in Northern Blotting or Southern Blotting or in primer binding in PCR or RT-PCR.

Short oligonucleotides of the conserved regions are used advantageously for hybridization. However, it is also possible to use longer fragments of the nucleic acids according to the invention or the complete sequences for the hybridization. These “standard conditions” vary depending on the nucleic acid used (oligonucleotide, longer fragment or complete sequence) or depending on which type of nucleic acid - DNA or RNA - is used for hybridization. For example, the melting temperatures for DNA:DNA hybrids are approx. 10 °C lower than those of DNA:RNA hybrids of the same length.

For example, depending on the particular nucleic acid, standard conditions mean temperatures between 42 and 58 °C in an aqueous buffer solution with a concentration between 0.1 to 5 x SSC (1 X SSC = 0.15 M NaCl, 15 mM sodium citrate, pH 7.2) or additionally in the presence of 50 % formamide, for example 42 °C in 5 x SSC, 50 % formamide. Advantageously, the hybridization conditions for DNA:DNA hybrids are 0.1 x SSC and temperatures between about 20 °C to 45 °C, preferably between about 30 °C to 45 °C. For DNA:RNA hybrids the hybridization conditions are advantageously 0.1 x SSC and temperatures between about 30 °C to 55 °C, preferably between about 45 °C to 55 °C. These stated temperatures for hybridization are examples of calculated melting temperature values for a nucleic acid with a length of approx. 100 nucleotides and a G + C content of 50 % in the absence of formamide. The experimental conditions for DNA hybridization are described in relevant genetics textbooks, for example Sambrook et a , 1989, and can be calculated using formulae that are known by a person skilled in the art, for example depending on the length of the nucleic acids, the type of hybrids or the G + C content. A person skilled in the art can obtain further information on hybridization from the following textbooks: Ausubel et al. (eds), (1985), Brown (ed) (1991).

"Hybridization" can in particular be carried out under stringent conditions. Such hybridization conditions are for example described in Sambrook (1989), or in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6.

As used herein, the term hybridization or hybridizes under certain conditions is intended to describe conditions for hybridization and washes under which nucleotide sequences that are significantly identical or homologous to each other remain bound to each other. The conditions may be such that sequences, which are at least about 70%, such as at least about 80%, and such as at least about 85%, 90%, or 95% identical, remain bound to each other. Definitions of low stringency, moderate, and high stringency hybridization conditions are provided herein.

Appropriate hybridization conditions can be selected by those skilled in the art with minimal experimentation as exemplified in Ausubel et al. (1995, Current Protocols in Molecular Biology , John Wiley & Sons, sections 2, 4, and 6). Additionally, stringency conditions are described in Sambrook et al. (1989, Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Press, chapters 7, 9, and 11).

As used herein, defined conditions of low stringency are as follows. Filters containing DNA are pretreated for 6 h at 40°C in a solution containing 35% formamide, 5x SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 Lig/ml denatured salmon sperm DNA. Hybridizations are carried out in the same solution with the following modifications: 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 Lig/ml salmon sperm DNA, 10% (wt/vol) dextran sulfate, and 5-20x106 32P-labeled probe is used. Filters are incubated in hybridization mixture for 18-20 h at 40°C, and then washed for 1.5 h at 55°C. In a solution containing 2x SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1% SDS. The wash solution is replaced with fresh solution and incubated an additional 1.5 h at 60°C. Filters are blotted dry and exposed for autoradiography.

As used herein, defined conditions of moderate stringency are as follows. Filters containing DNA are pretreated for 7 h at 50°C. in a solution containing 35% formamide, 5x SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 Lig/ml denatured salmon sperm DNA. Hybridizations are carried out in the same solution with the following modifications: 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 Lig/ml salmon sperm DNA, 10% (wt/vol) dextran sulfate, and 5-20x106 32P-labeled probe is used. Filters are incubated in hybridization mixture for 30 h at 50°C, and then washed for 1.5 h at 55°C. In a solution containing 2x SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1% SDS. The wash solution is replaced with fresh solution and incubated an additional 1.5 h at 60°C. Filters are blotted dry and exposed for autoradiography.

As used herein, defined conditions of high stringency are as follows. Prehybridization of filters containing DNA is carried out for 8 h to overnight at 65°C in buffer composed of 6x SSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, and 500 Lig/ml denatured salmon sperm DNA. Filters are hybridized for 48 h at 65°C in the prehybridization mixture containing 100 pg /ml denatured salmon sperm DNA and 5-20x106 cpm of 32P-labeled probe. Washing of filters is done at 37°C for 1 h in a solution containing 2x SSC, 0.01% PVP, 0.01% Ficoll, and 0.01% BSA. This is followed by a wash in 0. lx SSC at 50°C for 45 minutes.

Other conditions of low, moderate, and high stringency well known in the art (e.g., as employed for cross-species hybridizations) may be used if the above conditions are inappropriate (e.g., as employed for cross-species hybridizations).

A detection kit for nucleic acid sequences encoding a polypeptide of the invention may include primers and/or probes specific for nucleic acid sequences encoding the polypeptide, and an associated protocol to use the primers and/or probes to detect nucleic acid sequences encoding the polypeptide in a sample. Such detection kits may be used to determine whether a plant, organism, microorganism or cell has been modified, i.e., transformed with a sequence encoding the polypeptide.

To test a function of variant DNA sequences according to an embodiment herein, the sequence of interest is operably linked to a selectable or screenable marker gene and expression of said reporter gene is tested in transient expression assays, for example, with microorganisms or with protoplasts or in stably transformed plants.

The invention also relates to derivatives of the concretely disclosed or derivable nucleic acid sequences. Thus, further nucleic acid sequences according to the invention can be derived from the sequences specifically disclosed herein and can differ from it by one or more, like 1 to 20, in particular 1 to 15 or 5 to 10 additions, substitutions, insertions or deletions of one or several (like for example 1 to 10) nucleotides, and furthermore code for polypeptides with the desired profile of properties.

The invention also encompasses nucleic acid sequences that comprise so-called silent mutations or have been altered, in comparison with a concretely stated sequence, according to the codon usage of a special original or host organism.

According to a particular embodiment of the invention variant nucleic acids may be prepared in order to adapt its nucleotide sequence to a specific expression system. For example, bacterial expression systems are known to more efficiently express polypeptides if amino acids are encoded by particular codons. Due to the degeneracy of the genetic code, more than one codon may encode the same amino acid sequence, multiple nucleic acid sequences can code for the same protein or polypeptide, all these DNA sequences being encompassed by an embodiment herein. Where appropriate, the nucleic acid sequences encoding the polypeptides described herein may be optimized for increased expression in the host cell. For example, nucleic acids of an embodiment herein may be synthesized using codons particular to a host for improved expression.

The invention also encompasses naturally occurring variants, e.g. splicing variants or allelic variants, of the sequences described therein.

Allelic variants may have at least 60 % homology at the level of the derived amino acid, preferably at least 80 % homology, quite especially preferably at least 90 % homology over the entire sequence range (regarding homology at the amino acid level, reference should be made to the details given above for the polypeptides). Advantageously, the homologies can be higher over partial regions of the sequences.

The invention also relates to sequences that can be obtained by conservative nucleotide substitutions (i.e. as a result thereof the amino acid in question is replaced by an amino acid of the same charge, size, polarity and/or solubility).

The invention also relates to the molecules derived from the concretely disclosed nucleic acids by sequence polymorphisms. Such genetic polymorphisms may exist in cells from different populations or within a population due to natural allelic variation. Allelic variants may also include functional equivalents. These natural variations usually produce a variance of 1 to 5 % in the nucleotide sequence of a gene. Said polymorphisms may lead to changes in the amino acid sequence of the polypeptides disclosed herein. Allelic variants may also include functional equivalents.

Furthermore, derivatives are also to be understood to be homologs of the nucleic acid sequences according to the invention, for example animal, plant, fungal or bacterial homologs, shortened sequences, single- stranded DNA or RNA of the coding and noncoding DNA sequence. For example, homologs have, at the DNA level, a homology of at least 40 %, preferably of at least 60 %, especially preferably of at least 70 %, quite especially preferably of at least 80 % over the entire DNA region given in a sequence specifically disclosed herein.

Moreover, derivatives are to be understood to be, for example, fusions with promoters. The promoters that are added to the stated nucleotide sequences can be modified by at least one nucleotide exchange, at least one insertion, inversion and/or deletion, though without impairing the functionality or efficacy of the promoters. Moreover, the efficacy of the promoters can be increased by altering their sequence or can be exchanged completely with more effective promoters even of organisms of a different genus. d. Generation of functional polypeptide mutants

Moreover, a person skilled in the art is familiar with methods for generating functional mutants, that is to say nucleotide sequences which code for a polypeptide with at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity to anyone of amino acid related SEQ ID NOs as disclosed herein and/or encoded by a nucleic acid molecule comprising a nucleotide sequence having at least 70% sequence identity to anyone of the nucleotide related SEQ ID NOs as disclosed herein.

Depending on the technique used, a person skilled in the art can introduce entirely random or else more directed mutations into genes or else noncoding nucleic acid regions (which are for example important for regulating expression) and subsequently generate genetic libraries. The methods of molecular biology required for this purpose are known to the skilled worker and for example described in Sambrook and Russell, Molecular Cloning. 3rd Edition, Cold Spring Harbor Laboratory Press 2001. Methods for modifying genes and thus for modifying the polypeptide encoded by them have been known to the skilled worker for a long time, such as, for example

- direct synthesis of the whole coding sequence with different methods (Sriram Kosuri and George M Church, 2014, Nature Methods, 11: 499-507),

- site- specific mutagenesis, where individual or several nucleotides of a gene are replaced in a directed fashion (Trower MK (Ed.) 1996; In vitro mutagenesis protocols. Humana Press, New Jersey),

- saturation mutagenesis, in which a codon for any amino acid can be exchanged or added at any point of a gene (Kegler-Ebo DM, Docktor CM, DiMaio D (1994) Nucleic Acids Res 22:1593; Barettino D, Feigenbutz M, Valcarel R, Stunnenberg HG (1994) Nucleic Acids Res 22:541; Barik S (1995) Mol Biotechnol 3:1),

- error-prone polymerase chain reaction, where nucleotide sequences are mutated by error-prone DNA polymerases (Eckert KA, Kunkel TA (1990) Nucleic Acids Res 18:3739);

- the SeSaM method (sequence saturation method), in which preferred exchanges are prevented by the polymerase. Schenk et a , Biospektrum, Vol. 3, 2006, 277-279

- the passaging of genes in mutator strains, in which, for example owing to defective DNA repair mechanisms, there is an increased mutation rate of nucleotide sequences (Greener A, Callahan M, Jerpseth B (1996) An efficient random mutagenesis technique using an E.coli mutator strain. In: Trower MK (Ed.) In vitro mutagenesis protocols. Humana Press, New Jersey), or

- DNA shuffling, in which a pool of closely related genes is formed and digested and the fragments are used as templates for a polymerase chain reaction in which, by repeated strand separation and reassociation, full-length mosaic genes are ultimately generated (Stemmer WPC (1994) Nature 370:389; Stemmer WPC (1994) Proc Natl Acad Sci USA 91:10747).

Using so-called directed evolution (described, inter alia, in Reetz MT and Jaeger K-E (1999), Topics Curr Chem 200:31; Zhao H, Moore JC, Volkov AA, Arnold FH (1999), Methods for optimizing industrial polypeptides by directed evolution, In: Demain AL, Davies JE (Ed.) Manual of industrial microbiology and biotechnology. American Society for Microbiology), a skilled worker can produce functional mutants in a directed manner and on a large scale. To this end, in a first step, gene libraries of the respective polypeptides are first produced, for example using the methods given above. The gene libraries are expressed in a suitable way, for example by bacteria or by phage display systems.

The relevant genes of host organisms which express functional mutants with properties that largely correspond to the desired properties can be submitted to another mutation cycle. The steps of the mutation and selection or screening can be repeated iteratively until the present functional mutants have the desired properties to a sufficient extent. Using this iterative procedure, a limited number of mutations, for example 1, 2, 3, 4 or 5 mutations, can be performed in stages and assessed and selected for their influence on the activity in question. The selected mutant can then be submitted to a further mutation step in the same way. In this way, the number of individual mutants to be investigated can be reduced significantly.

The results according to the invention also provide important information relating to structure and sequence of the relevant polypeptides, which is required for generating, in a targeted fashion, further polypeptides with desired modified properties. In particular, it is possible to define so-called“hot spots”, i.e. sequence segments that are potentially suitable for modifying a property by introducing targeted mutations.

Information can also be deduced regarding amino acid sequence positions, in the region of which mutations can be effected that should probably have little effect on the activity, and can be designated as potential“silent mutations”. e. Constructs for expressing polypeptides of the invention

In this context the following definitions apply:

“Expression of a gene” encompasses “heterologous expression” and “over expression” and involves transcription of the gene and translation of the mRNA into a protein. Overexpression refers to the production of the gene product as measured by levels of mRNA, polypeptide and/or enzyme activity in transgenic cells or organisms that exceeds levels of production in non-transformed cells or organisms of a similar genetic background.

“Expression vector” as used herein means a nucleic acid molecule engineered using molecular biology methods and recombinant DNA technology for delivery of foreign or exogenous DNA into a host cell. The expression vector typically includes sequences required for proper transcription of the nucleotide sequence. The coding region usually codes for a protein of interest but may also code for an RNA, e.g., an antisense RNA, siRNA and the like.

An“expression vector” as used herein includes any linear or circular recombinant vector including but not limited to viral vectors, bacteriophages and plasmids. The skilled person is capable of selecting a suitable vector according to the expression system. In one embodiment, the expression vector includes the nucleic acid of an embodiment herein operably linked to at least one“regulatory sequence”, which controls transcription, translation, initiation and termination, such as a transcriptional promoter, operator or enhancer, or an mRNA ribosomal binding site and, optionally, including at least one selection marker. Nucleotide sequences are “operably linked” when the regulatory sequence functionally relates to the nucleic acid of an embodiment herein.

An“expression system” as used herein encompasses any combination of nucleic acid molecules required for the expression of one, or the co-expression of two or more polypeptides either in vivo of a given expression host, or in vitro. The respective coding sequences may either be located on a single nucleic acid molecule or vector, as for example a vector containing multiple cloning sites, or on a polycistronic nucleic acid, or may be distributed over two or more physically distinct vectors. As a particular example there may be mentioned an operon comprising a promotor sequence, one or more operator sequences and one or more structural genes each encoding an enzyme as described herein

As used herein, the terms "amplifying" and "amplification" refer to the use of any suitable amplification methodology for generating or detecting recombinant of naturally expressed nucleic acid, as described in detail, below. For example, the invention provides methods and reagents (e.g., specific degenerate oligonucleotide primer pairs, oligo dT primer) for amplifying (e.g., by polymerase chain reaction, PCR) naturally expressed (e.g., genomic DNA or mRNA) or recombinant (e.g., cDNA) nucleic acids of the invention in vivo, ex vivo or in vitro.

“Regulatory sequence” refers to a nucleic acid sequence that determines expression level of the nucleic acid sequences of an embodiment herein and is capable of regulating the rate of transcription of the nucleic acid sequence operably linked to the regulatory sequence. Regulatory sequences comprise promoters, enhancers, transcription factors, promoter elements and the like.

A“promoter”, a“nucleic acid with promoter activity” or a“promoter sequence” is understood as meaning, in accordance with the invention, a nucleic acid which, when functionally linked to a nucleic acid to be transcribed, regulates the transcription of said nucleic acid. “Promoter” in particular refers to a nucleic acid sequence that controls the expression of a coding sequence by providing a binding site for RNA polymerase and other factors required for proper transcription including without limitation transcription factor binding sites, repressor and activator protein binding sites. The meaning of the term promoter also includes the term“promoter regulatory sequence”. Promoter regulatory sequences may include upstream and downstream elements that may influences transcription, RNA processing or stability of the associated coding nucleic acid sequence. Promoters include naturally-derived and synthetic sequences. The coding nucleic acid sequences is usually located downstream of the promoter with respect to the direction of the transcription starting at the transcription initiation site.

In this context, a“functional” or“operative” linkage is understood as meaning for example the sequential arrangement of one of the nucleic acids with a regulatory sequence. For example the sequence with promoter activity and of a nucleic acid sequence to be transcribed and optionally further regulatory elements, for example nucleic acid sequences which ensure the transcription of nucleic acids, and for example a terminator, are linked in such a way that each of the regulatory elements can perform its function upon transcription of the nucleic acid sequence. This does not necessarily require a direct linkage in the chemical sense. Genetic control sequences, for example enhancer sequences, can even exert their function on the target sequence from more remote positions or even from other DNA molecules. Preferred arrangements are those in which the nucleic acid sequence to be transcribed is positioned behind (i.e. at the 3’-end of) the promoter sequence so that the two sequences are joined together covalently. The distance between the promoter sequence and the nucleic acid sequence to be expressed recombinantly can be smaller than 200 base pairs, or smaller than 100 base pairs or smaller than 50 base pairs.

In addition to promoters and terminator, the following may be mentioned as examples of other regulatory elements: targeting sequences, enhancers, polyadenylation signals, selectable markers, amplification signals, replication origins and the like. Suitable regulatory sequences are described, for example, in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990).

The term“constitutive promoter” refers to an unregulated promoter that allows for continual transcription of the nucleic acid sequence it is operably linked to. As used herein, the term“operably linked” refers to a linkage of polynucleotide elements in a functional relationship. A nucleic acid is“operably linked” when it is placed into a functional relationship with another nucleic acid sequence. For instance, a promoter, or rather a transcription regulatory sequence, is operably linked to a coding sequence if it affects the transcription of the coding sequence. Operably linked means that the DNA sequences being linked are typically contiguous. The nucleotide sequence associated with the promoter sequence may be of homologous or heterologous origin with respect to the plant to be transformed. The sequence also may be entirely or partially synthetic. Regardless of the origin, the nucleic acid sequence associated with the promoter sequence will be expressed or silenced in accordance with promoter properties to which it is linked after binding to the polypeptide of an embodiment herein. The associated nucleic acid may code for a protein that is desired to be expressed or suppressed throughout the organism at all times or, alternatively, at a specific time or in specific tissues, cells, or cell compartment. Such nucleotide sequences particularly encode proteins conferring desirable phenotypic traits to the host cells or organism altered or transformed therewith. More particularly, the associated nucleotide sequence leads to the production of the product or products of interest as herein defined in the cell or organism. Particularly, the nucleotide sequence encodes a polypeptide having an enzyme activity as herein defined.

The nucleotide sequence as described herein above may be part of an“expression cassette”. The terms “expression cassette” and “expression construct” are used synonymously. The (preferably recombinant) expression construct contains a nucleotide sequence which encodes a polypeptide according to the invention and which is under genetic control of regulatory nucleic acid sequences.

In a process applied according to the invention, the expression cassette may be part of an“expression vector”, in particular of a recombinant expression vector.

An“expression unit” is understood as meaning, in accordance with the invention, a nucleic acid with expression activity which comprises a promoter as defined herein and, after functional linkage with a nucleic acid to be expressed or a gene, regulates the expression, i.e. the transcription and the translation of said nucleic acid or said gene. It is therefore in this connection also referred to as a“regulatory nucleic acid sequence”. In addition to the promoter, other regulatory elements, for example enhancers, can also be present. An“expression cassette” or“expression construct” is understood as meaning, in accordance with the invention, an expression unit which is functionally linked to the nucleic acid to be expressed or the gene to be expressed. In contrast to an expression unit, an expression cassette therefore comprises not only nucleic acid sequences which regulate transcription and translation, but also the nucleic acid sequences that are to be expressed as protein as a result of transcription and translation.

The terms “expression” or “overexpression” describe, in the context of the invention, the production or increase in intracellular activity of one or more polypeptides in a microorganism, which are encoded by the corresponding DNA. To this end, it is possible for example to introduce a gene into an organism, replace an existing gene with another gene, increase the copy number of the gene(s), use a strong promoter or use a gene which encodes for a corresponding polypeptide with a high activity; optionally, these measures can be combined.

Preferably such constructs according to the invention comprise a promoter 5’- upstream of the respective coding sequence and a terminator sequence 3’-downstream and optionally other usual regulatory elements, in each case in operative linkage with the coding sequence.

Nucleic acid constructs according to the invention comprise in particular a sequence coding for a polypeptide for example derived from the amino acid related SEQ ID NOs as described therein or the reverse complement thereof, or derivatives and homologs thereof and which have been linked operatively or functionally with one or more regulatory signals, advantageously for controlling, for example increasing, gene expression.

In addition to these regulatory sequences, the natural regulation of these sequences may still be present before the actual structural genes and optionally may have been genetically modified so that the natural regulation has been switched off and expression of the genes has been enhanced. The nucleic acid construct may, however, also be of simpler construction, i.e. no additional regulatory signals have been inserted before the coding sequence and the natural promoter, with its regulation, has not been removed. Instead, the natural regulatory sequence is mutated such that regulation no longer takes place and the gene expression is increased.

A preferred nucleic acid construct advantageously also comprises one or more of the already mentioned“enhancer” sequences in functional linkage with the promoter, which sequences make possible an enhanced expression of the nucleic acid sequence. Additional advantageous sequences may also be inserted at the 3’-end of the DNA sequences, such as further regulatory elements or terminators. One or more copies of the nucleic acids according to the invention may be present in a construct. In the construct, other markers, such as genes which complement auxotrophisms or antibiotic resistances, may also optionally be present so as to select for the construct.

Examples of suitable regulatory sequences are present in promoters such as cos, tac, trp, tet, trp-tet, lpp, lac, lpp-lac, lacl q , T7, T5, T3, gal, trc, ara, rhaP (rhaP BAD )SP6, lambda-P R or in the lambda-P L promoter, and these are advantageously employed in Gram-negative bacteria. Further advantageous regulatory sequences are present for example in the Gram-positive promoters amy and SP02, in the yeast or fungal promoters ADC1, MFalpha, AC, P-60, CYC1, GAPDH, TEF, rp28, ADH. Artificial promoters may also be used for regulation.

For expression in a host organism, the nucleic acid construct is inserted advantageously into a vector such as, for example, a plasmid or a phage, which makes possible optimal expression of the genes in the host. Vectors are also understood as meaning, in addition to plasmids and phages, all the other vectors which are known to the skilled worker, that is to say for example viruses such as SV40, CMV, baculovirus and adenovirus, transposons, IS elements, phasmids, cosmids and linear or circular DNA or artificial chromosomes. These vectors are capable of replicating autonomously in the host organism or else chromosomally. These vectors are a further development of the invention. Binary or cpo-integration vectors are also applicable.

Suitable plasmids are, for example, in E. coli pFG338, pACYCl84, pBR322, pUCl8, pUCl9, pKC30, pRep4, pHSl, pKK223-3, pDHEl9.2, pHS2, pPFc236, pMBF24, pFG200, pUR290, pIN-III 113 -Bl, kgtl l or pBdCI, in Streptomyces pUlOl, pU364, pU702 or pU36l, in Bacillus pUBl lO, pCl94 or pBD2l4, in Corynebacterium pSA77 or pAJ667, in fungi pAFSl, pIF2 or pBBH6, in yeasts 2alphaM, pAG-l, YEp6, YEpl3 or pEMBFYe23 or in plants pFGV23, pGHlac + , rBIN19, pAK2004 or pDH51. The abovementioned plasmids are a small selection of the plasmids which are possible. Further plasmids are well known to the skilled worker and can be found for example in the book Cloning Vectors (Eds. Pouwels P. H. et al. Elsevier, Amsterdam-New York- Oxford, 1985, ISBN 0 444 904018). In a further development of the vector, the vector which comprises the nucleic acid construct according to the invention or the nucleic acid according to the invention can advantageously also be introduced into the microorganisms in the form of a linear DNA and integrated into the host organism’s genome via heterologous or homologous recombination. This linear DNA can consist of a linearized vector such as a plasmid or only of the nucleic acid construct or the nucleic acid according to the invention.

For optimal expression of heterologous genes in organisms, it is advantageous to modify the nucleic acid sequences to match the specific“codon usage” used in the organism. The“codon usage” can be determined readily by computer evaluations of other, known genes of the organism in question.

An expression cassette according to the invention is generated by fusing a suitable promoter to a suitable coding nucleotide sequence and a terminator or polyadenylation signal. Customary recombination and cloning techniques are used for this purpose, as are described, for example, in T. Maniatis, E.F. Fritsch and J. Sambrook, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989) and in T.J. Silhavy, M.L. Berman and L.W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1984) and in Ausubel, F.M. et a , Current Protocols in Molecular Biology, Greene Publishing Assoc and Wiley Interscience (1987).

For expression in a suitable host organism, the recombinant nucleic acid construct or gene construct is advantageously inserted into a host-specific vector which makes possible optimal expression of the genes in the host. Vectors are well known to the skilled worker and can be found for example in“cloning vectors” (Pouwels P. H. et a , Ed., Elsevier, Amsterdam-New York-Oxford, 1985).

An alternative embodiment of an embodiment herein provides a method to“alter gene expression” in a host cell. For instance, the polynucleotide of an embodiment herein may be enhanced or overexpressed or induced in certain contexts (e.g. upon exposure to certain temperatures or culture conditions) in a host cell or host organism.

Alteration of expression of a polynucleotide provided herein may also result in ectopic expression which is a different expression pattern in an altered and in a control or wild-type organism. Alteration of expression occurs from interactions of polypeptide of an embodiment herein with exogenous or endogenous modulators, or as a result of chemical modification of the polypeptide. The term also refers to an altered expression pattern of the polynucleotide of an embodiment herein which is altered below the detection level or completely suppressed activity.

In one embodiment, provided herein is also an isolated, recombinant or synthetic polynucleotide encoding a polypeptide or variant polypeptide provided herein.

In one embodiment, several polypeptide encoding nucleic acid sequences are co expressed in a single host, particularly under control of different promoters. In another embodiment, several polypeptide encoding nucleic acid sequences can be present on a single transformation vector or be co-transformed at the same time using separate vectors and selecting transformants comprising both chimeric genes. Similarly, one or polypeptide encoding genes may be expressed in a single plant, cell, microorganism or organism together with other chimeric genes. f. Hosts to be applied for the present invention

Depending on the context, the term“host” can mean the wild-type host or a genetically altered, recombinant host or both.

In principle, all prokaryotic or eukaryotic organisms may be considered as host or recombinant host organisms for the nucleic acids or the nucleic acid constructs according to the invention.

Using the vectors according to the invention, recombinant hosts can be produced, which are for example transformed with at least one vector according to the invention and can be used for producing the polypeptides according to the invention. Advantageously, the recombinant constructs according to the invention, described above, are introduced into a suitable host system and expressed. Preferably common cloning and transfection methods, known by a person skilled in the art, are used, for example co-precipitation, protoplast fusion, electroporation, retroviral transfection and the like, for expressing the stated nucleic acids in the respective expression system. Suitable systems are described for example in Current Protocols in Molecular Biology, F. Ausubel et a , Ed., Wiley Interscience, New York 1997, or Sambrook et al. Molecular Cloning: A Laboratory Manual. 2nd edition, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989.

Advantageously, microorganisms such as bacteria, fungi or yeasts are used as host organisms. Advantageously, gram-positive or gram-negative bacteria are used, preferably bacteria of the families Enterobacteriaceae, Pseudomonadaceae, Rhizobiaceae, Streptomycetaceae, Streptococcaceae or Nocardiaceae, especially preferably bacteria of the genera Escherichia, Pseudomonas, Streptomyces, Lactococcus, Nocardia, Burkholderia, Salmonella, Agrobacterium, Clostridium or Rhodococcus. The genus and species Escherichia coli is quite especially preferred. Furthermore, other advantageous bacteria are to be found in the group of alpha-Proteobacteria, beta-Proteobacteria or gamma-Proteobacteria. Advantageously also yeasts of families like Saccharomyces or Pichia are suitable hosts.

Alternatively, entire plants or plant cells may serve as natural or recombinant host. As non-limiting examples the following plants or cells derived therefrom may be mentioned the genera Nicotiana, in particular Nicotiana benthamiana and Nicotiana tabacum (tobacco); as well as Arabidopsis, in particular Arabidopsis thaliana.

Depending on the host organism, the organisms used in the method according to the invention are grown or cultured in a manner known by a person skilled in the art. Culture can be batchwise, semi-batchwise or continuous. Nutrients can be present at the beginning of fermentation or can be supplied later, semicontinuously or continuously. This is also described in more detail below. g. Recombinant production of polypeptides according to the invention

The invention further relates to methods for recombinant production of polypeptides according to the invention or functional, biologically active fragments thereof, wherein a polypeptide -producing microorganism is cultured, optionally the expression of the polypeptides is induced by applying at least one inducer inducing gene expression and the expressed polypeptides are isolated from the culture. The polypeptides can also be produced in this way on an industrial scale, if desired.

The microorganisms produced according to the invention can be cultured continuously or discontinuously in the batch method or in the fed-batch method or repeated fed-batch method. A summary of known cultivation methods can be found in the textbook by Chmiel (Bioprozesstechnik 1. Einfuhrung in die Bioverfahrenstechnik [Bioprocess technology 1. Introduction to bioprocess technology] (Gustav Fischer Verlag, Stuttgart, 1991)) or in the textbook by Storhas (Bioreaktoren und periphere Einrichtungen [Bioreactors and peripheral equipment] (Vieweg Verlag, Braunschweig/Wiesbaden, 1994)). The culture medium to be used must suitably meet the requirements of the respective strains. Descriptions of culture media for various microorganisms are given in the manual "Manual of Methods for General Bacteriology" of the American Society for Bacteriology (Washington D. C., USA, 1981).

These media usable according to the invention usually comprise one or more carbon sources, nitrogen sources, inorganic salts, vitamins and/or trace elements.

Preferred carbon sources are sugars, such as mono-, di- or polysaccharides. Very good carbon sources are for example glucose, fructose, mannose, galactose, ribose, sorbose, ribulose, lactose, maltose, sucrose, raffinose, starch or cellulose. Sugars can also be added to the media via complex compounds, such as molasses, or other by-products of sugar refining. It can also be advantageous to add mixtures of different carbon sources. Other possible carbon sources are oils and fats, for example soybean oil, sunflower oil, peanut oil and coconut oil, fatty acids, for example palmitic acid, stearic acid or linoleic acid, alcohols, for example glycerol, methanol or ethanol and organic acids, for example acetic acid or lactic acid.

Nitrogen sources are usually organic or inorganic nitrogen compounds or materials that contain these compounds. Examples of nitrogen sources comprise ammonia gas or ammonium salts, such as ammonium sulfate, ammonium chloride, ammonium phosphate, ammonium carbonate or ammonium nitrate, nitrates, urea, amino acids or complex nitrogen sources, such as corn-steep liquor, soya flour, soya protein, yeast extract, meat extract and others. The nitrogen sources can be used alone or as a mixture.

Inorganic salt compounds that can be present in the media comprise the chloride, phosphorus or sulfate salts of calcium, magnesium, sodium, cobalt, molybdenum, potassium, manganese, zinc, copper and iron.

Inorganic sulfur-containing compounds, for example sulfates, sulfites, dithionites, tetrathionates, thiosulfates, sulfides, as well as organic sulfur compounds, such as mercaptans and thiols, can be used as the sulfur source.

Phosphoric acid, potassium dihydrogen phosphate or dipotassium hydrogen phosphate or the corresponding sodium-containing salts can be used as the phosphorus source.

Chelating agents can be added to the medium, in order to keep the metal ions in solution. Especially suitable chelating agents comprise dihydroxyphenols, such as catechol or protocatechuate, or organic acids, such as citric acid. The fermentation media used according to the invention usually also contain other growth factors, such as vitamins or growth promoters, which include for example biotin, riboflavin, thiamine, folic acid, nicotinic acid, pantothenate and pyridoxine. Growth factors and salts often originate from the components of complex media, such as yeast extract, molasses, corn-steep liquor and the like. Moreover, suitable precursors can be added to the culture medium. The exact composition of the compounds in the medium is strongly dependent on the respective experiment and is decided for each specific case individually. Information on media optimization can be found in the textbook "Applied Microbiol. Physiology, A Practical Approach" (Ed. P.M. Rhodes, P.F. Stanbury, IRL Press (1997) p. 53-73, ISBN 0 19 963577 3). Growth media can also be obtained from commercial suppliers, such as Standard 1 (Merck) or BHI (brain heart infusion, DIFCO) and the like.

All components of the medium are sterilized, either by heat (20 min at 1.5 bar and 121 °C) or by sterile filtration. The components can either be sterilized together, or separately if necessary. All components of the medium can be present at the start of culture or can be added either continuously or batchwise.

The culture temperature is normally between l5°C and 45°C, preferably 25°C to 40°C and can be varied or kept constant during the experiment. The pH of the medium should be in the range from 5 to 8.5, preferably around 7.0. The pH for growing can be controlled during growing by adding basic compounds such as sodium hydroxide, potassium hydroxide, ammonia or ammonia water or acid compounds such as phosphoric acid or sulfuric acid. Antifoaming agents, for example fatty acid polyglycol esters, can be used for controlling foaming. To maintain the stability of plasmids, suitable selective substances, for example antibiotics, can be added to the medium. To maintain aerobic conditions, oxygen or oxygen-containing gas mixtures, for example ambient air, are fed into the culture. The temperature of the culture is normally in the range from 20°C to 45 °C. The culture is continued until a maximum of the desired product has formed. This target is normally reached within 10 hours to 160 hours.

The fermentation broth is then processed further. Depending on requirements, the biomass can be removed from the fermentation broth completely or partially by separation techniques, for example centrifugation, filtration, decanting or a combination of these methods or can be left in it completely. If the polypeptides are not secreted in the culture medium, the cells can also be lysed and the product can be obtained from the lysate by known methods for isolation of proteins. The cells can optionally be disrupted with high-frequency ultrasound, high pressure, for example in a French press, by osmolysis, by the action of detergents, lytic enzymes or organic solvents, by means of homogenizers or by a combination of several of the aforementioned methods.

The polypeptides can be purified by known chromatographic techniques, such as molecular sieve chromatography (gel filtration), such as Q-sepharose chromatography, ion exchange chromatography and hydrophobic chromatography, and with other usual techniques such as ultrafiltration, crystallization, salting-out, dialysis and native gel electrophoresis. Suitable methods are described for example in Cooper, T. G., Biochemische Arbeitsmethoden [Biochemical processes], Verlag Walter de Gruyter, Berlin, New York or in Scopes, R., Protein Purification, Springer Verlag, New York, Heidelberg, Berlin.

For isolating the recombinant protein, it can be advantageous to use vector systems or oligonucleotides, which lengthen the cDNA by defined nucleotide sequences and therefore code for altered polypeptides or fusion proteins, which for example serve for easier purification. Suitable modifications of this type are for example so-called "tags" functioning as anchors, for example the modification known as hexa-histidine anchor or epitopes that can be recognized as antigens of antibodies (described for example in Harlow, E. and Lane, D., 1988, Antibodies: A Laboratory Manual. Cold Spring Harbor (N.Y.) Press). These anchors can serve for attaching the proteins to a solid carrier, for example a polymer matrix, which can for example be used as packing in a chromatography column, or can be used on a microtiter plate or on some other carrier.

At the same time these anchors can also be used for recognition of the proteins. Lor recognition of the proteins, it is moreover also possible to use usual markers, such as fluorescent dyes, enzyme markers, which form a detectable reaction product after reaction with a substrate, or radioactive markers, alone or in combination with the anchors for derivatization of the proteins. h. Polypeptide immobilization

The enzymes or polypeptides according to the invention can be used free or immobilized in the method described herein. An immobilized enzyme is an enzyme that is fixed to an inert carrier. Suitable carrier materials and the enzymes immobilized thereon are known from EP-A-l 149849, EP-A-l 069 183 and DE-OS 100193773 and from the references cited therein. Reference is made in this respect to the disclosure of these documents in their entirety. Suitable carrier materials include for example clays, clay minerals, such as kaolinite, diatomaceous earth, perlite, silica, aluminum oxide, sodium carbonate, calcium carbonate, cellulose powder, anion exchanger materials, synthetic polymers, such as polystyrene, acrylic resins, phenol formaldehyde resins, polyurethanes and polyolefins, such as polyethylene and polypropylene. For making the supported enzymes, the carrier materials are usually employed in a finely- divided, particulate form, porous forms being preferred. The particle size of the carrier material is usually not more than 5 mm, in particular not more than 2 mm (particle- size distribution curve). Similarly, when using dehydrogenase as whole-cell catalyst, a free or immobilized form can be selected. Carrier materials are e.g. Ca-alginate, and carrageenan. Enzymes as well as cells can also be crosslinked directly with glutaraldehyde (cross-linking to CLEAs). Corresponding and other immobilization techniques are described for example in J. Lalonde and A. Margolin "Immobilization of Enzymes" in K. Drauz and H. Waldmann, Enzyme Catalysis in Organic Synthesis 2002, Vol. Ill, 991-1032, Wiley- VCH, Weinheim. Further information on biotransformations and bioreactors for carrying out methods according to the invention are also given for example in Rehm et al. (Ed.) Biotechnology, 2nd Edn, Vol 3, Chapter 17, VCH, Weinheim. i. Reaction conditions for biocatalytic production methods of the invention

The reaction of the present invention may be performed under in vivo or in vitro conditions.

The at least one polypeptide/enzyme which is present during a method of the invention or an individual step of a multistep-method as defined herein above, can be present in living cells naturally or recombinantly producing the enzyme or enzymes, in harvested cells i.e. under in vivo conditions, or, in dead cells, in permeabilized cells, in crude cell extracts, in purified extracts, or in essentially pure or completely pure form, i.e. under in vitro conditions. The at least one enzyme may be present in solution or as an enzyme immobilized on a carrier. One or several enzymes may simultaneously be present in soluble and/or immobilised form. The methods according to the invention can be performed in common reactors, which are known to those skilled in the art, and in different ranges of scale, e.g. from a laboratory scale (few millilitres to dozens of litres of reaction volume) to an industrial scale (several litres to thousands of cubic meters of reaction volume). If the polypeptide is used in a form encapsulated by non-living, optionally permeabilized cells, in the form of a more or less purified cell extract or in purified form, a chemical reactor can be used. The chemical reactor usually allows controlling the amount of the at least one enzyme, the amount of the at least one substrate, the pH, the temperature and the circulation of the reaction medium. When the at least one polypeptide/enzyme is present in living cells, the process will be a fermentation. In this case the biocatalytic production will take place in a bioreactor (fermenter), where parameters necessary for suitable living conditions for the living cells (e.g. culture medium with nutrients, temperature, aeration, presence or absence of oxygen or other gases, antibiotics, and the like) can be controlled. Those skilled in the art are familiar with chemical reactors or bioreactors, e.g. with procedures for up-scaling chemical or biotechnological methods from laboratory scale to industrial scale, or for optimizing process parameters, which are also extensively described in the literature (for biotechnological methods see e.g. Crueger und Crueger, Biotechnologie - Lehrbuch der angewandten Mikrobiologie, 2. Ed., R. Oldenbourg Verlag, Miinchen, Wien, 1984).

Cells containing the at least one enzyme can be permeabilized by physical or mechanical means, such as ultrasound or radiofrequency pulses, French presses, or chemical means, such as hypotonic media, lytic enzymes and detergents present in the medium, or combination of such methods. Examples for detergents are digitonin, n- dodecylmaltoside, octylglycoside, Triton® X-100, Tween ® 20, deoxycholate, CHAPS (3-[(3-Cholamidopropyl)dimethylammonio]-l-propansulfonate), Nonidet ® P40

(Ethylphenolpoly(ethyleneglycolether), and the like.

Instead of living cells biomass of non-living cells containing the required biocatalyst(s) may be applied of the biotransformation reactions of the invention as well.

If the at least one enzyme is immobilised, it is attached to an inert carrier as described above.

The conversion reaction can be carried out batch wise, semi-batch wise or continuously. Reactants (and optionally nutrients) can be supplied at the start of reaction or can be supplied subsequently, either semi-continuously or continuously. The reaction of the invention, depending on the particular reaction type, may be performed in an aqueous, aqueous-organic or non-aqueous reaction medium.

An aqueous or aqueous-organic medium may contain a suitable buffer in order to adjust the pH to a value in the range of 5 to 11, like 6 to 10.

In an aqueous-organic medium an organic solvent miscible, partly miscible or immiscible with water may be applied. Non-limiting examples of suitable organic solvents are listed below. Further examples are mono- or polyhydric, aromatic or aliphatic alcohols, in particular polyhydric aliphatic alcohols like glycerol.

The non-aqueous medium may contain is substantially free of water, i.e. will contain less that about 1 wt.-% or 0.5 wt.-% of water.

Biocatalytic methods may also be performed in an organic non-aqueous medium. As suitable organic solvents there may be mentioned aliphatic hydrocarbons having for example 5 to 8 carbon atoms, like pentane, cyclopentane, hexane, cyclohexane, heptane, octane or cyclooctane; aromatic carbohydrates, like benzene, toluene, xylenes, chlorobenzene or dichlorobenzene, aliphatic acyclic and ethers, like diethylether, methyl- tert.-butylether, ethyl-tert.-butylether, dipropylether, diisopropylether, dibutylether; or mixtures thereof.

The concentration of the reactants/substrates may be adapted to the optimum reaction conditions, which may depend on the specific enzyme applied. For example, the initial substrate concentration may be in the 0,1 to 0,5 M, as for example 10 to 100 mM.

The reaction temperature may be adapted to the optimum reaction conditions, which may depend on the specific enzyme applied. For example, the reaction may be performed at a temperature in a range of from 0 to 70 °C, as for example 20 to 50 or 25 to 40 °C. Examples for reaction temperatures are about 30°C, about 35°C, about 37°C, about 40°C, about 45°C, about 50°C, about 55°C and about 60°C.

The process may proceed until equilibrium between the substrate and then product(s) is achieved, but may be stopped earlier. Usual process times are in the range from 1 minute to 25 hours, in particular 10 min to 6 hours, as for example in the range from 1 hour to 4 hours, in particular 1.5 hours to 3.5 hours. These parameters are non limiting examples of suitable process conditions.

If the host is a transgenic plant, optimal growth conditions can be provided, such as optimal light, water and nutrient conditions, for example. k. Product isolation and derivatization

The methodology of the present invention can further include a step of recovering an end or intermediate product, optionally in stereoisomerically or enantiomerically substantially pure form. The term“recovering” includes extracting, harvesting, isolating or purifying the compound from culture or reaction media. Recovering the compound can be performed according to any conventional isolation or purification methodology known in the art including, but not limited to, treatment with a conventional resin (e.g., anion or cation exchange resin, non-ionic adsorption resin, etc.), treatment with a conventional adsorbent (e.g., activated charcoal, silicic acid, silica gel, cellulose, alumina, etc.), alteration of pH, solvent extraction (e.g., with a conventional solvent such as an alcohol, ethyl acetate, hexane and the like), distillation, dialysis, filtration, concentration, crystallization, recrystallization, pH adjustment, lyophilization and the like.

Identity and purity of the isolated product may be determined by known techniques, like High Performance Liquid Chromatography (HPLC), gas chromatography (GC), Spektroskopy (like IR, UV, NMR), Colouring methods, TLC, NIRS, enzymatic or microbial assays. (see for example: Patek et al. (1994) Appl. Environ. Microbiol. 60:133- 140; Malakhova et al. (1996) Biotekhnologiya 11 27-32; und Schmidt et al. (1998) Bioprocess Engineer. 19:67-70. Ullmann's Encyclopedia of Industrial Chemistry (1996) Bd. A27, VCH: Weinheim, S. 89-90, S. 521-540, S. 540-547, S. 559-566, 575-581 und S. 581-587; Michal, G (1999) Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, John Wiley and Sons; Fallon, A. et al. (1987) Applications of HPLC in Biochemistry in: Laboratory Techniques in Biochemistry and Molecular Biology, Bd. 17.)

The unsaturated Cio aldehydes compound produced in any of the method described herein can be converted to derivatives such as, but not limited to hydrocarbons, esters, amides, glycosides, ethers, epoxides, ketons, alcohols, diols, acetals or ketals. The unsaturated Cio aldehyde derivatives can be obtained by a chemical method such as, but not limited to oxidation, reduction, alkylation, acylation and/or rearrangement. Alternatively, the unsaturated Cio aldehyde derivatives can be obtained using a biochemical method by contacting the unsaturated Cio aldehyde with an enzyme such as, but not limited to an oxidoreductase, a monooxygenase, a dioxygenase, a transferase. The biochemical conversion can be performed in-vitro using isolated enzymes, enzymes from lysed cells or in-vivo using whole cells. 1. Fermentative production of unsaturated Cio-aldehydes

The invention also relates to methods for the fermentative production of unsaturated Cio aldehydes.

A fermentation as used according to the present invention can, for example, be performed in stirred fermenters, bubble columns and loop reactors. A comprehensive overview of the possible method types including stirrer types and geometric designs can be found in "Chmiel: Bioprozesstechnik: Einfuhrung in die Bioverfahrenstechnik, Band 1 ". In the process of the invention, typical variants available are the following variants known to those skilled in the art or explained, for example, in "Chmiel, Hammes and Bailey: Biochemical Engineering", such as batch, fed-batch, repeated fed-batch or else continuous fermentation with and without recycling of the biomass. Depending on the production strain, sparging with air, oxygen, carbon dioxide, hydrogen, nitrogen or appropriate gas mixtures may be effected in order to achieve good yield (YP/S).

The culture medium that is to be used must satisfy the requirements of the particular strains in an appropriate manner. Descriptions of culture media for various microorganisms are given in the handbook "Manual of Methods for General Bacteriology" of the American Society for Bacteriology (Washington D. C., USA, 1981).

These media that can be used according to the invention may comprise one or more sources of carbon, sources of nitrogen, inorganic salts, vitamins and/or trace elements.

Preferred sources of carbon are sugars, such as mono-, di- or polysaccharides. Very good sources of carbon are for example glucose, fructose, mannose, galactose, ribose, sorbose, ribulose, lactose, maltose, sucrose, raffinose, starch or cellulose. Sugars can also be added to the media via complex compounds, such as molasses, or other by products from sugar refining. It may also be advantageous to add mixtures of various sources of carbon. Other possible sources of carbon are oils and fats such as soybean oil, sunflower oil, peanut oil and coconut oil, fatty acids such as palmitic acid, stearic acid or linoleic acid, alcohols such as glycerol, methanol or ethanol and organic acids such as acetic acid or lactic acid.

Sources of nitrogen are usually organic or inorganic nitrogen compounds or materials containing these compounds. Examples of sources of nitrogen include ammonia gas or ammonium salts, such as ammonium sulfate, ammonium chloride, ammonium phosphate, ammonium carbonate or ammonium nitrate, nitrates, urea, amino acids or complex sources of nitrogen, such as com-steep liquor, soybean flour, soy-bean protein, yeast extract, meat extract and others. The sources of nitrogen can be used separately or as a mixture.

Inorganic salt compounds that may be present in the media comprise the chloride, phosphate or sulfate salts of calcium, magnesium, sodium, cobalt, molybdenum, potassium, manganese, zinc, copper and iron.

Inorganic sulfur-containing compounds, for example sulfates, sulfites, di-thionites, tetrathionates, thiosulfates, sulfides, but also organic sulfur compounds, such as mercaptans and thiols, can be used as sources of sulfur.

Phosphoric acid, potassium dihydrogenphosphate or dipotassium hydrogenphosphate or the corresponding sodium-containing salts can be used as sources of phosphorus.

Chelating agents can be added to the medium, in order to keep the metal ions in solution. Especially suitable chelating agents comprise dihydroxyphenols, such as catechol or protocatechuate, or organic acids, such as citric acid.

The fermentation media used according to the invention may also contain other growth factors, such as vitamins or growth promoters, which include for example biotin, riboflavin, thiamine, folic acid, nicotinic acid, pantothenate and pyridoxine. Growth factors and salts often come from complex components of the media, such as yeast extract, molasses, corn-steep liquor and the like. In addition, suitable precursors can be added to the culture medium. The precise composition of the compounds in the medium is strongly dependent on the particular experiment and must be decided individually for each specific case. Information on media optimization can be found in the textbook "Applied Microbiol. Physiology, A Practical Approach" (1997) Growing media can also be obtained from commercial suppliers, such as Standard 1 (Merck) or BHI (Brain heart infusion, DIFCO) etc.

All components of the medium are sterilized, either by heating (20 min at 1.5 bar and 121 °C) or by sterile filtration. The components can be sterilized either together, or if necessary separately. All the components of the medium can be present at the start of growing, or optionally can be added continuously or by batch feed.

The temperature of the culture is normally between 15 °C and 45 °C, preferably 25 °C to 40 °C and can be kept constant or can be varied during the experiment. The pH value of the medium should be in the range from 5 to 8.5, preferably around 7.0. The pH value for growing can be controlled during growing by adding basic compounds such as sodium hydroxide, potassium hydroxide, ammonia or ammonia water or acid compounds such as phosphoric acid or sulfuric acid. Antifoaming agents, e.g. fatty acid polyglycol esters, can be used for controlling foaming. To maintain the stability of plasmids, suitable substances with selective action, e.g. antibiotics, can be added to the medium. Oxygen or oxygen-containing gas mixtures, e.g. the ambient air, are fed into the culture in order to maintain aerobic conditions. The temperature of the culture is normally from 20 °C to 45 °C. Culture is continued until a maximum of the desired product has formed. This is normally achieved within 1 hour to 160 hours.

The methodology of the present invention can further include a step of recovering said one or more unsaturated Cio aldehydes.

The term“recovering” includes extracting, harvesting, isolating or purifying the compound from culture media. Recovering the compound can be performed according to any conventional isolation or purification methodology known in the art including, but not limited to, treatment with a conventional resin (e.g., anion or cation exchange resin, non-ionic adsorption resin, etc.), treatment with a conventional adsorbent (e.g., activated charcoal, silicic acid, silica gel, cellulose, alumina, etc.), alteration of pH, solvent extraction (e.g., with a conventional solvent such as an alcohol, ethyl acetate, hexane and the like), distillation, dialysis, filtration, concentration, crystallization, recrystallization, pH adjustment, lyophilization and the like.

Before the intended isolation the biomass of the broth can be removed. Processes for removing the biomass are known to those skilled in the art, for example filtration, sedimentation and flotation. Consequently, the biomass can be removed, for example, with centrifuges, separators, decanters, filters or in flotation apparatus. For maximum recovery of the product of value, washing of the biomass is often advisable, for example in the form of a diafiltration. The selection of the method is dependent upon the biomass content in the fermenter broth and the properties of the biomass, and also the interaction of the biomass with the product of value.

In one embodiment, the fermentation broth can be sterilized or pasteurized. In a further embodiment, the fermentation broth is concentrated. Depending on the requirement, this concentration can be done batch wise or continuously. The pressure and temperature range should be selected such that firstly no product damage occurs, and secondly minimal use of apparatus and energy is necessary. The skillful selection of pressure and temperature levels for a multistage evaporation in particular enables saving of energy.

The following examples are illustrative only and are not meant to limit the scope of invention as set forth in the Summary, Description or in the Claims.

The numerous possible variations that will become immediately evident to a person skilled in the art after heaving considered the disclosure provided herein also fall within the scope of the invention.

Experimental Part

Materials:

Unless otherwise stated, all chemical and biochemical materials and microorganisms or cells employed herein are commercially available products.

Unless otherwise specified, recombinant proteins are cloned and expressed by standard methods, such as, for example, as described by Sambrook, J., Fritsch, E.F. and Maniatis, T., Molecular cloning: A Laboratory Manual, 2 nd Edition, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989.

Methods:

Functional expression of lipoxygenase

The coding sequences of lipoxygenase (LOX) were optimized by following the genetic codon frequency of E. coli, synthesized and then subcloned into the pETDuet-l (Novagen, Merck KGaA, Germany) plasmid for subsequent expression in E. coli. BL21 E. coli cells (Tiangen, China) were transformed with the plasmids pETDuet-LOX. The transformed cells were selected on LB-agar plates containing Ampicillin (50 pg/mL final). Single colonies were used to inoculate 25 mL liquid LB medium containing Ampicillin (50 pg/mL final). Cultures were incubated at 37°C and 200 rpm shaking. After 4 hours incubation, the cultures were cooled down to 20°C for 1.5 hour and IPTG (0.016 mM final) was added to induce protein expression. To express proteins the cultures were incubated for another 16 hours at 20°C and 200 rpm shaking. The cultures were spin down and resuspended in 3 mL of reaction buffer (25 mM Tris-HCl pH7.5) followed by a sonication process to make protein solution, respectively. The protein solution was transferred into a 20 mL SPME vial, 30 pL fatty acid substrate and 10 pL internal standard (80 ppm alpha-ionone in ethanol) were added into the vial. After 10 min incubation, the SPME-GC-MS method described below was used for analysis of decadienals and decatrienals.

Solid Phase Micro Extraction Gas Chromatography Mass Spectrometry (SPME-GC-MS)

The reaction mixture was concentrated on a solid phase microextraction (SPME) fiber assembly polydimethylsiloxane/carboxen/divinylbenzene (57329-U, SUPELCO). The extraction was performed in headspace mode at 40 °C for 20 min. After extraction, the SPME fiber was introduced into the GC-MS inlet and maintained at 250°C for 5 min, and the products were analyzed on an Agilent 6890 series GC system equipped with a DBl-ms column 30 m x 0.25 mm x 0.25 pm film thickness (P/N 122-0132, J&W scientific Inc., Folsom, CA) and coupled with a 5975 series mass spectrometer (Agilent, US). The carrier gas was helium at a constant flow of 0.7 mL/min. Injection was in splitless mode with the injector temperature set at 250°C. The oven temperature was programmed from 50°C (5 min hold) to 250°C at l5°C/min (5 min hold). Identification of products was based on mass spectra and retention indices as well as respective product standards.

Liquid Chromatography coupled to UV detection and Mass spectrometry (LC-UV/MS)

200 pL of reaction mixture was diluted with 800 pL acetonitrile and then put on ice for 30 min. Filtration with 0.2 pL regenerated cellulose membrane (5190-5108, Agilent) was applied to remove the protein precipitation from the mixture. 1 pL of sample was injected to LC for the quantification of decadienal as well as side products.

Part A:

UfLOX Isolation and Characterization

Example 1: Seaweed sourcing and analysis for aroma aldehydes

Plant materials of Ulva fasciata (sample ID: PA-2017-0012) were collected from Nanao, Guangdong Province, China. One gram of smashed sample was put into a 20 mL vial for further SPME-GC-MS analysis. To determine whether U. fasciata contained decadienals or decatrienals, fresh samples were analyzed by SPME-GC-MS as described in the Methods section.

One gram of smashed U. fasciata sample was put into a 20 mL vial with 3 mL Tris-HCl buffer (pH=7.5). 30 pL fatty acid substrate (30 pL LA, ALA, GLA, EPA, ARA, borage oil hydrolysate, arachidonic oil hydrolysate, linseed oil hydrolysate or fish oil hydrolysate in 1 ml ethanol respectively) and 10 pL internal standard (80 ppm alpha- ionone in ethanol) were added into the vial for incubation. After 10 min incubation at RT, the SPME-GC-MS method described in the method section was used for analysis of decadienals and decatrienals.

GC-MS analysis revealed that there were limited amounts of 2E,4Z-decadienal

(retention time 13.0 min) and 2E,4E-decadienal (retention time 13.25 min) (Figures 2, 4 and 5) in U. fasciata, however, after feeding with gamma-linolenic acid, the content thereof increased significantly (Table 1). Table 1. SPME-GC-MS analysis for U. fasciata before and after feeding with gamma linolenic acid (GLA)

Example 2: Transcriptome analysis and identification of UfLOX protein

Total RNA of U. fasciata was extracted using the RNeasy Plant Mini Kit (Qiagen,

Germany). The total RNA sample was processed using NEBNext® UltraTM RNA Library Prep Kit for Illumina (NEB, USA) and TruSeq PE Cluster Kit (Illumina, USA) and then sequenced on Illumina HiSeq 2500 System. An amount of 38 million of paired- end reads of 2x150 bp was generated. The reads were processed using the Trinity (http://trinityrnaseq.sf.net/) software and 91564 transcripts with an N50 of 2262 were obtained. The obtained transcripts were translated into protein sequences and then functionally annotated by searching the NCBI non-redundant protein sequence database using the tblastx algorithm. One candidate protein sequence of LOX was mined by Pfam search and relative expression level.

The total RNA sample of U. fasciata was first reverse transcribed into cDNA using SMARTer™ RACE cDNA Amplification Kit (Clontech, Takara, Japan). The products were then used as the template for gene cloning. The coding sequence of UfLOX2 (SEQ ID NO: 18) was amplified from the cDNA by using forward primer (5’- TCGTCC AACAGGTTCTCTT-3’ ) (SEQ ID NO:57) and reverse primer (5’- TTCTTTCCACTCACCGCCA-3’ ) (SEQ ID NO:58).

Example 3: Functional characterization of UfLOX2

The coding sequence of UfLOX2 was optimized by following the genetic codon frequency of E. coli, synthesized and then subcloned into the pETDuet-l plasmid for subsequent expression in E. coli. The following codon optimized sequences were applied: UfLOX2 (SEQ ID NO: 17) and plasmid pETDuet-UfLOX2 was obtained.

Functional expression of the gene was performed as described above in the Methods section to yield protein solution. The enzymatic activity of the UfLOX2 was evaluated as described below:

a) UfLOX2 (SEQ ID NO: 18) was tested by feeding with fatty acid substrate including gamma-linolenic acid (GLA), alpha-linolenic acid(ALA), linoleic acid (LA) and arachidonic acid (ARA) as below:

The protein solution (3 mL) from E. coli which contain UfLOX2 was put into a 20 mL SPME vial, 30 pL fatty acid substrate (30 pL LA, ALA, GLA, EPA, ARA, borage oil, arachidonic oil, linseed oil or fish oil in 1 mL ethanol respectively) and 10 pL internal standard (80 ppm alpha-ionone in ethanol) were added into the vial for incubation. After 10 min at RT, the SPME-GC-MS method described in the method section was used for analysis of decadienals and decatrienals.

UfLOX2 showed capability to produce decadienals (retention time 12.60 and 12.80 min) when feeding with specific substrates (Table 2)

Table 2. SPME-GC-MS analysis for UfLOX2 before and after feeding with GLA and arachidonic acid (ARA)

b) To prove the lyase activity for UfLOX2, feeding experiments with fatty acid hydroperoxide was performed.

To test the HPL activity, UfLOX2 was produced in E. coli and cell lysates that contain UfLOX2 were prepared for testing its HPL activity. One aliquot of UfLOX2 was feed with GLA as a positive control of making decadienal. A second and third aliquot of UfLOX2 was denatured (boiled at l00°C for 20 min) and feed with GLA or GLA hydroperoxide (GLA-HPO) as negative control to exclude UfLOX2 functionality to make decadienal and to show the conversion of GLA-HPO to decadienal in a non-UfLOX2 manner, respectively. A fourth aliquot of UfLOX2 was feed with GLA hydroperoxide (GLA-HPO) to prove its HPL activity in comparison with the third aliquot (i.e. non- UfLOX2 conversion of GLA-HPO to decadienal). In addition, the buffer for making UfLOX2 aliquots was also set as a negative control to show the non-UfLOX2 conversion of GLA-HPO to decadienal.

To prepare the GLA hydroperoxide (GLA-HPO) intermediate, 50 mL of UfLOX2 protein solution was incubated with 0.5 mL GLA (60 mg/mL) and stored at room temperature for 10 min. The reaction mixture was then loaded on a HLB column (Waters. US Part No. 186000118). The column was eluted with 10 mL of methanol to get GLA- HPO. After incubation for 1 hour, the reaction mixture was checked with LC-MS.

The results are summarized in Table 3 below.

Table 3. Decadienal peak areas by feeding heat-treated or non-treated UfLOX2 with gamma linolenic hydroperoxide intermediate

Part B:

CoLOX Isolation and Characterization Example 4: Seaweed sourcing and analysis for aroma aldehydes

Plant materials of Cladophora oligoclada (sample ID: AVLH2012-011) were collected from Qingdao, Shandong Province, China. One gram of smashed sample was put into a 20 mL vial for further SPME-GC-MS analysis.

Identification of peaks was based on comparison of their mass spectra and retention indices with those in internal libraries. GC-MS analysis revealed four main components in C. oligoclada as showed in Table 4 and Figure 3-7:

Table 4. Identified flavor aldehydes from C. oligoclada

Example 5: Transcriptome analysis and identification of CoLOX proteins

Fresh sample from C. oligoclada was extracted by MiniBest plant RNA extraction kit to yield total RNA by following protocol I provided by the kit (Cat. #9769 v20l309Da, Takara, Japan). The total RNA sample was processed using the TruSeq PE

Cluster Kit (Illumina, USA) and then sequenced on an Illumina MiSeq System. An amount of 14 million of paired-end reads of 2x251 bp was generated. The reads were processed using the Trinity (http://trinitymaseq.sf.net/) software and 225917 transcripts with an N50 of 676 were obtained. The obtained transcripts were translated into protein sequences and then functionally annotated by searching the NCBI non-redundant protein sequence database using the tblastx algorithm. One candidate protein sequence of LOX was mined by Pfam search and relative expression level.

The total RNA sample C. oligoclada (sample ID: PA-2017-0028) was first reverse transcribed into cDNA using SMARTer™ RACE cDNA Amplification Kit (Clontech Takara, Japan). The products were then used as the template for gene cloning. By using forward primer (5’- CTCTCTCTCTTTCTCTCTGTTCT-3’ ) (SEQ ID NO:55) and reverse primer (5’- CTCGTTCCCTTACCGTCT-3’) (SEQ ID NO:56) several coding sequences of LOX were amplified from the cDNA, designated CoLOX-3 (SEQ ID NO:3 ) (and its variants) CoLOX-03l7 (SEQ ID NO:6), CoLOX-l9 (SEQ ID NO:9), CoLOX- 22 (SEQ ID NO: 12) and CoLOX-d4 (SEQ ID NO: 15).

Example 6: Functional characterization of CoLOX proteins

The nucleic acid sequences of CoLOX-3 and its variants CoLOX-03l7, CoLOX- 19, CoLOX-22 and CoLOX-d4 were codon optimized by following the genetic codon frequency of E. coli, synthesized and then subcloned into the pETDuet-l (Novagen, Merck KGaA, Germany) between Ndel and Kpnl sites, respectively, for subsequent expression in E. coli. The following codon optimized sequences were applied: CoLOX-3 (SEQ ID NO:2), CoLOX-03l7 (SEQ ID NO:5), CoLOX- 19 (SEQ ID NO:8), CoLOX-22 (SEQ ID NO: 11) and CoLOX-d4 (SEQ ID NO: 14), and the following plasmids were prepared: pETDuet-CoLOX-3, pETDuet-CoLOX-03l7, pETDuet-CoLOX-l9, pETDuet- CoLOX-22 and pETDuet-CoLOX-d4. Functional expression of the genes was performed as described above in the Methods section. The cultures were spin down and resuspended in 3 mL of buffer (25 mM Tris-HCl pH7.5, 0.2 mM CaCl 2 ) followed by a sonication step to make the respective protein solution.

The crude protein solutions (3 mL) of CoLOX-3, CoLOX-03l7, CoLOX- 19, CoLOX-22 and CoLOX-d4 were put into a 20 mL SPME vial, respectively, 30 pL fatty acid substrate (30 pL LA, ALA, GLA, EPA, ARA borage oil, arachidonic oil, linseed oil or fish oil in 1 ml ethanol respectively) and 10 pL internal standard (80 ppm alpha-ionone in ethanol) were added into each of the vial for incubation. After 10 min at RT, the SPME-GC-MS method described in the methods section was used for analysis of decadienals and decatrienals. A mixture of buffer plus fatty acid plus internal standard was used as control.

All five proteins showed capability to produce decadienals and/or decatrienals when feeding with specific substrates (see Table 5 and 6 below and Figure 8, 9 and 10).

Table 5. Decadienals/intemal standard peak ratio after feeding with GLA (normalized by protein concentration)

Table 6. Decadienals/intemal standard peak ratio after feeding with fish oil hydrolysate (normalized by protein concentration)

Part C:

Mining and Characterization of Cm- Aldehyde-producing LOXs from Public

Database

Example 7: Mining and selection of LOXs by sequence analysis

Due to its activity of producing decadienals and decatrienals, UfLOX2 was used to search for more LOXs from GenBank by using BLASTP 2.8.0+ (https://blast.ncbi.nlm.nih.gov/Blast.cgi). A total of 188 LOXs were found by this approach, in which 181 LOXs are from cyanobacteria, 5 LOXs are from proteobacteria, and 2 LOXs are from planctomycetes, with sequence identity of less than 42% to UfLOX2. 16 LOXs were selected as example for a relatively higher sequence identity to UfLOX2 and being representative for their own homologs, as listed in Table 7. Two known LOXs from red algae were listed and used for comparison. The residual 83 LOXs with a relatively higher identity to UfLOX2 were listed in the attached sequence listing as SEQ ID NO: 75 to 239 (amino acid and nucleic acid sequences. The start codons, where necessary, were set as ATG.

Table 7. List of bifunctional LOXs

Note: a. CoLOX-3 of present invention; b. UfLOX2 of present invention; c. AFQ59981.1 (PhLOX) was described for example by Jechan Lee et al., Environmental Pollution 227 (2017) 252-262; d. AGN54275.1 (PhLOX2) was described in Zhujun Zhu et al., PLoS One. (2015) 10(2):e0117351.

The amino acid sequence identity and the number of different residues are summarized in Table 8. The upper right block shows the number of unmatched amino acids, the lower left block shows the sequence identity. The sequence identities between the bacterial LOXs and UfLOX2 range from 32 to 42%. The sequence identities between the bacterial LOXs and CoLOX-3 range from 13 to 16%. The sequence identities between the bacterial LOXs and the red algae LOXs are less than 15%. able 8. The sequence identity of the LOXs.

Example 8: Expression and Functional Characterization of the mined bacterial LOXs

The coding sequences of the bifunctional LOXs were optimized by following the genetic codon frequency of E. coli, synthesized and then subcloned into the pETDuet-l plasmid for subsequent expression in E. coli.

Functional expression of the mined LOXs was performed as described above in the Methods section. The different LOX proteins expressed by E. coli were released by sonication in 25mM Tris-HCl buffer (pH7.5) to deliver LOX protein solution, respectively. Each LOX protein solution was transferred into a 20 mL SPME vial, 30 pL of GLA and 10 pL of internal standard were added into the vial. After 10 min incubation, SPME-GC-MS was used for analysis of decadienals, decatrienals and hexanal, and LC- UV was used for analysis of decadienals, decatrienals and the GLA-HPO (intermediate between gamma- linolenic acid and decadienals). SPME-GC-MS was performed as described in the Methods section above. GC-MS analysis revealed 2E,4Z-decadienal (retention time 13.0 min), 2E,4E-decadienal (retention time 13.25) and hexanal in the reactions for each LOX but with different levels. LC-UV revealed 2E,4Z-decadienal (retention time 6.61 min at 280 nm), 2E,4E-decadienal (retention time 6.62 min at 280 nm) and GLA-HPO (retention time 6.90 min at 235 nm).

The selectivity, bifunctionality and productivity of LOXs for the decadienal end product from the GLA substrate were calculated and shown in Table 9 below (UfLOX2 and CoLOX-3 were involved for comparison). The selectivity can be deduced by calculating the peak area ratio of decadienal (Cio) to hexanal (C 6 ). The productivity can be deduced from the peak area of decadienal. The bifunctionality can be deduced by calculating the peak area ratio of decadienal (Cio) to GLA-HPO (intermediate). In this comparison, UfLOX2 remains the best bifunctional LOX, followed by cyanobacterial bifunctional LOX WP_002738122.1 (from Microcystis aeruginosa) and WP_015204462.1 (from Crinalium epipsammum). There are still some cyanobacterial LOXs with similar activity compared to CoLOX-3, e.g. WP_039200563.l, WP_07364l30l.l. able 9. The analytical data related to selectivity, bifunctionality and productivity of LOXs.

Part D:

Further Characterization of LOXs of the Invention

Example 9: Characterization of the key amino acids in high performance LOXs

Experiment 1 :

High performance LOXs, UfLOX2 and WP_002738122.1 and WP_015204462.1 were compared with the other less active LOXs in an alignment view (see Fig. 11). For mining potential key amino acid residues for high activity LOX, a number of potential positions were selected and marked by stars (indicating potential key positions) and dots (indicating other potential positions).

The importance of some of the identified conserved residues by mutagenesis studies was investigated. The results are summarized in Table 10. Table 10. Modified amino acids of UfLOX2 for functional study.

Double mutation in positions 134 and 136

2 ) Numbering relates to SEQ ID NO: 18

In a first series of mutagenesis studies, some UfLOX2 mutants showed reduced activity, see in Figure 12.

Based on these date the following may be concluded:

1) D142/M143, N150, C174, K209, C268 and A331 are not key to the activity;

2) Cl, D134/R136, 061, A219, S256, C278, S305, C409 and G526 are key to the activity, as the corresponding mutants shown reduced activity at different levels.

Experiment 2:

The residues identified in Experiment 1 were introduced into several bacterial LOXs with several other residues that are conserved in bacterial LOXs to improve productivity. The designed sequences are as shown in Table 11.

Table 11. Modified amino acids of LOX mutants.

The coding sequences of the mutants of bacterial LOXs were optimized by following the genetic codon frequency of E. coli, synthesized and then subcloned into the pETDuet-l plasmid for subsequent expression in E. coli.

Functional expression of the mutants of bacterial LOXs was performed as described above in the Methods section. The different LOX proteins expressed by E. coli were released by sonication in 25mM Tris-HCl buffer (pH7.5) to deliver LOX protein solution, respectively. Each LOX protein solution was transferred into a 20 mL SPME vial, 30 pL of GLA and 10 pL of internal standard were added into the vial. After 10 min incubation, LC- UV was used for analysis of decadienals. The productivity of LOX mutants for the decadienal end product were calculated and shown in Figure 18 (their natural counterparts were involved for comparison). WP_002738l22.lmut, WP_002738l22.lmut2,

WP_015204462.1 mut, WP_0l5204462.lmut2, WP_0l5204462.lmut3,

WP_0l5l785l2.lmut, WP_006635899.lmut and WP_09909943l.lmut shown increased productivity compared to their natural counterparts.

10: Characterization of the cofactors for LOXs

Previous studies indicated that five essential conserved amino acid residues in the active site are involved in the binding of cofactors as described by Toralf Senger, et al., J. Biol. Chem. 2005, 280:7588-7596 (residues cited therein as His-585, His-590, His-774, Asn- 778 and Ile-899). Both iron and manganese were reported to be the cofactors as described by Alexandra Andreou, et al., J. Biol. Chem. 2010. The algal LOXs and the bacterial LOXs also have these five conservative residues as shown in said alignment in Figure 11, indicating that addition of iron and manganese might improve the activity of LOXs. We therefore tested the importance of iron and manganese on the activity of UfLOX2. The observed results show clearly the importance of adding manganese (to a lesser extent magnesium) to the reaction for enhancing the enzyme activity. Manganese is therefore important for enabling/improving the LOX activity. The results are summarized in Figure 13. We have also tested iron in the assay, however, the effect is not as significant as using manganese (data not shown).

Example 11: Downstream products profiling

In the case of making decadienal by using UfLOX2 and gamma-linolenic acid, the molar yield for total decadienal (including 2E,4Z-decadienal and 2E,4E-decadienal) is approx.. 30-40% based on quantification by LC-UV/MS with external calibration as described above in the Methods section. However, the overall percentage for decadienal, based total volatiles is above 90%.

To obtain information of other downstream side products, UfLOX2 was produced in E. coli. Cell lysates (20 ml) that contain UfLOX2 were fed with GLA at room temperature. 200 pl sample aliquots were picked up and mixed with 800 pl acetonitrile for further LC- UV/MS analysis as described above in the Methods section. Nine side product (see Table 12) were proposed based on the observed mass spectra as well as comparison with literature.

Table 12. Side products

All the publications mentioned in this application are incorporated by reference to disclose and describe the methods and/or materials in connection with which the publications are cited.

Listing of Sequences

Table 13. Sequences described and used herein

NA = Nucleic Acid Sequence

AA = Amino Acid Sequence Remarks on the above listing:

• SEQ ID NO: 59-74 refer to the corresponding natural coding sequences for SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50

• SEQ ID NO: 75-238 are a pairwise representation of the corresponding putative

coding sequences (the start codon changed to“ATG” for the sequences which don’t have“ATG”; sequence not codon optimized, therefore considered as“natural” except for start codon) and the amino acid Sequences for the mined LOX mined from NCBI

• SEQ ID NO: 239 - the amino acid sequence for 5MEE_A mined from NCBI

• SEQ ID NO: 253-290 refer to mutants of bacterial LOX: Encompassed within the general disclosure of the present description is any coding nucleic acid described herein without a 5’-terminal start codon triplet or with an artificial or natural start codon triplet.

1. CoLOX

Coding sequence for CoLOX-3 - SEQ ID NO: 1 ATGACGTCGTCTCCGACCGTCAGATCGATGGTAATGCTGGCCGTGCTGGCCGTCTCTGCC CTGGAGAGCGCGC

CCTGCGCCTCGGCCTTTGCCACGCTCCCCCGCGCCCTCGTACGACCGCAAGCCGCCC TCAAGTACCGAGCCGA

GGACAAAAATGACGTCGATGTCGCCCCGGCTGGTAGCACTGCCTCCGACGTGAGCAA GCCCGAAGGCAAGG

CCACCGCCGTCGCCAAGGGTACTGTCAACGCGCCCATCGAGGAGGCATGGAAGGTCT TCCGGTCTTTTTCCAA

CATGAACCAATGGATGCCCGTGTACGGCGAGTGGGAGGCCACGGGAGACTCAGTCGG AGACACCCGCACGT

TCAACTTCAAGGATCAGCCGACCTTCTTCACTACCGAGAGGCTTGTCGGCCTGGACG ACTCCCAGTACAAGAT

GAAGTACACCCTCGTCAACTGCAAGGGCTCGCCCGTGCCCATCGAATCTATCGACAC CATCGTCACCTTCACTG

CAAACG AT GAT GT G ACCG AGGTT G ACTGGCGCTCCTGGACG AAGT CGCCCATGGT CGACTT G ATCAAGGG AC

GTCAGGCGGCCGGGTATGCCGGCGGCATCGCAGCGCTCGACCGGTACCTGAACCCGT CCCTTGGCACCGTCG

AT GT C ACCAT C AAGT CGGCCG ACAACCT CG ATGGT G ATTTCCT GTCCAGCTCCT ACGCCACT CT CAT GGT CACG

G ACGCCG ACCCCG AGCAAGTGCATGCCAAGG AGTGGGGG ACG AGTCCT G AGTTT G ATGCCAAGCCCGT CCA

GTTCAGCCTGCTCAAGCCCGACTCCAAGCTCTACATGAGCGTCATGCTGACCAAGTG CGGCGTCGACGCCCCC

GTCGG AT ACGCCGT CTTT G ACAT CCAG AAG AGCCT CAAGTCCGGCG AG ACT GT G ACCGAG ACCTTT CAGCTCG

AGGGCAGCAACGATGCAACGTTGACGGTCGAGATGGAGCTCAACCTCCGGCAGGGCA GCATCCTCCCTCAAT

CCAAGGCCCAG AAG AAT CTGGCG ACCCTT GT CGCCCT CCAGCAGTCT GTCG AG AGGGTCCG AG ACCGCATCG

TTACTATCGGCAAGCTGGCCGGCGAGCCCGAGAAGTCGGTATGGGAGTACGAGCGAA AGTCCGGCCTTCCCA

AGTCCGTCAAGGGCCTTCCTCGGTCGGAAGTGCTGCCGCCGCACAAGATCGCCCTCA TGGTCGACGCCATCGC

CG AGT ACG CTT ACACCCAGTT CC AGCTCGT CC AGCGCCT GCTCCCCGT C AG AAACTCGT ACG ACCGGT ACGCC

GCTTACTTTGCCCCAGAAGGCGAGGAGTACGTTCCCATCCCGCAGATCCTCAAGGAC ATGACGTGGTCCACCG

ATGACGAGTTCATCCGCCAGATCTTCGCCGGCCTCAACCCTTTGCAAGTCGAGGTCG TCAAGAACAAGGCCGG

TCTGCCCTCCAAGTTGCAGGAGCTCAAGGCCAGGGACGGATCTGATGTCGATAAGCT CATCTCGGAAGGCCG

GTTGTATGTTCTGGACTACTCGGTCCTCAAGGATCTCGACCTCAACCGCAACGGTGT CACCCTGTACGCGCCGA

CGATGCTCATCTACCGCACTGGTGGTGACAAGCTCGACGTCCTCGGCATCATGCTCG AGCCCCGCCGTGACGA

TGCGCCCGTTTACACGCCCGACTCTGAAACTCCCAACAAGTTCCTTCTCGCCAAGTG CCACGTTGCCTGCGCTG

ACAACCAAGT GCACCAGTT CACGT ACCATCTCGGTT ACGCCCAT CTTGCCACGG AGCCACTTGCG ATCGCAAG

CCACAACGTCCTGGAGAAGAACAGCCATCCGCTCGGCATGTTCCTCAAGCCACACTT CCGCGACAACATTGGC

ATCAACTACCTCGCCCGGCAGACTCTTGTTGCCGACGAAGACGCCATCACAGACCAC ACCTTTGCCACGGGCA

CCGCGCAGGGCGTCAGTATGGTCGTCGACGCCTTCAAGTCGTACAACTTCCTCGAGT CTGGCTTGCCCGATGA

GCTGCGCCGTCGTGGATTCGAACGGTCGGACGACCTCAAGGTGTATCGCTACCGCGA CGATGGCTGGTTGGT

TTGGGACACGCTCTGGAAGTACGCCGAGGATATGGTCAACGAGCTGTACGGAACGGA CAACGATGTCACTGC

TGACAAGGTCGTCCAGGAGTGGGCGAGGGAAGCATCTGGCTCGGACACTGCCGACGT CCAGGGCTTTCCGG

AGTCCATCACGACCAAGTACATCCTCACAAAGGTCCTGACGACGATCATCTGGCAAG CGTCCGCCTTGCACTC

GGCTCTCAACTACATCCAATACCCGTACACTGCGACCCCCATCAACCGTGCCGCCTC CATCTTTGGACCGGTCC

CTGACGGCGAAGCGGATATCACCGAGCAGGACATCCTGGATGTCATCCCTGGTGGTC TGGACGATGAGAACA

ACCGTGGTCTGACCCTCTCCATCTTCCAAGGCCTGCTCTCGTGGCTCCTGCGCACTC CTGAGAACCCGACGCTG

G ACG AAGT CGGCAGCCCAATCCCG AACAGG AACAACCCCAT CG AGTGGGTCG AGTTCCGCT CG AAGT ACCCC

CAGGTCTACTACAACTTGGACCAGAACCTTGCCGTGGTGGAGAAGATCATCGAGGAG CGCAACAAGGGCCTT

GCTTCTCCGTACGAGGTGCTCCTTCCCAGCCACATTGCTGCCAGCATCAACATCTGA

Codon-optimized coding sequence of CoLOX-3 by Genscript genetic codon frequency of f. coli - SEQ I D NO: 2

ATGACCAGCAGCCCGACCGTGCGTAGCATGGTTATGCTGGCGGTTCTGGCGGTTAGC GCGCTGGAAAGCGCG

CCGTGCGCGAGCGCGTTTGCGACCCTGCCGCGTGCGCTGGTTCGTCCGCAGGCGGCG CTGAAGTATCGTGCG

GAAGACAAAAACGATGTTGATGTTGCGCCGGCGGGTAGCACCGCGAGCGATGTTAGC AAGCCGGAGGGTAA

AGCGACCGCGGTTGCGAAGGGCACCGTGAACGCGCCGATCGAGGAAGCGTGGAAAGT GTTCCGTAGCTTTA GCAACATGAACCAATGGATGCCGGTTTACGGCGAGTGGGAAGCGACCGGTGACAGCGTGG GCGATACCCGT

ACCTTCAACTTTAAGGACCAGCCGACCTTCTTTACCACCGAGCGTCTGGTGGGTCTG GACGATAGCCAATATA

AGATGAAATACACCCTGGTTAACTGCAAAGGCAGCCCGGTGCCGATCGAAAGCATTG ATACCATCGTTACCTT

CACCGCGAACGACGATGTGACCGAGGTTGACTGGCGTAGCTGGACCAAGAGCCCGAT GGTGGATCTGATTAA

AGGTCGTCAAGCGGCGGGTTATGCGGGTGGCATTGCGGCGCTGGACCGTTACCTGAA CCCGAGCCTGGGCAC

CGTGGACGTTACCATCAAGAGCGCGGATAACCTGGACGGCGATTTCCTGAGCAGCAG CTACGCGACCCTGAT

GGTTACCGATGCGGATCCGGAGCAGGTGCATGCGAAGGAATGGGGCACCAGCCCGGA GTTCGACGCGAAAC

CGGTTCAATTTAGCCTGCTGAAGCCGGATAGCAAACTGTATATGAGCGTGATGCTGA CCAAATGCGGTGTGG

ATGCGCCGGTTGGTTATGCGGTGTTCGATATTCAGAAGAGCCTGAAAAGCGGCGAGA CCGTTACCGAAACCT

TTCAACTGGAAGGCAGCAACGACGCGACCCTGACCGTGGAGATGGAACTGAACCTGC GTCAGGGTAGCATCC

TGCCGCAGAGCAAGGCGCAGAAGAACCTGGCGACCCTGGTTGCGCTGCAGCAAAGCG TGGAGCGTGTTCGT

GACCGTATTGTTACCATCGGTAAACTGGCGGGTGAACCGGAGAAGAGCGTGTGGGAG TATGAGCGTAAGAG

CGGTCTGCCGAAGAGCGTTAAAGGTCTGCCGCGTAGCGAAGTGCTGCCGCCGCACAA AATTGCGCTGATGGT

TGACGCGATCGCGGAGTACGCGTATACCCAGTTCCAACTGGTTCAGCGTCTGCTGCC GGTGCGTAACAGCTAC

GACCGTTATGCGGCGTACTTTGCGCCGGAAGGCGAGGAATACGTGCCGATTCCGCAA ATCCTGAAGGATATG

ACCTGGAGCACCGACGATGAGTTCATTCGTCAGATCTTTGCGGGTCTGAACCCGCTG CAAGTTGAAGTGGTTA

AGAACAAAGCGGGTCTGCCGAGCAAGCTGCAGGAGCTGAAAGCGCGTGACGGTAGCG ACGTGGATAAGCTG

ATCAGCGAGGGCCGTCTGTATGTTCTGGATTACAGCGTGCTGAAAGACCTGGATCTG AACCGTAACGGTGTTA

CCCTGTATGCGCCGACCATGCTGATTTACCGTACCGGTGGCGACAAACTGGATGTTC TGGGCATCATGCTGGA

ACCGCGTCGT G ACG AT GCGCCGGT GTACACCCCGG ACAGCG AG ACCCCG AACAAGTTCCT GCT GGCG AAAT G

CCACGTTGCGTGCGCGGATAACCAGGTGCACCAATTTACCTATCACCTGGGTTATGC GCACCTGGCGACCGAA

CCGCTGGCGATTGCGAGCCACAACGTGCTGGAGAAGAACAGCCACCCGCTGGGCATG TTCCTGAAACCGCAC

TTTCGTGACAACATCGGCATTAACTATCTGGCGCGTCAGACCCTGGTTGCGGACGAA GATGCGATCACCGATC

ATACCTTTGCGACCGGCACCGCGCAAGGCGTGAGCATGGTGGTTGACGCGTTCAAGA GCTACAACTTTCTGG

AAAGCGGTCTGCCGGATGAGCTGCGTCGTCGTGGTTTCGAGCGTAGCGACGATCTGA AGGTTTACCGTTATC

GTGACGATGGTTGGCTGGTGTGGGACACCCTGTGGAAATATGCGGAGGATATGGTTA ACGAACTGTACGGCA

CCGACAACGATGTGACCGCGGACAAAGTGGTTCAGGAGTGGGCGCGTGAAGCGAGCG GTAGCGACACCGCG

GATGTTCAAGGCTTCCCGGAAAGCATTACCACCAAGTATATCCTGACCAAAGTGCTG ACCACCATCATTTGGC

AAGCGAGCGCGCTGCACAGCGCGCTGAACTACATTCAATACCCGTATACCGCGACCC CGATTAACCGTGCGGC

GAGCATCTTTGGTCCGGTTCCGGACGGCGAGGCGGATATTACCGAACAGGACATTCT GGATGTGATCCCGGG

TGGCCTGGACGATGAGAACAACCGTGGTCTGACCCTGAGCATCTTCCAAGGTCTGCT GAGCTGGCTGCTGCGT

ACCCCGG AAAACCCG ACCCT GG ACG AGGTT GGT AGCCCG ATT CCG AACCGT AACAACCCG AT CGAGTGGGTT

GAATTTCGTAGCAAGTATCCGCAGGTGTACTATAACCTGGATCAAAACCTGGCGGTG GTTGAAAAGATCATTG

AGGAACGTAACAAAGGTCTGGCGAGCCCGTACGAGGTGCTGCTGCCGAGCCACATTG CGGCGAGCATCAAC

ATTTAA

Amino acid Sequence for CoLOX-3 - SEQ I D NO: 3

MTSSPTVRSMVMLAVLAVSALESAPCASAFATLPRALVRPQAALKYRAEDKN DVDVAPAGSTASDVSKPEGKATA

VAKGTVNAPI EEAWKVFRSFSNM NQWMPVYGEWEATGDSVGDTRTFNFKDQPTFFTTERLVGLDDSQYKMKY

TLVNCKGSPVPIESIDTIVTFTAN DDVTEVDWRSWTKSPMVDLIKGRQAAGYAGG IAALDRYLNPSLGTVDVTI KSA

DN LDGDFLSSSYATLMVTDADPEQVHAKEWGTSPEFDAKPVQFSLLKPDSKLYMSVMLTKCG VDAPVGYAVFDI

QKSLKSGETVTETFQLEGSN DATLTVEMELNLRQGSI LPQSKAQKNLATLVALQQSVERVRDRIVTIGKLAGEPEKSV

WEYERKSGLPKSVKGLPRSEVLPPH KIALMVDAIAEYAYTQFQLVQRLLPVRNSYDRYAAYFAPEGEEYVPIPQILKD

MTWSTDDEFIRQI FAGLN PLQVEVVKNKAGLPSKLQELKARDGSDVDKLISEGRLYVLDYSVLKDLDLNRNGVTLYA

PTMLIYRTGGDKLDVLGI MLEPRRDDAPVYTPDSETPNKFLLAKCHVACADNQVHQFTYH LGYAHLATEPLAIASH NVLEKNSHPLGM FLKPH FRDNIGINYLARQTLVADEDAITDHTFATGTAQGVSMVVDAFKSYNFLESGLPDELRRR

GFERSDDLKVYRYRDDGWLVWDTLWKYAEDMVNELYGTDN DVTADKVVQEWAREASGSDTADVQGFPESITT

KYILTKVLTTI IWQASALHSALNYIQYPYTATPI NRAASI FGPVPDGEADITEQDILDVIPGGLDDEN NRGLTLSI FQGLL

SWLLRTPENPTLDEVGSPIPNRN NPI EWVEFRSKYPQVYYNLDQNLAVVEKII EERN KGLASPYEVLLPSH IAASI NI

Coding sequence for CoLOX-0317 - SEQ ID NO: 4

ATGACGTCGTCTCCGACCGTCAGATCGATGGTAATGCTGGCCGTGCTGGCCGTCTAT GCCCTGGAGAGCACGC

CCTGCGCCTCGGCCTTTGCCACGCTCCCCCGCGCCCTCGTACGACCGCAAGCCGCCC TCAAGTACCGAGCCGA

GGACAAAAACGACGTCGATGTCGCCCCGGCTGGTAGCACTGCCTCCGACGTGAGCAA GCCCGAAGGCAAGG

CCACCGCCGTCGCCAAGGGTACGGTCAACGCGCCCATCGAGGAGGCATGGAAGGTCT TCCGGTCTTTTTCCAA

CAT G AACCAATGG AT GCCCGT GTACGGCG AGTGGG AGGCCACGGG AG ACT CCGT CGG AG ACACCCGCACGT

TCAACTTCAAGGATCAGCCGACCTTCTTCACTACCGAGAGGCTTGTCGGCCTGGACG ACTCCCAGTACAAGAT

GAAGTACACCCTCGTCAACTGCAAGGGCTCGCCCGTGCCCATCGAATCTATCGACAC CATCGTCACCTTCACTG

CAAACG AT GAT GT G ACCG AGGTT G ACTGGCGCTCCTGGACG AAGT CGCCCATGGT CGACTT G ATCAAGGG AC

GCCAGGCGGCCGGGTATGCCGGCGGCATCGCAGCGCTCGACCGGTACCTGAACCCGT CCCTTGGCACCGTCG

ATGTCACCATCAAGTCGGCCGACAACCTCGATGGCAATTTCCTGTCCAGCTCCTACG CCACTCTCATGGTCACG

G ACGCCG ACCCCG AGCAAGTGCATGCCAAGG AGTGGGGG ACG AGTCCT G AGTTT G ATGCCAAGCCCGT CCA

GTTCAGCCTGCTCAAGCCCGACTCCAAGCTCTACATGAGCGTCATGCTGACCAAGTA CGGCGTCGACACGCCC

GTCGG AT ACGCCGT CTTT G ACAT CCAG AAG AGCCT CAAGTCCGGCG AG ACT GT G ACCGAG ACCTTT CAGCTCG

AGGGCAGCAACGATGCAACGTTGACGGTCGAGATGGAGCTCAACCTCCGACAGGGCA GCGTCCTCCCTCAAT

CCAAGGCCCAG AAG AAT CTGGCG ACCCTT GT CGCCCT CCAGCAGTCT GTCG AG AGGGTCCG AG ACCGCATCG

TTACTATCGGCAAGCTGGCCGGCGAGCCCGAGAAGTCGGTATGGGAGTACGAGCGAA AGTCCGGCCTTCCCA

AGTCCGTCAAGGGCCTTCCTCGGTCGGAAGTGCTGCCGCCGCACAAGATCGCCCTCA TGGTCGACGCCATCGC

CG AGT ACG CTT ACACT CAGTT CC AGCTCGT CC AGCGCCT GCTCCCCGT C AG AAACTCGT ACG ACCGGT ACGCC

GCTTACTTTGCCCCAGAAGGCGAGGAATACGTTCCCATCCCGCAGATCCTCAAGGAC ATGACGTGGTCCACCG

ATGACGAGTTCATCCGCCAGATCTTTGCCGGCCTCAACCCGTTGCAAGTCGAGGTCG TCAAGAACAAGGCCGG

TCTGCCCTCCAAGTTGCAGGAGCTCAAGGCCAAGGACGGATCTGATGTCGATAAGCT CATCTCGGAAGGCCG

GTTGTATGTTCTGGACTACTCGGTCCTCAAGGATCTCGACCTCGACCGCAACGGTGT CACCCTGTACGCGCCG

ACGATGCTCATCTACCGCACTGGTGGTGACAAGCTCGACGTCCTCGGCATCATGCTT GAGCCCCGCCGTGACG

ACGCGCCCGTTTACACGCCCGACTCTGAAACTCCCAACAAGTTCCTTCTCGCCAAGT GCCACGTTGCCTGCGCT

GACAACCAAGTGCACCAGTTCACGTACCATCTCGGTTACGCCCATCTTGCCACAGAG CCACTTGCGATTGCAA

GCCACAACGTCCTGGAGAAGAACAGCCATCCGCTCGGCATGTTCCTCAAGCCACACT TCCGCGACAACATCGG

CATCAACTACCTCGCCCGACAGACTCTTGTTGCCGACGAAGACGCCATCACAGACCA CACCTTTGCCACGGGC

ACCGCGCAGGGCGTCAGTATGGTCGTCGACGCCTTCAAGTCGTACAACTTCCTCGAG TCTGGCTTGCCCGATG

AGCTGCGCCGTCGTGGATTCGAACGGTCGGACGACCTCAAGGTGTATCGCTACCGCG ACGATGGCTGGTTGA

TCTGGGACACGCTCTGGAAGTACGCCGAGGATATGGTCAACGAGCTGTACGGAACGG ACAACGATGTCGCTG

CTGACAAGGTCGTCCAGGAGTGGGCGAAGGAAGCATCTGGCTCGGACACTGCCGACG TCCAGGGCTTTCCGG

AGTCCATCACGACCAAGTACATCCTCACAAAGGTCCTGACGACGATCATCTGGCAAG CGTCCGCCTTGCACTC

GGCTCTCAACTACATCCAATACCCGTACACTGCGACCCCCATCAACCGTGCCGCCTC CATCTTTGGACCGGTCC

CTGACGGCGAAGCGGATATCACCGAGCAGGACATCCTGGATGTCATCCCTGGTGGTC TGGACGATGAGAACA

ACCGTGGTCTGACCCTCTCCATCTTCCAAGGCCTGCTCTCGTGGCTCCTGCGCACTC CTGAGAACCCGACGCTG

G ACG AAGT CGGCAGCCCAATCCCG AACAGG AACAACCCCAT CG AGTGGGTCG AGTTCCGCT CG AAGT ACCCC

CAGGTCTACTACAACTTGGACCAGAACCTTGCCGTGGTGGAGAAGATCATCGAGGAG CGCAACAAGGGCCTT

GCTTCTCCGTACGAGGTGCTCCTTCCCAGCCACATTGCTGCCAGCATCAACATCTGA Codon-optimized coding sequence of CoLOX-0317 by Genscript genetic codon frequency of E. coli - SEQ ID NO: 5

ATGACCAGCAGCCCGACCGTGCGTAGCATGGTTATGCTGGCGGTGCTGGCGGTTTAT GCGCTGGAAAGCACC

CCGTGCGCGAGCGCGTTTGCGACCCTGCCGCGTGCGCTGGTTCGTCCGCAGGCGGCG CTGAAGTATCGTGCG

GAAGATAAAAACGATGTGGATGTGGCGCCGGCGGGTAGCACCGCGAGCGACGTTAGC AAGCCGGAGGGTA

AAGCGACCGCGGTTGCGAAGGGCACCGTGAACGCGCCGATCGAGGAAGCGTGGAAAG TGTTCCGTAGCTTT

AGCAACATGAACCAATGGATGCCGGTTTACGGCGAGTGGGAAGCGACCGGTGATAGC GTGGGCGACACCCG

TACCTTCAACTTTAAGGATCAGCCGACCTTCTTTACCACCGAGCGTCTGGTGGGTCT GGACGATAGCCAATATA

AGATGAAATACACCCTGGTTAACTGCAAAGGCAGCCCGGTGCCGATCGAAAGCATTG ACACCATCGTTACCTT

CACCGCGAACGACGATGTGACCGAGGTTGATTGGCGTAGCTGGACCAAGAGCCCGAT GGTGGACCTGATTAA

AGGTCGTCAAGCGGCGGGTTATGCGGGTGGCATTGCGGCGCTGGATCGTTATCTGAA CCCGAGCCTGGGCAC

CGTGGACGTTACCATTAAGAGCGCGGATAACCTGGACGGCAACTTCCTGAGCAGCAG CTACGCGACCCTGAT

GGTTACCGATGCGGATCCGGAGCAGGTGCATGCGAAGGAATGGGGCACCAGCCCGGA GTTCGATGCGAAAC

CGGTTCAATTTAGCCTGCTGAAGCCGGACAGCAAACTGTATATGAGCGTGATGCTGA CCAAATACGGTGTGG

ATACCCCGGTTGGCTATGCGGTGTTCGACATCCAGAAGAGCCTGAAAAGCGGCGAGA CCGTTACCGAAACCT

TTCAACTGGAAGGCAGCAACGACGCGACCCTGACCGTTGAGATGGAACTGAACCTGC GTCAGGGTAGCGTGC

TGCCGCAGAGCAAGGCGCAGAAGAACCTGGCGACCCTGGTGGCGCTGCAGCAAAGCG TGGAGCGTGTTCGT

GACCGTATTGTTACCATCGGTAAACTGGCGGGTGAACCGGAGAAGAGCGTGTGGGAG TACGAGCGTAAGAG

CGGTCTGCCGAAGAGCGTTAAAGGTCTGCCGCGTAGCGAAGTGCTGCCGCCGCACAA AATTGCGCTGATGGT

TGACGCGATCGCGGAGTACGCGTATACCCAGTTCCAACTGGTTCAGCGTCTGCTGCC GGTGCGTAACAGCTAC

GATCGTTATGCGGCGTACTTTGCGCCGGAAGGCGAGGAATACGTGCCGATTCCGCAA ATCCTGAAGGACATG

ACCTGGAGCACCGACGATGAGTTCATTCGTCAGATCTTTGCGGGTCTGAACCCGCTG CAAGTTGAAGTGGTTA

AGAACAAAGCGGGTCTGCCGAGCAAGCTGCAGGAGCTGAAGGCGAAAGATGGTAGCG ACGTGGATAAACTG

ATCAGCGAGGGCCGTCTGTATGTTCTGGACTACAGCGTGCTGAAGGACCTGGATCTG GACCGTAACGGTGTT

ACCCTGTATGCGCCGACCATGCTGATTTACCGTACCGGTGGCGATAAACTGGACGTT CTGGGCATCATGCTGG

AACCGCGTCGTGACGATGCGCCGGTGTACACCCCGGATAGCGAGACCCCGAACAAGT TCCTGCTGGCGAAAT

GCCACGTTGCGTGCGCGGACAACCAGGTGCACCAATTTACCTATCACCTGGGTTATG CGCACCTGGCGACCGA

ACCGCTGGCGATTGCGAGCCACAACGTGCTGGAGAAGAACAGCCACCCGCTGGGCAT GTTCCTGAAACCGCA

CTTTCGTGATAACATCGGCATTAACTACCTGGCGCGTCAGACCCTGGTTGCGGATGA AGACGCGATCACCGAT

CATACCTTTGCGACCGGCACCGCGCAAGGCGTGAGCATGGTGGTTGATGCGTTCAAG AGCTATAACTTTCTGG

AAAGCGGTCTGCCGGATGAGCTGCGTCGTCGTGGTTTCGAGCGTAGCGACGATCTGA AGGTTTACCGTTATC

GTGACGATGGTTGGCTGATTTGGGATACCCTGTGGAAATACGCGGAGGACATGGTTA ACGAACTGTATGGCA

CCGATAACGACGTGGCGGCGGACAAGGTGGTTCAGGAGTGGGCGAAAGAAGCGAGCG GTAGCGATACCGC

GG ACGTT CAAGGCTTCCCGG AAAGCATTACCACCAAGTACAT CCT G ACCAAAGTGCT G ACCACCAT CATTT GG

CAAGCG AGCGCGCTGCACAGCGCGCT G AACT AT AT CCAAT ACCCGTATACCGCG ACCCCG ATTAACCGTGCGG

CGAGCATCTTTGGTCCGGTTCCGGATGGCGAGGCGGACATTACCGAACAGGATATTC TGGACGTGATCCCGG

GTGGCCTGGACGATGAGAACAACCGTGGTCTGACCCTGAGCATCTTCCAAGGTCTGC TGAGCTGGCTGCTGC

GTACCCCGG AAAACCCG ACCCT GG AT G AGGTT GGT AGCCCG ATT CCG AACCGT AACAACCCG AT CG AGTGGG

TTGAATTTCGTAGCAAATACCCGCAGGTGTACTATAACCTGGACCAAAACCTGGCGG TGGTTGAAAAGATCAT

TGAGGAACGTAACAAAGGCCTGGCGAGCCCGTATGAGGTGCTGCTGCCGAGCCACAT TGCGGCGAGCATCA

ACATTTAA

Amino acid Sequence for CoLOX-0317 - SEQ ID NO: 6 MTSSPTVRSMVMLAVLAVYALESTPCASAFATLPRALVRPQAALKYRAEDKN DVDVAPAGSTASDVSKPEGKATA

VAKGTVNAPI EEAWKVFRSFSNM NQWMPVYGEWEATGDSVGDTRTFNFKDQPTFFTTERLVGLDDSQYKMKY

TLVNCKGSPVPIESIDTIVTFTAN DDVTEVDWRSWTKSPMVDLIKGRQAAGYAGG IAALDRYLNPSLGTVDVTI KSA

DN LDGN FLSSSYATLMVTDADPEQVHAKEWGTSPEFDAKPVQFSLLKPDSKLYMSVMLTKYGVDTP VGYAVFDIQ

KSLKSGETVTETFQLEGSN DATLTVEMELNLRQGSVLPQSKAQKN LATLVALQQSVERVRDRIVTIGKLAGEPEKSV

WEYERKSGLPKSVKGLPRSEVLPPH KIALMVDAIAEYAYTQFQLVQRLLPVRNSYDRYAAYFAPEGEEYVPIPQILKD

MTWSTDDEFIRQI FAGLN PLQVEVVKNKAGLPSKLQELKAKDGSDVDKLISEGRLYVLDYSVLKDLDLDRNGVTLYA

PTMLIYRTGGDKLDVLGI MLEPRRDDAPVYTPDSETPNKFLLAKCHVACADNQVHQFTYH LGYAHLATEPLAIASH

NVLEKNSHPLGM FLKPH FRDNIGINYLARQTLVADEDAITDHTFATGTAQGVSMVVDAFKSYNFLESGLPDELRRR

GFERSDDLKVYRYRDDGWLIWDTLWKYAEDMVN ELYGTDNDVAADKVVQEWAKEASGSDTADVQGFPESITTK

YI LTKVLTTIIWQASALHSALNYIQYPYTATPIN RAASIFGPVPDGEADITEQDI LDVI PGGLDDEN NRGLTLSI FQGLLS

WLLRTPENPTLDEVGSPIPN RN NPI EWVEFRSKYPQVYYNLDQNLAVVEKI IEERN KGLASPYEVLLPSHIAASIN I

Coding sequence for CoLOX-19 - SEQ ID NO: 7

ATGACGTCGTCTCCGACCGTCAGATCGATGGTAATGCTGGCCGTGCTGGCCGTCTCT GCCCTGGAGAGCGCGC

CCTGCGCCTCGGCCTTTGCCACGCTCCCCCGCGCCCTCGTACGACCGCAAGCCGCCC TCAAGTACCGAGCCGA

GGACAAAAACGACGTCGATGTCGCCCCGGCTGGTAGCACTGCCTCCGACGTGAGCAA GCCCGAAGGAAAGG

CCACTGCTGTCGCCAAGGGTACTGTCAACGCGCCCATCGAGGAGGCATGGAAGGTCT TCCGGTCTTTTTCCAA

CATGGACCAATGGATGCCCGTGTACGGCGAGTGGGAGGCCACGGGAGACTCAGTCGG AGACACCCGCACGT

TCAACTTCAAGGATCAGCCGACCTTCTTCACTACCGAGAGGCTTGTCGGCCTGGACG ACTCCCAGTACAAGAT

GAAGTACACCCTCGTCGACTGCAAGGGCTCGCCCGTGCCCATCGAATCTATTGACAC CATCGTCACCTTCACTG

CAAACG AT GAT GT G ACCG AGGTT G ACTGGCGCTCCTGGACG AAGT CGCCCATGGT CGACTT G ATCAAGGG AC

GTCAGGCGGCCGGGTATGCTGGCGGCATCGCAGCGCTCGACCGGTACCTGAACCCGT CCCTTGGCACCGTCG

ATGTCACCATCAAGTCGGCCGACAACCTCGATGGCGATTTCCTGTCCAGCTCCTACG CCACTCTCATGGTCACG

G ACGCCG ACCCCG AGCAAGTGCATGCCAAGG AGTGGGGG ACG AGTCCT G AGTTCG ATGCCAAGCCCGT CCA

GTTCAGCCTGCTCAAGCCCGACTCCAAGCTCTACATGAACGTCATGCTGACCAAGTA CGGCGTCGACACGCCC

GTCGG AT ACGCCGT CTTT G ACAT CCAG AAG AGCCT CAAGTCCGGCG AG ACT GT G ACCGAG ACCTTT CAGCTCG

AGGGCAGCAACGATGCAACGTTGACGGTCGAGATGGAGCTCAACCTCCGGCAGGGCA GCGTCCTCCCTCAAT

CCAAGGCCCAG AAG AAT CTGGCG ACCCTT GT CGCCCT CCAGCAGTCT GTCG AG AGGGTCCG AG ACCGCATCG

TTACTATCGGCAAGCTGGCCGGCGAGCCCGAGAAGTCGGTATGGGAGTACGAGCGAA AGTCCGGCCTTCCCA

AGTCCGTCAAGGGTCTTCCTCGATCGGAAGTGCTGCCGCCGCACAAGATCGCTCTCA TGGTCGACGCCATCGC

CG AGT ACG CTT ACACT CAGTT CC AGCTCGT CC AGCGCCT GCTCCCCGT C AG AAACTCGT ACG ACCGGT ACGCC

GCTTACTTTGCCCCAGAAGGCGAGGAGTACGTTCCCATCCCGCAGATCCTCAAGGAC ATGACGTGGTCCACCG

ACGACGAGTTCATCCGCCAGATCTTTGCCGGCCTCAACCCGTTGCAAGTCGAGGTCG TCAAGAACAAGGCCG

GTCTGCCCTCCAAGTTGCAGGAGCTCAAGGCCAAGGACGGATCTGATGTCGATAAGC TCATCTCGGAAGGCC

GGTTGTATGTTCTGGACTACTCGGTCCTCAAGGATCTCGACCTCAACCGCAACGGTG TCACCCTGTACGCGCC

GACGATGCTCATCTACCGCACTGGTGGTGACAAGCTCGACGTCCTCGGCATCATGCT CGAGCCCCGCCGTGAC

GATGCGCCCGTTTACACGCCCGACTCTGAAACTCCCAACAAGTTCCTTCTCGCCAAG TGCCACGTTGCCTGCGC

TGACAACCAAGTGCACCAGTTCACGTACCATCTCGGTTACGCCCATCTTGCCACGGA GCCACTTGCGATCGCA

AGCCACAACGTCCTGGAGAAGAACAGCCATCCGCTCGGCATGTTCCTCAAGCCACAC TTGCGCGACAACATTG

GCATCAACTACCTCGCCCGGCAGACTCTTGTTGCCGACGAAGACGCCATCACAGACC ACACCTTTGCCACGGG

CACCGCGCAGGGCGTCAGTATGGTCGTCGACGCCTTCAAGTCGTACAACTTCCTCGA GTCTGGCTTGCCCGAT

GAGCTGCGCCGTCGTGGATTCGAACGGTCGGACGACCTCAAGGTGTATCGCTACCGC GACGATGGCTGGTTG

GTCTGGGACACGCTCTGGAAGTACGCCGAGGATATGGTCAACGAGCTGTACGGAACG GACAACGATGTCGCT

GCTGACAAGGTCGTCCAGGAGTGGGCGAGGGAAGCATCTGGCTCGGACACTGCCGAC GTCCAGGGCTTTCC GGAGTCCATCACGACCAAGTACATCCTCACAAAGGTCCTGACGACGATCATCTGGCAAGC GTCCGCCTTGCAC

TCGGCTCTCAACTACATCCAATACCCGTACACTGCGACCCCCATCAACCGTGCCGCC TCCATCTTTGGACCGGT

CCCTGACGGCGAAGCGGATATCACCGAGCAGGACATCCTGGATGTCATCCCTGGTGG TCTGGACGATGAGAA

CAACCGTGGTCTGACCCTCTCCATCTTCCAAGGCCTGCTCTCGTGGCTCCTGCGCAC TCCTGAGAACCCGACGC

TGGACGAAGTCGGCAGCCCAATCCCGAACAGGAACAACCCCATCGAGTGGGTCGAGT TCCGCTCGAAGTACC

CCCAGGTCTACTACAACTTGGACCAGAACCTTGCCGTGGTGGAGAAGATCATCGAGG AGCGCAACAAGGGCC

TTGCTTCTCCGTACGAGGTGCTCCTTCCCAGCCACATTGCTGCCAGCATCAACATCT GA

Codon-optimized coding sequence of CoLOX-19 by Genscript genetic codon frequency of f. coli - SEQ ID NO: 8

ATGACCAGCAGCCCGACCGTGCGTAGCATGGTTATGCTGGCGGTTCTGGCGGTTAGC GCGCTGGAAAGCGCG

CCGTGCGCGAGCGCGTTTGCGACCCTGCCGCGTGCGCTGGTTCGTCCGCAGGCGGCG CTGAAGTACCGTGCG

GAAGACAAAAACGATGTTGATGTTGCGCCGGCGGGTAGCACCGCGAGCGATGTTAGC AAGCCGGAGGGTAA

AGCGACCGCGGTTGCGAAGGGCACCGTGAACGCGCCGATCGAGGAAGCGTGGAAAGT GTTCCGTAGCTTTA

GCAACATGGACCAATGGATGCCGGTTTATGGCGAGTGGGAAGCGACCGGTGACAGCG TGGGCGATACCCGT

ACCTTCAACTTTAAGGATCAGCCGACCTTCTTTACCACCGAGCGTCTGGTGGGTCTG GACGATAGCCAATATAA

GATGAAATACACCCTGGTTGACTGCAAAGGCAGCCCGGTGCCGATCGAAAGCATTGA TACCATCGTTACCTTC

ACCGCGAACGACGATGTGACCGAGGTTGACTGGCGTAGCTGGACCAAGAGCCCGATG GTGGATCTGATTAAA

GGTCGTCAAGCGGCGGGTTATGCGGGTGGCATTGCGGCGCTGGACCGTTATCTGAAC CCGAGCCTGGGCACC

GTGGACGTTACCATTAAGAGCGCGGATAACCTGGACGGCGATTTTCTGAGCAGCAGC TACGCGACCCTGATG

GTTACCGATGCGGATCCGGAGCAGGTGCATGCGAAGGAATGGGGCACCAGCCCGGAG TTCGACGCGAAACC

GGTTCAATTTAGCCTGCTGAAGCCGGATAGCAAACTGTATATGAACGTGATGCTGAC CAAATACGGTGTGGAC

ACCCCGGTTGGCTATGCGGTGTTCGATATCCAGAAGAGCCTGAAAAGCGGCGAGACC GTTACCGAAACCTTTC

AACTGGAAGGCAGCAACGACGCGACCCTGACCGTTGAGATGGAACTGAACCTGCGTC AGGGTAGCGTGCTG

CCGCAGAGCAAGGCGCAGAAGAACCTGGCGACCCTGGTGGCGCTGCAGCAAAGCGTG GAGCGTGTTCGTGA

CCGTATTGTTACCATCGGTAAACTGGCGGGTGAACCGGAGAAGAGCGTGTGGGAGTA CGAGCGTAAGAGCG

GTCTGCCGAAGAGCGTTAAAGGTCTGCCGCGTAGCGAAGTGCTGCCGCCGCACAAAA TTGCGCTGATGGTTG

ACGCGATCGCGGAGTACGCGTATACCCAGTTCCAACTGGTTCAGCGTCTGCTGCCGG TGCGTAACAGCTACGA

CCGTTATGCGGCGTACTTTGCGCCGGAAGGCGAGGAATACGTGCCGATTCCGCAAAT CCTGAAGGATATGAC

CTGGAGCACCGACGATGAGTTCATTCGTCAGATCTTTGCGGGTCTGAACCCGCTGCA AGTTGAAGTGGTTAAG

AACAAAGCGGGTCTGCCGAGCAAGCTGCAGGAGCTGAAGGCGAAAGACGGTAGCGAC GTGGATAAACTGAT

CAGCGAGGGCCGTCTGTATGTTCTGGATTACAGCGTGCTGAAGGACCTGGATCTGAA CCGTAACGGTGTTACC

CTGTATGCGCCGACCATGCTGATTTACCGTACCGGTGGCGACAAACTGGATGTTCTG GGCATCATGCTGGAAC

CGCGTCGT G ACG AT GCGCCGGT GTACACCCCGG ACAGCG AG ACCCCG AACAAGTT CCT GCT GGCG AAAT GCC

ACGTTGCGTGCGCGGATAACCAGGTGCACCAATTTACCTATCACCTGGGTTATGCGC ACCTGGCGACCGAACC

GCTGGCGATTGCGAGCCACAACGTGCTGGAGAAGAACAGCCACCCGCTGGGCATGTT CCTGAAACCGCACCT

GCGTGACAACATCGGCATTAACTACCTGGCGCGTCAGACCCTGGTTGCGGACGAAGA TGCGATCACCGATCA

CACCTTTGCGACCGGCACCGCGCAAGGCGTGAGCATGGTGGTTGACGCGTTCAAGAG CTATAACTTTCTGGA

AAGCGGTCT GCCGG AT G AGCT GCGT CGTCGTGGTTT CG AGCGTAGCG ACG AT CT G AAGGTTT ACCGTT ATCGT

GACGATGGTTGGCTGGTGTGGGACACCCTGTGGAAATACGCGGAGGATATGGTTAAC GAACTGTATGGCACC

GACAACGATGTGGCGGCGGACAAAGTGGTTCAGGAGTGGGCGCGTGAAGCGAGCGGT AGCGACACCGCGG

ATGTTCAAGGCTTCCCGGAAAGCATTACCACCAAGTACATCCTGACCAAAGTGCTGA CCACCATCATTTGGCA

AGCGAGCGCGCTGCACAGCGCGCTGAACTATATTCAATACCCGTATACCGCGACCCC GATTAACCGTGCGGCG

AGCATCTTTGGTCCGGTTCCGGACGGCGAGGCGGATATTACCGAACAGGACATTCTG GATGTGATCCCGGGT

GGCCTGGACGATGAGAACAACCGTGGTCTGACCCTGAGCATCTTCCAAGGTCTGCTG AGCTGGCTGCTGCGT ACCCCGG AAAACCCG ACCCT GG ACG AGGTT GGT AGCCCG ATT CCG AACCGT AACAACCCG AT CG AGTGGGTT GAATTTCGTAGCAAATACCCGCAGGTGTACTATAACCTGGATCAAAACCTGGCGGTGGTT GAAAAGATCATTG AGGAACGTAACAAAGGCCTGGCGAGCCCGTATGAGGTGCTGCTGCCGAGCCACATTGCGG CGAGCATCAAC ATTTAA

Amino acid Sequence for CoLOX-19 - SEQ ID NO: 9

MTSSPTVRSMVMLAVLAVSALESAPCASAFATLPRALVRPQAALKYRAEDKN DVDVAPAGSTASDVSKPEGKATA

VAKGTVNAPI EEAWKVFRSFSNM DQWMPVYGEWEATGDSVGDTRTFN FKDQPTFFTTERLVGLDDSQYKM KYT

LVDCKGSPVPI ESI DTIVTFTANDDVTEVDWRSWTKSPMVDLI KGRQAAGYAGGIAALDRYLN PSLGTVDVTIKSAD

N LDGDFLSSSYATLMVTDADPEQVHAKEWGTSPEFDAKPVQFSLLKPDSKLYMNVM LTKYGVDTPVGYAVFDIQ

KSLKSGETVTETFQLEGSN DATLTVEMELNLRQGSVLPQSKAQKN LATLVALQQSVERVRDRIVTIGKLAGEPEKSV

WEYERKSGLPKSVKGLPRSEVLPPH KIALMVDAIAEYAYTQFQLVQRLLPVRNSYDRYAAYFAPEGEEYVPIPQILKD

MTWSTDDEFIRQI FAGLN PLQVEVVKNKAGLPSKLQELKAKDGSDVDKLISEGRLYVLDYSVLKDLDLNRNGVTLYA

PTMLIYRTGGDKLDVLGI MLEPRRDDAPVYTPDSETPNKFLLAKCHVACADNQVHQFTYH LGYAHLATEPLAIASH

NVLEKNSHPLGM FLKPH LRDNIGINYLARQTLVADEDAITDHTFATGTAQGVSMVVDAFKSYNFLESGLPDELRRR

GFERSDDLKVYRYRDDGWLVWDTLWKYAEDMVNELYGTDN DVAADKVVQEWAREASGSDTADVQGFPESITT

KYILTKVLTTI IWQASALHSALNYIQYPYTATPI NRAASI FGPVPDGEADITEQDILDVIPGGLDDEN NRGLTLSI FQGLL

SWLLRTPENPTLDEVGSPIPNRN NPI EWVEFRSKYPQVYYNLDQNLAVVEKII EERN KGLASPYEVLLPSH IAASI NI

Coding sequence for CoLOX-22 - SEQ ID NO: 10

ATGACGTCGTCTCCGACCGTCAGATCGATGGTAATGCTGGCCGTGCTGGCCGTCTCT GCCCTGGAGAGCGCGC

CCTGCGCCTCGGCCTTTGCCACGCTCCCCCGCGCCCTCGTACGACCGCAAGCCGCCC TCAAGTACCGAGCCGA

GGACAAAAACGACGTCGATGTCGCCCCGGCTGGTAGCACTGCCTCCGACGTGAGCAA GCCCGAAGGCAAGG

CCACCGCCGTCGCCAAGGGTACTGTCAACGCGCCCATCGAGGAGGCATGGAAGGTCT TCCGGTCTTTTTCCAA

CATGAACCAATGGATGCCCGTGTACGGCGAGTGGGAGGCCACGGGAGACTCAGTCGG AGACACCCGCACGT

TCAACTTCAAGGATCAGCCGACCTTCTTCACTACCGAGAGGCTTGTCGGCCTGGACG ACTCCCAGTACAAGAT

GAAGTACACCCTCGTCGACTGCAAGGGCTCGCCCGTGCCCATCGAATCTATCGACAC CATCGTCACCTTCACTG

CAAACG AT GAT GT GACCG AGGTT G ACTGGCGCTCCTGGACG AAGT CGCCCATGGT CGACTT G ATCAAGGG AC

GTCAGGCGGCCGGGTATGCCGGCGGCATCGCAGCGCTCGACCGGTACCTGAACCCGT CCCTTGGCACCGTCG

AT GT C ACCAT C AAGT CGGCCG ACAACCT CG ATGGT G ATTTCCT GTCCAGCTCCT ACGCCACT CT CAT GGT CACG

G ACGCCG ACCCCG AGCAAGTGCATGCCAAGG AGTGGGGG ACG AGTCCT G AGTTT G ATGCCAAGCCCGT CCA

GTTCAGCCTGCTCAAGCCCGACTCCAAGCTCTACATGAGCGTCATGCTGACCAAGTG CGGCGTCGACGCCCCC

GTCGG AT ACGCCGT CTTT G ACAT CCAG AAG AGCCT CAAGTCCGGCG AG ACT GT G ACCGAG ACCTTT CAGCTCG

AGGGCAGCAACGATGCAACGTTGACGGTCGAGATGGAGCTCAACCTCCGGCAGGGCA GCATCCTCCCTCAAT

CCAAGGCCCAG AAG AAT CTGGCG ACCCTT GT CGCCCT CCAGCAGTCT GTCG AG AGGGTCCG AG ACCGCATCG

TTACTATCGGCAAGCTGGCCGGCGAGCCCGAGAAGTCGGTATGGGAGTACGAGCGAA AGTCCGGCCTTCCCA

AGTCCGTCAAGGGCCTTCCTCGGTCGGAAGTGCTGCCGCCGCACAAGATCGCCCTCA TGGTCGACGCCATCGC

CG AGT ACG CTT ACACCCAGTT CC AGCTCGT CC AGCGCCT GCTCCCCGT C AG AAACTCGT ACG ACCGGT ACGCC

GCTTACTTTGCCCCAGAAGGCGAGGAGTACGTTCCCATCCCGCAGATCCTCAAGGAC ATGACGTGGTCCACCG

ACGACGAGTTCATCCGCCAGATCTTTGCCGGCCTCAACCCGTTGCAAGTCGAGGTCG TCAAGAACAAGGCCG

GTCTGCCCTCCAAGTTGCAGGAGCTCAAGGCCAAGGACGGATCTGATGTCGATAAGC TCATCTCGGAAGGCC

GGTTGTATGTTCTGGACTACTCGGTCCTCAAGGATCTCGACCTCAACCGCAACGGTG TCACCCTGTACGCGCC

GACGATGCTCATCTACCGCACTGGTGGTGACAAGCTCGACGTCCTCGGCATCATGCT CGAGCCCCGCCGTGAC

GATGCGCCCGTTTACACGCCCGACTCTGAAACTCCCAACAAGTTCCTTCTCGCCAAG TGCCACGTTGCCTGCGC TGACAACCAAGTGCACCAGTTCACGTACCATCTCGGTTACGCCCATCTTGCCACGGAGCC ACTTGCGATCGCA

AGCCACAACGTCCTGGAGAAGAACAGCCATCCGCTCGGCATGTTCCTCAAGCCACAC TTGCGCGACAACATTG

GCATCAACTACCTCGCCCGGCAGACTCTTGTTGCCGACGAAGACGCCATCACAGACC ACACCTTTGCCACGGG

CACCGCGCAGGGCGTCAGTATGGTCGTCGACGCCTTCAAGTCGTACAACTTCCTCGA GTCTGGCTTGCCCGAT

GAGCTGCGCCGTCGTGGATTCGAACGGTCGGACGACCTCAAGGTGTATCGCTACCGC GACGATGGCTGGTTG

GTCTGGGACACGCTCTGGAAGTACGCCGAGGATATGGTCAACGAGCTGTACGGAACG GACAACGATGTCGCT

GCTGACAAGGTCGTCCAGGAGTGGGCGAGGGAAGCATCTGGCTCGGACACTGCCGAC GTCCAGGGCTTTCC

GGAGTCCATCACGACCAAGTACATCCTCACAAAGGTCCTGACGACGATCATCTGGCA AGCGTCCGCCTTGCAC

TCGGCTCTCAACTACATCCAATACCCGTACACTGCGACCCCCATCAACCGTGCCGCC TCCATCTTTGGACCGGT

CCCTGACGGCGAAGCGGATATCACCGAGCAGGACATCCTGGATGTCATCCCTGGTGG TCTGGGTGATGAGAA

CAACCGTGGTCTGACCCTCTCCATCTTCCAAGGCCTGCTCTCGTGGCTCCTGCGCAC TCCTGAGAACCCGACGC

TGG ACG AAGTCGGCAGT CCAATCCCG AACAGG AACAACCCCAT CG AGTGGGT CG AGTT CCGCT CG AAGTATC

CCCAGGTCTACTACAACTTGGACCAGAACCTTGCCGTGGTGGAGAAGATCATCGAGG AGCGCAACAAGGGCC

TTGCTTCTCCGTACGAGGTGCTCCTTCCCAGCCACATCGCTGCCAGCATCAACATCT GA

Codon-optimized coding sequence of CoLOX-22 by Genscript genetic codon frequency of f. coli - SEQ ID NO: 11

ATGACCAGCAGCCCGACCGTGCGTAGCATGGTTATGCTGGCGGTTCTGGCGGTTAGC GCGCTGGAAAGCGCG

CCGTGCGCGAGCGCGTTTGCGACCCTGCCGCGTGCGCTGGTTCGTCCGCAGGCGGCG CTGAAGTATCGTGCG

GAAGACAAAAACGATGTTGATGTTGCGCCGGCGGGTAGCACCGCGAGCGATGTTAGC AAGCCGGAGGGTAA

AGCGACCGCGGTTGCGAAGGGCACCGTGAACGCGCCGATCGAGGAAGCGTGGAAAGT GTTCCGTAGCTTTA

GCAACATGAACCAATGGATGCCGGTTTACGGCGAGTGGGAAGCGACCGGTGACAGCG TGGGCGATACCCGT

ACCTTCAACTTTAAGGACCAGCCGACCTTCTTTACCACCGAGCGTCTGGTGGGTCTG GACGATAGCCAATATA

AGATGAAATACACCCTGGTTGACTGCAAAGGCAGCCCGGTGCCGATCGAAAGCATTG ATACCATCGTTACCTT

CACCGCGAACGACGATGTGACCGAGGTTGACTGGCGTAGCTGGACCAAGAGCCCGAT GGTGGATCTGATTAA

AGGTCGTCAAGCGGCGGGTTATGCGGGTGGCATTGCGGCGCTGGACCGTTACCTGAA CCCGAGCCTGGGCAC

CGTGGACGTTACCATCAAGAGCGCGGATAACCTGGACGGCGATTTTCTGAGCAGCAG CTACGCGACCCTGAT

GGTTACCGATGCGGATCCGGAGCAGGTGCATGCGAAGGAATGGGGCACCAGCCCGGA GTTCGACGCGAAAC

CGGTTCAATTTAGCCTGCTGAAGCCGGATAGCAAACTGTATATGAGCGTGATGCTGA CCAAATGCGGTGTGG

ATGCGCCGGTTGGTTATGCGGTGTTCGATATTCAGAAGAGCCTGAAAAGCGGCGAGA CCGTTACCGAAACCT

TTCAACTGGAAGGCAGCAACGACGCGACCCTGACCGTGGAGATGGAACTGAACCTGC GTCAGGGTAGCATCC

TGCCGCAGAGCAAGGCGCAGAAGAACCTGGCGACCCTGGTTGCGCTGCAGCAAAGCG TGGAGCGTGTTCGT

GATCGTATTGTTACCATCGGTAAACTGGCGGGTGAACCGGAGAAGAGCGTGTGGGAG TATGAGCGTAAGAG

CGGTCTGCCGAAGAGCGTTAAAGGTCTGCCGCGTAGCGAAGTGCTGCCGCCGCACAA AATTGCGCTGATGGT

TGACGCGATCGCGGAGTACGCGTATACCCAGTTCCAACTGGTTCAGCGTCTGCTGCC GGTGCGTAACAGCTAC

GACCGTTATGCGGCGTACTTTGCGCCGGAAGGCGAGGAATACGTGCCGATTCCGCAA ATCCTGAAGGATATG

ACCTGGAGCACCGACGATGAGTTCATTCGTCAGATCTTTGCGGGTCTGAACCCGCTG CAAGTTGAAGTGGTTA

AGAACAAAGCGGGTCTGCCGAGCAAGCTGCAGGAGCTGAAGGCGAAAGACGGTAGCG ACGTGGATAAACT

GATCAGCGAGGGCCGTCTGTATGTTCTGGATTACAGCGTGCTGAAGGACCTGGATCT GAACCGTAACGGTGT

TACCCT GTAT GCGCCG ACCAT GCT G ATTTACCGT ACCGGTGGCG ACAAACT GG AT GTTCTGGGCAT CAT GCTG

G AACCGCGTCGT G ACG AT GCGCCGGT GTACACCCCGG ACAGCG AG ACCCCG AACAAGTT CCT GCTGGCG AAA

TGCCACGTTGCGTGCGCGGATAACCAGGTGCACCAATTTACCTATCACCTGGGTTAT GCGCACCTGGCGACCG

AACCGCTGGCGATTGCGAGCCACAACGTGCTGGAGAAGAACAGCCACCCGCTGGGCA TGTTCCTGAAACCGC

ACCTGCGTGACAACATCGGCATTAACTATCTGGCGCGTCAGACCCTGGTTGCGGACG AAGATGCGATCACCG

ATCACACCTTTGCGACCGGCACCGCGCAAGGCGTGAGCATGGTGGTTGACGCGTTCA AGAGCTACAACTTTCT GG AAAGCGGT CTGCCGG AT G AGCTGCGT CGT CGTGGTTT CG AGCGT AGCG ACG ATCT G AAGGTTT ACCGTT A

TCGTGACGATGGTTGGCTGGTGTGGGACACCCTGTGGAAATATGCGGAGGATATGGT TAACGAACTGTACGG

CACCGACAACGATGTGGCGGCGGACAAAGTGGTTCAGGAGTGGGCGCGTGAAGCGAG CGGTAGCGACACC

GCGGATGTTCAAGGCTTCCCGGAAAGCATTACCACCAAGTATATCCTGACCAAAGTG CTGACCACCATCATTT

GGCAAGCGAGCGCGCTGCACAGCGCGCTGAACTACATTCAATACCCGTATACCGCGA CCCCGATTAACCGTGC

GGCGAGCATCTTTGGTCCGGTTCCGGACGGCGAGGCGGATATTACCGAACAGGACAT TCTGGATGTGATCCC

GGGTGGCCTGGGTGACGAGAACAACCGTGGCCTGACCCTGAGCATCTTCCAAGGTCT GCTGAGCTGGCTGCT

GCGT ACCCCGG AAAACCCG ACCCTGG AT G AGGTTGGCAGCCCG ATTCCG AACCGTAACAACCCG AT CG AGT G

GGTTGAATTTCGTAGCAAATATCCGCAGGTGTACTATAACCTGGACCAAAACCTGGC GGTGGTTGAAAAGATC

ATTGAGGAACGTAACAAAGGTCTGGCGAGCCCGTACGAGGTGCTGCTGCCGAGCCAC ATTGCGGCGAGCATC

AACATTTAA

Amino acid Sequence for CoLOX-22 - SEQ ID NO: 12

MTSSPTVRSMVMLAVLAVSALESAPCASAFATLPRALVRPQAALKYRAEDKN DVDVAPAGSTASDVSKPEGKATA

VAKGTVNAPI EEAWKVFRSFSNM NQWMPVYGEWEATGDSVGDTRTFNFKDQPTFFTTERLVGLDDSQYKMKY

TLVDCKGSPVPIESIDTIVTFTAN DDVTEVDWRSWTKSPMVDLIKGRQAAGYAGGIAALDRYLNPSLGTVDVTI KSA

DN LDGDFLSSSYATLMVTDADPEQVHAKEWGTSPEFDAKPVQFSLLKPDSKLYMSVMLTKCG VDAPVGYAVFDI

QKSLKSGETVTETFQLEGSN DATLTVEMELNLRQGSI LPQSKAQKNLATLVALQQSVERVRDRIVTIGKLAGEPEKSV

WEYERKSGLPKSVKGLPRSEVLPPH KIALMVDAIAEYAYTQFQLVQRLLPVRNSYDRYAAYFAPEGEEYVPIPQILKD

MTWSTDDEFIRQI FAGLN PLQVEVVKNKAGLPSKLQELKAKDGSDVDKLISEGRLYVLDYSVLKDLDLNRNGVTLYA

PTMLIYRTGGDKLDVLGI MLEPRRDDAPVYTPDSETPNKFLLAKCHVACADNQVHQFTYH LGYAHLATEPLAIASH

NVLEKNSHPLGM FLKPH LRDNIGINYLARQTLVADEDAITDHTFATGTAQGVSMVVDAFKSYNFLESGLPDELRRR

GFERSDDLKVYRYRDDGWLVWDTLWKYAEDMVNELYGTDN DVAADKVVQEWAREASGSDTADVQGFPESITT

KYILTKVLTTI IWQASALHSALNYIQYPYTATPI NRAASI FGPVPDGEADITEQDILDVIPGGLGDENN RGLTLSIFQGL

LSWLLRTPENPTLDEVGSPI PN RNN PIEWVEFRSKYPQVYYN LDQNLAVVEKII EERN KGLASPYEVLLPSHIAASIN I

Coding sequence for CoLOX-d4 - SEQ ID NO: 13

ATGACGTCGTCTCCGACCGTCAGATCGATGGTAATGCTGGCCGTGCTGGCCGTCTCT GCCCTGGAGAGCGCGC

CCTGCGCCTCGGCCTTTGCCACGCTCCCCCGCGCCCTCGTACGACCGCAAGCCGCCC TCAAGTACCGAGCCGA

GGACAAAAACGACGTCGATGTCGCCCCGGCTGGTAGCACTGCCTCCGACGTGAGCAA GCCCGAAGGAAAGG

CCACTGCTGTCGCCAAGGGTACTGTCAACGCGCCCATCGAGGAGGCATGGAAGGTCT TCCGGTCTTTTTCCAA

CATGGACCAATGGATGCCCGTGTACGGCGAGTGGGAGGCCACGGGAGACTCAGTCGG AGACACCCGCACGT

TCAACTTCAAGGATCAGCCGACCTTCTTCACTACCGAGAGGCTTGTCGGCCTGGACG ACTCCCAGTACAAGAT

GAAGTACACCCTCGTCGACTGCAAGGGCTCGCCCGTGCCCATCGAATCTATTGACAC CATCGTCACCTTCACTG

CAAACG AT GAT GT G ACCG AGGTT G ACTGGCGCTCCTGGACG AAGT CGCCCATGGT CGACTT G ATCAAGGG AC

GTCAGGCGGCCGGGTATGCTGGCGGCATCGCAGCGCTCGACCGGTACCTGAACCCGT CCCTTGGCACCGTCG

ATGTCACCATCAAGTCGGCCGACAACCTCGATGGCGATTTCCTGTCCAGCTCCTACG CCACTCTCATGGTCACG

G ACGCCG ACCCCG AGCAAGTGCATGCCAAGG AGTGGGGG ACG AGTCCT G AGTTCG ATGCCAAGCCCGT CCA

GTTCAGCCTGCTCAAGCCCGACTCCAAGCTCTACATGAACGTCATGCTGACCAAGTA CGGCGTCGACACGCCC

GTCGG AT ACGCCGT CTTT G ACAT CCAG AAG AGCCT CAAGTCCGGCG AG ACT GT G ACCGAG ACCTTT CAGCTCG

AGGGCAGCAACGATGCAACGTTGACGGTCGAGATGGAGCTCAACCTCCGGCAGGGCA GCGTCCTCCCTCAAT

CCAAGGCCCAG AAG AAT CTGGCG ACCCTT GT CGCCCT CCAGCAGTCT GTCG AG AGGGTCCG AG ACCGCATCG

TTACTATCGGCAAGCTGGCCGGCGAGCCCGAGAAGTCGGTATGGGAGTACGAGCGAA AGTCCGGCCTTCCCA

AGTCCGTCAAGGGTCTTCCTCGATCGGAAGTGCTGCCGCCGCACAAGATCGCTCTCA TGGTCGACGCCATCGC CG AGT ACG CTT ACACT CAGTT CC AGCTCGT CC AGCGCCT GCTCCCCGT C AG AAACTCGT ACG ACCGGT ACGCC

GCTTACTTTGCCCCAGAAGGCGAGGAGTACGTTCCCATCCCGCAGATCCTCAAGGAC ATGACGTGGTCCACCG

ACGACGAGTTCATCCGCCAGATCTTTGCCGGCCTCAACCCGTTGCAAGTCGAGGTCG TCAAGAACAAGGCCG

GTCTGCCCTCCAAGTTGCAGGAGCTCAAGGCCAAGGACGGATCTGATGTCGATAAGC TCATCTCGGAAGGCC

GGTTGTATGTTCTGGACTACTCGGTCCTCAAGGATCTCGACCTCAACCGCAACGGTG TCACCCTGTACGCGCC

GACGATGCTCATCTACCGCACTGGTGGTGACAAGCTCGACGTCCTCGGCATCATGCT CGAGCCCCGCCGTGAC

GATGCGCCCGTTTACACGCCCGACTCTGAAACTCCCAACAAGTTCCTTCTTGCCAAG TGCCACGTTGCCTGCGC

TGACAACCAAGTGCACCAGTTCACGTACCATCTCGGTTACGCCCATCTTGCCACGGA GCCACTTGCGATCGCA

AGCCACAACGTCCTGGAGAAGAACAGCCATCCGCTCGGCATGTTCCTCAAGCCACAC TTCCGCGACAACATCG

GCATCAACTACCTCGCCCGGCAGACTCTTGTTGCCGACGAAGACGCCATCACAGACC ACACTTTTGCCACGGG

CACCGCGCAGGGCGTCAGTATGGTCGTCGACGCCTTCAAGTCGTACAACTTCCTCGA GTCTGGCTTGCCCGAT

GAGCTGCGCCGTCGTGGATTCGAACGGTCGGACGACCTCAAGGTGTATCGCTACCGC GACGATGGCTGGTTG

GTTTGGGACACGCTCTGGAAGTACGCCGAGGATATGGTCAACGAGCTGTACGGAACG GACAACGATGTCACT

GCTGACAAGGTCGTCCAGGAGTGGGCGAGGGAAGCATCTGGCTCGGACACTGCCGAC GTCCAGGGCTTTCC

GGAGTCCATCACGACCAAGTACATCCTCACAAAGGTCCTGACGACGATCATCTGGCA AGCGTCCGCCTTGCAC

TCGGCTCTCAACTACATCCAATACCCGTACACTGCGACCCCCATCAACCGTGCCGCC TCCATCTTTGGACCGGT

CCCTGACGGCGAAGCGGATATCACCGAGCAGGACATCCTGGATGTCATCCCTGGTGG TCTGGACGATGAGAA

CAACCGTGGTCTGACCCTCTCCATCTTCCAAGGCCTGCTCTCGTGGCTCCTGCGCAC TCCTGAGAACCCGACGC

TGGACGAAGTCGGCAGCCCAATCCCGAACAGGAACAACCCCATCGAGTGGGTCGAGT TCCGCTCGAAGTACC

CCCAGGTCTACTACAACTTGGACCAGAACCTTGCCGTGGTGGAGAAGATCATCGAGG AGCGCAACAAGGGCC

TTGCTTCTCCGTACGAGGTGCTCCTTCCCAGCCACATTGCTGCCAGCATCAACATCT GA

Codon-optimized coding sequence of CoLOX-d4 by Genscript genetic codon frequency of f. coli - SEQ ID NO: 14

ATGACCAGCAGCCCGACCGTGCGTAGCATGGTTATGCTGGCGGTTCTGGCGGTTAGC GCGCTGGAAAGCGCG

CCGTGCGCGAGCGCGTTTGCGACCCTGCCGCGTGCGCTGGTTCGTCCGCAGGCGGCG CTGAAGTACCGTGCG

GAAGACAAAAACGATGTTGATGTTGCGCCGGCGGGTAGCACCGCGAGCGATGTTAGC AAGCCGGAGGGTAA

AGCGACCGCGGTTGCGAAGGGCACCGTGAACGCGCCGATCGAGGAAGCGTGGAAAGT GTTCCGTAGCTTTA

GCAACATGGACCAATGGATGCCGGTTTATGGCGAGTGGGAAGCGACCGGTGACAGCG TGGGCGATACCCGT

ACCTTCAACTTTAAGGATCAGCCGACCTTCTTTACCACCGAGCGTCTGGTGGGTCTG GACGATAGCCAATATAA

GATGAAATACACCCTGGTTGACTGCAAAGGCAGCCCGGTGCCGATCGAAAGCATTGA TACCATCGTTACCTTC

ACCGCGAACGACGATGTGACCGAGGTTGACTGGCGTAGCTGGACCAAGAGCCCGATG GTGGATCTGATTAAA

GGTCGTCAAGCGGCGGGTTATGCGGGTGGCATTGCGGCGCTGGACCGTTATCTGAAC CCGAGCCTGGGCACC

GTGGACGTTACCATTAAGAGCGCGGATAACCTGGACGGCGATTTCCTGAGCAGCAGC TACGCGACCCTGATG

GTTACCGATGCGGATCCGGAGCAGGTGCATGCGAAGGAATGGGGCACCAGCCCGGAG TTCGACGCGAAACC

GGTTCAATTTAGCCTGCTGAAGCCGGATAGCAAACTGTATATGAACGTGATGCTGAC CAAATACGGTGTGGAC

ACCCCGGTTGGCTATGCGGTGTTCGATATCCAGAAGAGCCTGAAAAGCGGCGAGACC GTTACCGAAACCTTTC

AACTGGAAGGCAGCAACGACGCGACCCTGACCGTTGAGATGGAACTGAACCTGCGTC AGGGTAGCGTGCTG

CCGCAGAGCAAGGCGCAGAAGAACCTGGCGACCCTGGTGGCGCTGCAGCAAAGCGTG GAGCGTGTTCGTGA

CCGTATTGTTACCATCGGTAAACTGGCGGGTGAACCGGAGAAGAGCGTGTGGGAGTA CGAGCGTAAGAGCG

GTCTGCCGAAGAGCGTTAAAGGTCTGCCGCGTAGCGAAGTGCTGCCGCCGCACAAAA TTGCGCTGATGGTTG

ACGCGATCGCGGAGTACGCGTATACCCAGTTCCAACTGGTTCAGCGTCTGCTGCCGG TGCGTAACAGCTACGA

CCGTTATGCGGCGTACTTTGCGCCGGAAGGCGAGGAATACGTGCCGATTCCGCAAAT CCTGAAGGATATGAC

CTGGAGCACCGACGATGAGTTCATTCGTCAGATCTTTGCGGGTCTGAACCCGCTGCA AGTTGAAGTGGTTAAG

AACAAAGCGGGTCTGCCGAGCAAGCTGCAGGAGCTGAAGGCGAAAGACGGTAGCGAC GTGGATAAACTGAT CAGCGAGGGCCGTCTGTATGTTCTGGATTACAGCGTGCTGAAGGACCTGGATCTGAACCG TAACGGTGTTACC

CTGTATGCGCCGACCATGCTGATTTACCGTACCGGTGGCGACAAACTGGATGTTCTG GGCATCATGCTGGAAC

CGCGTCGT G ACG AT GCGCCGGT GTACACCCCGG ACAGCG AG ACCCCG AACAAGTT CCT GCT GGCG AAAT GCC

ACGTTGCGTGCGCGGATAACCAGGTGCACCAATTTACCTATCACCTGGGTTATGCGC ACCTGGCGACCGAACC

GCTGGCGATTGCGAGCCACAACGTGCTGGAGAAGAACAGCCACCCGCTGGGCATGTT CCTGAAACCGCACTT

TCGTGACAACATCGGCATTAACTACCTGGCGCGTCAGACCCTGGTTGCGGACGAAGA TGCGATCACCGATCAT

ACCTTTGCGACCGGCACCGCGCAAGGCGTGAGCATGGTGGTTGACGCGTTCAAGAGC TATAACTTTCTGGAA

AGCGGTCTGCCGGATGAGCTGCGTCGTCGTGGTTTCGAGCGTAGCGACGATCTGAAG GTTTACCGTTATCGT

GACGATGGTTGGCTGGTGTGGGACACCCTGTGGAAATACGCGGAGGATATGGTTAAC GAACTGTATGGCACC

GACAACGATGTGACCGCGGACAAAGTGGTTCAGGAGTGGGCGCGTGAAGCGAGCGGT AGCGACACCGCGG

ATGTTCAAGGCTTCCCGGAAAGCATTACCACCAAGTACATCCTGACCAAAGTGCTGA CCACCATCATTTGGCA

AGCGAGCGCGCTGCACAGCGCGCTGAACTATATTCAATACCCGTATACCGCGACCCC GATTAACCGTGCGGCG

AGCATCTTTGGTCCGGTTCCGGACGGCGAGGCGGATATTACCGAACAGGACATTCTG GATGTGATCCCGGGT

GGCCTGGACGATGAGAACAACCGTGGTCTGACCCTGAGCATCTTCCAAGGTCTGCTG AGCTGGCTGCTGCGT

ACCCCGG AAAACCCG ACCCT GG ACG AGGTT GGT AGCCCG ATT CCG AACCGT AACAACCCG AT CGAGTGGGTT

GAATTTCGTAGCAAATACCCGCAGGTGTACTATAACCTGGATCAAAACCTGGCGGTG GTTGAAAAGATCATTG

AGGAACGTAACAAAGGCCTGGCGAGCCCGTATGAGGTGCTGCTGCCGAGCCACATTG CGGCGAGCATCAAC

ATTTAA

Amino acid Sequence for CoLOX-d4 - SEQ I D NO: 15

MTSSPTVRSMVMLAVLAVSALESAPCASAFATLPRALVRPQAALKYRAEDKN DVDVAPAGSTASDVSKPEGKATA

VAKGTVNAPI EEAWKVFRSFSNM DQWMPVYGEWEATGDSVGDTRTFN FKDQPTFFTTERLVGLDDSQYKM KYT

LVDCKGSPVPI ESI DTIVTFTANDDVTEVDWRSWTKSPMVDLI KGRQAAGYAGGIAALDRYLN PSLGTVDVTIKSAD

N LDGDFLSSSYATLMVTDADPEQVHAKEWGTSPEFDAKPVQFSLLKPDSKLYMNVM LTKYGVDTPVGYAVFDIQ

KSLKSGETVTETFQLEGSN DATLTVEMELNLRQGSVLPQSKAQKN LATLVALQQSVERVRDRIVTIGKLAGEPEKSV

WEYERKSGLPKSVKGLPRSEVLPPH KIALMVDAIAEYAYTQFQLVQRLLPVRNSYDRYAAYFAPEGEEYVPIPQILKD

MTWSTDDEFIRQI FAGLN PLQVEVVKNKAGLPSKLQELKAKDGSDVDKLISEGRLYVLDYSVLKDLDLNRNGVTLYA

PTMLIYRTGGDKLDVLGI MLEPRRDDAPVYTPDSETPNKFLLAKCHVACADNQVHQFTYH LGYAHLATEPLAIASH

NVLEKNSHPLGM FLKPH FRDNIGINYLARQTLVADEDAITDHTFATGTAQGVSMVVDAFKSYNFLESGLPDELRRR

GFERSDDLKVYRYRDDGWLVWDTLWKYAEDMVNELYGTDN DVTADKVVQEWAREASGSDTADVQGFPESITT

KYILTKVLTTNWQASALHSALNYIQYPYTATPI NRAASI FGPVPDGEADITEQDILDVIPGGLDDEN NRGLTLSI FQGLL

SWLLRTPENPTLDEVGSPIPNRN NPI EWVEFRSKYPQVYYNLDQNLAVVEKII EERN KGLASPYEVLLPSH IAASI NI

2. UfLOX

Coding sequence for UfLOX2 - SEQ ID NO: 16

ATGCCTTCCATCAAACCATGCCTACCGGGTGACTCTGCCAACAGCGCAGCCCGGACA GCCTCAATCAAGGAGA

AGCGGGCGCAGATTGGATACGACTACAAGATGCTCCCTAAGCTCGCCCTGGCCTCAG CACCCCCAGCAAAGTT

CGTGGAGCTCTCTGATGCCTACATGGCTGAGCGCATTGGTGAAACTGCAAAGTTTTT TAAGAACAAGGAGATG

ACGAAGGCCCGGAGGATGTTTGACGTTGTCAACAGGATGGAGGACTTCAACGACTAT TTCATTCTCCCTCCTG

TGATCGCGCCGGAGCATGCTAAGGGCAAGTGGATGGAGGATGACTTTTTTGCGGAGC AGCGCCTGTCCGGG

GCAAACCCTCTGGTCCTGGCTAAGCTCGACCGTGACGACGCCCGCGCAGAAATCCTC GAGGATATGAACCTTG ACTTCAGCGTCAACAGCGAGCTCAGCAGAGGCAACATCTACGTCTGCGACTACACTGGGA CGGACCCGACGT

ACCGCGGCCCTTGCATGGTCACGGGAGGCGAAAACAACTCTGGAAAGAAGAAGTGGC TGCCAAAACCCCTAT

CATGGTTCCGCTGGATTGAGGACGACAAAAACAAGGTGGGCGGCAAGCTCGTGCCTG TCGCCATTCAGCTCG

ATGCCAGTGAGGACCCAGTCAACTACGTCCGCAAGGACTCGCGGGTGTACACCCCCA ACGAGGAGCACGAGT

ACGACTGGCTGTTTGCAAAGATCTGTGTCCAGGTGGCAGACTCTCTGCACCACGAGA TGGGCTCCCATCTCGC

TCGCTGCCACTTCACGATGGAACCGATCGCCGTGTGTGTTCACCGGACGATGGCAGA AGAGCACCCCATCGCT

CTGCTCCTGAACCTGCACATGCGGTTCCACATTGCCAACGACTCGGTCGCGGCTTAC ACACTCATTGGTCCTTC

TGGCAACGTTGATGACTTGATGCCTGGAACCCTGCGCGAGTCCATGGCGCTACTGAC GGAGTCATACGACAA

GTGGGACCTCATCGGCACCAACTTTGAGAACGACCTCTTCAACCGCGAGGTGAACGA TGATGAACGCCTGCCC

CACTACCCCTACCGTGACGATGGCAAGCTCATCTGGAAGATCATCGAGGACTGGGTG GAGAAATACGTAAAT

GCCTTCTACGACAACGATGATGAGGTTGAGGGCGATCCTGAGCTGCAGGCGTTCGCC AAGGAGTGCAAGGAC

AAGAAGGAAGGTGGCCGGGTGAAGGGTATGCCGGAGACGATCCGCAGCCGTAGCATG CTTGTTGAAATCCT

C ACC AG CAT CAT CTTT GTGTGTGGCCCTGGCCACGGAGCTAT C AACTT CTCG C AAT ACG ACT ATATGTCGTTCG

TGCCCAACATGCCACTCGCGATTTATGAGGATATCCAGCTGCTCGCAGACCAAAAGG AGCCGGTTACGGAGG

CGCAGCTCAT GT CG ATCCTGCCAG ACGGT G AAACCGCAGCCCGCCAGCTT GAG ATT GTAT ACAACCT G ACCGC

CTACAAGTTCGATAAGTTCGGGGATTATGACAGGACCTTCAAGGAGTGGTACGGCGA GACCTTTGAAGCCCA

TTTCAAGGACTACCCGCTCGTGATCCAGGGCTATCGGCAGCTCCAGGTTGCGCTGAG GCAGTCGGAGGTGGA

GATTAAGAAGCGCAACGCCAAACGCCCGAACAACTATCCGTACATGCAGCAGAGCGA GATGTTGAACAGCAT

CAGCATTTAA

Codon-optimized coding sequence of UfLOX2 by Genscript genetic codon frequency of E. coli - SEQ ID NO: 17

ATGCCGAGCATCAAACCGTGCCTGCCGGGTGACAGCGCGAACAGCGCGGCGCGTACC GCGAGCATCAAAGA

AAAGCGTGCGCAGATTGGTTACGATTATAAAATGCTGCCGAAGCTGGCGCTGGCGAG CGCTCCGCCGGCGAA

GTTCGTGGAGCTGAGCGACGCGTATATGGCGGAGCGTATTGGTGAAACCGCGAAATT CTTTAAAAACAAGGA

GATGACCAAGGCGCGTCGTATGTTTGATGTGGTTAACCGTATGGAAGACTTCAACGA TTACTTTATTCTGCCGC

CGGTGATTGCGCCGGAGCACGCGAAGGGCAAGTGGATGGAGGACGATTTCTTTGCGG AACAGCGTCTGAGC

GGTGCG AACCCGCT GGTTCT GGCG AAACT GG ACCGT G ACG AT GCGCGT GCGG AG AT CCTGG AAG ACAT G AA

CCTGGATTTCAGCGTGAACAGCGAACTGAGCCGTGGCAACATTTACGTTTGCGACTA TACCGGCACCGATCCG

ACCTACCGTGGTCCGTGCATGGTTACCGGTGGCGAAAACAACAGCGGTAAGAAAAAG TGGCTGCCGAAACCG

CTGAGCTGGTTTCGTTGGATCGAGGACGATAAAAACAAAGTGGGTGGCAAGCTGGTG CCGGTTGCGATTCAG

CT GG ACGCG AGCG AAG ATCCGGT G AACTACGTTCGT AAAG ACAGCCGT GTTT AT ACCCCG AACG AGG AACAC

GAGTACGACTGGCTGTTCGCGAAGATCTGCGTGCAAGTTGCGGATAGCCTGCATCAT GAGATGGGTAGCCAC

CTGGCGCGTTGCCACTTTACCATGGAACCGATCGCGGTGTGCGTTCACCGTACCATG GCGGAGGAACACCCG

ATTGCGCTGCTGCTGAACCTGCACATGCGTTTCCACATCGCGAACGATAGCGTGGCG GCGTATACCCTGATTG

GCCCGAGCGGTAACGTTGACGATCTGATGCCGGGCACCCTGCGTGAGAGCATGGCGC TGCTGACCGAAAGCT

ACGACAAGTGGGATCTGATCGGCACCAACTTCGAAAACGACCTGTTTAACCGTGAGG TGAACGACGATGAAC

GTCTGCCGCACTACCCGTATCGTGACGATGGTAAACTGATTTGGAAGATCATTGAGG ATTGGGTGGAAAAAT

ACGTTAACGCGTTCTATGACAACGACGATGAGGTGGAAGGCGATCCGGAGCTGCAGG CGTTTGCGAAAGAG

TGCAAGGACAAAAAGGAAGGTGGCCGTGTTAAGGGTATGCCGGAGACCATCCGTAGC CGTAGCATGCTGGT

TGAGATTCTGACCAGCATCATTTTCGTTTGCGGTCCGGGCCACGGTGCGATCAACTT CAGCCAATACGATTATA

TGAGCTTTGTGCCGAACATGCCGCTGGCGATCTACGAGGACATTCAGCTGCTGGCGG ATCAAAAAGAGCCGG

TTACCGAAGCGCAGCTGATGAGCATTCTGCCGGATGGTGAAACCGCGGCGCGTCAAC TGGAAATTGTGTACA

ACCTGACCGCGTATAAATTCGATAAGTTTGGCGACTATGATCGTACCTTTAAAGAAT GGTACGGCGAGACCTT

CGAAGCGCACTTTAAGGACTACCCGCTGGTTATCCAGGGTTATCGTCAGCTGCAAGT GGCGCTGCGTCAAAGC GAGGTTGAAATTAAAAAGCGTAACGCGAAGCGTCCGAACAACTACCCGTATATGCAGCAA AGCGAGATGCTG A AC AG CATC AG C ATTT A A

Amino acid Sequence for UfLOX2 - SEQ ID NO: 18

MPSI KPCLPGDSANSAARTASI KEKRAQIGYDYKMLPKLALASAPPAKFVELSDAYMAERIGETAKFFKNKEMTKAR

RMFDVVNRMEDFNDYFILPPVIAPEHAKGKWM EDDFFAEQRLSGAN PLVLAKLDRDDARAEI LEDMN LDFSVNS

ELSRGNIYVCDYTGTDPTYRGPCMVTGGEN NSGKKKWLPKPLSWFRWIEDDKN KVGGKLVPVAIQLDASEDPVN

YVRKDSRVYTPNEEH EYDWLFAKICVQVADSLH HEMGSHLARCH FTM EPIAVCVH RTMAEEH PIALLLNLHMRFH

lAN DSVAAYTLIGPSGNVDDLM PGTLRESMALLTESYDKWDLIGTN FENDLFN REVNDDERLPHYPYRDDGKLIWK

II EDWVEKYVNAFYDNDDEVEGDPELQAFAKECKDKKEGGRVKGMPETIRSRSMLVEI LTSI IFVCGPGHGAIN FSQ

YDYMSFVPNM PLAIYEDIQLLADQKEPVTEAQLMSILPDGETAARQLEIVYN LTAYKFDKFGDYDRTFKEWYGETFE

AHFKDYPLVIQGYRQLQVALRQSEVEI KKRNAKRPNNYPYMQQSEM LNSISI

3. Bacterial LOX

Codon-optimized coding sequence for WP_002738122.1 - SEQ I D NO: 19

ATGGT G AACACCCCGCCGCCG ACCCCGTGCCTGCCGCAG AACG AGCCGG ATGCG AACCGTCGTGCGG ATAGC

CTGAACCTGCAGCGTCAAGCGTACCGTTATGACTACCAGTATCTGCCGCCGCTGGTG CTGATGGAGAGCGTTC

CGGCGGCGGAAAACTTCAGCTTTCAATATATTACCGAACGTCTGGCGGCGACCGCGG AACTGCCGGCGAACA

TGCTGGCGGTGAAGGTTAAAAGCTTCCTGGACCCGCTGGATGAGCTGCAGGACTACG AAGATTTCTTTGCGAT

CATTCCGCTGCCGAAGATCGCGAAAGTGTATCAGACCAACGATGCGTTTGCGGAACA ACGTCTGAGCGGTGC

GAACCCGCTGGTTCTGCACCTGCTGAAGCCGGGTGATGCGCGTGCGCAGGTTCTGAA CCAAATTCCGAGCAG

CAAAACCGATTTCGAGCCGCTGTTTCAGGTTAACCAAGAACTGGCGGCGGGCAACAT CTACATTGCGGACTAT

ACCGGCACCGATATCAACTACCTGGGTCCGAGCCTGATTCAGGGTGGCACCCACGCG AAGGGTCGTAAATAT

CTGCCGAAGCCGCGTGCGTTCTTTTGGTGGCGTAAGAGCGGCATCCGTGACCGTGGT AAACTGGTGCCGATC

GCGATTCAGTTCGGCGAGAACGCGGAAAAGCTGTACACCCCGTTCGAGAAAAACCCG CTGGCGTGGCTGTTT

GCGAAGATTTGCGTGCAAGTTGCGGATAGCAACCACCACGAAATGAACAGCCACCTG TGCCGTACCCACTTCG

TTATGGAGCCGATTGCGATTGGCACCGCGCGTCAGCTGGCGGAAAACCACCCGCTGA GCCTGCTGCTGAAAC

CGCACCTGCGTTTTATGCTGACCAACAACCACCTGGGTCAAGAGCGTCTGATCAACC CGGGTGGCCCGGTGGA

TGAGCTGCTGGCGGGCACCCTGGGTGAAAGCATGGCGCTGGTTAAGGACGCGTACGC GAACTGGAACCTGC

GTGATTTCGCGTTTCCGAAAGAGATTAGCAACCGTGGCATGGACGATACCGAACGTC TGCCGCACTACCCGTA

TCGT G ACG AT GGTAT GCT GGT GTGGCAG AGCAT CAACCAATT CGTT AGCG ACTACCT GCACTACTTTT AT CCG A

ACCCGCAGGACATTACCAACGATCAGGAGCTGCAAGCGTGGGCGGGTGAACTGAGCA ACAGCGCGGCGGAT

CAAGGTGGCAACGT G AAGGGTATGCCGGCG AACTTCACCG ACGTT G AGG ATCT GAT CG AAGTGGTTACCACC

ATCATTTTTATTTGCGGCCCGCTGCACAGCGCGGTTAACTACGGCCAGTACGACTAT ATGACCTTTGCGGCGAA

CATGCCGCTGGCGGCGTATTGCGACCTGCCGGAGGCGATCAAGGATACCACCGGTAG CATCATTGGCGACGC

GCGTGGTAGCATCACCGAAAAAGATATTCTGCAGCTGCTGCCGCCGTACAAGAAAGC GGCGGATCAGCTGCA

AAGCCTGTTCACCCTGAGCGACTACCGTTATGATCAACTGGGCTACTATGACAAGGC GTTTCGTGAGCTGTAT

GGTCGTAAATTCGAGGAAGTGTTTGCGGAAGGCGATCAGGCGACCATCACCGGTTTC CTGCGTCAATTTCAGC

AAAACCT G AACAT G AACG AGCAGG AAAT CG ACGCG AACAACCAAAAGCGT ATT GTT CCGT AC ACCT ATCTG A

AACCG AGCCT GATT CT G AACAGC AT CAGCATTT AA Amino acid Sequence for WP_002738122.1 - SEQ ID NO: 20

MVNTPPPTPCLPQN EPDANRRADSLNLQRQAYRYDYQYLPPLVLMESVPAAENFSFQYITERLAATAELPANM LA

VKVKSFLDPLDELQDYEDFFAI IPLPKIAKVYQTNDAFAEQRLSGANPLVLHLLKPGDARAQVLNQI PSSKTDFEPLFQ

VNQELAAG NIYIADYTGTDI NYLGPSLIQGGTHAKGRKYLPKPRAFFWWRKSGI RDRGKLVPIAIQFGENAEKLYTPF

EKNPLAWLFAKICVQVADSNHH EM NSH LCRTH FVM EPIAIGTARQLAEN HPLSLLLKPHLRFM LTNN HLGQERLIN

PGGPVDELLAGTLGESMALVKDAYANWNLRDFAFPKEISN RGMDDTERLPHYPYRDDGMLVWQSINQFVSDYL

HYFYPNPQDITN DQELQAWAGELSNSAADQGGNVKGM PAN FTDVEDLI EVVTTII FICGPLHSAVNYGQYDYMTF

AAN MPLAAYCDLPEAI KDTTGSIIGDARGSITEKDILQLLPPYKKAADQLQSLFTLSDYRYDQLGYYDKAFRELYG RKF

EEVFAEGDQATITGFLRQFQQN LN MN EQEI DANNQKRIVPYTYLKPSLILNSISI

Codon-optimized coding sequence for WP_006635899.1 - SEQ I D NO: 21

ATGGTGGATAACATGAAGCCGCTGCTGCCGCAAGACGATCCGAACCCGGAACAGCGT CACGACAGCCTGAAC

CGTCAGCAACAGGCGTACCAATTCGATTATGAAAGCCTGAGCCCGCTGGCGCTGCTG AAGGATGTGCCGGCG

GTTGAGAACTTTAGCAGCAAATACCTGGCGGAGCGTATCCTGGCGACCAGCGAACTG CCGGCGAACATGCTG

GCGGCGGACAGCCGTACCTTCCTGGACCCGCTGGATGAGCTGCAGGACTACGAAGAT TTCTTTACCTGGCTGC

CGCTGCCGGGTGTGGCGAAAATCTATCAAACCGATCGTAGCTTTGCGGAACAGCGTC TGAGCGGTGCGAACC

CGATGGTTCTGCGTCTGCTGCACCAAGAGGACAGCCGTGCGGAAACCCTGGCGCAAC TGTGCTGCCTGCAGC

CGCTGTTCGACCTGCGTAAGGAGCTGCAGGATAAAAACATCTACATTGCGGACTATA CCGGCACCGATGAAC

ACTATCGTGGTCCGGCGAAGGTTGCGGGTGGCACCTACGAGAAGGGTCGTAAATATC TGCCGAAACCGCGTG

CGTTCTTT GCGTGGCGTTGG ACCGGT AT CCGT GAT CGTGGCG AG AT G ACCCCG ATCGCG ATT CAACTGG ACCC

GAAGCCGGGTAGCCACCTGTACACCCCGTTTGACCCGCCGATTGATTGGCTGTATGC GAAACTGTGCGTGCAG

GTT GCGG ACGCG AACCACCACG AAAT G AGCAGCCACCTGGGCCGTACCCACCT GGT GATGGAGCCG ATCGCG

ATTGTTACCGCGCGTCAGCTGGCGAAGAACCACCCGCTGAGCCTGCTGCTGAAACCG CACTTCCGTTTTATGCT

GACCAACAACGATCTGGCGCGTAGCCATCTGATTGCGCCGGGTGGCCCGGTGGATGA ACTGCTGGGTGGCAC

CCTGGCGGAGACCATGGAACTGACCCGTGAGGCGTGCAGCACCTGGAGCCTGGATGA GTTTGCGCTGCCGGC

GGAACTGAAGAACCGTGGTATGGACGATCCGAACCAGCTGCCGCACTACCCGTATCG TGACGATGGCCTGCT

GCTGTGGGATGCGATCGAAACCTTTGTTAGCGGTTACCTGAAGTTCTTTTATCCGAC CAACGAGGGCATTGTG

CAAGACGTTGAACTGCAGACCTGGGCGAAAGAGCTGGCGAGCGACGATGGTGGCAAG GTGAAGGGTATGCC

GCACCACATCGACACCGTTGAGCAGCTGATCGCGATTGTGACCACCGTTATTTTCAC CTGCGGCCCGCAACAC

AGCGCGGTGAACTTCCCGCAGTACGATTATATGAGCTTTGCGGCGAACATGCCGCTG GCGGCGTACCGTGAC

ATCCCGGGTATTACCGCGAGCGGCCACCTGGAAGTGATCACCGAAAACGATATTCTG CGTCTGCTGCCGCCGT

ATAAGCGTGCGGCGGACCAACTGCAGATCCTGTTCATTCTGAGCGCGTACCGTTATG ACCGTCTGGGTTACTA

TGATAAAAGCTTTCGTGAACTGTACCGTATGAGCTTCGATGAGGTGTTTGCGGGCAC CCCGATCCAACTGCTG

GCGCGTCAGTTCCAACAGAACCTGAACATGGCGGAACAAAAGATCGACGCGAACAAC CAGAAACGTGTGATT

CCGTATTTTGCGCTGAAACCGAGCCTGGTTCTGAACAGCATTAGCATGTAA

Amino acid Sequence for WP_006635899.1 - SEQ ID NO: 22

MVDNMKPLLPQDDPN PEQRHDSLNRQQQAYQFDYESLSPLALLKDVPAVEN FSSKYLAERILATSELPANMLAAD

SRTFLDPLDELQDYEDFFTWLPLPGVAKIYQTDRSFAEQRLSGAN PMVLRLLHQEDSRAETLAQLCCLQPLFDLRKE

LQDKN IYIADYTGTDEHYRGPAKVAGGTYEKGRKYLPKPRAFFAWRWTGI RDRGEMTPIAIQLDPKPGSH LYTPFD

PPIDWLYAKLCVQVADANH HEMSSH LGRTH LVMEPIAIVTARQLAKN HPLSLLLKPHFRFMLTN NDLARSHLIAPG

GPVDELLGGTLAETM ELTREACSTWSLDEFALPAELKNRGMDDPNQLPHYPYRDDGLLLWDAI ETFVSGYLKFFYP

TN EGIVQDVELQTWAKELASDDGGKVKGMPHH IDTVEQLIAIVTTVIFTCGPQHSAVNFPQYDYMSFAAN MPLA AYRDIPGITASGHLEVITENDILRLLPPYKRAADQLQILFILSAYRYDRLGYYDKSFREL YRMSFDEVFAGTPIQLLARQF

QQN LN MAEQKIDANNQKRVI PYFALKPSLVLNSISM

Codon-optimized coding sequence for WP_015178512.1 - SEQ I D NO: 23

ATGGTGGACAACATGAAGCCGAGCCTGCCGCAAGACGATCCGAACCAAGAACAGCGT AAAGACAGCCTGAA

CCGTCAGCAACAGGCGTACCAGTTCGATTATGAGAGCCTGAGCCCGCTGGCGCTGCT GAAGAACGTGCCGGC

GGTTGAAAACTTTAGCAGCAAATACATCGGCGAGCGTATTCTGGCGACCAGCGAACT GCCGGCGAACATGCT

GGCGGCGGACAGCCGTACCTTCCTGGACCCGCTGGATGAGCTGCAAGACTACGAAGA TTTCTTTACCCTGCTG

CCGCTGCCGGCGGTGGCGAAGATTTATCAAACCGATCGTAGCTTTGCGGAACAGCGT CTGAGCGGTGCGAAC

CCGATGGTTCTGCGTCTGCTGGATGCGGGTGATCCGCGTGCGCAAACCCTGGCGCAG ATCAGCAGCTTCCACC

CGCTGTTTGACCTGGGCCAGGAGCTGCAACAGAAAAACATTTACGTTGCGGACTATA CCGGCACCGATGAGC

ACTACCGTGCGCCGAGCAAGATCGGTGGCGGTAGCTATGAAAAGGGCCGTAAATTCC TGCCGAAACCGCGTG

CGTTCTTTGCGTGGCGTTGGACCGGCATCCGTGACCGTGGTGAGATGACCCCGATCG CGATTCAACTGGACCC

G ACCCCGG ATAGCCAT GT GTACACCCCGTTT G ACCCGCCGGTT GATT GGCT GTTTGCGAAGCT GT GCGTGCAG

GTTGCGGATGCGAACCACCACGAGATGAGCAGCCACCTGGGTCGTACCCACCTGGTG ATGGAACCGATCGCG

ATTGTTACCGCGCGTCAACTGGCGCAGAACCACCCGCTGAGCCTGCTGCTGAAACCG CACTTCCGTTTTATGCT

GACCAACAACGAGCTGGCGCGTAGCTATCTGATTGCGCCGGGCGGTCCGGTGGATGA ACTGCTGGGTGGCAC

CCTGCCGGAGACCATGGAAATTGCGCGTGAGGCGTGCAGCACCTGGAGCCTGGATGA GTTTGCGCTGCCGGC

GGAACTGAAGAACCGTGGCATGGACGATACCAACCAGCTGCCGCACTACCCGTATCG TGACGATGGCCTGCT

GCTGTGGGACGCGATTGAGACCTTTGTTAGCGGTTACCTGAAATTCTTTTATCCGAC CGAAATCGCGATTGTG

CAAGACGTTGAGCTGCAAACCTGGGCGCAGGAACTGGCGAGCGATCGTGGCGGTAAA GTGAAAGGCATGCC

GCCGCGT ATCAACACCGTGG AACAGCT GAT CAAG ATT GTT ACCACCAT CATTTT CACCTGCGGT CCGCAACACA

GCGCGGTTAACTTCCCGCAGTACGAGTATATGAGCTTTGCGGCGAACATGCCGCTGG CGGCGTACCGTGATAT

CCCGAAGATTACCGCGAGCGGTAACCTGGAAGTGATCACCGAAAAAGACATTCTGCG TCTGCTGCCGCCGTAT

AAGCGTGCGGCGGATCAGCTGAAAATCCTGTTCACCCTGAGCGCGTACCGTTATGAC CGTCTGGGCTACTATG

ATAAGAGCTTTCGTGAGCTGTACCGTATGAGCTTCGACGAAGTTTTTGCGGGCACCC CGATTCAACTGCTGGC

GCGT CAGTTT CAACAG AACCT G AACAT GGCGG AACAAAAG AT CG ATGCG AACAACCAG AAACGT GT G ATCCC

GTAT ATT G CG CT G AA ACCG AG CCTG GTT AT C A AC AG C ATT AG CAT GT AA

Amino acid Sequence for WP_015178512.1 - SEQ ID NO: 24

MVDNMKPSLPQDDPNQEQRKDSLNRQQQAYQFDYESLSPLALLKNVPAVEN FSSKYIGERILATSELPANMLAAD

SRTFLDPLDELQDYEDFFTLLPLPAVAKIYQTDRSFAEQRLSGANPMVLRLLDAGDP RAQTLAQISSFH PLFDLGQEL

QQKN IYVADYTGTDEHYRAPSKIGGGSYEKGRKFLPKPRAFFAWRWTGIRDRGEMTPIAIQLDP TPDSHVYTPFDP

PVDWLFAKLCVQVADAN HH EMSSHLGRTHLVM EPIAIVTARQLAQNHPLSLLLKPH FRFMLTN NELARSYLIAPG

GPVDELLGGTLPETMEIAREACSTWSLDEFALPAELKN RGMDDTNQLPHYPYRDDGLLLWDAI ETFVSGYLKFFYP

TEIAIVQDVELQTWAQELASDRGGKVKGMPPRINTVEQLIKIVTTI IFTCGPQHSAVN FPQYEYMSFAAN MPLAAY

RDI PKITASGN LEVITEKDI LRLLPPYKRAADQLKILFTLSAYRYDRLGYYDKSFRELYRMSFDEVFAGTPIQLLARQFQ

QN LN MAEQKIDAN NQKRVIPYIALKPSLVI NSISM

Codon-optimized coding sequence for WP_015204462.1 - SEQ I D NO: 25

ATGCCGCAACCGTACCTGCCGCAGAACGAGCCGAACCCGGAAAAACGTAACAACGAC CTGAGCGATCAGCAA

CAGGCGTACGAGTATGATTACAAGTATCTGCCGCCGCTGGTGCTGCTGAAGAAAATT CCGGCGTTCGAAAACT

TTAGCGCGCAGTACATCGCGGAACGTGTGGTTGCGACCAGCGAGCTGGTTCCGAACA TGCTGGCGGCGAAAG

CGCGTAGCTTTCTGGACCCGCTGGACGATATCAAGGACTACGAGGACCTGTTCACCC TGCTGCCGCTGCCGGA AGTGGCGAAAGTTTATCAAACCAACAACAGCTTTGCGGAGCAGCGTCTGAGCGGTGCGAA CCCGTTCGTGAT

TCGTCTGCTGGACGAGGACGATCCGCGTAGCCAAGTTCTGGAACAGATCCCGAGCTT CAAAGACGATTTTGA

GCCGCTGTTCGATGTGCGTAAGGAACTGGCGGCGGGTAACATCTACATTACCGACTA TACCGGCACCGATGA

GTACTATCGTGGCCCGAGCATGGTTCAGGGTGGCACCTACGAAAAGGGCCGTAAATA TCTGCCGAAACCGCT

GGCGTTCTTTTGGTGGCAACGTACCGGTATTAGCGACCGTGGCAAGCTGGTGCCGAT CGCGATTCAGCTGGA

TGCGAGCAAGAACAGCAAAGTGTACACCCCGACCAACAGCAAAGTTTATACCCCGTT TGAGCAAAACCCGCTG

GACTGGCTGTTCGCGAAGCTGTGCGTGCAGATCGCGGATGGTAACCACCACGAAATG AGCAGCCACCTGTGC

CGTACCCACTTCGTTATGGAGCCGATTGCGATTGGCACCGCGCACCAGCTGGCGGAA AACCACCCGCTGAGCC

TGCTGCTGCGTCCGCACTTCCTGTTTATGCTGACCAACAACCACCTGGGCCAACAGC GTCTGATCAACCCGGGT

GGCCCGGTGGATGAGCTGCTGGCGGGCACCCTGCCGGAGAGCATGGAACTGGTTAAG GATGCGTACGAGGG

CTGGAACATTAAAGAATTCGCGTTTCCGACCGAGATCAAGAACCGTGGTATGGACAA CACCGAACGTCTGCC

GCACTACCCGTATCGTGACGATGGCATGCTGGTTTGGAAAGCGATTCACACCTTTGT GAGCGATTACGTTAAC

CACTTCTATCCGACCCCGGAAGACATCACCGGTGATACCGAGCTGCAAGCGTGGGCG AAGGAACTGAGCGAC

CAAAGCGCGCAGACCAACGGTGGCAAGGTGAAAGGCATGCCGACCAGCTTTACCACC GTGCAGGAGCTGAT

CGAAATTGTTACCACCATCATTTTCATTTGCGGTCCGCAACACAGCGCGGTTAACTA CGCGCAGGATGGCTATA

TGACCTTTGCGGCGAACATGCCGCTGGCGGCGTACCGTGACATCCCGAAGCAGAGCC ACAAACCGCAGGATC

AACCGACCGCGACCCCGAGCGTGGCGGTTCAAACCACCGCGGAGCAGACCACCGCGG AACAAACCAAGGCG

GTGGAAATTACCGCGGACAAAGCGACCCTGGATCAGAACACCGTTCTGCAAAAACGT GCGGTGCAGACCACC

ACCGTTGAGATCCCGGAAGACCAAATTACCGAGGAACAGATCCTGAAGCTGCTGCCG CCGTACAAACGTACC

GCGGACCAACTGCAGAGCCTGTTTGTGCTGAGCGCGTACCAATATGATCGTCTGGGT TACTATGAGAAGGCG

TT CCAACAGCT GT ACAACG ACAAGTT CG AAG AT GTTTTCAAGG ACG ATAACAACCAAGCG ATCATTGCG ATT G

TGCGTCAGTT CCAACAG AACCT G AACATGGTT G AGCAGGAAAT CG ACGCG AACAACAAG AAACGT GTGGTT C

CGT ACCT GT AT CT G AAGCCG AGCCT GAT CCT G AACAGCAT CAGCATTT AA

Amino acid Sequence for WP_015204462.1 - SEQ ID NO: 26

MPQPYLPQN EPNPEKRN NDLSDQQQAYEYDYKYLPPLVLLKKIPAFEN FSAQYIAERVVATSELVPNM LAAKARSF

LDPLDDIKDYEDLFTLLPLPEVAKVYQTNNSFAEQRLSGAN PFVI RLLDEDDPRSQVLEQIPSFKDDFEPLFDVRKELA

AGNIYITDYTGTDEYYRGPSMVQGGTYEKGRKYLPKPLAFFWWQRTGISDRGKLVPI AIQLDASKNSKVYTPTNSKV

YTPFEQNPLDWLFAKLCVQIADGNH HEMSSH LCRTHFVMEPIAIGTAHQLAENH PLSLLLRPH FLFMLTN NH LGQ

QRLI N PGGPVDELLAGTLPESM ELVKDAYEGWNIKEFAFPTEIKN RGMDNTERLPHYPYRDDGMLVWKAIHTFVS

DYVNH FYPTPEDITGDTELQAWAKELSDQSAQTNGGKVKGMPTSFTTVQELIEIVTTII FICGPQHSAVNYAQDGY

MTFAAN MPLAAYRDIPKQSH KPQDQPTATPSVAVQTTAEQTTAEQTKAVEITADKATLDQNTVLQKRAVQTTTV

EI PEDQITEEQILKLLPPYKRTADQLQSLFVLSAYQYDRLGYYEKAFQQLYN DKFEDVFKDDN NQAIIAIVRQFQQNL

N MVEQEIDANN KKRVVPYLYLKPSLI LNSISI

Codon-optimized coding sequence for WP_028091425.1 - SEQ I D NO: 27

ATGCAGCCGTTCCTGCCGCAAAACGACCCGAACCCGAGCCAGCGTCAAAGCAGCCTG GAGAAGGGTCGTAAG

GAATACCAGTTCATGTACGATTTTCTGCCGCCGATGGCGATGATCAAGAGCGTGCCG CCGGCGGAGAACTTTA

GCACCAAATACATTGCGGAACGTACCCTGGAGGCGGCGGAACTGCCGCTGAACATGA TGGCGGTTAAGACCC

ACGCGATGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTTCTTTCCGGTGC TGCAAAAGCCGAACG

TTATGAAAACCTATGAGACCGACGATAGCTTCGCGGAACAGCGTCTGTGCGGCGTGA ACCCGATGGTTCTGC

GTCAGATCAAGCAAATGCCGGCGAACTTCGCGTTTACCATTGAGGAACTGCAAGACA AATTCGGTAGCAGCA

TCAACCTGATTGAGCGTCTGGCGACCGGCAACCTGTACGTGGCGGATTATCGTAGCC TGGCGTTTATCCAGGG

TGGCACCTACGCGAAGGGTAAGAAATATCTGCCGGCGCCGCTGGCGTTCTTTTGCTG GCGTACCAGCGGTTTC CAGGACCGTGGCCAACTGGTGCCGGTTGCGATCCAGATTAACCCGAAAGCGGGTAAAGCG AGCCCGCTGCTG

ACCCCGTTTGATGATCCGCTGACCTGGTTTTACGCGAAAAGCTGCGTGCAAATCGCG GATGCGAACCACCACG

AGATGAGCAGCCACCTGTGCCGTACCCACCTGGTTATGGAGCCGTTTGCGGTGGTTA CCCCGCGTCAGCTGGC

GGAAAACCACCCGCTGCGTATTCTGCTGAAGCCGCACTTCCGTTTTATGCTGGCGAA CAACGACCTGGCGCGT

AAACGTCTGGTTAGCCGTGGTGGCTTCGTTGATGAGCTGCTGGCGGGCACCCTGCAG GAAAGCCTGCAAATC

GTGGTTGACGCGTACAAAAGCTGGAGCCTGGATCAGTTTGCGCTGCCGCGTGAACTG AAGAACCGTGGTGTG

AACGACGTTAAAAACCTGCCGCACTACCCGTATCGTGACGATGGCATCCTGCTGTGG AACGCGATTAACAAGT

TCGTTTTTAACTATCTGCAGCTGTACTATCAAAGCAGCGCGGACCTGAAGGCGGATG CGGAACTGCAGGCGT

GGGCGCGTGAACTGGTGGCGCAAGATGGTGGCCGTGTTAAGGGTATGAGCGACCGTA TCGATACCCTGGAG

CAGCTGGTTGAGATCGTTACCACCATCATTTACATTTGCGGCCCGCAGCACAGCGCG GTGAACTTCAGCCAAT

ACGAATATATGGGCTTTATTCCGAACATGCCGCTGGCGGCGTATCAGCCGATCCAGC AAAAGGGTGACATTAA

AGATCGTCAAGCGCTGATCGACTTCCTGCCGCCGGCGAAACCGACCAGCACCCAGCT GAGCACCGTTTACATT

CTGAGCGACTACCGTTATGATCGTCTGGGCTACTATGAGGAAGAGGAATTCACCGAC CCGAACGCGGATCAG

GTGGTTAACAAGTTTCAGCAAGAGCTGAACATGGTGCAGCGTAAGATCGAACTGAAC AACAAACGTCGTCTG

GTT AACT AC AA AT ATCTG C AACCG CGTCT GATT CT G AAC AG CATC AG C ATTT AA

Amino acid Sequence for WP_028091425.1 - SEQ ID NO: 28

MQPFLPQN DPNPSQRQSSLEKGRKEYQFMYDFLPPMAMI KSVPPAEN FSTKYIAERTLEAAELPLN MMAVKTHA

MWDPLDELQDYEDFFPVLQKPNVMKTYETDDSFAEQRLCGVN PMVLRQI KQMPAN FAFTIEELQDKFGSSIN LIE

RLATGN LYVADYRSLAFIQGGTYAKGKKYLPAPLAFFCWRTSGFQDRGQLVPVAIQIN PKAGKASPLLTPFDDPLT

WFYAKSCVQIADANH HEMSSHLCRTHLVM EPFAVVTPRQLAEN HPLRI LLKPH FRFM LANNDLARKRLVSRGGFV

DELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVNDVKNLPHYPYRDDGILLWNAIN KFVFNYLQLYYQSSA

DLKADAELQAWARELVAQDGGRVKGMSDRIDTLEQLVEIVTTI IYICGPQHSAVN FSQYEYMGFI PNMPLAAYQPI

QQKGDI KDRQALI DFLPPAKPTSTQLSTVYILSDYRYDRLGYYEEEEFTDPNADQVVN KFQQELNMVQRKI ELNN KR

RLVNYKYLQPRLI LNSISI

Codon-optimized coding sequence for OBQ01436.1 - SEQ ID NO: 29

ATGCAGCCGTTCCTGCCGCAAAACGACCCGAACCCGGCGCAGCGTCAAAGCTGCCTG GAGAAGGGTCGTAAG

GAATACCAGTTCATGTACGATTTTCTGCCGCCGATGGCGATGCTGAAGAGCGTTCCG CCGGCGGAGAACTTTA

GCACCAAATACATCGCGGAACGTACCCTGGAGGCGGCGGAACTGCCGCTGAACATGA TGGCGGTGAAGACC

CACGCGATGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTTCTTTCCGATT CTGCAAAAGCCGAAC

GTTATGAAAACCTATGAGACCGACGATAGCTTCGCGGAACAGCGTCTGTGCGGCGTG AACCCGATGGTTCTG

CGTCAGATCAAGCAAATGCCGGCGAACTTCGCGTTTACCATTGAGGAACTGCAAGCG AAATTCGGTAACAGC

ATCAACCTGATTGAGCGTCTGGCGACCGGCAACCTGTACGTGGCGGACTATCGTAGC CTGGCGTTTATCCAGG

GTGGCACCTACGCGAAGGGTAAGAAATATCTGCCGGCGCCGCTGGCGTTCTTTTGCT GGCGTAGCAGCGGTT

TCCAGGATCGTGGCCAACTGGTGCCGGTTGCGATCCAGATTAACCCGAAAGCGGGTA AAGCGAGCCCGCTGC

T G ACCCCGTTT GAT G ATCCGCT G ACCT GGTTTT ACGCG AAAAGCT GCGT GCAAATCGCGG ACGCG AACCACCA

CGAGATGAGCAGCCACCTGTGCCGTACCCACCTGGTTATGGAGCCGTTTGCGGTGGT TACCCCGCGTCAGCTG

GCGGAAAACCACCCGCTGCGTATTCTGCTGCGTCCGCACTTCCGTTTTATGCTGGCG AACAACGACCTGGCGC

GTAAGCGTCTGGTTAGCCGTGGTGGCTTCGTTGATGAGCTGCTGGCGGGCACCCTGC AGGAAAGCCTGCAAA

TCGTGGTTGACGCGTACAAAAGCTGGAGCCTGGATCAGTTTGCGCTGCCGCGTGAAC TGAAGAACCGTGGTG

TGGACGATGTTAAAAACCTGCCGCACTACCCGTATCGTGACGATGGCATCCTGCTGT GGAACGCGATTAACAA

GTTCGTTTTTAACTATCTGCAGCTGTACTATAAGAGCCCGGCGGACCTGAAGGCGGA TGGTGAACTGCAGGC

GTGGGCGCGTGAACTGGTGGCGCAAGACGGTGGCCGTGTTAAAGGCATGAGCGACCG TATCGATACCCTGG AGCAACTGGTGGAAATCGTTACCACCATCATTTACATTTGCGGCCCGCAGCACAGCGCGG TGAACTTCAGCCA

ATACGAGTATATGGGCTTTATTCCGAACATGCCGCTGGCGGCGTATCAGGAGATCCA GCAAAACGGTGACATT

GAAGATCGTCAAGCGCTGATCGATTTCCTGCCGCCGGCGAAGCCGACCAACACCCAG CTGAGCACCGTTTACA

TTCTGAGCGACTACCGTTATGATCGTCTGGGCTACTATGAGGAAGAGGAATTCACCG ACCCGAACGCGGATCA

GGTGGTTAACAAATTTCAGCAAGAGCTGAGCGTGGTTCAGCGTAAGATCGAACTGAA CAACAAAGGTCGTCT

GGTGAACTACGAATATCTGCAACCGGGCCTGATTCTGAACAGCATCAGCATTTAA

Amino acid Sequence for OBQ01436.1 - SEQ ID NO: 30

MQPFLPQN DPNPAQRQSCLEKGRKEYQFMYDFLPPMAMLKSVPPAENFSTKYIAERTLEAAELPLNMM AVKTHA

MWDPLDELQDYEDFFPILQKPNVM KTYETDDSFAEQRLCGVN PMVLRQI KQMPAN FAFTIEELQAKFGNSI NLIE

RLATGN LYVADYRSLAFIQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQI NPKAGKASPLLTPFDDPLT

WFYAKSCVQIADANH HEMSSHLCRTHLVM EPFAVVTPRQLAEN HPLRI LLRPHFRFM LANNDLARKRLVSRGGFV

DELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVDDVKNLPHYPYRDDGILLWNAIN KFVFNYLQLYYKSPA

DLKADGELQAWARELVAQDGGRVKGMSDRIDTLEQLVEIVTTIIYICGPQHSAVNFS QYEYMGFIPNM PLAAYQEI

QQNGDI EDRQALI DFLPPAKPTNTQLSTVYILSDYRYDRLGYYEEEEFTDPNADQVVN KFQQELSVVQRKIELN NKG

RLVNYEYLQPGLILNSISI

Codon-optimized coding sequence for OBQ25779.1 - SEQ ID NO: 31

ATGATCAACATTATGCAGCCGTTCCTGCCGCAAAACGACCCGAACCCGGGTCAGCGT CAAAGCAGCCTGGAG

AAGGGCCGTAAGGAATACCAGTTCATGTACGATTTTCTGCCGCCGATGGCGATGCTG AAGAGCGTGCCGCCG

GCGGAGAACTTTAGCACCAAATACATCGCGGAACGTACCCTGGAGGCGGCGGAACTG CCGCTGAACATGATG

GCGGTTAAGACCCACGCGATGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGAT TTCTTTCCGGTGCTG

CAAAAGCCGAACGTTATGAAAACCTATGAGACCGACGATAGCTTCGCGGAACAGCGT CTGTGCGGTGTGAAC

CCGATGGTTCTGCGTCAGATCAAGCAAATGCCGGCGAACTTCGCGTTTACCATTGAG GAACTGCAAGCGAAAT

TCGGTAACAGCATCAACCTGATTGAGCGTCTGGCGACCGGCAACCTGTACGTTGCGG ACTATCGTAGCCTGGC

GTTTATCCAGGGTGGCACCTACGCGAAGGGTAAGAAATATCTGCCGGCGCCGCTGGC GTTCTTTTGCTGGCGT

AGCAGCGGTTTCCAGGATCGTGGCCAACTGGTGCCGGTTGCGATCCAGATTAACCCG AAAGCGGGTCAAGCG

AGCCCGCTGCTGACCCCGTTTGACAAGCCGCTGACCTGGTTTTACGCGAAAAGCTGC GTGCAGATCGCGGATG

CG AACCACCACG AG AT G AGCAGCCACCT GTGCCGTACCCACCTGGTT AT GG AGCCGTTTGCGGTGGTT ACCCC

GCGTCAACTGGCGGAAAACCACCCGCTGCGTATTCTGCTGAAGCCGCACTTCCGTTT TATGCTGGCGAACAAC

GACCTGGCGCGTAAACGTCTGGTTAGCCGTGGTGGCTTCGTTGATGAGCTGCTGGCG GGCACCCTGCAGGAA

AGCCTGCAAATCGTGGTTGACGCGTACAAAAGCTGGAGCCTGGATCAGTTTGCGCTG CCGCGTGAACTGAAG

AACCGTGGTGTGGACGATGTTAAAAACCTGCCGCACTACCCGTATCGTGACGATGGC ATCCTGCTGTGGAACG

CGATTAACAAGTTCGTGTTTAACTATCTGCAGCTGTACTATAAGAGCCCGGCGGACC TGAAGGCGGATGGTGA

ACTGCAGGCGTGGGCGCGTGAACTGGTGGCGCAAGACGGTGGCCGTGTTAAAGGCAT GAGCGACCGTATCG

ATACCCTGGAGCAACTGGTGGAAATCGTTACCACCATCATTTACATTTGCGGCCCGC AGCACAGCGCGGTGAA

CTTCAGCCAATACGAGTATATGGGCTTTATTCCGAACATGCCGCTGGCGGCGTATCA GGCGATCCAGCAAAAG

GGCGACATTAAAGATCGTCAAGCGCTGATCGACTTCCTGCCGCCGGCGAAGCCGACC AACACCCAGCTGAGC

ACCGTTT ACATT CT G AGCG ACTACCGTT AT G ATCGT CT GGGTT ACT AT G AGG AAG AGG AATT CACCG ACCCG A

ACGCGGATCAGGTGGTTAACAAATTTCAGCAAGAGCTGAACGTGGTTCAGCGTAAGA TCGAACTGAACAACA

AAGGCCGTCTGGTGAACTACGAATATCTGCAGCCGCGTCTGATTCTGAACAGCATCA GCATTTAA

Amino acid Sequence for OBQ25779.1 - SEQ ID NO: 32 MI N IMQPFLPQNDPNPGQRQSSLEKGRKEYQFMYDFLPPMAMLKSVPPAENFSTKYIAERTLE AAELPLNMMAV

KTHAMWDPLDELQDYEDFFPVLQKPNVM KTYETDDSFAEQRLCGVNPMVLRQIKQM PANFAFTI EELQAKFGNS

IN U ERLATGNLYVADYRSLAFIQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQINP KAGQASPLLTPFDK

PLTWFYAKSCVQIADAN HH EMSSH LCRTHLVMEPFAVVTPRQLAENHPLRILLKPH FRFM LAN NDLARKRLVSRG

GFVDELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVDDVKNLPHYPYRDDGILLWNAI NKFVFNYLQLYYKS

PADLKADGELQAWARELVAQDGGRVKGMSDRIDTLEQLVEIVTTIIYICGPQHSAVN FSQYEYMGFI PNMPLAAY

QAIQQKGDI KDRQALI DFLPPAKPTNTQLSTVYI LSDYRYDRLGYYEEEEFTDPNADQVVNKFQQELNVVQRKIELN

N KGRLVNYEYLQPRLI LNSISI

Codon-optimized coding sequence for WP_039200563.1 - SEQ I D NO: 33

ATGAAGCCGTTCCTGCCGCAGAACGATCCGAACCCGACCCAGCGTCAAAGCAGCCTG GAGAAGGGCCGTAAA

GAGTACGAATTCCGTTATGACTTTCTGCCGCCGATGGCGATGCTGAAGAACGTGCCG CCGAGCGAGAACTTTA

GCACCAAATACATTGCGGAACGTACCATCGAGACCGCGGAACTGCCGAGCAACATGA TGGCGGTTAAAGCGC

ACGCGATGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTTCTTTCCGGTGC TGCAAAAGCCGAACG

TTATGAAAAACTATGAGACCGACGATAGCTTCGCGGAACAGCGTCTGTGCGGTGTGA ACCCGGTGGTTCTGT

GCCAGATTAAGCAAATGCCGGCGAACTTCGCGTTTACCATCGAGGAACTGCAAGCGA AATTTGGTAACAGCA

TTGATCTGCGTGAGCGTCTGGCGACCGGCAACCTGTACGTGGCGGACTATCGTCCGC TGGCGTTCATCCGTGG

TGGCACCTTTGCGAAGGGTAAGAAATACCTGCCGGCGCCGCTGGCGTTCTTTTGCTG GCGTAGCAGCGGTTTC

CAGGATCGTGGCCAACTGGTTCCGATCGCGATTCAGATCAACCCGAAGGAAGGCAAA GCGAGCCCGCTGCTG

ACCCCGTTCGACGATAGCAGCACCTGGTTTTACGCGAAGAGCTGCGTGCAAATCGCG GACGCGAACCACCAC

GAGATGAGCAGCCACCTGTGCCGTACCCACTTCGTTATGGAACCGTTTGCGGTGGTT ACCCCGCGTCAGCTGG

CGCAAAACCACCCGCTGCGTATTCTGCTGAAACCGCACTTCCGTTTTATGCTGGCGA ACAACGATCTGGGTCGT

CAGCGTCTGGTGAACCGTGGTGGCCCGGTTGATGAGCTGCTGGCGGGCACCCTGCAG GAAAGCCTGCAAATT

GTGGTTGACGCGTACACCGATTGGCGTCTGGACCAATTCGCGCTGCCGACCGAGCTG AAGAACCGTGGTGTG

GACGATGTTAAAAACCTGCCGCACTACCCGTATCGTGACGATGGCATTCTGCTGTGG AACGCGATCAACAAGT

TCGTGTTCAACTACCTGGAACTGTACTACAAGAGCCCGGCGGATCTGACCGCGGATG TTGAACTGCAGGCGT

GGGCGCGTGAACTGGTGGCGCAAGATGGTGGCCGTGTTAAGGGTATGAGCGACCGTA TTGATACCCTGAAA

CAGCT GGTT GAG AT CGTT ACCACCATCATTTACACCTGCGGTCCGCTGCACAGCGCGGTG AACTT CCCGCAGT

ACGAATATATGGGCTTTATCCCGAACATGCCGCTGGCGGCGTATCAACCGATTAAGA AAGAGGGTGTTTGCAC

CCGTAAGGAACTGATCGACTTCCTGCCGGCGGCGAAACCGACCAGCAGCCAGCTGAC CACCCTGTTTACCCTG

AGCGCGTACCGTT AT G ATCGT CT GGGCTACT AT G AGG AAG AGG AATT CG AGG ACCCG AACGCGG ACG AT GT G

GTTAACAAATTTCAGCAAGAGCTGAACGTGGTTCAGCGTAAGATCGAACTGAGCAAC AAAGGTCGTCTGGTT

AACTACG AAT AT CTGCAACCGCGT CT GATT CT G AACAGCATT AGCAT CTAA

Amino acid Sequence for WP_039200563.1 - SEQ ID NO: 34

MKPFLPQN DPNPTQRQSSLEKGRKEYEFRYDFLPPMAM LKNVPPSEN FSTKYIAERTIETAELPSN MMAVKAHAM

WDPLDELQDYEDFFPVLQKPNVM KNYETDDSFAEQRLCGVNPVVLCQIKQMPAN FAFTIEELQAKFGNSI DLRER

LATGN LYVADYRPLAFIRGGTFAKGKKYLPAPLAFFCWRSSGFQDRGQLVPIAIQIN PKEGKASPLLTPFDDSSTWFY

AKSCVQIADAN HH EMSSHLCRTH FVMEPFAVVTPRQLAQNH PLRI LLKPHFRFMLAN N DLGRQRLVN RGGPVDE

LLAGTLQESLQIVVDAYTDWRLDQFALPTELKNRGVDDVKNLPHYPYRDDGILLWNA I NKFVFNYLELYYKSPADLT

ADVELQAWARELVAQDGGRVKGMSDRI DTLKQLVEIVTTI IYTCGPLHSAVN FPQYEYMGFI PNMPLAAYQPIKKE

GVCTRKELI DFLPAAKPTSSQLTTLFTLSAYRYDRLGYYEEEEFEDPNADDVVNKFQQELNVVQRKI ELSN KGRLVNY

EYLQPRLILNSISI

Codon-optimized coding sequence for WP_012407347.1 - SEQ I D NO: 35 ATGAAGCCGTACCTGCCGCAGAACGACCCGGATCCGACCAAACGTCAGATCCTGCTGGAG CGTAACCAAGGC

GAGTACGAATTCGACTATGATTTTCTGGTGCCGATGGCGATGCTGAAGAACGTTCCG AGCATTGAGAACTTCA

GCACCAAATATATCGCGGAACGTACCCTGGAGACCGCGGAACTGCCGATTAACATGC TGGCGGTGAAGACCC

GTAGCCTGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTATTTTCCGGTTC TGCCGAAGCCGAACAT

CATTAAAACCTACCAGAGCGACGATAGCTTCTGCGAGCAACGTCTGTGCGGTGCGAA CCCGTTTGTGCTGCGT

CGTATTGAACAGATGCCGGACGGCTTCGCGTTTACCATCCTGGAGCTGCAAGAAAAG TTCGGTGATAGCATTA

ACCTGGTTGAGAAACTGGCGAACGGCAACCTGTACGTGGCGGACTATCGTGCGCTGG CGTTCGTTAAAGGTG

GCAGCTACGAACGTGGTAAGAAATTTCTGCCGACCCCGATCGCGTTCTTTTGCTGGC GTAGCAGCGGTTTCAG

CGACCGTGGCCAGCTGGTGCCGATCGTTATTCAAATCAACCCGGCGGATGGCAAGCA GAGCCAACTGATCAC

CCCGTTCGACGATCCGCTGACCTGGTTTCACGCGAAACTGTGCGTGCAGATTGCGGA CGCGAACCACCACGAA

AT G AGCAGCCACCT GTGCCGTACCCACTT CGT G ATGG AGCCGTTTGCG ATT GTT ACCGCGCGT CAACTGGCGG

AAAACCACCCGCTGAGCCTGCTGCTGAAGCCGCACTTCCGTTTTATGCTGGCGAACA ACGACCTGGCGCGTAA

ACGTCTGATCAGCCGTGGTGGCCCGGTGGATGAGCTGCTGGCGGGCACCCTGCAGGA AAGCCTGCAAATTGT

GGTTAACGCGTACACCGAGTGGAGCCTGGACCAGTTCAGCCTGCCGACCGAACTGAA GAACCGTGGTATGGA

CGATCCGGATAACCTGCCGCACTACCCGTATCGTGACGATGGCCTGCTGCTGTGGAA CGCGATTAAGAAATTT

GTTAGCGAGTATCTGCAGATCTACTATAAGACCCCGCAAGACCTGGCGGAGGATCTG GAACTGCAGAGCTGG

GTGCAAGAACTGGTTAGCCAGAGCGGTGGCCGTGTGAAAGGTATTAGCGACCGTATC AACACCCTGGACCAA

CTGGTGGATATTGCGACCGCGGTTATTTTTACCTGCGGTCCGCAGCATGCGGCGGTT AACTACAGCCAATACG

AGTATATGACCTTTATGCCGAACATGCCGCTGGCGGCGTATAAACAGATGACCAGCG AAGGCACCATCCCGG

ATCGTAAGAGCCTGCTGAGCTTCCTGCCGCCGAGCAAACAGACCGCGGACCAACTGA GCATTCTGTTTATCCT

GAGCGCGTACCGTTATGATCGTCTGGGCTACTATGACGATAAGTTCCTGGACCCGGA GGCGCAAGATGTTCTG

GCGAAATTTCAGCAAGAACTGAACGAGGCGGAACGTGAGATTGAACTGAACAACAAG AGCCGTCTGATCAAC

T ACAACT AT CT G AAACCGCGT CT GGT G ACCAAC AGC AT CAGCGTTT AA

Amino acid Sequence for WP_012407347.1 - SEQ ID NO: 36

MKPYLPQNDPDPTKRQILLERNQGEYEFDYDFLVPMAM LKNVPSI EN FSTKYIAERTLETAELPI NMLAVKTRSLWD

PLDELQDYEDYFPVLPKPNI IKTYQSDDSFCEQRLCGANPFVLRRI EQMPDGFAFTILELQEKFGDSIN LVEKLANGN

LYVADYRALAFVKGGSYERGKKFLPTPIAFFCWRSSGFSDRGQLVPIVIQIN PADGKQSQLITPFDDPLTWFHAKLCV

QIADAN HH EMSSH LCRTHFVMEPFAIVTARQLAEN HPLSLLLKPH FRFM LAN NDLARKRLISRGGPVDELLAGTLQ

ESLQIVVNAYTEWSLDQFSLPTELKNRGMDDPDN LPHYPYRDDGLLLWNAIKKFVSEYLQIYYKTPQDLAEDLELQS

WVQELVSQSGGRVKGISDRINTLDQLVDIATAVIFTCGPQHAAVNYSQYEYMTFM PNMPLAAYKQMTSEGTIPD

RKSLLSFLPPSKQTADQLSI LFILSAYRYDRLGYYDDKFLDPEAQDVLAKFQQELN EAEREI ELNNKSRLINYNYLKPRL

VTNSISV

Codon-optimized coding sequence for WP_027843955.1 - SEQ I D NO: 37

ATGAAGCCGTACCTGCCGCAGAACGACCCGAACCCGGAGAAGCGTAAAGATTGGCTG AACAAAAACCGTGA

GGAATACCAATTCAACTTTAACTATCTGAGCCCGCTGCCGCTGATCGACGATGTTCC GAACAACGAGGCGTTT

AGCCCGAAGTACCTGGCGGAACGTCTGCCGCTGACCTTCGGTAAACTGAGCGCGAAC ACCCTGGGCATTCGT

CTGCGTAGCTTTTGGGACCCGTTCGATGAGTTTCAGGACTATGAAGATTTCTTTCCG GTGCTGCCGACCCCGG

AACTGCT G AAG ACCTACCAG AACG ACG AGT ATTTCGCGG AACAACGT CT G AGCGGT GT G AACCCG AT GGTTA

TCCGTAGCATTAAAGAGCTGCCGCCGCACTTCGCGTTTAGCATCCGTGACCTGCAGG CGGAATTCGGCACCAG

CCTGAACCTGGAGCAAGAACTGAACAACGGCAACCTGTACATTGCGGATTATACCAG CCTGAGCTTTGTTCGT

GGTGGCAGCTACCTGCGTGGTCGTAAGAGCCTGCCGGCGCCGATTGCGCTGTTCTGC TGGCGTAACAGCGGT

TATTGCGATCGTGGCGAGCTGACCCCGATCGCGATTCAACTGGTGCCGGAACTGGGC ACCGGTAGCCGTATTC TGACCCCGTTTGACAGCCACCTGAACTGGCTGTACGCGAAAATCTGCATGCAAATTGCGG ATGCGAACCACCA

CG AG AT G AGCAGCCACCT GT GCCACACCCACCTGGTT ATGG AGCCGTTT GCGGTGGTTACCGCGCGTCAGCT G

GCGGAAAACCACCCGCTGGGTCTGCTGCTGCGTCCGCACTTCCGTTTTATGCTGCAC AACAACGAGCTGGCGC

GTAAGAACCTGATCAACCAGGGTGGCTACGTTGACAACCTGCTGGGTGGCACCCTGC GTGAAAGCCTGCAAA

TTGTGCGTGACGCGTATTTCAAGAACGCGGAGGAATTTTGGAGCCTGGATGAGTTCG CGCTGCCGAAAGAAA

TCGCGAACCGTGGTCTGGACGATACCGATCGTCTGCCGCACTACCCGTATCGTGACG ATGGCATGCTGCTGTG

GAACGCGATTGAAAAGTTTGTTAGCAACTACCTGAGCATCTACTATCCGAACCCGGG TGACATTAAAGATGAT

CGTGAGCTGCAAGCGTGGGCGGCGGAACTGGTGGCGGCGGATGGTGGCCGTGTGAAG GGCGTTCCGAGCC

AATTTGAGAACCTGCAGCAACTGATCGACGTGGTTACCGGTATCATTTTTACCTGCG GTCCGCAGCACAGCGC

GGTGAACTACCCGCAATACGAATATATGGCGTTTGTTCCGAACATGCCGCTGGCGGG TTATCAGGCGGTGGA

CAGCAACCCGAACATGGATCTGAAAAGCCTGATGGCGTTCCTGCCGCCGCCGAACCA AACCGCGGACCAGCT

GCAAATCATTTACGGTCTGAGCGCGTACCGTTATGATCGTCTGGGCTACTATGACCG TGAGTTTAGCGATCCG

CACGCGG AGG AAGT GGTTCGTCT GTT CCAGCAAG ACCT G AACCAGGT GG AGCGT AAG AT CG AACTGCGTAAC

AAAAACCGTCTGGTGGAATATAACTTCCTGAAACCGAGCCTGGTTCTGAACAGCATC AGCATTTAA

Amino acid Sequence for WP_027843955.1 - SEQ ID NO: 38

MKPYLPQNDPNPEKRKDWLNKN REEYQFN FNYLSPLPLIDDVPN N EAFSPKYLAERLPLTFGKLSANTLGI RLRSFW

DPFDEFQDYEDFFPVLPTPELLKTYQNDEYFAEQRLSGVN PMVIRSI KELPPH FAFSIRDLQAEFGTSLN LEQELN NG

N LYIADYTSLSFVRGGSYLRGRKSLPAPIALFCWRNSGYCDRGELTPIAIQLVPELGTGSR ILTPFDSHLNWLYAKICM

QIADAN HH EMSSH LCHTH LVMEPFAVVTARQLAENH PLGLLLRPH FRFMLH N NELARKN LINQGGYVDNLLGGT

LRESLQIVRDAYFKNAEEFWSLDEFALPKEIANRGLDDTDRLPHYPYRDDGM LLWNAIEKFVSNYLSIYYPNPGDI K

DDRELQAWAAELVAADGGRVKGVPSQFENLQQLIDVVTGIIFTCGPQHSAVNYPQYE YMAFVPNMPLAGYQAV

DSN PN MDLKSLMAFLPPPNQTADQLQIIYGLSAYRYDRLGYYDREFSDPHAEEVVRLFQQDLNQV ERKIELRNKN R

LVEYN FLKPSLVLNSISI

Codon-optimized coding sequence for WP_073641301.1 - SEQ I D NO: 39

ATGAAACCGTACCTGCCGCAGAACGACCCGGATCCGATTAAGCGTAAATACAGCCTG GAGCACAAGAAAGAG

GAATATGAATTCGACCACGATTTTCTGAGCCCGATGGCGATGCTGAAAGACGTGCCG GCGGTTGAGAACTTC

AGCACCCGTTATATTGCGGAACGTACCGTGGAGACCGCGGAACTGCCGATCAACATG CTGGCGGTTAAGACC

CGTGCGCTGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTATTTCCCGGTG CTGCCGAAGCCGAAC

GTTATCAAAACCTACCAGACCGACGATAGCTTTTGCGAGCAACGTCTGTGCGGTGCG AACCCGATGGCGCTGC

AGCAAATCAAAGAGATGCCGCTGGGCTTCGAATTTACCATTGAGGAACTGCAGGAGA AATTCGGTGAAAGCA

TCAACCTGGTGGAGAAGCTGGCGGACGGCAACCTGTACGTGACCGATTATCGTCCGC TGAGCTTTGTTAAGG

GTGGCACCTACGAACGTGGTAAGAAATATCTGCCGACCCCGCTGGCGTTCTTTTGCT GGCGTAGCAGCGGTTT

TAGCGACCGTGGTCAGCTGGTGCCGATCGCGATTCAACTGAACCCGGCGGTTGGCCG TCAGAGCCAACTGAT

TACCCCGTTCGACGATCCGCTGACCTGGTTTCACGCGAAACTGTGCGTTCAGATCGC GGACGCGAACCACCAC

GAGATGAGCAGCCACCTGTGCCGTACCCACTTCGTGATGGAACCGTTTGCGATTGTT ACCGCGCGTCAACTGG

CGGATAACCACCCGCTGAACCTGCTGCTGAAACCGCACTTCCGTTTTATGCTGGCGA ACAACGACCTGGGTCG

TAAGCGTCTGGTGAACCGTGGTGGCCCGGTTGATGAGCTGCTGGCGGGCACCCTGCA GGAAAGCCTGCAAAT

TGTGGTTAACGCGTACAAAGAGTGGAGCCTGGATGAATTCGCGCTGCCGACCGAAAT CAAGAACCGTGGTAT

GGACGATAAGCTGAAACTGCCGCACTACCCGTATCGTGACGATGGCATGCTGCTGTG GAACGCGATTAAGAA

ATTTGTGAGCGAGTACCTGAAGCTGTACTATAAAACCCCGCAGGACCTGACCGCGGA TCTGGAACTGCAGGC

GTGGGCGCAAGAGCTGGTTAGCGAAAGCGGTGGCCGTGTGAAAGGTGTTCCGAGCCG TATCGAGAAGCTGG

AACAACTGGTGGACATCGCGACCGCGGTTATTTTTACCTGCGGTCCGCAGCATGCGG CGGTGAACTACAGCCA ATACGAGTATATGACCTTTATGCCGAACATGCCGCTGGCGGCGTATAAGCAGATGACCGC GGAAGGCACCAT CGCGGATCGTAAAAGCCTGCTGAGCTTCCTGCCGCCGAGCAAGCAGACCGCGGACCAACT GAGCATCCTGTTT ATTCTGAGCGCGTACCGTTATGATCGTCTGGGTTACTATGACGATAAATTCGCGGACCCG GAGGCGCAAGATA TTCTGGTGACCTTTCAGCAAGACCTGAACGAGGTTGAGCGTAAGATCGAACTGAACAACA AGAGCCGTCTGA TT AAATACAACTAT CT G AAGCCGCGT CTGGT G ACCAACAGCATCAGCGTTT AA

Amino acid Sequence for WP_073641301.1 - SEQ ID NO: 40

MKPYLPQNDPDPIKRKYSLEHKKEEYEFDHDFLSPMAMLKDVPAVEN FSTRYIAERTVETAELPINM LAVKTRALW

DPLDELQDYEDYFPVLPKPNVIKTYQTDDSFCEQRLCGANPMALQQIKEMPLGFEFT I EELQEKFGESI NLVEKLAD

GN LYVTDYRPLSFVKGGTYERGKKYLPTPLAFFCWRSSGFSDRGQLVPIAIQLNPAVGRQSQ LITPFDDPLTWFHAK

LCVQIADANH HEMSSHLCRTHFVM EPFAIVTARQLADNH PLNLLLKPHFRFMLAN N DLGRKRLVNRGGPVDELLA

GTLQESLQIVVNAYKEWSLDEFALPTEI KNRGMDDKLKLPHYPYRDDGM LLWNAIKKFVSEYLKLYYKTPQDLTADL

ELQAWAQELVSESGGRVKGVPSRIEKLEQLVDIATAVIFTCGPQHAAVNYSQYEYMT FMPNM PLAAYKQMTAEG

TIADRKSLLSFLPPSKQTADQLSI LFI LSAYRYDRLGYYDDKFADPEAQDILVTFQQDLN EVERKI ELNN KSRLIKYNYLK

PRLVTNSISV

Codon-optimized coding sequence for WP_096647440.1 - SEQ I D NO: 41

ATGAAACCGTACCTGCCGCAGAACGACCCGGAGCCGACCCAGCGTAAGAACTTCCTG GAACGTAAACAGGGC

GAGTATGAATTCGATCACAAGTTTCTGAAACCGATGGCGATGCTGAAGAACGTGCCG AGCATTGAGAACTTTA

GCACCAAATACATCGCGGAACGTACCGTGGAGACCGCGGAACTGCCGCTGAACATGC TGGCGGTTAAAACCC

GTAGCCTGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTATTTCCCGGTGC TGCCGAAGCCGAACG

TTATCAAAACCTACCAGACCGACAACAGCTTTTGCGAGCAACGTCTGTGCGGTGCGA ACCCGCTGGTTCTGCG

TCAGATTCAGCAAATGCCGGATGGCTTCGCGTTTACCATCAGCGAGCTGCAAGAAAA GTTCGGTGACAGCATT

GATCTGGAGGAACGTCTGAAAACCGGCAACCTGTACGTGGCGGACTATCGTGCGCTG GCGTTTGTTAAGGGT

GGCACCTACGAGCGTGGTAAGAAATATCTGCCGACCCCGATCGCGTTCTTTTGCTGG CGTAGCAGCGGTTTCA

GCGATCGTGGCCAGCTGGTGCCGATCGCGATTCAAATCAACCCGACCGACGGCAAGC AGAGCCAACTGATCA

CCCCGTT CG AT G AACCGCTGGT GT GGTTT CACGCG AAACT GT GCGTT CAG ATTGCGG ACGCG AACCACCACG A

GATGAGCAGCCACCTGTGCCGTACCCACTTCGTGATGGAACCGTTTGCGATTGTTAC CGCGCGTCAGCTGGCG

GATAACCACCCGCTGAACCTGCTGCTGAAGCCGCACTTCCGTTTTATGCTGGCGAAC AACGAGCTGGGTCGTC

AACGTCT GGT G AACCGTGGT GGCCCGGTT GAT G AGCT GCTGGCGGGCACCCTGCAGGAAAGCCT GCAAATCG

TGGTTAACGCGTACAAAGAGTGGAGCCTGGATCAGTTCAGCCTGCCGACCGAACTGA AGAACCGTGGTATGG

ACAACAGCGATAAACTGCCGCACTACCCGTATCGTGACGATGGCCTGCTGCTGTGGA ACGCGATTAAGAAATT

CGTGAGCGAATATCTGAAGCTGTACTATAAAACCCCGCAAGACCTGACCGCGGATTT TGAGCTGCAGAGCTG

GGCGCAAGAACTGGTTAGCCAGAGCGGTGGCCGTGTGAAAGGTGTTAGCGACCGTAT CACCACCCTGGACCA

ACTGATTGATATCGCGACCGCGGTGATTTTTACCTGCGGTCCGCAGCATGCGGCGGT TAACTACAGCCAATAC

GAGTATATGACCTTTATCCCGAACATGCCGCTGGCGGCGTATAAGCAGATTACCAGC GAGGGTAACATCCCG

GACCGTAAGAGCCTGCTGAGCTTCCTGCCGCCGAGCAAACAGACCGCGGATCAACTG AGCATTCTGTTTATCC

TGAGCGCGTACCGTTATGACCGTCTGGGCTACTATGACGATAAATTCCTGGATCCGG AGGCGCAGGAAATCCT

GGTTACCTTTCAGCAAGAGCTGAACGAGGCGGAACGTCAAATTGAACTGAACAACAA GAGCCGTCTGATCAA

CTACGACTATCTGAAACCGCGTCTGGTGACCAACAGCATTAGCGTTTAA

Amino acid Sequence for WP_096647440.1 - SEQ ID NO: 42

MKPYLPQNDPEPTQRKNFLERKQGEYEFDHKFLKPMAMLKNVPSIENFSTKYIAERT VETAELPLN MLAVKTRSLW

DPLDELQDYEDYFPVLPKPNVIKTYQTDNSFCEQRLCGANPLVLRQIQQMPDGFAFT ISELQEKFGDSI DLEERLKTG N LYVADYRALAFVKGGTYERGKKYLPTPIAFFCWRSSGFSDRGQLVPIAIQIN PTDGKQSQLITPFDEPLVWFHAKLC

VQIADAN HH EMSSH LCRTHFVMEPFAIVTARQLADN HPLNLLLKPH FRFM LAN NELGRQRLVNRGGPVDELLAG

TLQESLQIVVNAYKEWSLDQFSLPTELKNRGMDNSDKLPHYPYRDDGLLLWNAI KKFVSEYLKLYYKTPQDLTADFE

LQSWAQELVSQSGGRVKGVSDRITTLDQUDIATAVI FTCGPQHAAVNYSQYEYMTFI PNMPLAAYKQITSEGNIPD

RKSLLSFLPPSKQTADQLSI LFILSAYRYDRLGYYDDKFLDPEAQEI LVTFQQELN EAERQIELNNKSRUNYDYLKPRLV

TNSISV

Codon-optimized coding sequence for WP_099099431.1 - SEQ I D NO: 43

ATGAAACCGTACCTGCCGCAGAAAGACCCGGATGTTAAAGTGCGTATCAACTGGCTG GACAAAAACCGTGAG

GAATATAAGTTCAACTACGACTATCTGGCGCCGCTGCCGGTTATCGATAAAGTGCCG CACAAGGAGATTTTTA

GCGCGGAATACACCACCAAACGTCTGGCGAGCATGGCGAGCCTGGCGCCGAACATGC TGGCGGCGAAGGCG

CGTAACTTCCTGGACCCGCTGGATGAGCTGGAGGAATACGAGGAACTGCTGAGCCTG CTGCCGAAGCCGGAC

GTTATCAAGAACTATAAAACCGATAGCTGCTTTGCGGAACAACGTCTGAGCGGTGCG AACCCGCTGGCGATCC

AAAAAATTGACGTTCTGCCGGATAACTTCGCGGTGACCGATGCGCACTTTCAGAAGG TGGCGGGCACCGAGT

TCACCCTGGAAAAGGCGCTGAAAGAGGGCAAGCTGTACTTTCTGGACTATCCGCTGC TGAGCGATATCAAAG

GTGGCGTTTACAACAACGTGAAGAAATATCTGCCGAAGCCGCAGGCGCTGTTCTACT GGCAAAGCAACGACA

GCCCGAACGGTGGCAGCCTGGTTCCGGTGGCGATCCAGATTAACCACGATAGCGGTG GCAAAAGCGTTATCT

ATACCCCGGACGATCCGCACCTGGACTGGTTTCTGGCGAAGACCTGCGTGCAGATTG CGGATGGTAACCACC

AAGAGCTGGGCAGCCACTTCGCGTACACCCACGCGGTTATGGCGCCGTTTGCGATCG TGACCGCGCGTCAACT

GGCGGAAAACCACCCGATTGCGCTGCTGCTGAAACCGCACTTCCGTTTTATGCTGTT CGACAACGATCTGGGT

CGTACCCAGTTTCTGCAACCGGGTGGCCCGGTTGACGAGTTCATGGCGGGTAGCCTG GCGGAAAGCCTGGGC

TTTGTTGCGAAGGTGTACGAGGAATGGAGCGTGGAGAAATTCACCTTTCCGCGTCTG ATCAAGAGCCGTCGT

ACCGACGATCCGGAAATTCTGCCGCACTTCCCGTTTCGTGACGATGGTATGCTGATC TGGAACGCGGTTGAGA

AATT CGT GT ACG AAT AT CT GC AGCT GT ACT AT AAG ACCAGCCAAG ACCT GATT G ACG ATT AT G AGCTGC AG AA

CT GGGCGCGT G AACTGGTTGCGCAAG AT GGT GGCCGT GT G AAAGGCAT GCCGGCG AAG AT CG AG ACCCTGG

AACAGCTGATTGAGATCATTAGCGTGGTTGTTTTTACCTGCGCGCCGCTGCACAGCG CGCTGAACTTCAGCCA

ATACG AAT AT AT GGCGTTT GTT CCG AACATGCCGTACGCGGCGT AT CACCCG ATCCCGGAG ACCAAAGGT GT G

GACCTGGAAACCATCATGAAAATTCTGCCGCCGTTCAAGCAGGCGGCGGACCAAGTT ATGTGGACCGAGATT

CTGACCAGCTACCACTATGATAAGCTGGGCTTCTACGACGAGGAATTTGCGGATCCG CTGGCGCAGGAAATC

GTT GTGCAATTCCAGC AAAACCTGC ACG AG ATT G AACGT C AG ATCG AT ATT CGT AACC AAACCCGT CCG ATCC

CGT AC AACT ATTTT AAACCG AGCC AG AT C ATT AACAGCATT AAC ACCT AA

Amino acid Sequence for WP_099099431.1 - SEQ ID NO: 44

MKPYLPQKDPDVKVRI NWLDKN REEYKFNYDYLAPLPVIDKVPH KEI FSAEYTTKRLASMASLAPN MLAAKARN FL

DPLDELEEYEELLSLLPKPDVIKNYKTDSCFAEQRLSGANPLAIQKI DVLPDN FAVTDAHFQKVAGTEFTLEKALKEGK

LYFLDYPLLSDI KGGVYNNVKKYLPKPQALFYWQSNDSPNGGSLVPVAIQI N HDSGGKSVIYTPDDPHLDWFLAKTC

VQIADGN HQELGSHFAYTHAVMAPFAIVTARQLAEN HPIALLLKPH FRFM LFDN DLGRTQFLQPGGPVDEFMAG

SLAESLGFVAKVYEEWSVEKFTFPRLI KSRRTDDPEI LPHFPFRDDGMLIWNAVEKFVYEYLQLYYKTSQDLIDDYEL

QNWARELVAQDGGRVKGMPAKI ETLEQLIEIISVVVFTCAPLHSALN FSQYEYMAFVPN MPYAAYHPIPETKGVDL

ETIM KILPPFKQAADQVMWTEILTSYHYDKLGFYDEEFADPLAQEIVVQFQQN LH EI ERQIDI RNQTRPIPYNYFKPS

QI INSINT

Codon-optimized coding sequence for WP_052672367.1 - SEQ I D NO: 45 ATGAAACCGTACCTGCCGCAACATGAGCCGGATGCGATTGCGCGTCAGAACCGTCTGATT AAAAACCGTGCG

GACTATGTGCTGGATTACAACTATCTGCCGCCGATCCCGCTGCAGACCCCGGTTCCG CAGCAAGAGCGTTTCA

GCGCGGAATACACCGCGCGTCGTCTGGCGAGCTTTGCGAACCTGGTGCCGAACATGC TGATGGCGCGTGCGC

GTAACGCGTTTGACCCGCTGGATACCCTGGAGGAATATGCGGACCTGCTGCCGGTGC TGCCGAAGCCGAACG

TTATTAAAAACTATCAAGCGGATTGGTGCTTCGCGGAGCAGCGTCTGAGCGGTATCA ACCCGCCGGCGATCCG

TCGTATTGACGCGCTGCCGGAAAACCTGCCGATTAGCAACAGCAGCTTTCAACACAG CGTTGGCGCGGAGCA

CAACCTGGAACAGGCGCTGAAGGAAGGTAAACTGTACTGCCTGGACTATCCGCTGCT GAGCGGCATCGGTGG

CGGTAACTACCAAAACCTGCCGAAGTATCTGCCGAAACCGCAGGCGCTGTTTTACTG GCGTAGCGATAACAGC

AAGATTGGCGGTAGCCTGGTGCCGGTTGCGATCAAGATTCTGAACGAGCTGGGCGGT AAAAACCTGGTGTAC

ACCCCGAACGACGCGCCGCTGGATTGGTTCCTGGCGAAGACCTGCGTTCAGATGGCG GACGCGAACCACCAA

GAACTGGGCACCCACTTTGCGAAAACCCATGCGGTTATGGCGCCGATTGCGGCGATT ACCGCGCGTGAGCTG

GGTGAAAACCACCCGCTGACCCTGCTGCTGAAACCGCACTTCCGTTTTATGCTGTTC GATAACGAGCTGGGTC

GTACCCAGTTTCTGCAACCGACCGGTCCGACCGAGGAACTGCTGGCGGGCACCCTGG AGGAAAGCGTTCAGC

TGGTTGTGCAAGCGTACGAGGAATGGAGCATCGACACCACCTTCCCGCTGGAGCTGC AGCAACGTCAAATGC

ACGATCCGGAAATTCTGCCGCACTATCCGTTCCGTGACGATGGCATCCTGGTGTGGA ACGCGATTCACCAGTT

TGTTACCGAATACCTGCAAATTTACTATCACACCCCGCAGGACATCAGCGCGGATTA TGAGGTGCAGAACTGG

GCGCGTGAACTGGTGGACAGCGGTCGTGTTAAGGGTATGCCGGAGAGCATCGACACC CTGGCGCAACTGATT

GATATCATTGCGGTGGTTATCTTCACCTGCGCGCCGCTGCACAGCTGCCTGAACCTG GCGCAGTACGAATATA

TGACCTTTGTTCCGAACATGCCGTACGCGGCGTATCACCCGATCCCGACCACCAAGG GTGTGGATATGGCGAC

CATCGTTAAAATTATGCCGCCATTCCAGCGTGCGATCGACCAAATTCTGTGGACCGA TATCCTGAGCGCGTTTC

AATACGACAAGCTGGGCTTCTATGAGGAAGACTTTGCGGATCCGAAAGCGCAGGAAG TGCTGCAGCGTTTCC

AAGATAACCTGCAGCAAGTTGAGGAAAAGATCGAAATGCACAACCAGATCCGTCCGA TTCCGTACAACTATCT

GAAACCGAGCCGTATCATGAACAGCATTAACACCTAA

Amino acid Sequence for WP_052672367.1 - SEQ ID NO: 46

MKPYLPQHEPDAIARQNRLIKN RADYVLDYNYLPPI PLQTPVPQQERFSAEYTARRLASFANLVPNMLMARARNA

FDPLDTLEEYADLLPVLPKPNVIKNYQADWCFAEQRLSGI NPPAI RRI DALPEN LPISNSSFQHSVGAEHN LEQALKE

GKLYCLDYPLLSGIGGGNYQNLPKYLPKPQALFYWRSDNSKIGGSLVPVAIKI LN ELGGKN LVYTPNDAPLDWFLAK

TCVQMADAN HQELGTH FAKTHAVMAPIAAITARELGENH PLTLLLKPHFRFMLFDN ELGRTQFLQPTGPTEELLA

GTLEESVQLVVQAYEEWSIDTTFPLELQQRQM HDPEILPHYPFRDDGILVWNAI HQFVTEYLQIYYHTPQDISADYE

VQNWARELVDSGRVKGMPESI DTLAQLIDI IAVVIFTCAPLHSCLNLAQYEYMTFVPNMPYAAYHPIPTTKGVDMA

TIVKIMPPFQRAIDQI LWTDILSAFQYDKLGFYEEDFADPKAQEVLQRFQDN LQQVEEKI EMH NQI RPIPYNYLKPSR

IM NSI NT

Codon-optimized coding sequence for WP_073631249.1 - SEQ I D NO: 47

ATGAAACCGTACCTGCCGCAGCATGACCCGAACCCGGAAGCGCGTCGTAACTGGCTG GAACAAAACCGTGAG

GACTACAAGTTTGATCACAACTATCTGGCGCCGATCCCGATTCTGGACAAGGTTCCG CACAAAGAGCTGTTCA

GCCCGCAGTATACCGCGAAACGTCTGGCGAGCATGGCGGATCTGGTGCCGAACATGC TGGCGGCGAAGGCG

CGTAACTTCTTTGACCCGCTGGATGAACTGGAGGAATACGAGGCGCTGCTGAGCATT CTGCCGAAACCGAGC

GTTATCAAGAACTATAAAACCGACAGCTGCTTTGCGGAACAGCGTCTGAGCGGTGCG AACCCGATGGCGATG

CACCGTATTGACGAGCTGCCGGAAAAGTTCCCGGTTACCAACGATCACTTTCAAAAA GCGGTGGGTGCGGAA

CACAACCTGGAGGCGGCGCTGAAAGAGGGTAAACTGTACCTGCTGGACTATCCGCTG CTGTTTGATATTAAG

GGTGGCACCTACCAGAACATCAAGAAATATCTGCCGAAACCGCAGGCGCTGTTCTAC TGGCAAAGCAACGGT

AACAAGAACAGCGGCAGCCTGGTTCCGATCGCGATTCAAATCCACAACGACACCGGT GGCGATAGCCTGATT TATACCCCGGACGATCCGCACCTGGACTGGTTCCTGGCGAAGACCTGCGTGCAGATCGCG GATGCGAACCAC

CAAGAACTGGGTAGCCACTTCGCGCGTACCCACGCGGTTATGGCGCCGTTTGCGATT GTGACCGCGCGTCAAC

TGGGTGAAAACCACCCGCTGGCGCTGCTGCTGAAACCGCACTTCCGTTTTATGCTGT ACGACAACGATCTGGG

TCGTACCCACTTCCTGCAGGCGGGTGGCCCGGTTGACGAATTTATGGCGGGCACCCT GCAAGAGAGCCTGGG

CTTTGTGGCGAAGGCGTACGAGGAATGGAGCCTGGATAACGCGGTTTTCCCGACCGA AGTGAAGAACCGTAA

AATGGACGATCCGGACATTCTGCCGCACTATCCGTTTCGTGACGATGGTATGCTGCT GTGGGATGCGGTTAAG

AAATTCGTGACCGAATACCTGCAGCTGTACTATAAAACCCCGCAAGACCTGAGCGAG GATTATGAACTGCAAA

ACTGGGCGCGTGAGCTGGCGGCGCAAGACGGTGGCTGCGTTAAGGGCATGCCGGAGA AAATTGAAACCATC

GAGCAGCTGATCCACGTGGTTACCGTGGTTGTGTTTACCTGCGCGCCGCTGCACAGC GCGCTGAACTTCAGCC

AATACGAATATATGGCGTTTGTTCCGAACATGCCGTACGCGGCGTACTATCCGGTTC CGGAGACCAAAGGTGT

GGATATGCAGACCATTATGAAGATGCTGCCGCCGTTCAAACAGGCGGCGGACCAAGT GATGTGGAGCGATAT

CCTGACCAGCTTCCACTACGACAAGCTGGGCCACTATGATGAGGAATTTGCGAACCC GATGGCGCAGGCGAT

CCTGCTGCAATTCCAGCAAAACCTGCACGAGGTGGAACGTCAGATTGAAATCAAGAA CCAAAGCCGTCCGATT

CCGT ACAACT AT CT G AAACCG AGCG AG AT CATT AACAGCAT CAACACCT AA

Amino acid Sequence for WP_073631249.1 - SEQ ID NO: 48

MKPYLPQHDPNPEARRNWLEQNREDYKFDH NYLAPI PILDKVPH KELFSPQYTAKRLASMADLVPNMLAAKARN

FFDPLDELEEYEALLSILPKPSVIKNYKTDSCFAEQRLSGANPMAMH RIDELPEKFPVTNDHFQKAVGAEH NLEAALK

EGKLYLLDYPLLFDIKGGTYQNI KKYLPKPQALFYWQSNGNKNSGSLVPIAIQI HN DTGGDSLIYTPDDPHLDWFLAK

TCVQIADANHQELGSH FARTHAVMAPFAIVTARQLGENHPLALLLKPHFRFMLYDNDLGRTHFLQAGGPVDEFM

AGTLQESLGFVAKAYEEWSLDNAVFPTEVKN RKM DDPDILPHYPFRDDGMLLWDAVKKFVTEYLQLYYKTPQDLS

EDYELQNWARELAAQDGGCVKGMPEKIETI EQLI HVVTVVVFTCAPLHSALN FSQYEYMAFVPN MPYAAYYPVPE

TKGVDMQTIMKMLPPFKQAADQVMWSDILTSFHYDKLGHYDEEFANPMAQAILLQFQ QNLHEVERQI EIKNQS

RPI PYNYLKPSEII NSI NT

Codon-optimized coding sequence for WP_013220336.1 - SEQ I D NO: 49

AT G AACACCAGCCTGCCGCAG AACG ACAGCG AT CCGCAAGGT CGTAAGG ACCGT CTGG AACGT CGT CGTGCG

CT GT ACGT GTT C AACT ACG ACT AT GTT CCGCCG ATCCCG AT GATT GAT AAGGTT CCGCACG AGG AAT ACTTT AG

CCCGAAATATACCGCGGAGCGTCTGGCGAGCATGGCGAAACTGGCGCCGAACATGCT GGCGGCGAAGACCA

AACGTCTGTTCGATCCGCTGGACGAGCTGAACGAATACGACGAGATGTTCATCTTTC TGGATAAGCCGGGTAT

TGTTCGTGGCTATCGTACCGATGAAAGCTTCGGCGAGCAGCGTCTGAGCGGCGTGAA CCCGATGAGCATCCG

TCGTCTGGATAAACTGCCGGAAGACTTTCCGATTATGGATGAATACCTGGAGCAGAG CCTGGGTAGCCCGCA

CACCCTGGCGCAGGCGCTGCAAGAAGGCCGTCTGTATTTCCTGGAGTTTCCGCAACT GGCGCACGTTAAAGA

GGGTGGTCTGTACCGTGGTCGTAAGAAATATCTGCCGAAACCGCGTGCGCTGTTCTG CTGGGACGGTAACCA

CCTGCAGCCGGTGGCGATCCAGATTAGCGGCCAACCGGGTGGCCGTCTGTTCATTCC GCGTGACAGCGATCT

GGACTGGTTTGTGGCGAAGCTGTGCGTTCAGATCGCGGATGCGAACCACCAAGAACT GGGCACCCACTTCGC

GCGTACCCACGTGGTTATGGCGCCGTTTGCGGTGGTTACCCATCGTCAGCTGGCGGA GAACCACCCGCTGCAC

ATTCTGCTGCGTCCGCACTTCCGTTTTATGCTGTACGATAACGACCTGGGTCGTACC CGTTTTATCCAGCCGGA

CGGCCCGGTTGAACACATGATGGCGGGCACCCTGGAGGAAAGCATCGGCATTAGCGC GGCGTTCTACAAGG

AATGGCGTCTGGATGAGGCGGCGTTTCCGATCGAGATTGCGCGTCGTAAAATGGACG ATCCGGAAGTGCTGC

CGCACTACCCGTTCCGTGACGATGGTATGCTGCTGTGGGACGGCATTCAGAAGTTTG TTAAAGAGTATCTGGC

GCTGTACTATCAAAGCCCGGAAGATCTGGTGCAGGACCAAGAGCTGCGTAACTGGGC GCGTGAACTGACCGC

GAACGATGGTGGCCGTGTGGCGGGTATGCCGGGTCGTATCGAAACCGTTGATCAGCT GACCAGCATCCTGAG

CACCGTGATTTATACCTGCGCGCCGCTGCACAGCGCGCTGAACTTCGCGCAATACGA GTATATCGGTTATGTTC CGAACATGCCGTACGCGGCGTATCACCCGATTCCGGAGGAAGGTGGCGTGGACATGGAGA CCCTGATGAAG ATTCTGCCGCCGTACGAACAGGCGGCGCTGCAACTGAAATGGACCGAGATCCTGACCAGC TACCACTATGATC GTCTGGGCCACTATGACGAAAAGTTCGAGGATCCGCAGGCGCAAGCGGTGGTTGAACAGT TTCAGCAAGAGC TGGCGGCGGTGGAGCAAGAAATTGACCAGCGTAACCAAGATCGTCCGCTGGCGTACACCT ATCTGAAACCGA G CG A A AT C ATT A AC AG CAT C AAC ACCT A A

Amino acid Sequence for WP_013220336.1 - SEQ ID NO: 50

MNTSLPQN DSDPQGRKDRLERRRALYVFNYDYVPPIPMIDKVPH EEYFSPKYTAERLASMAKLAPNM LAAKTKRLF

DPLDELN EYDEMFI FLDKPGIVRGYRTDESFGEQRLSGVNPMSIRRLDKLPEDFPIMDEYLEQSLGSPHTLAQA LQE

GRLYFLEFPQLAHVKEGGLYRGRKKYLPKPRALFCWDGN HLQPVAIQISGQPGGRLFIPRDSDLDWFVAKLCVQIA

DAN HQELGTHFARTHVVMAPFAVVTH RQLAENH PLHI LLRPHFRFMLYDN DLGRTRFIQPDGPVEHMMAGTLEE

SIGISAAFYKEWRLDEAAFPI EIARRKMDDPEVLPHYPFRDDGMLLWDGIQKFVKEYLALYYQSPEDLVQDQELRN

WARELTAN DGGRVAGMPGRIETVDQLTSILSTVIYTCAPLHSALN FAQYEYIGYVPNM PYAAYH PIPEEGGVDMET

LM KILPPYEQAALQLKWTEILTSYHYDRLGHYDEKFEDPQAQAVVEQFQQELAAVEQEI DQRNQDRPLAYTYLKPS

EI INSINT

4. Consensus Sequences

Consensus sequence of CoLox - SEQ I D NO: 51

MxSxPTVRSMVMLAVLAVxALESxPCASAFATLPRALVRPQAALKYRAEDKNDVDVAPAG STASDVSKP

EGKATAVAKGTVNAPIEEAWKVFRSFSN MxQWMPVYGEWEATGDSVGDTRTFNFKDQPTFFTTERLV

GLDDSQYKMKYTLVxCKGSPVPIESI DTIVTFTAN DDVTEVDWRSWTKSPMVDLIKGRQAAGYAGGIAA

LDRYLN PSLGTVDVTI KSADN LDGxFLSSSYATLMVTDADPEQVHAKEWGTSPEFDAKPVQFSLLKPDSK

LYMxVMLTKxGVDxPVGYAVFDIQKSLKSGETVTETFQLEGSNDATLTVEMELNLRQ GSxLPQSKAQKN L

ATLVALQQSVERVRDRIVTIGKLAGEPEKSVWEYERKSGLPKSVKGLPRSEVLPPHK IALMVDAIAEYAYT

QFQLVQRLLPVRNSYDRYAAYFAPEGEEYVPIPQILKDMTWSTDDEFI RQI FAGLNPLQVEVVKNKAGLP

SKLQELKAxDGSDVDKLISEGRLYVLDYSVLKDLDLxRNGVTLYAPTMLIYRTGGDK LDVLGIMLEPRRDD

APVYTPDSETPN KFLLAKCHVACADNQVHQFTYHLGYAHLATEPLAIASHNVLEKNSHPLGMFLKPHxR

DN IGINYLARQTLVADEDAITDHTFATGTAQGVSMVVDAFKSYNFLESGLPDELRRRGFERS DDLKVYRY

RDDGWLxWDTLWKYAEDMVNELYGTDN DVxADKWQEWAxEASGSDTADVQGFPESITTKYI LTKVL

TTIIWQASALHSALNYIQYPYTATPIN RAASIFGPVPDGEADITEQDI LDVIPGGLxDEN N RGLTLSI FQGLL

SWLLRTPEN PTLDEVGSPIPN RNNPI EWVEFRSKYPQVYYNLDQN LAWEKII EERNKGLASPYEVLLPSHI

AASI NI

Consensus sequence for the protein sequences of bacterial LOX - SEQ ID NO: 52 xxxxxxxxxxLPQxxxxxxxRxxxLxxxxxxYxxxxxxxxPxxxxxxxPxxExFSxxYxx xRxxxxxxxLxxNxxxxxxxxxx DPxDxxxxYxxxxxxxxxPxxxxxYxxxxxFxEQRLxGxN PxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxLxxxxxY xxxxxxxxxxxxxxxxxxxGGxxxxxxKxLPxPxAxFxWxxxxxxxxxxxxPxxlxxxxx xxxxxxxxxxxxxxxPxxxxxxx WxxAKxCxQxADxN HxExxxHxxxTHxVMxPxAxxTxxxLxxN H PXXXLLXPHXXFM LXXNXLXXXXXXXXXGXX xxxxxGxLxExxxxxxxxxxxxxxxxWxxxxxxxPxxxxxRxxxxxxxLPHxPxRDDGxL xWxxxxxFVxxYxxxxYxxx xxxxxDxExxxWxxE LxxxxxxxxxGxVxGxxxxxxxxxxLxxxxxxxxxxCxPxHxxxNxxQxxYxxxxxN M PxAxYx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxPxxxxxxxQ xxxxxxLxxxxYDxLGxYxxxxxxxxxxxFxxxxxxxxxxxxxxxxxxFQxxLxxxxxxl xxxNxxRxxxYxxxxPxxxxNSlx x xxx = amino acids that are locate in a key long helix close to the reaction center

xxx = amino acids that are locate in a key shorter helix close to the reaction center xxx = amino acids that are locate in a key long helix close to the reaction center

Five essential conserved amino acid residues of the active site which are assumed to be involved in the binding of cofactors are shown in enlarged bold letters.

Consensus sequence for bacterial LOX and UfLOX2 protein sequences - SEQ ID NO: 53 xxxxxxxxxxLPxxxxxxxxRxxxxxxxxxxxxxxxxxxxxxxxxxxxPxxxxxxSxxYx xxRxxxxxxxxxxNxxxxxxxxxx DxxxxxxxxxxxxxxxxxxxPxxxxxxxxxxxxFxEQRLxGxN PxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxLxxxx xYxxxxxxxxxxxxxjg xxxGxxxxxxxKxLPxPxxxFxWxxxxxxxxxxxxxPxxlxxxxxxxxxxxxxxxxxxxx Pxxxxx XXWXXAKXCXQXADXXHXEXXXHXXXXHXXMXPXAXXXXXXXXXXH PXXXLLXXHXXFXXXXXXXXXXXXXXXXGXXX

xxxxGxLxExxxxxxxxxxxxxxxxWxxxxxxxxxxxxxRxxxxxxxLPHxPxRDDG xLxWxxxxxxVxxYxxxxYxxxxx xxxDxExxxxxxExxxxxxxxxxGxVxGxxxxxxxxxxLxxxxxxxxxxCxPxHxxxNxx QxxYxxxxxN M PXAXYXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXPXXXXXXXQXXX XXX Lxxxj<xP <xG_xY_xxxxxxxx_xxxf xxxxxxxxxxxxxxxxxxxQxx Lxxxxxx I xxx N xx Rxxx Yxxxxxxxxx N S I xx xxx = amino acids that locate in a key long helix close to the reaction center

xxx = amino acids that locate in a key shorter helix close to the reaction center xxx = amino acids that locate in a key long helix close to the reaction center

Five essential conserved amino acid residues of the active site which are assumed to be involved in the binding of cofactors are shown in enlarged bold letters.

Consensus sequence for bacterial LOX, CoLOXs and UfLOX2 protein sequences - SEQ I D NO: 54 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxExxxxxxxPxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxLxxxxxx xxxYxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx P xxxxxxxxxx A Kxxxxx A D XXXXXXXX H XXXX H XXX xPxAxxxxxxxxxxxHPxxxxLxxHxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxx xxxRxxxxxxxLxxxxxRDDGxLxWxxxxxxxxxxxxxxYxxxxxxxxDxxxxxxxxExx xxxxxxxxxxVxGxxxxxxxxx xLxxxxxxxxxxxxxxHxxxNxxQxxYxxxxxNxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxPxxxxxxxxxxxxxxxxxxLxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx Lxxxxxx I xxx N xxxxxxYxxxxxxxxxxS I XX xxx = amino acids that locate in a key long helix close to the reaction center

xxx = amino acids that locate in a key long helix close to the reaction center

Five essential conserved amino acid residues of the active site which are assumed to be involved in the binding of cofactors are shown in enlarged bold letters.

5. Others

CoLOX forward primer (5'- CT CT CT CT CTTT CTCTCTGTTCT-3' ) (SEQ ID NO:55) CoLOX reverse primer (5'- CTCGTTCCCTTACCGTCT-3') (SEQ ID NO:56) UfLOX2 forward primer (5'-TCGTCCAACAGGTTCTCTT-3') (SEQ ID NO:57) UfLOX2 reverse primer (5'- TTCTTTCCACTCACCGCCA-3'). (SEQ ID NO:58)

6. Corresponding natural coding sequences for SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50

Coding sequence for WP_002738122.1 - SEQ ID NO: 59

ATGGTTAATACCCCTCCTCCCACTCCTTGTCTGCCCCAAAATGAACCAGATGCGAAT CGGCGGGCTGATTCCCT

CAAT CTT C AACGCC AAGCCT AT AG AT ACG ACT AT C AGT AT CTCCC ACCCTT AGTCCT C ATGG AAT CCGTGCCT G

CAGCGGAAAACTTTTCCTTTCAGTACATTACTGAACGGTTGGCGGCAACTGCGGAAC TACCGGCCAATATGCT

GGCTGTCAAAGTCAAATCTTTTTTAGATCCCCTCGATGAGCTACAAGATTATGAGGA CTTCTTTGCCATTATCCC

CTTACCCAAAATCGCCAAAGTCTATCAAACCAATGATGCCTTTGCCGAACAACGTCT ATCGGGAGCTAATCCCC

TAGTATTACATTTACTGAAGCCGGGGGATGCTCGCGCCCAAGTTCTCAATCAAATCC CTAGTTCTAAGACAGAT

TTCG AG CC ATT GTTT C AG GT CAAT C AAG AATT AG C AG CG G G AA AC ATTT AT ATT G CCG ATT ATACGGGTACGG

ACATTAATTATCTCGGTCCCTCTTTGATTCAAGGGGGAACCCATGCCAAAGGGCGAA AATATTTACCGAAACC

CAGGGCCTTCTTTTGGTGGCGGAAAAGTGGCATCAGAGATCGGGGCAAATTAGTTCC GATCGCTATCCAATTT

GGGGAAAATGCGGAAAAGCTTTATACTCCTTTTGAGAAAAACCCCCTTGCTTGGCTA TTTGCTAAAATTTGTGT

TCAGGTGGCCGATAGCAATCACCACGAGATGAATTCCCATCTCTGTCGAACTCATTT TGTCATGGAACCGATCG

CGATCGGCACGGCCCGGCAACTGGCAGAAAATCATCCCCTCAGTCTTCTGCTTAAGC CACACCTAAGATTTAT GTTAACGAACAACCATCTGGGACAAGAGAGACTGATCAACCCTGGTGGACCGGTGGATGA ATTATTGGCCGG

CACCTTGGGCGAGTCGATGGCACTGGTTAAGGATGCCTACGCAAACTGGAATCTTCG AGACTTTGCCTTTCCC

AAAGAAATAAGTAACCGGGGTATGGACGATACGGAACGACTACCCCACTACCCTTAC CGGGATGATGGGATG

CT GGTTT GGCAGT CT ATT AAT CAGTTT GTTT CT GATT AT CTCCATTATTTTT ACCCAAACCCCCAAG ACAT CACT A

ACGATCAAGAATTGCAAGCATGGGCCGGAGAATTATCTAATTCTGCGGCAGATCAAG GGGGCAATGTGAAG

GGAATGCCGGCCAATTTTACGGATGTAGAGGACTTAATTGAAGTCGTTACCACAATT ATTTTTATCTGCGGGCC

ACTGCATTCAGCTGTTAACTATGGTCAGTATGATTACATGACTTTTGCCGCTAATAT GCCCTTGGCCGCTTACTG

TGATCTTCCAGAAGCGATTAAGGATACTACAGGATCAATAATTGGAGATGCCAGAGG ATCAATTACCGAAAA

AG AC ATT CTT C AG CT ATT G CCT CCTT AT A AA AAG G CTG CCG ATC AGTT AC AA AGT CTGTT C ACTTT AT CCG ACT A

TCGATACGATCAATTGGGCTATTACGATAAAGCTTTTCGAGAACTCTATGGCCGGAA GTTTGAGGAGGTTTTT

GCCGAGGGTGATCAGGCAACAATTACGGGCTTCCTTCGACAATTTCAGCAAAATCTC AATATGAACGAACAAG

AG ATT G ATGCCAAT AAT CAAAAACGG AT CGT ACCCT AT ACCT AT CT AAAACCTT CT CT AAT ACT C AAT AGCAT C

AGCATTTAA

Coding sequence for WP_006635899.1 - SEQ ID NO: 60

ATGGT AG ACAAT AT G AAACCT CTT CTT CCT CAAG ACG ACCCG AACCCAG AACAGCGCCACG ATTCCTT G AAT C

GTCAGCAACAAGCTTATCAGTTTGACTATGAGAGTTTATCACCTTTGGCATTATTGA AAGATGTGCCCGCAGTC

GAGAACTTTTCGAGTAAGTATCTTGCAGAACGCATATTAGCAACATCGGAACTTCCA GCAAATATGCTGGCAG

CCGATTCTAGAACTTTTCTCGATCCTCTCGACGAACTCCAAGACTATGAAGACTTTT TTACTTGGCTGCCGCTAC

CTGGAGTGGCCAAAATTTACCAAACCGATCGCTCTTTTGCAGAACAGCGCCTGTCTG GAGCAAATCCCATGGT

GCTTCGCCTGTTACATCAGGAGGACTCTCGGGCAGAAACACTGGCACAACTTTGCTG TTTGCAGCCATTATTCG

ATCTTCGCAAAGAGTTACAGGACAAAAACATTTACATTGCCGATTATACAGGTACTG ACGAACACTATCGCGG

GCCTGCGAAAGTTGCAGGAGGAACCTATGAAAAAGGCAGAAAATACTTGCCGAAACC ACGGGCTTTTTTCGC

TTGGCGGTGGACAGGAATCCGCGATCGCGGTGAAATGACACCTATTGCCATTCAACT AGATCCTAAGCCCGGT

AGCCATCTGTATACCCCATTCGATCCTCCTATCGATTGGCTGTATGCGAAACTCTGC GTACAAGTGGCAGATGC

TAATCACCATGAAATGAGTTCCCATTTAGGTCGAACTCATCTGGTGATGGAACCAAT CGCGATCGTCACCGCCC

GACAGTTGGCTAAAAATCACCCGCTTAGCCTGCTGCTGAAACCGCACTTTCGCTTTA TGTTGACCAACAACGAT

CTGGCGCGTTCTCACTTGATCGCTCCCGGCGGGCCCGTCGATGAATTGCTAGGCGGC ACCTTGGCTGAGACAA

TGGAACTGACTAGAGAGGCGTGCAGTACATGGAGTCTCGATGAATTTGCCTTGCCCG CTGAACTGAAAAATC

GGGGAATGGATGACCCCAATCAACTGCCTCACTATCCTTACCGAGATGATGGATTGT TGCTTTGGGATGCGAT

TGAAACCTTTGTATCGGGCTATCTGAAATTCTTTTACCCGACGAATGAGGGGATCGT ACAAGATGTGGAACTG

CAAACCTGGGCTAAAGAATTAGCGTCTGATGACGGCGGTAAAGTCAAAGGAATGCCA CACCACATCGACACA

GTT G AACAATT AATTGCAATT GT CACAACT GT AATTTTT ACCT GTGGTCCACAACATT CAGC AGT CAATTTT CCC

CAGTATGACTATATGAGTTTTGCGGCCAATATGCCCTTGGCAGCCTACCGGGACATT CCTGGAATTACCGCCTC

GGGTCATCTAGAAGTGATTACGGAAAATGACATTTTACGGTTGCTTCCTCCGTACAA ACGAGCTGCTGACCAA

CTGCAAATTCTGTTTATTTTGTCAGCTTATCGATATGACCGTTTGGGTTATTACGAT AAATCTTTCCGAGAACTC

TACCGGATGAGCTTCGATGAAGTTTTTGCGGGAACGCCGATCCAACTTTTAGCCAGA CAGTTCCAGCAAAATT

TGAATATGGCAGAACAAAAGATTGATGCCAACAATCAAAAACGAGTCATCCCTTATT TTGCTCTCAAGCCTTCG

TTG GTACT A AAT AG CAT C AGT ATGTAG

Coding sequence for WP_015178512.1 - SEQ ID NO: 61

ATGGT AG ACAAT AT G AAACCTT CT CTT CCT CAAG ACG ACCCG AACC AAG AACAGCGCAAAG ATT CCTT G AAT C

GCCAGCAACAAGCTTATCAGTTTGACTATGAGAGTTTATCACCTTTGGCATTATTGA AAAATGTGCCCGCAGTC

GAGAACTTTTCGAGCAAGTATATTGGAGAGCGGATATTAGCAACATCGGAACTTCCA GCAAATATGCTGGCA GCCGATTCGAGAACTTTTCTCGATCCTCTCGACGAACTCCAAGACTATGAAGATTTCTTT ACTCTGCTGCCGCTA

CCTGCTGTTGCCAAAATTTACCAAACCGATCGCTCTTTTGCAGAACAGCGCCTGTCT GGAGCAAATCCGATGGT

GCTTCGTTTGTTAGATGCCGGCGATCCTCGGGCGCAAACACTGGCACAAATTTCCAG CTTTCACCCATTATTCG

ATCTGGGCCAAGAGTTGCAGCAAAAAAACATTTACGTTGCCGATTACACGGGTACTG ACGAACACTATCGCGC

GCCTTCAAAAATAGGAGGCGGAAGCTATGAAAAAGGCAGAAAATTCTTGCCGAAACC GCGGGC I I I I I I CGC

TTGGCGGTGGACGGGAATTCGCGATCGCGGTGAAATGACACCAATTGCCATTCAACT AGATCCCACGCCAGA

TAGCCATGTCTACACCCCATTCGATCCTCCTGTGGATTGGCTGTTTGCGAAACTCTG CGTGCAAGTAGCAGATG

CCAATCACCACGAAATGAGCTCGCATTTAGGTCGAACTCATCTGGTGATGGAACCAA TTGCGATCGTCACCGC

CCGACAGTTGGCCCAAAATCACCCGCTGAGCCTGTTGCTGAAACCGCACTTTCGCTT TATGTTGACCAACAACG

AGCTGGCGCGTTCTTATTTGATCGCTCCCGGCGGGCCCGTCGATGAATTGCTAGGCG GTACTTTGCCAGAGAC

AATGGAAATAGCTAGAGAGGCTTGCAGTACCTGGAGTCTCGATGAATTTGCGTTGCC CGCCGAACTGAAAAA

TCGGGGAATGGATGACACAAATCAACTGCCTCACTACCCTTACCGAGATGATGGATT GCTGCTTTGGGATGCG

ATTGAAACCTTTGTATCCGGCTATCTGAAATTCTTTTACCCGACGGAGATCGCGATC GTACAAGATGTGGAACT

GCAAACCTGGGCCCAAGAATTAGCGTCCGATCGTGGCGGTAAAGTCAAAGGAATGCC TCCGCGCATCAACAC

AGTT G AACAATT AATT AAAATT GT CACAACT AT AATTTTCACCT GCGGCCCGCAGCATT CAGCAGT CAATTTT CC

CCAGTATGAATACATGAGTTTTGCCGCCAATATGCCCTTGGCAGCCTACCGAGATAT TCCCAAAATTACTGCTT

CGGGCAATCTCGAAGTGATTACTGAAAAGGACATTTTACGGTTGCTTCCTCCGTACA AGCGAGCGGCTGACCA

ACTGAAAATTCTGTTTACTTTGTCAGCTTATCGATATGACCGTTTGGGTTATTACGA TAAATCTTTCCGAGAACT

CTACCGGATGAGTTTCGACGAAGTTTTTGCGGGAACCCCGATCCAACTTTTAGCCAG ACAGTTCCAGCAAAAT

TTGAATATGGCAGAACAAAAGATTGATGCCAACAATCAAAAACGAGTAATTCCTTAC ATTGCTCTCAAGCCTTC

GTTG GT A AT C AAT AG CAT C AGT ATGTAG

Coding sequence for WP_015204462.1 - SEQ ID NO: 62

ATGCCACAACCTTATCTTCCCCAAAACGAACCCAATCCAGAGAAGCGCAATAATGAC TTGAGCGATCAGCAAC

AGGCTT AT G AGT ACG ACT AT AAGT AT CT ACC ACCTTT GGT ATT ACT G AAAAAAAT ACCCGCATT CG AG AATTT C

TCGGCTCAATATATTGCGGAACGGGTAGTAGCAACCTCTGAACTGGTTCCAAATATG CTGGCAGCAAAAGCTA

GATCTTTTCTAGATCCTCTAGATGATATAAAGGACTATGAAGATTTATTTACACTGT TGCCGTTGCCTGAAGTC

GCAAAAGTTTAT CAAACAAAT AATT CCTTCGCT G AACAACGCCT CT CAGG AGCAAAT CCATTCGT G ATTCGCCT

GCTGGATGAAGATGACCCTCGATCGCAAGTCTTAGAGCAGATTCCTAGTTTTAAAGA CGACTTTGAACCATTG

TTCGATGTCCGCAAAGAATTAGCGGCTGGGAACATCTATATTACTGACTATACAGGC ACTGATGAATATTATC

GTGGTCCTTCTATGGTTCAGGGTGGTACTTATGAAAAAGGTCGGAAATATTTACCAA AACCGCTAGCTTTCTTT

TGGTGGCAGCGCACTGGGATCAGCGATCGCGGTAAGCTGGTGCCAATCGCTATCCAA CTAGATGCCAGCAAG

AATAGCAAGGTATATACTCCGACAAATAGCAAGGTATATACTCCCTTTGAGCAGAAT CCACTCGATTGGCTATT

TGCAAAACTTTGCGTTCAAATAGCAGATGGAAATCACCATGAGATGAGTTCCCACTT ATGTCGGACACATTTTG

TAATGGAACCGATCGCAATTGGAACTGCTCACCAATTGGCTGAAAATCATCCTCTCA GCCTTCTACTCAGACCA

CACTTCCTATTCATGTTGACCAATAATCATCTTGGACAGCAAAGGTTAATAAATCCA GGTGGTCCTGTTGATGA

GTTGCTGGCTGGTACTTTACCAGAGTCAATGGAGCTAGTTAAGGATGCTTATGAAGG ATGGAATATAAAGGA

ATTTGCCTTTCCAACCGAGATTAAGAATCGGGGAATGGATAATACGGAAAGACTACC TCACTATCCTTACCGA

GATGATGGGATGCTTGTTTGGAAAGCTATTCACACTTTTGTATCTGACTATGTTAAT CATTTTTACCCAACTCCT

GAAGACATCACTGGAGACACTGAATTGCAAGCATGGGCTAAAGAATTGTCCGATCAA TCCGCTCAAACTAATG

GTGGCAAAGTCAAGGG AATGCCAACAAGTTTTACT ACT GTT CAAG AACT GATT G AAATCGTTACT ACAAT CAT

CTTTATCTGTGGTCCCCAGCATTCAGCAGTAAACTACGCTCAGGATGGATATATGAC TTTTGCCGCTAATATGC

CCTTAGCAGCTTACCGTGATATTCCTAAGCAAAGTCACAAGCCTCAAGACCAACCTA CAGCAACCCCATCTGTA

G C AGTG C A AACT AC AG C AG AG C A AACT AC AG C AG AG C AA ACT A AAG C AGT AG A A ATT AC AG C AG AC AA AG CT

ACATT AG ACCAAAAT ACAGT ATT GCAAAAG AG AGC AGT AC AAACT ACC AC AGT AG AAATTCCAG AAG ACCAA ATTACAGAAGAACAAATTCTTAAGTTGCTGCCTCCCTACAAGAGAACTGCCGATCAACTG CAAAGTCTCTTTGT TTTGTCAGCCTATCAGTACGACCGATTGGGCTACTATGAAAAAGCCTTTCAACAACTTTA TAACGACAAATTTG AGGATGTTTTTAAAGATGACAATAATCAAGCAATTATTGCCATCGTCAGGCAGTTCCAGC AAAATCTGAATAT GGT AG AAC AAG AAATT G ATGCCAAT AAT AAAAAGCG AGT AGTTCCTT AT CTTT ACCT AAAACCTT CT CT AAT AC TC AAC AGT ATT AG C ATTT AG

Coding sequence for WP_028091425.1 - SEQ ID NO: 63

ATGCAGCCATTTCTACCTCAAAATGACCCGAACCCCTCACAACGCCAATCTTCTCTA GAGAAAGGCCGCAAAG AGTATCAGTTCATGTATGATTTTTTGCCGCCTATGGCAATGATCAAAAGCGTACCTCCCG CAGAGAATTTTTCT ACTAAGTATATTGCTGAACGGACATTAGAGGCAGCAGAACTTCCTCTAAATATGATGGCT GTTAAAACTCATG CT AT GTGGG AT CCTTT AG AT G AATTGCAAG ATT AT G AGG ACTTTTT CCCAGTTTTGCAAAAACCT AAT GT GAT G AAAACCTATGAAACCGATGATTCCTTCGCCGAACAACGGCTTTGTGGAGTAAATCCGATG GTTTTACGTCAAA TT AAG C A A AT G CC AG CT A ACTTT G CCTTT ACC ATT G A AG AATT AC AG GAT A AGTTT G G C AGTT CT ATT A ATTT A ATTGAAAGATTGGCCACAGGAAATCTATATGTCGCTGATTATAGATCCTTGGCGTTCATT CAAGGTGGCACTTA TGCCAAAGGAAAAAAGTACCTACCAGCACCTCTAGCTTTTTTCTGTTGGCGCACTTCAGG CTTTCAAGATCGAG GCCAATT AGTCCCT GTAGCCATT CAAATCAATCCCAAAGCAGGT AAAGCCAGCCCCTTGCT AACT CCTTTT G AC GACCCTTTAACCTGGTTTTATGCTAAGTCCTGTGTGCAAATTGCTGATGCTAATCATCAT GAAATGAGTAGCCA TTTATGCCGGACTCACCTGGTAATGGAACCCTTTGCTGTTGTCACTCCTCGTCAACTGGC TGAAAATCATCCTCT GAG AAT ATT ACT CAAACCCCATTT CCGGTTT AT GTT GGCT AAT AAT G ATTT AGCTCGC AAGCGT CT GGTT AGT A GGGGCGGTTTTGTGGATGAATTATTAGCAGGAACTCTGCAAGAATCATTGCAAATTGTGG TAGATGCCTATAA AAGTTGG AGT CT AG ACCAGTTT GCT CT ACCCAG AG AACT C AAAAAT CGCGGT GT G AATG AT GT CAAAAACTT A CCACATT ATCCTT ATCGGGATGATGGAATTTTGTT ATGGAATGCGATT AAT AAGTTTGTATTT AACT ATTTGCAG CTTT ATT ATCAGAGTTCAGCAGACTTGAAAGCAGACGCAGAACTGCAAGCTTGGGCGCGGGAATTAG TGGCT CAGGATGGTGGTAGAGTTAAGGGTATGAGCGATCGCATTGATACCCTAGAACAATTAGTG GAGATTGTTACT ACTATCATATATATTTGTGGTCCGCAGCATTCGGCGGTTAATTTCTCCCAATATGAATAC ATGGGTTTTATTCCT AAT AT GCCCCT AGCT GCTT ATCAACCAATT CAACAAAAGGGT GAT ATT AAAG ACCGTCAAGCCCT CATAG ATTT T CT ACC ACCTGCCAAGCCC ACAAGT ACCC AATT AT C AACT GT GT ACAT ACTTT CAG ACT AT CGTT AT G AC AG ACT GGGATATTATGAAGAGGAAGAATTTACAGATCCAAATGCTGACCAAGTTGTGAATAAATT TCAGCAAGAATTG AAT AT GGT ACAG AG AAAAATT G AATT G AAT AAT AAG AG ACGTTT AGT AAATT AC AAAT AT CTCCAACCAAG AC TT ATT CT C A AC AGT ATT AGT ATTT A A

Coding sequence for OBQ01436.1 - SEQ ID NO: 64

ATGCAGCCATTTCTACCTCAAAATGACCCGAACCCCGCACAACGCCAATCTTGTCTA GAGAAAGGACGCAAAG

AGT AT CAATT CAT GTATG A I I I I I I GCCTCCTATGGCGATGCTCAAAAGCGTACCTCCCGCAGAGAATTTCTCTA

CTAAGTATATTGCTGAACGGACATTAGAGGCAGCAGAACTTCCTCTGAATATGATGG CTGTTAAAACTCATGC

TATGTGGGATCCTTTAGATGAATTGCAAGATTATGAGGACTTTTTCCCAATTTTGCA AAAACCTAATGTGATGA

AAACCTATGAAACCGATGATTCTTTCGCCGAACAACGGCTTTGTGGGGTGAATCCGA TGGTTTTACGTCAAATT

AAG C AA AT G CC AG CT A ACTTT G CCTTT ACC ATCG AAG A ATT ACAG G CT A AGTTT G G CAATT CT ATT A ATTT AAT

CGAAAGATTGGCAACAGGAAATCTATATGTCGCTGATTATAGATCCTTGGCGTTCAT TCAAGGTGGCACTTAT

G CC AA AG G AA AA AAGT ACCT ACC AG C ACCTCTG G CCTTTTT CT GTT GGCGCAGTTCGGG CTTT C A AG AT CG AG

GACAATTAGTCCCTGTAGCCATTCAAATCAATCCCAAAGCAGGTAAAGCCAGCCCCT TGCTGACTCCTTTTGAT

GACCCTTTAACCTGGTTTTATGCTAAGTCCTGCGTGCAAATTGCTGATGCTAATCAT CATGAAATGAGTAGCCA

TTTATGTCGGACTCACTTAGTAATGGAACCCTTTGCTGTTGTCACCCCGCGTCAACT GGCTGAAAATCATCCTCT

GAGAATATTACTCAGACCCCATTTCCGGTTTATGTTGGCTAATAATGATTTAGCTCG CAAACGTCTGGTTAGTA GGGGCGGTTTTGTGGATGAATTATTAGCAGGAACTCTGCAAGAATCATTGCAAATTGTGG TAGATGCCTATAA

AAGTTGGAGTCTAGACCAGTTTGCTCTACCCAGGGAACTCAAAAATCGCGGTGTAGA TGATGTGAAAAACTTG

CCACATTATCCTTATCGGGATGATGGAATTTTGTTATGGAATGCGATTAATAAGTTT GTATTTAACTATTTGCAG

CTTTATTACAAGAGTCCAGCAGACTTGAAAGCAGACGGAGAACTACAAGCTTGGGCG CGGGAATTGGTGGCT

CAGGATGGTGGTAGAGTTAAGGGTATGAGCGATCGCATTGATACCTTAGAACAATTA GTTGAGATTGTTACTA

CTATCATATATATTTGTGGTCCTCAGCATTCGGCGGTTAATTTCTCCCAATATGAAT ACATGGGTTTTATTCCTA

ATATGCCCCTAGCTGCTTATCAAGAAATTCAACAAAACGGTGATATTGAAGACCGTC AAGCCCTGATAGATTTT

CT ACCACC AGC AAAGCCC ACAAAT ACCCAATT AT CAACT GT GT ACAT ACTTT C AG ACT AT CGTT AT G ACAG ACT

GGGATATTATGAAGAGGAAGAATTTACAGATCCAAATGCTGACCAAGTTGTGAATAA ATTTCAGCAAGAATTG

AGTGTGGTACAGAGAAAAATTGAATTGAATAATAAGGGACGTTTAGTAAATTACGAA TATCTCCAACCCGGAC

TT ATT CT C A AC AGT ATT AGT ATTT A A

Coding sequence for OBQ25779.1 - SEQ ID NO: 65

AT CAT AAAT AT CAT GCAGCCATTT CT ACCT C AAAAT G ACCCG AACCCCGG ACAACGCCAAT CTT CT CT AG AG AA

AGGACGCAAAGAGTATCAATTCATGTATGATTTTTTGCCTCCTATGGCGATGCTCAA AAGCGTACCTCCCGCAG

AGAATTTCTCTACTAAGTATATTGCTGAACGGACATTAGAGGCAGCAGAACTTCCTC TAAATATGATGGCTGTT

AAAACTCAT GCTAT GTGGGATCCTTTAG AT G AATTGCAAG ATT AT G AGG ACTTTTT CCCAGTTTTGCAAAAACC

TAATGTGATGAAAACCTATGAAACCGATGATTCCTTCGCCGAACAACGGCTTTGTGG GGTGAATCCGATGGTT

TT ACGT C AA ATT AAG C A A AT G CC AG CT A ACTTT G CCTTT ACC AT CG A AG A ATT ACAG G CT AAGTTT G G C A ATT C

TATTAATTTAATCGAAAGATTGGCCACAGGAAATCTATATGTCGCTGATTATAGATC CTTGGCGTTCATTCAAG

GTGGCACTTATGCCAAAGGAAAAAAGTACCTACCAGCACCTCTAGCCTTTTTCTGTT GGCGCAGTTCAGGCTTT

C A AG AT CG AG G CC AATT AGTCCCTGTAG CC ATT C A AAT C A AT CCC A AG G C AG GTC AAG CC AG CCCCTT G CT A A

CT CCTTTT GAT AAACCTTT AACCT GGTTTT ATGCT AAGT CCT GT GT GC AAATTGCT G ATGCT AAT CAT CAT G AAA

TGAGCAGCCATTTATGTCGGACTCACCTGGTAATGGAACCCTTTGCTGTTGTCACCC CGCGTCAACTGGCTGAA

AATCATCCTCTGAGAATATTACTCAAACCCCATTTCCGGTTTATGTTGGCTAATAAT GATTTAGCCCGCAAGCGT

CTGGTTAGTAGGGGCGGTTTTGTTGATGAATTATTAGCAGGAACTCTGCAAGAATCA TTGCAAATTGTGGTAG

ATGCCTATAAAAGTTGGAGTCTGGACCAGTTTGCTCTACCTAGGGAACTCAAAAATC GCGGTGTAGATGATGT

GAAAAACTTGCCACATTATCCTTATCGGGATGATGGAATTTTGTTATGGAATGCGAT TAATAAGTTTGTATTTA

ACTATTTGCAGCTTTATTACAAGAGTCCAGCAGACTTGAAAGCAGACGGAGAACTGC AAGCTTGGGCGCGGG

AATTAGTGGCTCAAGATGGTGGTAGAGTTAAGGGTATGAGCGATCGCATTGATACCT TAGAACAATTAGTTG

AG ATT GTTACT ACT AT CAT AT AT ATTT GTG GTCCTC AG C ATTCG G CG GTT AATTT CTCCC A AT AT G AAT ACAT G G

GTTTT ATT CCT AAT ATG CCCCT AG CTG CTT ATC AAG C A ATT C AAC A AA AG G GTG AT ATT AAAG ACCGT C A AG CC

CT GAT AG ATTTT CT ACCACCT GCC AAGCCCACAAAT ACCCAATT AT CAACT GT GT ACAT ACTTT CAG ACT AT CGT

TATGACAGACTGGGATATTATGAAGAGGAAGAATTTACAGATCCAAATGCTGACCAA GTTGTGAATAAATTTC

AGCAAGAATTGAATGTGGTACAGAGAAAAATTGAATTGAATAATAAGGGACGTTTAG TAAATTACGAATATCT

CC AACCCAG ACTT ATT CT C AACAGT ATT AGT ATTT AA

Coding sequence for WP_039200563.1 - SEQ ID NO: 66

AT G AAGCCATTTTT ACCT CAAAAT GACCCAAATCCC AC ACAACGCCAAT CTTCCCT AG AG AAAGGT CGC AAAG AGTATGAATTTAGGTATGA I I I I I I GCCT CCT ATGGCG ATGCT CAAAAACGT ACCTCCCT CT G AG AATTTTT CT A CCAAGTATATTGCTGAACGGACAATAGAGACAGCAGAACTTCCTAGCAATATGATGGCTG TCAAAGCCCATGC T AT GT GGG ACCCCTT AG AT G AATTGCAAG ACT AT G AAG ACTTTTTT CC AGTTTTGC AAAAACCT AAT GTG ATG A AAAATTATGAAACAGATGATTCCTTCGCCGAACAACGGCTTTGTGGCGTGAATCCTGTGG TTTTATGTCAGATT AAGCAAATGCCAGCCAACTTTGCCTTTACCATCGAAGAATTGCAAGCTAAGTTTGGCAAT TCTATTGATTTAAG AGAAAGACTGGCAACCGGAAATCTCTATGTAGCTGATTATAGACCTTTGGCGTTCATTCG AGGTGGCACTTTC

GCCAAAGGGAAAAAGTATTTACCAGCACCACTAGCCTTTTTCTGTTGGCGGAGTTCA GGCTTTCAAGATCGTG

GTCAATTAGTACCTATAGCGATTCAAATCAATCCCAAGGAAGGAAAAGCCAGCCCAT TGCTGACCCCTTTTGAT

GACTCTTCTACCTGGTTTTATGCCAAGTCCTGTGTGCAAATTGCTGATGCTAATCAT CATGAAATGAGTAGCCA

TTTATGCCGGACTCACTTTGTGATGGAACCCTTTGCGGTTGTTACTCCTCGTCAATT AGCCCAGAACCATCCGCT

GAGAATATTACTAAAACCCCATTTCCGGTTCATGTTGGCCAACAATGATTTAGGTCG CCAGCGGTTGGTGAAT

AGAGGCGGTCCTGTTGATGAATTATTAGCGGGAACTCTGCAAGAATCACTGCAAATT GTTGTAGATGCTTATA

CAGATTGGAGATTGGATCAGTTTGCGCTGCCAACAGAACTCAAAAATCGCGGTGTGG ATGATGTGAAAAATT

TGCCCCACTATCCCTATCGGGACGATGGGATCTTGTTGTGGAACGCGATTAACAAGT TTGTGTTTAACTATTTG

GAGCTTTACTACAAGAGTCCCGCAGACTTGACAGCAGATGTCGAACTACAAGCTTGG GCGCGGGAATTAGTG

GCTCAGGATGGTGGTAGAGTCAAGGGGATGAGCGATCGCATTGATACTTTGAAACAA TTAGTAGAGATTGTT

ACTACT AT C ATTT AC ACTT GTGGACCTCTG C ATT CTG CTGTT A ATTT CCCCC AAT AT G A AT AC AT G G GTTT C ATT C

CCAATATGCCTCTGGCTGCTTATCAACCAATTAAAAAAGAAGGCGTTTGTACCCGCA AGGAACTGATAGATTTT

TT ACCAGCTGCCAAACC AACAAGT AGCCAATT AAC AACTTT ATT CACACT CT C AGCCT AT CGTT AT G ACAG ACT

AGGATATTATGAAGAGGAAGAATTTGAAGACCCCAATGCTGACGATGTTGTGAATAA ATTCCAGCAAGAATT

G AAT GTGGTGCAAAG AAAAATT G AGTT G AGCAAC AAGGG ACGTTT AGT AAATT ACG AAT AT CT AC AACCCAG

ACTT AT CCT C A AC AG CAT C AG C ATTT AA

Coding sequence for WP_012407347.1 - SEQ ID NO: 67

AT G AAACC AT ACCT CCCT CAG AAT GAT CCT G ACCCT ACAAAACGT CAAAT ATTGCT AG AG AG AAAT CAAGGGG

AGT AT G AATTT GATT ACG ACTTTTTGGT ACCT ATGGCAATGCT AAAAAAT GT ACCTT CT AT AG AAAACTTTT C AA

CTAAGTATATTGCTGAACGGACATTAGAGACAGCAGAACTGCCTATAAATATGTTAG CCGTTAAAACCCGTTC

TTT ATGGG ACCCTTT AG AT G AATTGC AAG ACT AT G AAG ACT ATTTT CC AGTTTTGCCT AAACCT AAT ATT AT CAA

AACAT ACCAAAGT GAT G ACT CTTTTT GT G AGCAACGGCTTT GT GGGGCAAAT CCTTTT GTTTTACGTCG AATT G

AGCAGATGCCAGATGGCTTCGCCTTTACCATTTTAGAATTGCAAGAAAAATTTGGTG ACTCTATTAACTTAGTA

GAAAAACTTGCGAATGGAAATTTATATGTAGCTGATTACAGAGCGCTTGCGTTTGTT AAAGGAGGTAGTTATG

AAAGAGGTAAGAAGTTTTTACCAACCCCTATAGCTTTCTTTTGTTGGCGCAGTTCTG GTTTTAGCGATCGCGGT

CAACT AGT ACCG ATT GTT AT CC AAAT CAACCCCGC AG ATGGC AAAC AG AGCC AG CT AATT AC ACCTTTCG AT G A

CCCTTT AACCT GGTTT CAT GCCAAGCTTT GT GTT CAAATTGCT G ATGCT AACCAT CAT G AAAT G AGTAGCCATCT

GT GT CG AACT CACTTT GTT ATGG AACCCTTT GCT ATT GT CAC AGCCCGT CAACT AGCCG AG AACCAT CCCCTT A

GCTTACTGCTAAAACCCCACTTCCGTTTCATGTTGGCTAATAATGACTTGGCTCGTA AGCGCCTAATTAGTAGA

GGTGGGCCTGTTGACGAATTGCTAGCCGGAACTCTGCAAGAGTCATTGCAAATTGTC GTTAACGCATATACAG

AATGGAGCTTAGATCAGTTTTCCTTACCTACTGAACTAAAAAATCGGGGTATGGATG ATCCAGACAACTTACCT

CACT AT CCCT ATCG AG ACG AT GGCTTATTATT GT GG AAT GCCATT AAAAAGTTT GT GT CT G AAT ACTT GCAG AT

ATACTACAAAACTCCCCAAGATTTAGCAGAAGACTTGGAATTACAAAGTTGGGTGCA GGAATTAGTTTCCCAA

TCAGGCGGACGAGTCAAGGGTATTAGCGACCGCATCAACACATTAGACCAATTAGTT GATATTGCTACTGCGG

TTATCTTCACCTGTGGGCCGCAACACGCTGCTGTTAACTACTCACAATATGAATATA TGACTTTCATGCCAAATA

TGCCTCTTGCTGCTTATAAACAAATGACATCAGAAGGCACTATTCCTGACCGTAAAA GTCTATTATCATTTCTGC

CACCGTCAAAGCAAACTGCTGACCAATTATCGATTTTATTTATCCTGTCGGCCTACC GTTATGACAGATTAGGG

TACTACGATGATAAATTTTTAGACCCAGAGGCTCAGGATGTTTTAGCTAAATTCCAG CAGGAGTTGAATGAAG

CAGAGCGGGAAATTGAGTTGAATAACAAGAGTCGTTTAATAAATTACAACTATCTCA AACCAAGGCTTGTGAC

TA ATAGTATT AG CGTGTAA

Coding sequence for WP_027843955.1 - SEQ ID NO: 68 ATGAAACCCTATCTTCCTCAAAATGACCCTAACCCTGAGAAGCGGAAAGATTGGCTTAAT AAAAATCGTGAAG

AGTACCAATTTAACTTCAATTATCTTTCTCCCCTCCCATTAATTGATGATGTTCCTA ATAATGAGGCTTTTTCCCC

TAAATACCTTGCAGAACGCTTACCTTTAACTTTCGGTAAATTATCTGCTAATACCTT GGGAATTAGACTTCGCTC

TTTTT GGG ATCCTTTT GAT G AATT CC AAG ATT AT G AGG ACTTTTT CCCT GTTTT ACCAACACCGG AATT ACT CAA

GACCTACCAAAATGACGAATACTTTGCCGAACAAAGGCTAAGTGGAGTAAATCCTAT GGTAATACGCAGTATT

AAGGAACTACCCCCTCACTTTGCATTTTCCATCCGAGATTTACAGGCTGAATTTGGT ACATCCCTAAATTTAGA

GCAAGAACTGAACAACGGAAATCTATATATCGCAGACTATACCAGTCTTTCATTTGT TCGGGGAGGAAGCTAT

CTT AG G G GT CG A AAGT CTTT ACCTG C ACCC AT AG CCTT ATTTT G CTG G CGT AATT CTG GTT ATT GCGATCGCGG

AGAATTAACCCCAATCGCTATTCAACTAGTACCGGAACTTGGTACGGGAAGTAGAAT TTTAACTCCTTTTGATT

CT CACCTT AACTGGTT AT ATGCCAAAATTT GT AT GCAG ATTGCAG ATGCAAATCAT CAT G AAAT G AGTAGCCAT

TTATGTCATACTCACCTAGTGATGGAACCTTTCGCAGTTGTAACAGCTCGACAGCTA GCTGAAAATCATCCGTT

GGGTTTGTTGCTGCGTCCCCACTTCCGGTTCATGCTCCACAACAATGAATTAGCCCG TAAAAATTTAATTAATC

AAGGTGGGTACGTTGATAATCTCCTTGGGGGAACCTTAAGAGAATCCCTACAAATTG TCCGGGATGCTTACTT

TAAAAAT GCT G AAG AATTTTGG AGCTTAG ACG AATTTGCTTTACCT AAAG AAAT CGCAAATCGTGGCTT AG AT

GAT ACT GAT CGCTT ACCCC ACT ACCCCT AC AG AG AT G ATGG AAT GTT ACT GT GG AATGCG AT CG AG AAATTT G

TATCGAATTATTTGAGTATATATTATCCAAATCCAGGGGACATTAAAGATGATCGCG AACTGCAAGCTTGGGC

TGCAGAATTAGTTGCTGCTGATGGTGGACGAGTAAAAGGGGTACCCTCACAATTTGA AAATCTGCAACAATTA

ATCGACGTTGTAACTGGCATTATTTTTACATGCGGACCTCAGCACTCTGCTGTAAAT TATCCCCAATATGAATAT

ATGGCATTTGTTCCGAATATGCCCCTCGCAGGTTACCAAGCTGTGGATTCTAATCCC AACATGGATCTGAAAAG

TTT AAT GGCGTTT CT CCCCCC ACCC AAT C AAACTGC AG AT C AACT AC AAATT ATTT ACGG ATT AT CAGCTT AT CG

TT AT G ACCGCTTGGGTT ACT ACG ACCG AG AATTT AGCG AT CCT CAT GCT G AAG AAGTT GT CAG ACT ATTT CAAC

AAG ATTT AAAT C AGGTGG AACGT AAAATT G AGTT ACGT AAC AAAAAT CGCTTGGTT G AAT AT AACTTCCT CAA

G CCTT CTTT AGTT CTT AAT AGTATC AGT AT AT A A

Coding sequence for WP_073641301.1 - SEQ ID NO: 69

AT G AAACC AT ACCTTCCT CAAAAT G ACCCT G ACCCG AT AAAACGC AAAT ATTCCTT AG AGC AT AAG AAAG AAG

AATACGAATTCGATCACGACTTTTTATCACCGATGGCAATGCTCAAAGATGTACCTG CTGTCGAAAATTTTTCT

ACCAGGTATATTGCTGAACGTACAGTAGAGACAGCAGAGCTTCCTATCAATATGTTG GCTGTTAAAACCCGTG

CTTTATGGGACCCTTTAGATGAATTGCAAGACTATGAAGACTATTTTCCAGTCTTGC CTAAACCTAATGTCATCA

AAACATACCAAACAGATGATTCTTTTTGCGAACAACGCCTGTGTGGGGCGAATCCTA TGGCTTTACAGCAAAT

TAAAGAGATGCCGTTGGGGTTTGAATTTACCATCGAAGAACTGCAAGAAAAGTTTGG CGAATCTATCAATTTG

GTAGAAAAACTTGCTGATGGAAATTTATATGTGACTGATTACAGACCGCTTTCATTT GTAAAAGGTGGTACTTA

CGAGAGAGGTAAAAAGTATTTACCAACACCCCTAGCTTTTTTCTGTTGGCGGAGTTC TGGGTTTAGCGATCGC

GGTCAACTCGTACCT ATT G CC ATCC AACT C A AT CCCGCAGTCGG CAG AC A AAG CCA ATT AAT C AC ACCTTTT G A

CGATCCTTTAACTTGGTTTCATGCCAAACTTTGTGTTCAAATTGCTGATGCTAACCA TCATGAGATGAGTAGCC

AT CTTTGCCG AACT C ACTTT GT CATGG AACCTTT CGCCATT GT CACAGCCCGT C AATT AGCT GAT AAT CAT CCT C

T C AATTT GTT ATT AA AACCCC ACTTCCGTTT C ATGTTG G CT AAT AAT G ACTT G G GTCG C AAG CG CTT AGTT AAT A

GGGGCGGACCTGTTGATGAATTGCTAGCTGGAACTCTGCAAGAATCATTGCAAATTG TCGTCAACGCCTATAA

AGAATGGAGTCTAGATGAATTTGCCTTACCCACTGAAATCAAAAATCGGGGTATGGA TGATAAACTAAAATTG

CCTCACTATCCCTATCGAGACGATGGGATGCTATTGTGGAATGCTATTAAAAAGTTT GTGTCTGAATACTTGAA

GTTATACTATAAAACTCCCCAAGATTTGACAGCAGACTTAGAATTGCAAGCTTGGGC GCAGGAATTAGTTTCT

GAATCAGGCGGACGAGTTAAAGGCGTTCCCTCTCGCATTGAAAAATTAGAACAATTA GTTGATATTGCGACTG

CGGT AATTTTCACCT GT GG ACCACAACACGCTGCT GTT AACT ATT CACAAT AT G AAT AT AT G ACCTT CAT GCCG

AATATGCCCCTTGCTGCTTATAAACAAATGACAGCAGAAGGCACTATTGCTGACCGC AAAAGCCTATTATCATT

TCTGCCACCGTCAAAGCAAACTGCCGATCAATTGTCGATTTTATTCATCCTGTCAGC TTACCGTTATGATAGGTT AGGTTACTATGACGATAAGTTCGCAGACCCAGAAGCTCAGGATATTCTAGTTACATTTCA GCAGGATTTGAAC GAGGTAGAGCGTAAAATTGAGTTGAACAACAAGAGTCGTTTAATAAAGTATAACTACCTC AAACCAAGGCTTG TT ACCAAT AGCATT AGCGT CTAA

Coding sequence for WP_096647440.1 - SEQ ID NO: 70

AT G AAACC AT AT CTTCCACAG AAT GAT CCT G AACCT ACACAACGC AAG AATTT CCTGG AGCGCAAACAAG G AG

AGTAT G A ATTT GAT C AC A A ATTTTT A AAG CCTATG G C A AT G CT A AA A AAT GTACCCTCT ATT G A AA ATTTTT CT A

CTAAATATATTGCTGAACGTACGGTAGAGACGGCAGAACTTCCTCTAAATATGTTAG CCGTTAAAACTCGTTCT

TT GTGGG AT CCTTT AG AT G AATTGCAAG ACTAT G AAG ACT ATTTT CCAGTTTT ACCTAAACCT AAT GT CAT CAA

AACAT ACCAAACT GAT AACTCTTTCT GT G AACAACGGCTTT GT GGTGCAAATCCTTTAGTTTT ACGCCAAATT CA

GCAGATGCCAGATGGCTTTGCCTTTACCATTTCAGAACTGCAAGAAAAGTTCGGTGA CTCTATCGACTTAGAA

GAAAGACTTAAAACTGGAAATTTATATGTAGCTGATTACAGAGCGCTTGCATTTGTT AAAGGAGGTACTTATG

AAAGAGGTAAGAAGTATTTACCCACTCCCATAGCGTTCTTTTGTTGGCGTAGTTCTG GTTTTAGCGATCGCGGT

CAACTAGTACCGATTGCTATCCAAATCAATCCCACAGATGGTAAACAGAGTCAGTTA ATCACACCTTTTGATGA

G CCTTT G GTCTG GTTT CAT G CC AA ACTTT GTGTT C A AAT CG CTG ATG CT AACC AT CAT G AAAT G AGTAGT CAT C

T GT GT CG AACT C ACTTT GT AAT GG AACCCTT CGCCATT GT C AC AGCCCGT C AACT AGC AG AT AACC AT CCCCT C

AACTT ATT GCTT AAACCCC ACTTCCGTTT CAT GTT AGCT AAT AAT G AATT AGGTCGT CAGCGCCT AGTT AAT AG A

GGTGGGCCTGTTGACGAATTGCTAGCGGGAACTTTGCAAGAGTCATTGCAAATTGTC GTCAACGCATATAAAG

AATGGAGCTTAGATCAGTTTTCTTTACCCACCGAACTCAAAAATCGGGGTATGGATA ATTCAGACAAACTACCT

CACT AT CCTTATCG AG ACG ATGGCTT ACT ATT GTGG AATGCCATTAAAAAATTT GT GT CT G AAT ACTT G AAACT

AT ACT AT A A AACT CCT C AAG ATTT A AC AG C AG ACTTT G AATT AC AAT CTT G G G CG C AG G A ATT AGTTTCCC AAT

CAGGCGGGCGGGTCAAGGGCGTTAGCGACCGCATTACAACATTAGACCAATTAATTG ATATTGCTACGGCGG

TT ATTTT C ACCTGTG G G CC AC A AC ACG CT G CTGTT AATT ACT C AC A AT AT G A AT AT AT G ACTTT C ATTCCC AAT A

TGCCCCTCGCTGCTTATAAACAAATAACATCAGAAGGAAATATCCCTGATCGTAAAA GCCTACTATCATTTCTT

CCACCATCAAAGCAAACTGCTGATCAATTATCGATTTTATTCATCTTGTCCGCCTAC CGTTATGACAGATTAGG

GTACTATGACGATAAATTTTTAGATCCGGAGGCACAGGAGATTTTAGTTACATTTCA GCAGGAGTTGAACGAA

GCAGAACGGCAAATTGAGTTGAACAATAAAAGCCGTTTAATAAATTACGACTATCTG AAACCAAGGCTTGTTA

CT AAT AG CAT C AG CGT AT AA

Coding sequence for WP_099099431.1 - SEQ ID NO: 71

AT G AAACC AT ATTT ACCACAAAAAG AT CCT GAT GTT AAGGT CCG AAT C AATT GGCT AG AT AAAAAT CG AG AAG

AGT ACAAATTT AATT ACG ATT AT CT AGCTCCT CT ACC AGT AATT GAT AAAGTTCCT CAT AAGG AAAT ATT CTCGG

C AG A AT AT ACT ACT A AACGTTT G G C A AGT ATG G C A AGT CTT G C ACC AAAT ATG CTAG CTG CC AA AG CC AG AA A

CTT CTT AG ACCC ATT AG AT G AATT GG AAG AAT AT G AAG AACTTTT GT CACT ACT ACCAAAACCCG AT GT CAT AA

AAAATTACAAAACAGACTCCTGTTTTGCGGAACAACGACTCTCTGGAGCGAACCCAT TAGCTATCCAAAAAATT

GATGTATTACCTGATAATTTTGCTGTCACAGATGCACATTTTCAGAAGGTTGCAGGT ACAGAATTTACTTTGGA

AAA AG CACT C AAG G AAG G C AAG CTGT ATTT CTT AG ATT ATCCTTT GTT ATCTG AT ATT A AAG GTGGTGTCTACA

AT AAT GTT A AAA AGT ACCTT CCC A AG CC AC A AG CT CT ATTTT ATT G G C A AAGT AAT G ATAGTCCT AAT GGTG GT

TCTCTAGTGCCTGTTGCCATCCAGATTAATCATGACTCTGGTGGAAAAAGCGTGATT TATACACCAGATGACCC

CCATTTAGATTGGTTTTTGGCAAAAACCTGCGTTCAAATTGCTGATGGCAACCATCA AGAATTGGGTAGTCATT

TCGCCTATACCCATGCAGTTATGGCTCCGTTTGCAATTGTAACTGCGCGGCAGCTAG CAGAAAATCATCCCATC

GCCTTACTGTTAAAACCCCACTTCCGTTTTATGCTATTTGATAACGATTTGGGGCGC ACTCAGTTTTTACAACCT

GGAGGCCCGGTTGATGAGTTTATGGCAGGTTCATTGGCGGAGTCTCTTGGATTTGTA GCGAAGGTTTATGAA

GAATGGAGTGTGGAAAAATTTACCTTCCCTCGGTTAATAAAAAGTCGCCGAACGGAT GACCCAGAAATTTTAC CGCACTTTCCTTTCCGGGACGATGGTATGTTAATTTGGAATGCCGTCGAAAAGTTTGTGT ATGAATATTTGCAA

CTCTATTACAAAACCTCACAGGATCTAATTGATGACTATGAGTTGCAAAATTGGGCT AGAGAATTAGTGGCTC

AAGATGGTGGTAGAGTCAAGGGAATGCCAGCCAAGATTGAGACTCTAGAACAACTGA TTGAAATCATCAGTG

TGGT AGT ATT CACTT GCGCTCCT CT ACACTCT GCTTT G AATTTTT CT CAGT ACG AAT AT ATGGCTTTT GT ACCCA

ATATGCCCTATGCAGCTTATCACCCAATTCCAGAAACTAAAGGTGTGGATTTGGAAA CTATTATGAAGATACTT

CCTCCCTTTAAACAAGCTGCCGACCAGGTGATGTGGACTGAGATTTTAACATCATAC CACTATGATAAATTGGG

TTTTTATGATGAGGAGTTTGCCGATCCATTAGCGCAGGAAATTGTGGTGCAATTCCA ACAGAATTTGCATGAA

ATAGAACGGCAAATAGACATTAGAAATCAAACTCGTCCCATACCTTATAACTACTTC AAGCCTTCGCAAATTAT

TA AC AG C ATT AAT ACTT G A

Coding sequence for WP_052672367.1 - SEQ ID NO: 72

ATAAAACCATATTTACCTCAACACGAGCCTGATGCGATCGCGCGGCAAAATCGCTTA ATCAAAAACCGCGCTG

ATT ATGTTCT CG ACT AT A ACT AT CTG CC ACCT ATTCCTTT G C A AACT CCTGTTCCT C A AC AAG AACGTTTTT CTG C

TGAATACACTGCAAGGCGTTTAGCTAGTTTTGCTAATCTCGTCCCCAATATGTTGAT GGCGAGGGCGAGAAAT

GCTTTCGATCCTTTAGATACGTTAGAGGAATACGCGGACTTATTACCAGTCTTACCA AAACCTAATGTCATCAA

AAATTATCAAGCAGATTGGTGTTTTGCCGAACAAAGATTATCTGGTATTAACCCGCC AGCTATCCGCCGCATAG

ATGCTTTGCCAGAAAATTTGCCCATCTCTAACTCTTCGTTTCAACACTCTGTAGGTG CAGAACATAATCTGGAA

CAAGCACTCAAAGAAGGTAAGTTGTATTGTTTAGACTACCCGTTGTTATCTGGTATT GGAGGCGGTAATTACC

AGAATTTACCTAAATATCTGCCCAAACCGCAAGCGCTCTTTTATTGGCGTAGTGATA ATAGCAAAATCGGCGG

CTCTTTAGTTCCGGTAGCGATTAAAATTCTCAATGAATTGGGAGGGAAAAATTTAGT CTATACGCCCAATGATG

CACCTCTCGACTGGTTTCTTGCCAAAACCTGCGTGCAAATGGCAGATGCAAACCATC AGGAATTAGGCACTCA

TTTTGCTAAAACTCATGCTGTTATGGCTCCTATTGCGGCAATTACAGCTAGGGAATT AGGCGAAAACCATCCTT

TAACTTTGCTGCTAAAACCTCATTTCCGGTTCATGCTGTTTGATAATGAGTTAGGAC GCACGCAGTTTTTGCAA

CCTACTGGTCCTACTGAAGAACTGCTAGCTGGAACGCTGGAAGAATCTGTGCAATTG GTCGTGCAAGCTTATG

AGGAATGGAGTATAGATACTACTTTTCCTTTAGAATTGCAGCAACGGCAAATGCATG ACCCAGAGATTTTACC

TCATTACCCGTTCCGAGATGATGGCATATTAGTCTGGAATGCTATACATCAGTTTGT TACTGAATATTTGCAGA

TTTACTACCACACTCCGCAAGATATCAGTGCAGACTACGAGGTGCAAAATTGGGCTA GGGAATTGGTAGATA

GCGGTCGAGTTAAAGGAATGCCAGAGAGCATTGATACTCTAGCACAACTAATTGACA TTATCGCTGTAGTCAT

CTTTACCTGCGCTCCTCTGCATTCTTGCTTGAATTTAGCCCAGTACGAATACATGAC TTTCGTGCCAAATATGCC

TTATGCAGCCTACCACCCTATTCCCACTACTAAGGGCGTAGATATGGCAACTATTGT CAAAATTATGCCGCCTT

TTCAAAGAGCGATCGATCAAATATTGTGGACGGATATTTTGAGCGCTTTCCAATATG ACAAGTTGGGTTTTTAT

GAGGAAGATTTTGCCGATCCCAAGGCTCAGGAAGTGCTACAGCGCTTTCAAGATAAC TTGCAGCAGGTAGAA

GAAAAGATAGAAATGCACAATCAGATTCGCCCAATACCTTACAACTACCTCAAGCCT TCTCGGATTATGAACA

G C ATT AAT ACTT AA

Coding sequence for WP_073631249.1 - SEQ ID NO: 73

ATGAAACCCTACTTACCCCAACATGACCCAAATCCTGAAGCTCGGAGAAATTGGCTG GAACAAAACCGAGAA G ACT ACAAATTT G ACCACAATT ATTTGGCT CCCAT ACC AAT ACTT GAT AAGGT GCCT CAT AAAG AACT CTT CT C GCCGCAATATACCGCTAAGCGCTTAGCAAGTATGGCGGATCTCGTACCCAATATGCTTGC TGCCAAAGCCAGA AATTTCTTCGATCCACTGGATGAATTGGAAGAATATGAAGCCCTGTTGTCGATATTACCA AAGCCCTCTGTCAT AAAAAATTACAAAACAGATTCGTGTTTCGCCGAGCAAAGACTCTCTGGGGCAAACCCGAT GGCAATGCACAG GATT G ACG AGCTACCAG AAAAATT CCCT GT G ACAAACG ACCACTTT CAAAAAGCT GT AGGTGCAG AACACAAT TTGGAGGCGGCACTCAAAGAAGGCAAACTCTATTTATTAGATTATCCTTTGCTATTTGAC ATTAAAGGCGGTAC CT ACC AG AAC ATT AA A AAGT ACCTT CCCAAGCCGCAGGCTCT ATTTT ACT G GCA AAG C A AT G G C AAT AAA AAT AGTGGTTCTCTGGTGCCTATCGCCATTCAGATCCATAATGATACTGGTGGAGATAGCCTG ATTTACACACCAGA

TGACCCCCATTTAGATTGGTTTTTGGCAAAAACCTGCGTACAAATTGCTGATGCCAA CCATCAGGAATTGGGTA

GCCATTTT GCACGTACT CATGCAGTCAT GGCTCCATTT GCAATT GTCACT GCT CG ACAGTT GGG AG AAAACCAT

CCCCTCGCCTTACTTCTGAAACCCCACTTCCGATTCATGCTCTATGATAACGATTTG GGACGTACTCACTTTTTA

CAAGCAGGAGGTCCGGTTGATGAGTTTATGGCAGGTACGTTGCAGGAGTCTCTTGGT TTCGTTGCCAAAGCCT

ACGAAGAATGGAGTTTAGACAATGCTGTCTTCCCGACGGAAGTGAAGAATCGCAAAA TGGATGATCCAGACA

TTTTGCCGCACTATCCTTTCCGGGACGACGGGATGTTACTCTGGGATGCGGTCAAAA AGTTTGTGACTGAATA

CTTGCAACTCTATTACAAAACTCCCCAAGACTTGAGCGAGGATTATGAATTGCAAAA TTGGGCGAGAGAATTG

GCTGCCCAAGATGGTGGTTGTGTCAAGGGGATGCCAGAGAAAATTGAGACCATAGAG CAACTCATTCATGTT

GTGACTGTAGTCGTCTTCACCTGCGCTCCTCTCCACTCGGCTTTGAATTTTTCCCAG TACGAATACATGGCTTTC

GTACCCAATATGCCTTATGCAGCCTATTACCCCGTTCCAGAAACAAAGGGTGTGGAT ATGCAGACTATCATGA

AGATGCTTCCACCTTTTAAGCAAGCTGCTGATCAGGTGATGTGGTCGGATATTTTGA CATCCTTCCATTACGAC

AAATT GGGTCACTAT GAT GAAG AATTTGCCAACCCAAT GGCT CAGGCAATT CTTTT GCAGTTCCAACAAAATTT

GCAT G AAGTGG AACG ACAAAT AG AAAT C AAAAAT CAAT CTCGTCCAAT ACCAT AT AACTACCT C AAGCCTT CT

G AAATT ATT A AT AG CAT C AAT ACTT G A

Coding sequence for WP_013220336.1 - SEQ ID NO: 74

ATGAATACCTCGCTACCGCAAAATGATTCCGATCCCCAGGGCCGAAAGGATCGGCTT GAAAGACGGCGAGCG

CT GT AT GT ATTT AATT ACG ATT AT GTGCCGCCCAT ACCG AT GATT GAT AAGGTCCCT CAT G AAG AGT ATTT C AG

TCCAAAATACACTGCAGAACGTTTGGCGTCCATGGCGAAGCTAGCGCCTAATATGCT TGCCGCTAAAACCAAG

CGGCTCTTCGACCCGCTTGATGAACTGAATGAATATGATGAGATGTTCATCTTCCTG GACAAACCGGGTATTGT

CCGCGGCTATCGAACAGATGAATCCTTTGGGGAACAACGCCTATCCGGCGTTAATCC CATGTCAATACGCCGC

CTTGATAAACTCCCCGAAGACTTTCCGATCATGGATGAGTATCTGGAACAAAGTTTG GGTTCTCCACATACTCT

CGCGCAGGCACTCCAAGAAGGACGGCTTTATTTTCTGGAGTTCCCTCAATTGGCTCA TGTGAAAGAAGGCGGA

CTTTACCGGGGACGGAAAAAATACCTGCCCAAGCCCCGGGCTTTATTTTGCTGGGAC GGGAATCATTTGCAGC

CGGTGGCCATCCAAATTAGCGGACAACCAGGGGGGCGGCTCTTTATTCCCCGGGATT CTGATTTAGATTGGTT

TGTAGCCAAGTTGTGCGTCCAGATTGCCGATGCCAATCATCAGGAACTTGGCACCCA CTTTGCCCGTACTCATG

TGGTGATGGCGCCTTTTGCCGTGGTGACCCACCGTCAATTGGCGGAAAATCATCCTC TGCATATTCTGTTGCG

GCCTCATTTCCGGTTCATGCTCTACGACAATGATTTGGGGCGTACCCGATTTATCCA GCCAGATGGTCCGGTG

GAGCACATGATGGCGGGCACTCTAGAAGAGTCCATTGGGATTTCCGCTGCCTTTTAT AAGGAATGGCGGCTA

GATGAAGCCGCCTTTCCCATTGAAATTGCCCGCCGCAAGATGGATGACCCGGAGGTA TTGCCCCATTATCCCTT

CCGGGACGATGGGATGCTGCTATGGGACGGTATTCAGAAATTTGTGAAGGAATACTT GGCCCTTTATTATCAA

AGTCCTGAAGATTTGGTCCAGGACCAGGAACTGCGGAACTGGGCTAGGGAGCTTACC GCCAATGACGGGGG

CCGGGTAGCGGGTATGCCGGGGCGTATTGAAACCGTCGATCAGCTTACCAGCATCCT TAGCACGGTCATTTAT

ACTTGTGCACCCTTGCACTCGGCACTGAATTTTGCCCAGTACGAGTATATCGGCTAT GTCCCGAATATGCCCTA

TGCGGCCTATCACCCCATTCCCGAAGAGGGAGGCGTGGATATGGAAACGCTGATGAA AATTCTGCCTCCCTAC

GAGCAGGCTGCGCTGCAGCTGAAATGGACCGAGATCCTCACTTCCTACCATTATGAT CGCTTGGGACATTATG

ATGAAAAATTCGAAGATCCCCAGGCGCAAGCCGTAGTGGAACAATTCCAACAGGAGC TAGCGGCAGTAGAAC

AGGAGATTGATCAGCGTAACCAAGACCGTCCGCTAGCCTACACGTATCTGAAGCCTT CGGAAATTATCAATAG

CATTAATACCTGA

7. Coding sequences (start codon changed with ATG) and the amino acid sequences mined from NCBI Coding sequence for WP_108935963.1 - SEQ ID NO: 75

ATGGTTAATACCCCTCCTCCCACTCCTTGTCTGCCCCAAAATGAACCAGATGCGAAT CGCCGGGCTGATTCCCT

CAATCTTCAACGGCAAGCCTATAGATACGACTATCAGTATCTCCCACCTTTAGTCCT CATGGAATCCGTGCCTG

C AG CG G AA AACTTTTCCCTT C AGT AC ATT ACT GAACGGTTGGCGG C A ACT G CG G A ACTACC AG CC AAT AT G CT

GGCTGTCAAAGTCAAATCTTTTTTAGATCCCCTCGATGAGCTACAAGATTATGAGGA CTTCTTTGCTATTATCCC

CTTACCCAAAATCGCCAAAGTCTATCAAACCAATGATGCCTTTGCCGAACAACGTCT ATCGGGAGCTAATCCCC

TAGTATTACGTTTACTGAAGCCGGGGGATGCTGGCGCCCAAGTTCTCAATCAAATCC CCAGTTCTAAGACAGA

CTTCGAGCCATTGTTTCAGGTAAATCAAGAATTAGCGGCAGGAAACATTTACATTGC CGATTATACGGGTACG

GATGCTAATTATCTCGGTCCCTCTTTTGTTCAAGGGGGAACCCATGCCAAAGGGCGA AAATATTTACCGAAAC

CCAGGGCCTTCTTTTGGTGGCGGAAAAGTGGCATCAGAGATCGGGGCAAATTAGTTC CGATCGCTATCCAATT

TGGGG AAAATGCGG AAAAGCTTT ATACT CCTTTT G AG AAAAACCCCCTTGCTTGGCT ATTTGCT AAAATTT GT G

TTCAGGTGGCCGATAGCAATCACCACGAGATGAATTCCCATCTCTGTCGAACTCATT TTGTCATGGAACCGATC

GCGATCGGCACAGCCCGGCAACTGGCAGAAAATCATCCCCTCAGCCTTCTGCTTAAG CCACACCTAAGATTTA

TGTTAACGAACAACCATCTGGGACAAGAGAGACTGATCAACCCTGGTGGACCGGTGG ATGAATTATTGGCCG

GCACCTTGGGCGAGTCGATGGCACTGGTTAAGGATGCCTACGCAAACTGGAATCTTC GAGACTTTGCCTTTCC

CAAAGAAATAAGTAACCGGGGTATGGATGATACGGAACGACTACCCCACTACCCTTA CCGGGATGATGGGAT

GCT GGTTT GGCAGT CT ATT AAT CAGTTT GTTT CT GATT ATCTCCATT ATTTTT ACCCAAACCCCCAAG ACATCACT

AACGATCAAGAATTACAAGCATGGGCCAGAGAATTATCTAATTCTGCGGCAGATCAA GGGGGCAATGTGAAG

GGAATGCCAGCCAATTTT ACGGATGTAGAGGACTT AATTGAAGTCGTTACCACAATT ATTTTT ATCTGCGGGCC

ACTG C ATT CG G CCGTC AACTATG GTC AGT AT GATT AC AT G ACTTTT G CCG CT AAT AT G CCCTT G G CCG CTT ACT

GTGATCTTCCAGAAGCGATTAAGGATACTACAGGATCAATAATTGGAGATGCCAGAG GATCAATTACCGAAA

AAGACATTCTTCAGCTATTGCCTCCTTATAAAAAGGCTGCCGATCAGTTACAAAGTC TGTTCACTTTATCCGACT

ATCGATACGATCGATTGGGCTATTACGATAAAGCTTTTCGAGAACTCTATGGACGGA AGTTTGAGGAGGTTTT

TGCCGAGGGTGATCAGGCAACAATTACGGGCTTCCTTCGACAATTTCAGCAAAATCT CAATATGAACGAACAA

GAG ATT G ATGCCAAT AAT CAAAAACGG AT CGT ACCCT AT ACCT AT CT AAAACCTT CT CT AAT ACT C AAT AGCAT

CAGCATTTAA

Amino acid Sequence for WP_108935963.1 - SEQ ID NO: 76

MVNTPPPTPCLPQN EPDANRRADSLNLQRQAYRYDYQYLPPLVLMESVPAAENFSLQYITERLAATAELPANM LA

VKVKSFLDPLDELQDYEDFFAI IPLPKIAKVYQTNDAFAEQRLSGANPLVLRLLKPGDAGAQVLNQI PSSKTDFEPLFQ

VNQELAAG NIYIADYTGTDANYLGPSFVQGGTHAKGRKYLPKPRAFFWWRKSGIRDRGKLVPIAIQFG ENAEKLYT

PFEKN PLAWLFAKICVQVADSNH HEMNSHLCRTHFVMEPIAIGTARQLAEN HPLSLLLKPHLRFMLTN NH LGQER

LI NPGGPVDELLAGTLGESMALVKDAYANWN LRDFAFPKEISNRGM DDTERLPHYPYRDDGMLVWQSI NQFVSD

YLHYFYPNPQDITNDQELQAWARELSNSAADQGGNVKGM PAN FTDVEDLI EVVTTII FICGPLHSAVNYGQYDYM

TFAAN M PLAAYCDLPEAI KDTTGSI IGDARGSITEKDI LQLLPPYKKAADQLQSLFTLSDYRYDRLGYYDKAFRELYGR

KFEEVFAEGDQATITGFLRQFQQN LN MN EQEI DANNQKRIVPYTYLKPSLILNSISI

Coding sequence for WP_110985169.1 - SEQ ID NO: 77

ATGCCCAGCCTGCCTCAGAACGATCCCGACCTACAAGCGCGTCAAGCTCTACTCAAG CAGCAGCAGGAGCGCT

ATCAATTTAACTTCGAGTATCTGGCACCGCTGGCCATGCTGGATGAAGTTCCCAAGG ATGAGAATTTCTCCGG

CGCTTATCTTGCCGAACGTCTAACGCGCGCCGCTGATCTCCCGGTCAATATGTTGGC GGCGAAGGCTCATTCTC

TCTTAGATCCCCTAGATCGCCTGGAGGATTATGACGACTTGTTTACCTTGCTGCCTA AACCGGCTATTGCCAAT ACATTCCAAACGGATGAAGTCTTTGCTGAACAGCGGTTGTCAGGAGCGAATCCAATGGCA ATTCGCAGACTTG

ATCCCAGCAATCCGCCGTCGGCATATCTCAATATTAAGCAACAGCTAGCAACCAAGG GTAAAACGCTCGTCGA

GCGTAATCTTTACTACGTTGACTACAGCGAACTCAGCTTTATCCAGGGGGGAACCTA CGCCAAGGGCAAAAAG

TACCTACCCACTCCCTTTGCTCTTTTTAGTTGGCAGTCAATGGGGTATCGCGATCAC AAGACCAGCGATCATGG

CGAACTACTGCCCATTGCCATTCAGATTCAGCAAAACAACAGTGGTCGAGTCTATAC GCCCCGAGATGCCCAT

CTTGACTGGTTATTTGCCAAACTCTGTGTCCAGATTGCTGACGGTAATCATCACGAG ATGAGCAGCCATCTGTG

TCGCACTCATTTTGTTATGGAACCCATTGCCGTAGTCACTGCACGCCAACTGGCCGA AGATCACCCACTCTATA

TTTTACTGCAGCCTCACTTCCGATTTATGTTGGCCAACAACGAGCTGGGCCGGAAGC AGCTCATACAACACGG

TGGCCCGGTAGATAAGCTTTTGGCCGGGACGCTGGCCGAATCTTTGCAGGTTGTCAA AAATTCCTTTGAATCC

TGGAGCCTTGATCAGTTTTCCTTCCCCACCGAGGTTCGCAATCGCGGTATGGATAGC CCAGATCTGCCCCATTT

CCCTTACCGAGATGACGGCCAGCTCGTCTGGGATGCGATTTATAAATTTGTGACCGA CTACCTGCGGCTCTTTT

ATGCTGACTCTGACGCTCTTAAAAACGATGAAGAGCTACAGAGCTGGCTTAAAGAAC TGCGCGATCCGCAGG

GCGGACGCATCAAAGGCGTGCCCGAGCATATTCAAGCGCTAGAGCCGCTCGTTGAAA TGGTGACCACCATTA

TTTTTACCTGTGGCCCGCAGCACTGTGCCGTCAACTATACCCAATATGAATATATGG CTCTGGCCTCCAACATTC

CCCTAGCGGCCTATCAAGATCTAACAGGTCTTGAAAACGGCTCCGAGACTAAACCTG CCATCACTGACGAAGC

CCACCTGATGCAGTATCTGCCGCCCTACCAGCAGGCTGCAGGACAGCTTCAAATCAT GAATATTTTGACGGAC

TATCGCTATGACAAGTTGGGCTACTATGACCGCACCTTCAAGGATGCTTTTGCTGGA AGCAGTTTTGACACCGC

T GTT G ATGCT GTT GTCG AGCAGTT CAAGC AG AAT CT ACG AGT CGT AG AG ACT G AAATT GAT CT CG AT AACCGC

AAACGCGTGATTGAGTATCCCTACCTAAAGCCCTCTTTAATCTTGAATAGCATCAGT ATCTAG

Amino acid Sequence for WP_110985169.1 - SEQ ID NO: 78

MPSLPQN DPDLQARQALLKQQQERYQFNFEYLAPLAMLDEVPKDENFSGAYLAERLTRAADLPVNML AAKAHSL

LDPLDRLEDYDDLFTLLPKPAIANTFQTDEVFAEQRLSGAN PMAI RRLDPSN PPSAYLN I KQQLATKGKTLVERNLYY

VDYSELSFIQGGTYAKGKKYLPTPFALFSWQSMGYRDH KTSDHGELLPIAIQIQQN NSGRVYTPRDAHLDWLFAKL

CVQIADGNH HEMSSHLCRTH FVMEPIAVVTARQLAEDHPLYI LLQPHFRFMLAN N ELGRKQLIQHGGPVDKLLAG

TLAESLQVVKNSFESWSLDQFSFPTEVRNRGMDSPDLPH FPYRDDGQLVWDAIYKFVTDYLRLFYADSDALKNDEE

LQSWLKELRDPQGGRI KGVPEH IQALEPLVEMVTTII FTCGPQHCAVNYTQYEYMALASN IPLAAYQDLTGLENGS

ETKPAITDEAHLMQYLPPYQQAAGQLQIMN I LTDYRYDKLGYYDRTFKDAFAGSSFDTAVDAVVEQFKQN LRVVE

TEIDLDN RKRVI EYPYLKPSLILNSISI

Coding sequence for WP_053540410.1 - SEQ ID NO: 79

ATGCAGCCATTTCTACCTCAAAATGACCCGAACCCGGCACAACGCCAATCTTCTCTA GAGAAAGGACGCAAAG AGT AT CAATT CAT GTATG A I I I I I I GCCGCCTATGGCGATGCTCAAAAGCGTACCTCCCGCAGAGAATTTTTCT ACTAAGTATATTGCTGAACGGACATTAGAGGCAGCAGAACTTCCTCTAAATATGATGGCT GTTAAAACTCATG CT AT GTGGG AT ACTTT AG AT G AATT GCAAG ATTAT G AGGACTTTTTCCCAGTTTTGCAAAAACCTAAT GT GAT G AAAACCTATGAAACCGATGATTCCTTCGCCGAACAACGGCTTTGTGGGGTGAATCCAATG GTTTTACGTCAAA TT AAGCAAAT GCCAGCTAACTTT GCCTTT ACCAT CG AAG AATT ACAGG ATAAGTTTGGCAATT CT ATT AATTT A ATCGAAAGATTGGCCACAGGAAATCTATATGTCGCTGATTATAGATCCTTGGCGTTCATT CAAGGTGGCACTT ATGCCAAAGGAAAAAAGTACCTACCAGCACCTCTAGCCTTTTTCTGTTGGCGCAGTTCGG GCTTTCAAGATCG AG G CCA ATT AGTACCTGT AG CC ATT C AA AT C A ATCCC AAG G C AG GT AA AGT C AG CCCCTT G CT AACTCCTTTT G AT G ACCCTTT AACCTGGTTTT AT GCT AAGT CCT GT GT GCAAATTGCT GAT GGTAATCAT CAT G AAAT G AGTAGC CATTTATGCCGGACTCACCTGGTAATGGAACCCTTTGCTGTTGTCACCCCGCGTCAACTG GCTGAAAATCATCC TCTGAGAATATTACTCAAACCCCATTTCCGGTTTATGTTGGCTAATAATGATTTAGCTCG CAAGCGTCTGGTTA GTAGGGGCGGTTTTGTGGATGAATTATTAGCAGGAACTCTGCAAGAATCATTGCAAATTG TGGTAGATGCCTA TAAAAGTTGGAGTCTAGACCAGTTTGCTCTACCTAGGGAACTCAAAAATCGCGGTGTAGA TGATGTGAAAAAC

TTGCCACATTATCCTTATCGGGATGATGGAATTTTGTTATGGAATGCGATTAATAAG TTTGTATTTAACTATTTG

CAGCTTTATTACAAGAGTCCAGCAGACTTGAAAGCAGACGGAGAACTGCAAGCTTGG GCGCGGGAATTGGTG

GCTCAGGATGGTGGTAGAGTTAAGGGTATGAGCGACCGCATTGATACCCTAGAACAA TTAGTTGAGATTGTT

ACTACTATCATATATATTTGTGGTCCGCAGCATTCGGCGGTTAATTTCTCCCAATAT GAATACATGGGTTTTATT

CCTAATATGCCCCTAGCTGCTTATCAAGAAATTCAACAAAAGGGTGATATTGAAGAC CGTCAAGCCCTGATAG

ATTTT CT ACCACC AGC AAAGCCCACAAAT ACCCAATT AT CAACT GT GT AC AT ACTTT C AG ACT ATCGTT AT G ACA

GACTGGGATATTATGAAGAGGAAGAATTTACAGATCCAAATGCTGACCAAGTTGTGA ATAAATTTCAGCAAG

AATTGAATGTGGTACAGAGAAAAATTGAATTGAATAATAAGGGACGTTTAGTAAATT ACGAATATCTCCAACC

CAG ACTT ATT CT CAAC AGT ATT AGT ATTT AA

Amino acid Sequence for WP_053540410.1 - SEQ ID NO: 80

MQPFLPQN DPNPAQRQSSLEKGRKEYQFMYDFLPPMAMLKSVPPAENFSTKYIAERTLEAAELPLNMM AVKTHA

MWDTLDELQDYEDFFPVLQKPNVMKTYETDDSFAEQRLCGVNPMVLRQIKQMPANFA FTI EELQDKFGNSIN LIE

RLATGN LYVADYRSLAFIQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQI NPKAGKVSPLLTPFDDPLTW

FYAKSCVQIADGNH HEMSSHLCRTH LVMEPFAVVTPRQLAEN HPLRILLKPH FRFM LANNDLARKRLVSRGGFVD

ELLAGTLQESLQIVVDAYKSWSLDQFALPRELKNRGVDDVKN LPHYPYRDDGI LLWNAIN KFVFNYLQLYYKSPADL

KADGELQAWARELVAQDGGRVKG MSDRIDTLEQLVEIVTTIIYICGPQHSAVN FSQYEYMGFI PNMPLAAYQEIQ

QKGDIEDRQALIDFLPPAKPTNTQLSTVYILSDYRYDRLGYYEEEEFTDPNADQVVN KFQQELNVVQRKI ELNN KGR

LVNYEYLQPRLILNSISI

Coding sequence for WP_035367771.1 - SEQ ID NO: 81

ATGATCAATATTATGCAGCCATTTCTACCTCAAAATGACCCGAACCCCGCACAACGC CAATCTTGTCTAGAGAA

AG G CCG C A AAG AGT AT C AATT C ATGT ATG A I I I I I I GCCGCCTATGGCGATGCTCAAAAGCGTACCTCCCGCA

GAGAATTTTTCTACTAAGTATATTGCTGAACGGACATTAGAGGCAGCAGAACTTCCT CTAAATATGATGGCTGT

TAAAACT CATGCTAT GTGGG AT CCTTT AG AT G AATT GCAAG ATTAT G AGG ACTTTTTCCCAGTTTTGCAAAAAC

CTAATGTGATGAAAACCTATGAAACCGATGATTCCTTCGCCGAACAACGGCTTTGTG GAGTAAATCCGATGGT

TTT ACGT C AA ATT AAG C AA AT G CC AG CT AACTTT G CCTTT ACC ATT G A AG A ATT AC AG GAT A AGTTT GGCAGTT

CTATTAATTTAATTGAAAGATTGGCAACCGGAAATCTATATGTCGCTGATTATAGAT CCTTGGCGTTCATTCAA

G GTG G C ACTT AT G CC AA AG G A AA AA AGT ACCT ACC AG C ACCT CTAG CTTTTTT CTGTTG G CG C ACTT CAG G CTT

TCAAGATCGAGGCCAATTAGTACCTGTAGCCATTCAAATCGCCCCCAAAGCAGGTAA AGTCAGCCCCTTGCTA

ACTCCTTTTGATGACCCTTTAACCTGGTTTTATGCTAAGTCCTGTGTGCAAATTGCT GATGCTAATCATCATGAA

ATGAGCAGCCATTTATGCCGGACTCACCTGGTAATGGAACCCTTTGCTGTTGTCACC CCCCGTCAACTGGCTGA

AAATCATCCTCTGAGAATATTACTCAAACCCCATTTCCGGTTTATGTTGGCTAATAA TGATTTAGCTCGCAAGC

GTCTGGTTAGTAGGGGCGGTTTTGTGGATGAATTATTAGCAGGAACTCTGCAAGAAT CATTGCAAATTGTGGT

AGATGCCTATAAAAGTTGGAGTCTAGACCAGTTTGCTCTACCCAGAGAACTCAAAAA TCGCGGTGTGAATGAT

GTCAAAAACTTACCACATTATCCTTATCGGGATGATGGAATTTTGTTATGGAATGCG ATTAATAAGTTTGTATT

TAACTATTTGCAGCTTTATTATCAGAGTTCAGCAGACTTGAAAGCAGACGCAGAACT GCAAGCTTGGGCGCGG

GAATTAGTGGCTCAGGATGGTGGTAGAGTTAAGGGTATGAGCGATCGCATTGATACC CTAGAACAATTAGTG

GAGATTGTTACTACTATCATATATATTTGTGGTCCGCAGCATTCGGCGGTTAATTTC TCCCAATATGAATACATG

GGTTTTATTCCTAATATGCCCCTAGCTGCTTATCAACCAATTCAACAAAAGGGTGAT ATTAAAGACCGTAAAGC

CCT CAT AG ATTTT CT ACCACCAGCC AAGCCCACAAAT ACCCAATT AT CAACT GT GT ACAT ACTTT CAG ACT AT CG

TT AT G AC AG ACTGGG AT ATT AT G AAG AGG AAG AATTT ACAG AT CC AAAT GCT G ACCAAGTT GT G AAT AAATTT CAGCAAG AATT G AAT AT GGT AC AG AG AAAAATT G AATT G AAT AAT AAGGG ACGTTT AGT AAATT ACG AAT AT C TCCAACCAAG ACTT ATT CT CAACAGT ATT AGT ATTT AA

Amino acid Sequence for WP_035367771.1 - SEQ ID NO: 82

MI N IMQPFLPQNDPNPAQRQSCLEKGRKEYQFMYDFLPPMAMLKSVPPAENFSTKYIAERTLE AAELPLNMMAV

KTHAMWDPLDELQDYEDFFPVLQKPNVM KTYETDDSFAEQRLCGVNPMVLRQIKQM PANFAFTI EELQDKFGSS

IN LI ERLATGNLYVADYRSLAFIQGGTYAKGKKYLPAPLAFFCWRTSGFQDRGQLVPVAIQIAP KAGKVSPLLTPFDD

PLTWFYAKSCVQIADAN HH EMSSH LCRTHLVMEPFAVVTPRQLAENHPLRILLKPH FRFM LAN NDLARKRLVSRG

GFVDELLAGTLQESLQIWDAYKSWSLDQFALPRELKN RGVN DVKN LPHYPYRDDGILLWNAI NKFVFNYLQLYYQ

SSADLKADAELQAWARELVAQDGGRVKGMSDRIDTLEQLVEIVTTIIYICGPQHSAV N FSQYEYMGFI PNMPLAAY

QPIQQKGDI KDRKALI DFLPPAKPTNTQLSTVYI LSDYRYDRLGYYEEEEFTDPNADQVVNKFQQELNMVQRKIELN

N KGRLVNYEYLQPRLI LNSISI

Coding sequence for OBQ35765.1 - SEQ ID NO: 83

ATGAAGCCATTCCTACCTCAAAATGACCCGAACCCCGGACAACGCCAATCTTCTCTA GAGAAAGGCCGCAAAG

AGT AT CAATT CAT GTATG A I I I I I I GCCTCCTATGGCGATGCTCAAAAGCGTACCTCCGGCAGAGAATTTTTCTA

CTAAGTATATTGCTGAACGGACATTAGAGGTAGCAGAACTTCCTCTGAATATGATGG CTGTTAAAACTCATGC

TATGTGGGATCCTTTAGATGAATTGCAAGATTATGAGGACTTTTTCCCAGTTTTGCA GAAACCTAATGTGATGA

AAACCTATGAAACTGATGATTCCTTTGCCGAACAACGGCTTTGTGGGGTAAATCCGA TGGTTTTACGTCAAATT

AAGCAAATGCCAGCTAACTTTGCCTTTACCATCGAAGAATTACAGGATAAGTTTGGC AATTCTATTAATTTAAT

CGAAAGACTGGCAACGGGAAATCTATATGTCGCTGATTATAGATCCTTGGCGTTCGT TCAAGGTGGCACTTAT

GCCAAAGGGAAAAAGTACCTACCAGCACCTCTAGCTTTTTTCTGTTGGCGCAGTTCA GGCTTTCAAGATCGAG

GCCAATTAGTCCCTGTAGCCATTCAAATCAATCCCAAGGCAGGTAAGGTCAGCCCCT TGCTAACTCCTTTTGAT

GAT CCTTT AACCTG GTTTT ATG CT AAGTCCTGTGT AC AAATT G CTG ATG CT AAT CAT CAT G A AAT GAGTAGCCA

TTTATGCCGGACTCACCTGGTAATGGAACCCTTTGCTGTTGTCACCCCGCGTCAACT GGCTGAAAATCATCCTC

TGAGAATATTACTCAAACCCCATTTCCGGTTTATGTTGGCTAATAATGATTTAGCTC GCAAGCGTCTGGTTAGT

CGGGGCGGTTTTGTGGATGAATTATTAGCAGGAACTCTGCAAGAATCATTGCAAATT GTGGTAGATGCCTATA

AAAGTT GG AGT CT AG ACCAGTTTGCT CT ACCCAG AG AACT CAAAAAT CGCGGT GT AG AT GAT GT G AAAAACTT

G CC AC ATT AT CCTT ATCG G G ATG AT G G A ATTTT GTT AT G G A AT G CG ATT AAT AAGTTT GT ATTT AACT ATTT G C A

GCTTTATTATCGAAGTTCAGCAGACTTGAAAGCAGACGGAGAACTGCAAGCTTGGGC GCGGGAATTGGTGGC

TCAGGATGGTGGTAGAGTTAAGGGTATGAGCGATCGCATTGATACCTTAGAACAATT AGTGGAGATTGTTACT

ACTATCATATATATTTGTGGTCCTCAGCATTCGGCGGTTAATTTCTCCCAATATGAA TACATGGGTTTTATTCCT

AATATGCCCCTAGCTGCTTATCAAGCAATTCAACAAAAGGGTGATATTAAAGACCGT CAAGCCCTGATAGATT

TT CT ACCACC AGCAAAGCCCAC AAAT ACCC AATT AT C AACT GT GT ACAT ACTTT CAG ACT AT CGTT AT G AC AG A

CTGGGATATTATGAAGAGGAAGAATTTACAGATCCAAATGCTGACCAAGTTGTGAAT AAATTTCAGCAAGAAT

TGAATGTGGTACAGAGAAAAATTGAATTGAATAATAAGGGACGTTTAGTAAATTATG AATATCTCCAACCAAG

ACTT ATT CT CAACAGT ATT AGT ATTT A A

Amino acid Sequence for OBQ35765.1 - SEQ ID NO: 84

MKPFLPQN DPNPGQRQSSLEKGRKEYQFMYDFLPPMAMLKSVPPAENFSTKYIAERTLEVAELPLN MMAVKTHA

MWDPLDELQDYEDFFPVLQKPNVMKTYETDDSFAEQRLCGVN PMVLRQI KQMPAN FAFTIEELQDKFGNSI NLIE

RLATGN LYVADYRSLAFVQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQI NPKAGKVSPLLTPFDDPLT

WFYAKSCVQIADANH HEMSSHLCRTHLVM EPFAVVTPRQLAEN HPLRI LLKPH FRFM LANNDLARKRLVSRGGFV

DELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVDDVKNLPHYPYRDDGILLWNAIN KFVFNYLQLYYRSSAD LKADGELQAWARELVAQDGGRVKGMSDRIDTLEQLVEIVTTNYICGPQHSAVN FSQYEYMGFIPN MPLAAYQAIQ

QKGDIKDRQALIDFLPPAKPTNTQLSTVYILSDYRYDRLGYYEEEEFTDPNADQVVN KFQQELNVVQRKIELN NKGR

LVNYEYLQPRLILNSISI

Coding sequence for OBQ09764.1 - SEQ ID NO: 85

ATGCAGCCATTTCTACCTCAAAATGACCCGAACCCCGCACAACGCCAATCTTGTCTA GAGAAAGGCCGCAAAG

AGT AT CAATT CAT GTATG A I I I I I I GCCTCCTATGGCGATGCTCAAAAGCGTACCTCCCGCAGAGAATTTCTCTA

CTAAGTATATTGCTGAACGGACATTAGAGGCAGCAGAACTTCCTCTGAATATGATGG CTGTTAAAACTCATGC

TATGTGGGATCCTTTAGATGAATTGCAAGATTATGAGGACTTTTTCCCAATTTTGCA AAAACCTAATGTGATGA

AAACCTATGAAACCGATGATTCTTTCGCGGAACAACGGCTTTGTGGGGTAAATCCGA TGGTTTTACGTCAAAT

TAAGCAAATGCCAGCTAACTTTGCCTTTACCATCGAAGAATTACAGGCTAAGTTTGG CAATTCTATTAATTTAA

TCGAAAGATTGGCAACAGGAAATCTATATGTCGCTGATTATAGATCCTTGGCGTTCG TTCAAGGTGGCACTTA

TGCCAAAGGAAAAAAGTACCTACCAGCACCTCTAGCCTTTTTCTGTTGGCGCAGTTC AGGCTTTCAAGATCGA

G G CC AATT AGTCCCTGTAG CC ATT C A AAT C AAT CCC A AG G C AG GT A AAG CC AG CCCCTT G CT A ACTCCTTTT G A

TGACCCTTTAACCTGGTTTTATGCTAAGTCCTGTGTGCAAATTGCTGATGGTAATCA TCATGAAATGAGCAGCC

ATTTATGCCGGACTCACTTTGTCATGGAACCCTTTGCGGTTGTTACCCCTCGTCAAC TGGCTGAAAATCATCCTC

TGAGAATATTACTCAAACCCCATTTCCGGTTCATGTTGGCTAACAATGATTTAGGTC GTCAGCGGCTGGTGAAT

AGGGGCGGTATTGTTGATGAATTATTAGCAGGAACTCTGCAAGAATCATTGCAAATT GTGGTAGATGCCTATA

AAAGTTGGAGTCTGGACCAGTTTGCTCTACCCAGAGAACTCAAAAATCGCGGTGTAG ATGATGTGAAAAACTT

G CC AC ATT AT CCTT ATCG G G ATG AT G G A ATTTT GTT AT G G A AT G CG ATT AAT AAGTTT GT ATTT AACT ATTT G C A

ACTTTATTACAAGAGTCCAGCAGACTTGAAAGCAGACGGAGAACTGCAAGCTTGGGC GCGGGAATTGGTGGC

TCAGGATGGTGGTAGAGTTAAGGGTATGAGCGATCGCATTGATACCTTAGAACAATT AGTTGAGATTATTACT

ACTATCATATATATTTGTGGTCCTCAGCATTCGGCGGTTAATTTCTCCCAATATGAA TACATGGGTTTTATTCCT

AATATGCCCCTAGCTGCTTATCAAGAAATTCAACAAAAGGGTGATATTAAAGACCGT CAAGCCCTCATAGATTT

T CT ACC ACCAGCAAAGCCC ACAAAT ACCCAATT AT CAACT GT GT AC AT ACTTT C AG ACT ATCGTT AT G ACAG AC

TGGGATATTATGAAGAGGAAGAATTTGCAGATCCAAATGCTGACCAAGTTGTGAATA AATTTCAGCAAGAATT

G AGT GTGGT ACAG AG AAAAATT G AATT G AAT AAT AGGGG ACGTTT AGT AAATT ACG AAT AT CT CC AACCCGG

ACTT ATT CT C A AC AGT ATT AGT ATTT A A

Amino acid Sequence for OBQ09764.1 - SEQ ID NO: 86

MQPFLPQN DPNPAQRQSCLEKGRKEYQFMYDFLPPMAMLKSVPPAENFSTKYIAERTLEAAELPLNMM AVKTHA

MWDPLDELQDYEDFFPILQKPNVM KTYETDDSFAEQRLCGVN PMVLRQI KQMPAN FAFTIEELQAKFGNSI NLIE

RLATGN LYVADYRSLAFVQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQI NPKAGKASPLLTPFDDPLT

WFYAKSCVQIADGN HH EMSSH LCRTH FVMEPFAVVTPRQLAENH PLRILLKPHFRFMLANN DLGRQRLVNRGGI

VDELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVDDVKNLPHYPYRDDGILLWNAIN KFVFNYLQLYYKSPA

DLKADGELQAWARELVAQDGGRVKGMSDRIDTLEQLVEIITTI IYICGPQHSAVN FSQYEYMGFI PNMPLAAYQEI

QQKGDI KDRQALI DFLPPAKPTNTQLSTVYI LSDYRYDRLGYYEEEEFADPNADQVVNKFQQELSVVQRKIELNN RG

RLVNYEYLQPGLILNSISI

Coding sequence for OBQ23315.1 - SEQ ID NO: 87

ATGCAGCCATTTCTACCTCAAAATGACCCGAACCCGGCACAACGCCAATCTTCTCTA GAGAAAGGACGCAAAG AGT AT CAATT CAT GTATG A I I I I I I GCCGCCTATGGCGATGCTCAAAAGCGTACCTCCCGCAGAGAATTTTTCT ACTAAGTATATTGCTGAACGGACATTAGAGGCAGCAGAACTTCCTCTAAATATGATGGCT GTTAAAACTCATG CT AT GTGGG AT ACTTT AG AT G AATT GCAAG ATTAT G AGGACTTTTTCCCAGTTTTGCAAAAACCTAAT GT GAT G AAAACCTATGAAACCGATGATTCCTTCGCCGAACAACGGCTTTGTGGGGTGAATCCAATG GTTTTACGTCAAA

TT AAGCAAAT GCCAGCTAACTTT GCCTTT ACCAT CG AAG AATT ACAGG ATAAGTTTGGCAATT CT ATT AATTT A

ATCGAAAGATTGGCCACAGGAAATCTATATGTCGCTGATTATAGATCCTTGGCGTTC ATTCAAGGTGGCACTT

ATGCCAAAGGAAAAAAGTACCTACCAGCACCTCTAGCCTTTTTCTGTTGGCGCAGTT CGGGCTTTCAAGATCG

AG G CCA ATT AGTACCTGT AG CC ATT C AA AT C A ATCCC AAG G C AG GT AA AGT C AG CCCCTT G CT AACTCCTTTT G

AT G ACCCTTT AACCTGGTTTT AT GCT AAGT CCT GT GT GCAAATTGCT GAT GCT AATCATCAT G AAAT G AACAGC

CATTTATGCCGGACTCACCTGGTAATGGAACCCTTTGCTGTTGTCACCCCGCGTCAA CTGGCTGAAAATCATCC

TCTGAGAATATTACTCAGACCCCATTTCCGGTTTATGTTGGCTAATAATGATTTAGC TCGCAAGCGTCTGGTTA

GTAGGGGCGGTTTTGTGGATGAATTATTAGCAGGAACTCTGCAAGAATCATTGCAAA TTGTGGTAGATGCCTA

TAAAAGTTGGAGTCTAGACCAGTTTGCTCTACCTAGGGAACTCAAAAATCGCGGTGT AGATGATGTGAAAAAC

TTGCCACATTATCCTTATCGGGATGATGGAATTTTGTTATGGAATGCGATTAATAAG TTTGTATTTAACTATTTG

CAGCTTTATTATAAGAGTTCAGCAGACTTGAAAGCAGACGGAGAACTGCAAGCTTGG GCGCGGGAACTAGTG

GCTCAGGATGGTGGTAGGGTTAAGGGTATGAGCGATCGCATTGATACCCTAGAACAA TTAGTTGAGATTGTT

ACTACTATCATATATATTTGTGGTCCGCAGCATTCGGCGGTTAATTTCTCCCAATAT GAATACATGGGTTTTATT

CCTAATATGCCCCTAGCTGCTTATCAAGCAATTCAACAAAAGGGTGATATTAAAGAC CGTCAAGCCCTCATAGA

TTTT CT ACC ACCTGCCAAGCCCACAAAT ACCCAATT AT CAACT GT GT AC AT ACTTT C AG ACT ATCGTT AT G ACAG

ACTGGGATATTATGAAGAGGAAGAATTTACAGATCGAAATGCTGACCAAGTTGTGAA TAAATTTCAGCAAGA

ATT G AAT GTGGT ACAG AG AAAAATT G AATT G AAT AAT AAGGG ACGTTT AGT AAATT ACG AAT AT CT CCAACCC

AG ACTT ATT CT C AACAGT ATT AGT ATTT AA

Amino acid Sequence for OBQ23315.1 - SEQ ID NO: 88

MQPFLPQN DPNPAQRQSSLEKGRKEYQFMYDFLPPMAMLKSVPPAENFSTKYIAERTLEAAELPLNMM AVKTHA

MWDTLDELQDYEDFFPVLQKPNVMKTYETDDSFAEQRLCGVNPMVLRQIKQMPANFA FTI EELQDKFGNSIN LIE

RLATGN LYVADYRSLAFIQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQI NPKAGKVSPLLTPFDDPLTW

FYAKSCVQIADAN HH EMNSHLCRTHLVM EPFAVVTPRQLAENH PLRILLRPHFRFMLANN DLARKRLVSRGGFVD

ELLAGTLQESLQIVVDAYKSWSLDQFALPRELKNRGVDDVKN LPHYPYRDDGI LLWNAIN KFVFNYLQLYYKSSADL

KADGELQAWARELVAQDGGRVKG MSDRIDTLEQLVEIVTTIIYICGPQHSAVN FSQYEYMGFI PNMPLAAYQAIQ

QKGDIKDRQALIDFLPPAKPTNTQLSTVYILSDYRYDRLGYYEEEEFTDRNADQVVN KFQQELNVVQRKIELN NKGR

LVNYEYLQPRLILNSISI

Coding sequence for OBQ30848.1 - SEQ ID NO: 89

ATGCAGCCATTTCTACCTCAAAATGACCCAAACCCGGCACAACGCCAATCTTGTCTA GAGAAAGGCCGCAAAG AGT AT AAATT CAT GTATG A I I I I I I GCCGCCTATGGCAATGATCAAAAGCGTACCTCCCGCAGAGAATTTTTCT ACTAAGTATATTGCTGAACGGACATTAGAGGCGGCAGAACTTCCTCTAAATATGATGGCT GTTAAAACTCATG CT AT GTGGG AT CCTTT AG AT G AATTGCAAG ATT AT G AGG ACTTTTT CCCAGTTTTGCAAAAACCT AAT GT GAT G AAAACCTATGAAACCGATGATTCCTTCGCCGAACAACGGCTCTGTGGGGTAAATCCGATG GTTTTACGTCAAA TT AAGCAAAT GCCAGCTAACTTT GCCTTT ACCAT CG AAG AATT ACAGG ATAAGTTTGGCAATT CT ATT AATTT A ATCGAAAGATTGGCCACAGGAAATCTATATGTCGCTGATTATAGATCCTTGGCGTTCGTT CAAGGTGGCACTT ATGCCAAAGGGAAAAAGTACCTACCAGCACCTCTAGCTTTTTTCTGTTGGCGCAGTTCAG GCTTTCAAGATCGA GGCCAATTAGTCCCTGTAGCCATTCAAATCAATCCCAAGGCAGGTAAGGTCAGCCCCTTG CTGACTCCTTTTGA T G ACCCTTT AACCT GGTTTT ATGCT AAGT CCT GT GT ACAAATTGCT G ATGCT AAT CAT CAT G AAAT G AGT AGCC ATTT AT GCCGG ACT CACCT GGT AATGG AACCCTTT GCT GTT GT C ACCCCGCGT CAACT GGCT G AAAAT CATCCT CTGAGAATATTACTCAAACCCCATTTCCGGTTTATGTTGGCTAATAATGATTTAGCTCGC AAGCGTCTGGTTAG TCGGGGCGGTTTTGTGGATGAATTATTAGCAGGAACTCTGCAAGAATCATTGCAAATTGT GGTAGATGCCTAT AAAAGTTGGAGTCTGGACCAGTTTGCTCTACCCAGGGAACTCAAAAATCGCGGTGTAGAT GATGTGAAAAAC

TTGCCACATTATCCTTATCGGGATGATGGAATTTTGTTATGGAATGCGATTAATAAG TTTGTATTTAACTATTTG

CAGCTTTATTACAAGAGTCCAGCAGACTTGAAAGCAGACGGAGAACTGCAAGCTTGG GCGCGGGAATTGGTG

GCTCAGGATGGTGGTAGAGTTAAGGGTATGAGCGATCGCATTGATACCTTAGAACAA TTAGTTGAGATTGTTA

CTACTATCATATATATTTGTGGTCCGCAGCATTCGGCGGTTAATTTCTCCCAATATG AATACATGGGTTTTATTC

CTAATATGCCCCTAGCTGCTTATCAAGAAATTCAACAAAAGGGTGATATTAAAGACC GTCAAGCCCTCATAGAT

TTT CT ACCACC AGC AAAGCCCACAAAT ACCCAATT AT CAACT GT GT ACAT ACTTT C AG ACT ATCGTT AT G ACAG

ACTGGGATATTATGAAGAGGAAGAATTTACAGATCCAAATGCTGACCAAGTTGTGAA TAAATTTCAGCAAGAA

TTGAATGTGGTACAGAGAAAAATTGAATTGAATAATAAGGGACGTTTAGTAAATTAT GAATATCTCCAACCAA

G ACTT ATT CT CAAC AGT ATT AGT ATTT AA

Amino acid Sequence for OBQ30848.1 - SEQ ID NO: 90

MQPFLPQN DPNPAQRQSCLEKGRKEYKFMYDFLPPMAM IKSVPPAEN FSTKYIAERTLEAAELPLNMMAVKTHA

MWDPLDELQDYEDFFPVLQKPNVMKTYETDDSFAEQRLCGVN PMVLRQI KQMPAN FAFTIEELQDKFGNSI NLIE

RLATGN LYVADYRSLAFVQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQI NPKAGKVSPLLTPFDDPLT

WFYAKSCVQIADANH HEMSSHLCRTHLVM EPFAVVTPRQLAEN HPLRI LLKPH FRFM LANNDLARKRLVSRGGFV

DELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVDDVKNLPHYPYRDDGILLWNAIN KFVFNYLQLYYKSPA

DLKADGELQAWARELVAQDGGRVKGMSDRIDTLEQLVEIVTTIIYICGPQHSAVNFS QYEYMGFIPNM PLAAYQEI

QQKGDI KDRQALI DFLPPAKPTNTQLSTVYI LSDYRYDRLGYYEEEEFTDPNADQVVNKFQQELNVVQRKIELNN KG

RLVNYEYLQPRLILNSISI

Coding sequence for OBQ23778.1 - SEQ ID NO: 91

ATGCAGCCATTTCTACCTCAAAATGACCCAAACCCCGCACAACGCCAATCTTCTCTA GAGAAAGGCCGCAAAG

AGT AT CAATT CAT GTATG A I I I I I I GCCGCCTATGGCGATGCTCAAAAGCGTACCTCCCGCAGAGAATTTTTCT

ACTAAGTATATTGCTGAACGGACATTAGAGGCAGCAGAACTTCCTCTAAATATGATG GCTGTTAAAACTCATG

CT AT GTGGG AT CCTTT AG AT G AATTGCAAG ATT AT G AAGACTTTTTCCCAGTTTTGCAAAAACCTAAT GT GAT G

AAAACCTATGAAACCGATGATTCCTTCGCCGAACAACGGCTTTGTGGGGTGAATCCG ATGGTTTTACGTCAAA

TT AAGCAAAT GCCAGCTAACTTT GCCTTT ACCATT G AAG AATT ACAGG ATAAGTTTGGCAATT CT ATT AATTT A

ATCGAAAGATTGGCCACAGGAAATCTATATGTCGCTGATTATAGATCCTTGGCGTTC ATTCAAGGTGGCACTT

AT G CC AA AG G AA AA AAGT ACCT ACC AG C ACCTCTG G CCTTTTT CTGTTGGCGCAGTTCGGG CTTT C AAG ATCG

AGGCCAATTAGTCCCTGTAGCCATTCAAATCAATCCCAAAGCAGGTAAAGCCAGCCC CTTGCTGACTCCTTTTG

AT G ACCCTTT AACCTGGTTTT AT GCT AAGT CCT GT GT GCAAATTGCT GAT GGTAATCAT CAT G AAAT G AGTAGC

CATTTATGTCGGACTCACTTAGTAATGGAACCCTTTGCTGTTGTCACCCCGCGTCAA CTGGCTGAAAATCATCC

TCTGAGAATATTACTCAGACCCCATTTCCGGTTTATGTTGGCTAATAATGATTTAGC CCGCAAGCGTCTGGTTA

GTAGGGGCGGTTTTGTTGATGAATTATTAGCAGGAACTCTGCAAGAATCATTGCAAA TTGTGGTAGATGCCTA

TAAAAGTTGGAGTCTGGACCAGTTTGCTCTACCCAGGGAACTCAAAAATCGCGGTGT AGATGATGTGAAAAA

CTTGCCACATTATCCTTATCGGGATGATGGAATTTTGTTATGGAATGCGATTAATAA GTTTGTATTTAACTATTT

GCAACTTTATTACAAGAGTCCAGCAGACTTGAAAGCAGACGGAGAACTACAAGCTTG GGCGCGGGAATTGGT

GGCTCAGGATGGTGGTAGAGTTAAGGGTATGAGCGATCGCATTGATACCTTAGAACA ATTAGTTGAGATTGT

TACT ACT AT CAT AT AT ATTT GTGGTCCGCAG C ATT CG G CG GTT AATTT CT CCC AAT AT G A AT AC ATG G GTTTT AT

TCCTAATATGCCCCTAGCTGCTTATCAAGCAATTCAAGAAAAGGGTGATATTAAAGA CCGTCAAGCCCTCATA

G ATTTT CT ACCACTT GCC AAACCCACAAAT ACCCAATT AT CAACT GT GT ACAT ACTTT C AG ACT ATCGTT AT G AC

AGACTGGGATATTATGAAGAGGAAGAATTTACAGATCCAAATGCTGACCAAGTTGTG AATAAATTTCAGCAA GAATTGAGTGTGGTACAGAGAAAAATTGAATTGAATAATAAGGGACGTTTAGTAAATTAC GAATATCTCCAAC CC AG ACTT ATT CT C AACAGT ATT AGT ATTT AA

Amino acid Sequence for OBQ23778.1 - SEQ ID NO: 92

MQPFLPQN DPNPAQRQSSLEKGRKEYQFMYDFLPPMAMLKSVPPAENFSTKYIAERTLEAAELPLNMM AVKTHA

MWDPLDELQDYEDFFPVLQKPNVMKTYETDDSFAEQRLCGVN PMVLRQI KQMPAN FAFTIEELQDKFGNSI NLIE

RLATGN LYVADYRSLAFIQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQI NPKAGKASPLLTPFDDPLT

WFYAKSCVQIADGN HH EMSSH LCRTH LVMEPFAVVTPRQLAENH PLRILLRPH FRFMLANN DLARKRLVSRGGFV

DELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVDDVKNLPHYPYRDDGILLWNAIN KFVFNYLQLYYKSPA

DLKADGELQAWARELVAQDGGRVKGMSDRIDTLEQLVEIVTTIIYICGPQHSAVNFS QYEYMGFIPNM PLAAYQAI

QEKGDIKDRQALIDFLPLAKPTNTQLSTVYILSDYRYDRLGYYEEEEFTDPNADQVV N KFQQELSVVQRKIELNN KGR

LVNYEYLQPRLILNSISI

Coding sequence for WP_015083575.1 - SEQ ID NO: 93

ATGCAGCCATTTCTACCTCAAAATGACCCGAACCCCGCACAACGCCAATCTTGTCTA GAGAAAGGCCGCAAAG

AGT AT CAATT CAT GTATG A I I I I I I GCCGCCTATGGCGATGCTCAAAAGCGTACCTCCCGCAGAGAATTTTTCT

ACTAAGTATATTGCTGAACGGACATTAGAGGCAGCAGAACTTCCTCTAAATATGATG GCTGTTAAAACTCATG

CT AT GTGGG AT CCTTT AG AT G AATTGCAAG ATT AT G AGG ACTTTTT CCCAGTTTTGCAAAAACCT AAT GT GAT G

AAAACCTATGAAACCGATGATTCCTTCGCCGAACAACGGCTTTGTGGGGTGAATCCG ATGGTTTTACGTCAAA

TT AAGCAAAT GCCAGCTAACTTT GCCTTT ACCATT G AAG AATT ACAGG ATAAGTTTGGCAATT CT ATT AATTT A

ATCGAAAGATTGGCCACAGGAAATCTATATGTCGCTGATTATAGATCCTTGGCGTTC GTTCAAGGTGGCACTT

AT G CC AA AG G AA AA AAGT ACCT ACC AG C ACCTCTG G CCTTTTT CTGTTGGCGCAGTTCGGG CTTT C AAG ATCG

AGG CCA ATT AGTCCCTGT AG CC ATT C AA AT C A ATCCC AA AG C AG GT A AAG CC AG CCCCTT G CT A ACT CCTTTT G

AT G ACCCTTT AACCTG GTTTT ATG CT AAGTCCTGTGTG C AA ATT G CTG ATG CT AAT CAT CAT G A AAT G AG C AG C

CATTTATGCCGGACTCACCTGGTAATGGAACCCTTTGCTGTTGTCACCCCGCGTCAA CTGGCTGAAAATCATCC

TCTGAGAATATTACTCAGACCCCATTTCCGGTTTATGTTGGCTAATAATGATTTAGC TCGCAAGCGTCTGGTTA

GTAGGGGCGGTTTTGTGGATGAATTATTAGCAGGAACTCTGCAAGAATCATTGCAAA TTGTGGTAGATGCCTA

TAAAAGTTGGAGTCTAGACCAGTTTGCTCTACCTAGGGAACTCAAAAATCGCGGTGT AGATGATGTGAAAAAC

TTGCCACATTATCCTTATCGGGATGATGGAATTTTGTTATGGAATGCGATTAATAAG TTTGTATTTAACTATTTG

CAGCTTTATTATAAGAGTCCAGCAGACTTGAAAGCAGACGGAGAACTGCAAGCTTGG GCGCGGGAATTAGTG

GCTCAGGATGGTGGTAGGGTTAAGGGTATGAGCGATCGCATTGATACCCTAGAACAA TTAGTTGAGATTGTT

ACTACT AT CAT AT AT ATTT GTGGT CCGCAGCATTCGGCAGTT AATTT CT CCCAATAT G AAT ACAT GGGTTTT ATT

CCTAATATGCCCCTAGCTGCTTATCAAGAAATTCAACAAAAGGGTGATATTGAAGAC CGTCAAGCCCTCATAG

ATTTT CT ACCACCTGCC AAACCCACAAAT ACCCAATT AT CAACT GT GT ACAT ACTTT C AG ACT ATCGTT AT G ACA

GACTGGGATATTATGAAGAGGAAGAATTTGCAGATCCAAATGCTGACAAAGTTGTGA ATAAATTCCAGCAAG

AATTGAGTGTGGTACAGAGAAAAATTGAATTGAATAATAAGGGACGTTTAGTAAATT ATGAATATCTCCAACC

AAG ACT CATT CT CAAC AGT ATT AGT ATTT AA

Amino acid Sequence for WP_015083575.1 - SEQ ID NO: 94

MQPFLPQN DPNPAQRQSCLEKGRKEYQFMYDFLPPMAMLKSVPPAENFSTKYIAERTLEAAELPLNMM AVKTHA

MWDPLDELQDYEDFFPVLQKPNVMKTYETDDSFAEQRLCGVN PMVLRQI KQMPAN FAFTIEELQDKFGNSI NLIE

RLATGN LYVADYRSLAFVQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQI NPKAGKASPLLTPFDDPLT

WFYAKSCVQIADANH HEMSSHLCRTHLVM EPFAVVTPRQLAEN HPLRI LLRPHFRFM LANNDLARKRLVSRGGFV

DELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVDDVKNLPHYPYRDDGILLWNAIN KFVFNYLQLYYKSPA DLKADGELQAWARELVAQDGGRVKGMSDRIDTLEQLVEIVTTNYICGPQHSAVNFSQYEY MGFIPNM PLAAYQEI

QQKGDI EDRQALIDFLPPAKPTNTQLSTVYILSDYRYDRLGYYEEEEFADPNADKVVN KFQQELSVVQRKIELNN KG

RLVNYEYLQPRLILNSISI

Coding sequence for WP_027404620.1 - SEQ ID NO: 95

AT G AAGCCATTTTT ACCT CAAAAT G ACCCAAATCCC AC ACAACG AC AAT CTT CCCT AG AG AAAGGT CGC AAAG

AGTATGAATTTAGGTATGA I I I I I I GCCT CCT ATGGCG ATGCT CAAAAACGT ACCTCCCT CT G AG AATTTTT CT A

CCAAGTATATTGCTGAACGGACAATAGAGACAGCAGAACTTCCTAGCAATATGATGG CTGTCAAAGCCCATGC

T AT GT GGG ACCCCTT AG AT G AATTGC AAG ACT AT G AAG ACTTTTTT CC AGTTTTGC AAAAACCT AAT GTG ATG A

AAAATTATGAAACAGATGATTCCTTCGCCGAACAACGGCTTTGTGGCGTGAATCCTG TGGTTTTACGGCAGAT

TAAGCAAATGCCCGTCAACTTTGCCTTTACCATCGAAGAATTGCAAGCTAAGTTTGG CAACTCTATTGATTTAA

GAGAAAGACTGGCAACCGGAAATCTCTATGTAGCTGATTATAGACCTTTGGCGTTCA TTCGAGGTGGCACTTT

TGCCAAAGGGAAAAAGTATTTACCAGCACCACTAGCCTTTTTCTGTTGGCGGAGTTC AGGCTTTCAAGATCGT

GGTCAATTAGTACCTATAGCGATTCAAATCAATCCTAAGGAAGGAAAAGCCAGCCCC TTGCTGACCCCTTTTG

ATG ACT CTT CTACCTG GTTTT ATGCCAAGTCCTGTGTG C AA ATT G CTG ATG CT AAT CAT CAT G A AAT G AGTAG C

CATTTATGCCGGACTCACTTTGTAATGGAACCTTTTGCTGTTGTTACCCCTCGTCAA TTAGCCCAGAACCATCCG

CTGAGAATATTACTAAAACCCCATTTCCGGTTCATGTTGGCTAACAATGATTTAGGT CGTCAGCGGTTGGTGAA

TAGAGGCGGTCCTGTTGATGAATTATTAGCGGGAACTCTGCAAGAATCACTGCAAAT TGTTCTAGACGCTTAT

ACAGATTGGAGATTGGATCAGTTTGCGCTACCAACAGAACTCAAAAATCGCGGTGTG GATGATGTGAAAAAT

TTGCCCCACTATCCTTATCGGGACGATGGGATCTTGTTGTGGAACGCGATTAACAAG TTTGTGTTTAACTATTT

GGAGCTTTACTACAAGAGTCCCGCAGACTTGACAGCAGATGTCGAACTACAAGCTTG GGCGCGGGAATTAGT

GGCTCAGGATGGTGGTAGAGTCAAGGGGATGAGCGATCGCATTGATACTTTGAAACA ATTAGTAGAGATTGT

TACT ACT ATCATTTACACTTGTGGACCCCTGCATTCTGCTGTTAATTTCCCCCAATATGAATACATG GGTTTCATT

CCCAATATGCCTCTGGCTGCTTATCAACCAATTAAAAAAGAAGGGGTTTGTACCCGC AAGGAACTGATAGATT

TTTT ACC AGCTGCCAAACCAACAAGT AGCCAATT AAC AACT GT ATT CACACT CT C AGCCT ATCGTT AT G ACAG A

CT AGG AT ATT AT G AAG AGG AAG AATTT G AAG ACCCCAATGCT G ACG AT GTT GT G AAT AAATTCCAGCAAG AAT

TGAATGTGGTGCAAAGAAAAATTGAGTTGAGCAACAAGGGACGTTTAGTAAATTACG AATACCTACAACCCA

G ACTT AT CCT C A AC AG C ATC AGT ATTT A A

Amino acid Sequence for WP_027404620.1 - SEQ ID NO: 96

MKPFLPQN DPNPTQRQSSLEKGRKEYEFRYDFLPPMAM LKNVPPSEN FSTKYIAERTIETAELPSN MMAVKAHAM

WDPLDELQDYEDFFPVLQKPNVM KNYETDDSFAEQRLCGVNPVVLRQIKQM PVN FAFTIEELQAKFGNSI DLRER

LATGN LYVADYRPLAFIRGGTFAKGKKYLPAPLAFFCWRSSGFQDRGQLVPIAIQIN PKEGKASPLLTPFDDSSTWFY

AKSCVQIADAN HH EMSSHLCRTH FVMEPFAVVTPRQLAQNH PLRI LLKPHFRFMLAN N DLGRQRLVN RGGPVDE

LLAGTLQESLQIVLDAYTDWRLDQFALPTELKNRGVDDVKNLPHYPYRDDGI LLWNAIN KFVFNYLELYYKSPADLT

ADVELQAWARELVAQDGGRVKGMSDRI DTLKQLVEIVTTI IYTCGPLHSAVN FPQYEYMGFI PNMPLAAYQPIKKE

GVCTRKELI DFLPAAKPTSSQLTTVFTLSAYRYDRLGYYEEEEFEDPNADDVVNKFQQELNVVQRKIEL SNKGRLVNY

EYLQPRLILNSISI

Coding sequence for WP_114084873.1 - SEQ ID NO: 97

AT G AAACC AT ACCTTCCT CAAAAT G ATCCT G ACCCT AC AAAACGT AAAAT ATTGCT AG AG AG AAACC AAGG AG AGT AT G AATTT GATT ACG ACTTTTT AACGCCT AT GGCAATGCT AAAAAAT GT ACCTT CT AT AG AAAACTTTT C AA CT AAGT AT ATTGCT G AACGCACATT AG AG ACAGCAG AACT ACCT AT AAAT AT GTT AGCCGTT AAAACCCGTT CT TTATGGGACCCTTTAGATGAATTGCAAGACTATGAAGACTATTTTCCAGTTTTGCCTAAA CCTAATGTTATCAA AACATACCAAACTGATGACTCTTTTTGTGAACAACGGCTTTGTGGGGCAAATCCTTTTGT TTTACGTCGAATTG

AAAAG ATGCCAG AT GGCTTCGCCTTTACCATTTT AG AACT GCAAG AAAAGTTTGGT G ACT CT ATT AACTTAGTT

GACAAACTTACGAATGGAAATTTATATGTAGCTGATTATAGAGCGCTTGCGTTTGTT AAAGGAGGTACTTATG

AAAGAGGTAAGAAGTATTTACCAACCCCTATAGCTTTCTTTTGTTGGCGCAGTTCTG GTTTTAGCGATCGCGGT

CAACTAGTACCGATTGTTATCCAAATCAACCCCACAGATGGCAAACAGAGCCAGCTA ATTACGCCTTTTGATGA

CCCTTTAACCTGGTTTCATGCCAAACTTTGTGTTCAAATTGCTGATGCTAACCATCA TGAAATGAGTAGTCATCT

GTGCCGAACTCACTTTGTTATGGAACCCTTTGCTATTGTCACAGCCCGTCAACTAGC CGAGAACCATCCCCTTA

GCTTACTGCTAAAACCCCACTTCCGTTTCATGTTGGCTAATAATGACTTGGCTCGTA AGCGCCTAATTAGTAGA

GGTGGGCCTGTTGACGAATTGCTAGCCGGAACTCTGCAAGAGTCATTGCAAATTGTC GTCAACGCATATCAAG

AATGGAGCTTAGATCAGTTTTCCTTACCCACTGAACTAAAAAATCGGGGTATGGATG ACCCAAACAACCTACC

TCACTATCCCTATCGAGACGATGGCTTGCTATTGTGGAATGCAATTAAAAAGTTTGT GTCTGAATACTTGCAAA

TATACTACAAAACTCCCCAAGACTTAGCAGCAGACTTAGAATTACAAAGTTGGGCGC AGGAATTAGTTTCCCA

ATC AG G CG G G CG AGTT A AG G GT ATT AG C A AT CG C AT CG AC AC ATT AG ACC AATT AGTT GAT ATT G CT ACT G CG

GTT ATTTT C ACCTGTG G G CCG C AAC ACG CTG CTGTT AACT ACT C AC AAT AT G A AT AT AT G ACTTT CAT G CCC AAT

ATGCCTCTTGCTGCTTATAAACAAATGACATCAGAAGGCACTATTCCTGACCGTAAA AGTCTATTATCATTTCTG

CCACCGTCAAAGCAAACTGCTGACCAATTATCGATTTTATTTATCCTGTCAGCTTAC CGTTATGACAGATTAGG

GTACTATGATGATAAGTTTGTAGACCCAGAGGCTCAGGATGTTTTAGCTAAATTTCA GCAAGATTTGAACGAA

GCGGAGCGGGAAATTGAGTTGAATAACAAGAGTCGTTTAATAAATTACAACTATCTG AAACCACGGCTTGTTA

CT AAT AGT ATT AGCGTGTAA

Amino acid Sequence for WP_114084873.1 - SEQ ID NO: 98

MKPYLPQNDPDPTKRKILLERNQGEYEFDYDFLTPMAMLKNVPSI EN FSTKYIAERTLETAELPI NMLAVKTRSLWD

PLDELQDYEDYFPVLPKPNVIKTYQTDDSFCEQRLCGANPFVLRRIEKMPDGFAFTI LELQEKFGDSI NLVDKLTNGN

LYVADYRALAFVKGGTYERGKKYLPTPIAFFCWRSSGFSDRGQLVPIVIQIN PTDGKQSQLITPFDDPLTWFHAKLCV

QIADAN HH EMSSH LCRTHFVMEPFAIVTARQLAEN HPLSLLLKPH FRFM LAN NDLARKRLISRGGPVDELLAGTLQ

ESLQIVVNAYQEWSLDQFSLPTELKN RGM DDPNN LPHYPYRDDGLLLWNAIKKFVSEYLQIYYKTPQDLAADLELQ

SWAQELVSQSGGRVKGISN RI DTLDQLVDIATAVIFTCGPQHAAVNYSQYEYMTFMPNMPLAAYKQMTSEGTIP

DRKSLLSFLPPSKQTADQLSI LFILSAYRYDRLGYYDDKFVDPEAQDVLAKFQQDLNEAEREIELN NKSRLINYNYLKP

RLVTNSISV

Coding sequence for WP_096538768.1 - SEQ ID NO: 99

ATGAAACCATACCTTCCTCAAAATGACCCCGACCCAACAAAACGCAAATCTTTCTTA GAGCGTAAGCAAGAAG AAT AT G AATT CG ATTATG ATTTTTT ACCGCCG AT GGCG ATGCTT AAAG AT GTACCTGCCGT CG AAAATTTTTCT ACAAAATATATTGCTGAACGTGCAGTAGAAACGGCAGAGCTTCCTATCAATATGTTGGCT GTTAAAACCCATA CTTT ATGGG ACCCTTTGG AT G AATTGCAAG ACT AT G AAGACT ATTTT CCAGT CTT GCCT AAACCTACT GTCATCA AAACATACCAAACTGATGACTCGTTTTGCGAACAACGGCTGTGTGGGTCAAATCCTATGG CTTTACGCCAAATT AAAG AG ATGCCTTTAG ACTTT G AGTTTACT ATT CAAG AATT ACAACG AAAATTTGGCG AAT CT AT CAATTTGGC AG AAAAACTT GCCAATGG AAATTTATATAT AACCG ATT ACAG AT CGCTTT CCTTT GTT AAAGG AGGCACTT ACG AAAGAGGTAGAAAGTATTTACCAACACCCTTAGCTTTTTTTTGTTGGCGTAGTTCTGGCT TTAGCGATCGCGGT C A ACTT GTACCT ATT G CC ATT C A ACT C AATCCCG C AG CCG GT A AAC AAAG CC AACT AAT C AC ACCTTTT G ACG A TCCTTTAGCTTGGTTTCATGCCAAACTATGCGTTCAAATCGCTGATGCTAACCATCATGA AATGAGTAGCCATC TTTGTCGAACTCACTTTGTTATGGAACCTTTCGCCATTGTCACAGCCCGTCAATTAGCTG ATAATCATCCTCTTA ATTTATTACTAAAACCGCACTTCCGTTTCATGTTGGCTAATAATGATTTGGGTCGCAAGC GCTTAGTTAATAGG GGCGGCCCTGTTGATGAATTGCTAGCTGGAACTCTGCAAGAATCACTACAAATTGTTGTT AATGCCTATAAAG AATGGAGCTTAGATAAGTTTGCCTTACCCACGGAAATCAAAAATCGTGGTGTAGACGATC CACAAAAATTACC TCACTATCCCTATCGAGATGAT GGGAT GCTATT GT GGAAT GCCATTAAAAAGTTT GT GTCT GAATACTT GAATT TATACTACAAAACTCCCGAAGATTTGACAGCAGACTTTGAATTACAAGCTTGGGCGCAGG AACTAGTTTCTCA ATCAGGCGGACGAGTTAAAGGCGTTCCCGATCGCATTGAAAAATTAGAACAATTAATTGA TATCGCTACTGCG GTAATTTTCACTTGCGGGCCGCAACACGCTGCTGTGAACTATCCACAATATGAATATATG ACTTTCATGCCGAA TATGCCCCTT GCTGGTT AT AAACAAAT G ACATCAG AAGGCACT ATTGCT G ACCGCAAAAGTCT ATT AT CATTT C TGCCACCACCG AAGCAAACTGCT G ACCAATT GTCAATTTT ATTCAT CCT CT CAGCTT ACCGTTAT G ACAG ATT AG GCTACTATGACGATAAGTTTGCAGACCCAGAAGCTGAGGATATTGTAGCTACATTTCAGC AAGATTTGAACGA GGTAGATCGAGAAATTGAGTTGAATAATAAGAGCCGTTTAATAAAGTATAACTATCTCAA ACCAAGGCTTGTT ACC A AT AGT ATT G G CAT CT AA

Amino acid Sequence for WP_096538768.1 - SEQ ID NO: 100

MKPYLPQNDPDPTKRKSFLERKQEEYEFDYDFLPPMAMLKDVPAVENFSTKYIAERA VETAELPI NMLAVKTHTLW

DPLDELQDYEDYFPVLPKPTVIKTYQTDDSFCEQRLCGSNPMALRQIKEM PLDFEFTIQELQRKFGESIN LAEKLANG

N LYITDYRSLSFVKGGTYERGRKYLPTPLAFFCWRSSGFSDRGQLVPIAIQLNPAAGKQSQ LITPFDDPLAWFHAKLC

VQIADAN HH EMSSH LCRTHFVMEPFAIVTARQLADN HPLNLLLKPH FRFM LAN NDLGRKRLVN RGGPVDELLAG

TLQESLQIVVNAYKEWSLDKFALPTEIKN RGVDDPQKLPHYPYRDDGMLLWNAI KKFVSEYLNLYYKTPEDLTADFE

LQAWAQELVSQSGGRVKGVPDRIEKLEQLIDIATAVI FTCGPQHAAVNYPQYEYMTFM PNMPLAGYKQMTSEGT

lADRKSLLSFLPPPKQTADQLSILFILSAYRYDRLGYYDDKFADPEAEDIVATFQQD LN EVDREI ELNN KSRLIKYNYLK

PRLVTNSIGI

Coding sequence for RCJ25669.1 - SEQ I D NO: 101

ATGAATCCATACCTTCCTCAAAATGATCCTGACCCAACAAAACGCAAGTTTTCTTTA GAGCGTAAGCTAGAAGA

ATACGAATTCGATTACAACTTTTTACCGCCGATGGCGATGCTTAAAGATGTACCTGC CGTGGAAAATTTTTCTA

CCAAGTATATTGCTGAACGTGCAGTAGAAACGGCAGAACTTCCTCTCAACATGTTGG CTGTTAAAACCCGTAG

TTTATGGGACCCTTTGGATGAATTGCAAGACTATGAAGATTATTTTCCAGTCTTGCC TAAACCTGATGTCATCA

AAACATACCAAACTGATGACTCGTTTTGCGAGCAACGGTTGTGTGGGGCAAATCCTA TGGCTTTACGCCAAAT

TAAAG AG AT GCCTTTAGGCTTT G AGTTTACT ATT CAAG AATT GCAAG AAAAGTTTGGGGAAT CT AT CAATTT G

GCAG AAAAACTTGCCAATGG AAATTT AT ATATAACT GATT AT AG ACCACTTTCATTT GTTAAAGG AGGCACTT A

CGAAAGAGGTAAAAAGTATTTACCAACACCGTTAGCTTTTTTCTGTTGGCGTAGTTC TGGTTTTAGCGATCGCG

GT C AACTT GT ACCT ATTGCCATT C AACT CAATCCCGCACTCGGCAAACAAAGT C AATT AAT C AC ACCTTTT G ACG

ATCCTTTGACTTGGTTTCATGCTAAACTATGCGTTCAAATCGCTGATGCTAACCATC ATGAAATGAGTAGCCAT

CTTT GT CG A ACT C ACTTT GTT AT G G AACCTTT CG CC ATT GTT AC AG CTCG G C AATT AG CT GAT AAT C ACCCT CTT

A AC AT ATT ACT A A AACCCC ACTTCCGTTT C ATGTTG G CT AAT AAT G ACTT G G GTCG CAAG CG CTT AGTT AAT AG

GGGCGGTCCTGTTGATGAATTGCTAGCTGGAACTCTGCAAGAATCATTACAAATTGT TGTCAATGCCTATAAA

GAATGGAGTTTAGATCAATTTGCCTTACCCACGGAAATCAAAAATCGTGGTGTGGAT AATCCAGACAACTTGC

CT CACT AT CCCTATCG AG AT G ATGGG ATGCT CTT GTGG AATGCCATT AAAAAGTT CGT GTCT G AAT ATTT G AAG

TTATACTACAAAACTCCCGAAGATTTGACAGCAGACTTTGAATTGCAAGCTTGGGCA CAGGAACTAGTTTCTCA

ATCAGGCGGACGAGTTAAAGGCGTTCCTTCGCGCATTGAAAAATTAGAACAATTAGT TGACATTACTACTGCG

GT A ATTTT C ACTT GTG G G CCG C AAC ACG CTG CTGTT AACTAT CC AC AAT AT G A AT AT AT G ACCTT C ATGCCG A A

TATGCCCCTTGCTGGTTATAAACAAATGACATCAGAAGGCACTATTCCTGACCGCAA AAGCCTATTATCATTTC

TGCCACCCCCTAAGCAAACTGCTGACCAATTGTCAATTTTATTCATCCTCTCAGCTT ACCGTTATGACAGATTAG

GCT ATT AT G ACG AT AAATTTGCAG ACT C AG AAGCT G AGCAAATTTT AGTT AC ATT CC ACC AAG ATTT G ACCG AG GT AG AGCG AG AAATT G AATT G AAT AACAAG AGCCGTTT AAT CAAGT AT G ACT AT CT C AAACC AAGGCTT GT AA CC AAT AG CAT C AG CAT CT A A

Amino acid Sequence for RCJ25669.1 - SEQ ID NO: 102

MN PYLPQN DPDPTKRKFSLERKLEEYEFDYNFLPPMAM LKDVPAVEN FSTKYIAERAVETAELPLNMLAVKTRSLW

DPLDELQDYEDYFPVLPKPDVI KTYQTDDSFCEQRLCGANPMALRQIKEM PLGFEFTIQELQEKFGESIN LAEKLAN

GN LYITDYRPLSFVKGGTYERGKKYLPTPLAFFCWRSSGFSDRGQLVPIAIQLNPALGKQSQ LITPFDDPLTWFHAKL

CVQIADAN H HEMSSHLCRTHFVM EPFAIVTARQLADNH PLNI LLKPH FRFM LAN NDLGRKRLVN RGGPVDELLAG

TLQESLQIVVNAYKEWSLDQFALPTEIKN RGVDNPDN LPHYPYRDDGMLLWNAI KKFVSEYLKLYYKTPEDLTADFE

LQAWAQELVSQSGGRVKGVPSRIEKLEQLVDITTAVI FTCGPQHAAVNYPQYEYMTFMPNMPLAGYKQMTSEGT

IPDRKSLLSFLPPPKQTADQLSILFILSAYRYDRLGYYDDKFADSEAEQI LVTFHQDLTEVEREIELNN KSRLI KYDYLKP

RLVTNSISI

Coding sequence for WP_017318478.1 - SEQ ID NO: 103

ATGAAACCCAACTTACCGCAACACGAGCCAAATCCCGAAGCTCGGAGAAATTGGCTA GAACAAAACCGAGAA

GATT AT AAATTCG ACCAT AATT AT CTGGCTCCCAT ACC AAT ACTT GAT AAGGT GCCT CAT CAAG AACT CTT CTCG

CCGAAATATACTGCTAAACGCTTAGCAAGTATGGCGAATCTCGTACCTAATATGCTT GCTGCCAAAGCCAGAA

ATTT CTT CG AT CCGCTGG AT G AATT AG AAG AAT AT G AAG ACCTTTT GCCG AT ATT ACCAAAG CCCT CT GT CAT A

AAAAATTATAAAACAGACTCGTGTTTCGCCGAGCAAAGACTCTCTGGGGCAAACCCG ATGGCAATGCACAGG

ATTGACGCGCTCCCGGAAAATTTCCCTGTCACAAACGACCACTTTCAAAAAGCCGTA GGTGCAGCTCACGATC

TGGAGGCGGCACTCAAAGAAGGCAAACTCTATTTATTAGATTATCCTTTGCTATTTG ACATTAAAGGCGGTACC

T ACC AAA AC ATT A AA AAGT AT CTTCCC A AG CCG C AG G CT CT ATTTT ACT G G C AA AGC AAT G G C A AT AA AA AT A

GTGGTTCTCTGATGCCTATTGCCATTCAGCTCCATAATGATACTGACGGAGATAGCC TAATTTACACACCAGAT

GACCCCCATTTAGATTGGTTTTTGGCAAAAACTTGCGTACAAATGGCTGATGGGAAC CATCAGGAATTGGGCA

GTCATTTTGCACGAACTCATGCAGTTATGGGTCCGTTTGCAGTCGTCACGGCTCGAC AACTCGGAGAAAACCA

TCCCCTCTCCTTACTCCTGAGACCCCACTTCCGGTTCATGCTCTATGATAACGATTT GGGGCGTACTCACTTTTT

ACAACCAGGAGGTCCAGTTGATGAATTTATGGCAGGTACGTTGCAGGAGTCTCTTGG TTTCGTTGGCAAAGCC

TACGAAGAATGGAGTTTAGACAATGCTGTCTTCGCGACGGAAATAAAAAATCGCAAA ATGGATGATCCAGAA

ATTTTGCCGCACTATCCTTTCCGGGATGACGGGATGTTAGTCTGGGATGCGGTCAAA AAGTTTGTCACTGAAT

ACATCCAACTCTATTACAAAACTCCCCAAGACTTGAGTGAGGATTATGAATTGCAAA ATTGGGCGAGAGAATT

GGCTGCCCAAGATGGTGGTCGTGTTAAGGGGATGCCAGAGAAAATTGAGACCATAGA GCAACTCATTGACAT

TGTGACTGTAGTCGTCTTCACCTGCGCTCCTCTCCACTCGGCTTTGAATTTTTCCCA GTACGAATACATGGCTTT

TGTACCCAATATGCCGTATGCAGCCTACCACCCTGTTCCAGAAACAAAGGGTGTGGA TATGCAAACGATCATG

AAGATGCTTCCACCCTTTAAGCACGCTGCCGATCAGGTGATGTGGTCGGATATTTTG ACATCCTTCCATTACGA

CAAATTGGGTCACT AT GAT G AAG AATTTGCCG ACCCAATTGCTCAGG AAATTCTT GT GCAGTTT CAACAAAATT

T ACAT G AAGTGG AACG ACAAAT AG AAATT AAAAACCAAT CTCGT CC AAT ACCTT AT AACT ACCT C AAGCCTT CT

G A AATT ATT AAT AG CAT C AAT ACTT G A

Amino acid Sequence for WP_017318478.1 - SEQ ID NO: 104

MKPN LPQHEPN PEARRNWLEQN REDYKFDHNYLAPIPILDKVPHQELFSPKYTAKRLASMAN LVPNMLAAKARN

FFDPLDELEEYEDLLPI LPKPSVIKNYKTDSCFAEQRLSGANPMAMH RIDALPEN FPVTNDHFQKAVGAAH DLEAAL

KEGKLYLLDYPLLFDI KGGTYQNI KKYLPKPQALFYWQSNGNKNSGSLMPIAIQLH NDTDGDSLIYTPDDPHLDWFL

AKTCVQMADGN HQELGSHFARTHAVMGPFAVVTARQLGEN HPLSLLLRPH FRFM LYDNDLGRTHFLQPGGPVD

EFMAGTLQESLGFVGKAYEEWSLDNAVFATEI KNRKMDDPEILPHYPFRDDGMLVWDAVKKFVTEYIQLYYKTPQ DLSEDYELQNWARELAAQDGGRVKGMPEKI ETIEQLIDIVTVVVFTCAPLHSALNFSQYEYMAFVPNM PYAAYH P

VPETKGVDMQTIM KMLPPFKHAADQVMWSDILTSFHYDKLGHYDEEFADPIAQEI LVQFQQN LH EVERQI EI KNQ

SRPIPYNYLKPSEII NSI NT

Coding sequence for KJ H71567.1 - SEQ I D NO: 105

ATGATAAAACCATATTTACCTCAACACGAGCCTGATGCGATCGCGCGGCAAAATCGC TTAATCAAAAACCGCG

CT GATT AT GTT CTCG ACT AT AACT AT CTGCCACCT ATT CCTTT GCAAACTCCT GTTCCT CAAC AAG AACGTTTTT C

TGCTGAATACACTGCAAGGCGTTTAGCTAGTTTTGCTAATCTCGTCCCCAATATGTT GATGGCGAGGGCGAGA

AATGCTTTCGATCCTTTAGATACGTTAGAGGAATACGCGGACTTATTACCAGTCTTA CCAAAACCTAATGTCAT

C A AAA ATT ATC AAG C AG ATT G GT GTTTT G CCG A AC AA AG ATT ATCTG GT ATT AACCCGCCAGCTATCCGCCGCA

TAGATGCTTTGCCAGAAAATTTGCCCATCTCTAACTCTTCGTTTCAACACTCTGTAG GTGCAGAACATAATCTG

GAACAAGCACTCAAAGAAGGTAAGTTGTATTGTTTAGACTACCCGTTGTTATCTGGT ATTGGAGGCGGTAATT

ACCAGAATTTACCTAAATATCTGCCCAAACCGCAAGCGCTCTTTTATTGGCGTAGTG ATAATAGCAAAATCGGC

GGCTCTTTAGTTCCGGTAGCGATTAAAATTCTCAATGAATTGGGAGGGAAAAATTTA GTCTATACGCCCAATG

ATGCACCTCTCGACTGGTTTCTTGCCAAAACCTGCGTGCAAATGGCAGATGCAAACC ATCAGGAATTAGGCAC

TCATTTTGCTAAAACTCATGCTGTTATGGCTCCTATTGCGGCAATTACAGCTAGGGA ATTAGGCGAAAACCATC

CTTTAACTTTGCTGCTAAAACCTCATTTCCGGTTCATGCTGTTTGATAATGAGTTAG GACGCACGCAGTTTTTGC

AACCTACTGGTCCTACTGAAGAACTGCTAGCTGGAACGCTGGAAGAATCTGTGCAAT TGGTCGTGCAAGCTTA

TGAGGAATGGAGTATAGATACTACTTTTCCTTTAGAATTGCAGCAACGGCAAATGCA TGACCCAGAGATTTTA

CCTCATTACCCGTTCCGAGATGATGGCATATTAGTCTGGAATGCTATACATCAGTTT GTTACTGAATATTTGCA

GATTTACTACCACACTCCGCAAGATATCAGTGCAGACTACGAGGTGCAAAATTGGGC TAGGGAATTGGTAGA

TAGCGGTCGAGTTAAAGGAATGCCAGAGAGCATTGATACTCTAGCACAACTAATTGA CATTATCGCTGTAGTC

ATCTTTACCTGCGCTCCTCTGCATTCTTGCTTGAATTTAGCCCAGTACGAATACATG ACTTTCGTGCCAAATATG

CCTTATGCAGCCTACCACCCTATTCCCACTACTAAGGGCGTAGATATGGCAACTATT GTCAAAATTATGCCGCC

TTTTCAAAGAGCGATCGATCAAATATTGTGGACGGATATTTTGAGCGCTTTCCAATA TGACAAGTTGGGTTTTT

ATGAGGAAGATTTTGCCGATCCCAAGGCTCAGGAAGTGCTACAGCGCTTTCAAGATA ACTTGCAGCAGGTAG

AAG AAAAG AT AG AAAT GCAC AAT C AG ATT CGCCCAAT ACCTT AC AACT ACCT CAAG CCTT CT CGG ATT AT G AAC

AG C ATT A AT ACTT A A

Amino acid Sequence for KJ H71567.1 - SEQ ID NO: 106

MI KPYLPQH EPDAIARQN RLIKN RADYVLDYNYLPPI PLQTPVPQQERFSAEYTARRLASFANLVPNMLMARARNA

FDPLDTLEEYADLLPVLPKPNVIKNYQADWCFAEQRLSGI NPPAI RRI DALPEN LPISNSSFQHSVGAEHN LEQALKE

GKLYCLDYPLLSGIGGGNYQNLPKYLPKPQALFYWRSDNSKIGGSLVPVAIKI LN ELGGKN LVYTPNDAPLDWFLAK

TCVQMADAN HQELGTH FAKTHAVMAPIAAITARELGENH PLTLLLKPHFRFMLFDN ELGRTQFLQPTGPTEELLA

GTLEESVQLVVQAYEEWSIDTTFPLELQQRQM HDPEILPHYPFRDDGILVWNAI HQFVTEYLQIYYHTPQDISADYE

VQNWARELVDSGRVKGMPESI DTLAQLIDI IAVVIFTCAPLHSCLNLAQYEYMTFVPNMPYAAYHPIPTTKGVDMA

TIVKIMPPFQRAIDQI LWTDILSAFQYDKLGFYEEDFADPKAQEVLQRFQDN LQQVEEKI EMH NQI RPIPYNYLKPSR

IM NSI NT

Coding sequence for WP_017327314.1 - SEQ ID NO: 107

AT G AAT ACTGCT GT CAG ACCTTCATTGCCACAAAAGG AT CCT AACT CCAACAAGCGCAAT GATT ATTT AG AGC GCAACCGAGAGGATTATCAATTCGATCGCAGCCTATTACCCCCTCTCCCCTTCATGCAGA AGGTTCCAAAACGG GAATATTTTTCACCCGAATATACCGCGAAACGGCTCGCCAGTATGGCTAACCTGCCTGCT AATATGCTAGCTGC TAAAGCTAAGCGCTTTCTCGATCCCCTCGATAGCCTGGAAGAATACGAGGAGCTGATTCC TCTGCTATCTAAAC CCAATCTGCTGAAGAACTATCGCACTGACGAA I I I I I I GGGGAGCAGCGACTGTCGGGAGCCAACGCCATGG

CAACGCGCCGACTGGCAAAACTTCCCAGTGATTTTGCTGTGGATAATGCTCTGTTTC AGCAGGTGTTGGAGAC

CGATGGAACTCTCGACGCAGCCTTAGCTGAAGGTAGACTTTATTTTCTGGAACATCC CTATCTCAATCGCATCA

AAGGAGGGGAATCGGAGTACGGTCGCAAATACATGCCCAAAACGCGATCGCTGTTCT ATTGGAAAAGTGACG

ACTCTCCAGTGGGGGGTGCTCTTTTGCCAGTGGCGATCGAACTCAAAAGCGAAGCCA CGAATACCCCGATTGT

CTATACTCCCAAAGATGCCCCCCTCGATTGGCTGTTTGCCAAACTCTGCGTCCAAGT CGCCGACGCCAACCATC

AAGAATTAGGCTCCCACTTTGCCTTCACCCACACCGCCATGGGGCCGTTTGCCATGG TTACTGCTCGGCAATTG

GCTGAAAACCATCCCGTGTCGCTGTTATTAGAACCTCACTTCCAGTTCATGCTGTTT GATAACGATTTGGGGCG

GGCACAGTTTCTCAACCCCGGCGGTCCAGTCGATCGCTTTTTGGCTGGAACTCTCGA AGAAACCCTTACTTTTG

TGGTCGACACCCTCGATCGTTGGAGTATTGATACCTTTGACTTCCCATCGATTATCG AGCGCCAAAACATGGAT

GACCCAGAGGTGCTGCCCCACTATCCCTTTAGAGATGACGGCATGTTGATTTGGGAT GCTGTGAAGGAATTTA

TTACCAATTACCTCAGCATCTATTACAAAACCCCTGAGGATATTAGGGAGGACTACG AACTACAAAATTGGGC

GAAAGAATTAGCAGCATTTGATAGCGGTCGAGTCAAGGGAATGCCCGAAACTATTGA GTCATTGCAGCAGCT

GAT CG AT AT CCT GT CT GT CGT G ATTTT CACCT GTGCT CCCCT GCATT CT AACTT G AACTT C ACT C AAT ACG AAT A

CATGATCTTCGTTCCCAATATGCCTTACGCCGCATATCATCCGGTACCAGAGCAGAA GGGGATCGATATGGAA

ACCATTCTGAAGTTTCTACCCCCCTACAAACAAGCGGCCGATCAAGTGTATTGGACG ATGGTCTTGACCTCTTA

CCATCACGACAAGCTAGGCTTTTACGAAGATGATTTTGCCGATCCTCTAGCCCAAGA TGCCCTCGTTCAATTCC

AGCAAAACCTAGCGGATATCGAACGCAAGATCGAGATTGAAAATCAACATCGTCCGG TCCCCTATCAGTATTT

CTTGCCAT CT G AAATT ATT AACAGCATT AAT ACTT G A

Amino acid Sequence for WP_017327314.1 - SEQ ID NO: 108

MNTAVRPSLPQKDPNSN KRNDYLERN REDYQFDRSLLPPLPFMQKVPKREYFSPEYTAKRLASMAN LPANMLAA

KAKRFLDPLDSLEEYEELI PLLSKPNLLKNYRTDEFFGEQRLSGANAMATRRLAKLPSDFAVDNALFQQVLETDGTLD

AALAEGRLYFLEHPYLN RIKGGESEYGRKYMPKTRSLFYWKSDDSPVGGALLPVAIELKSEATNTPIVYTPKDAPLD W

LFAKLCVQVADANHQELGSHFAFTHTAMGPFAMVTARQLAEN HPVSLLLEPH FQFMLFDN DLGRAQFLN PGGP

VDRFLAGTLEETLTFVVDTLDRWSI DTFDFPSII ERQNMDDPEVLPHYPFRDDGMLIWDAVKEFITNYLSIYYKTPEDI

REDYELQNWAKELAAFDSGRVKGMPETIESLQQLIDILSVVIFTCAPLHSNLNFTQY EYM IFVPNMPYAAYHPVPEQ

KGI DMETILKFLPPYKQAADQVYWTMVLTSYHH DKLGFYEDDFADPLAQDALVQFQQN LADIERKIEIENQHRPVP

YQYFLPSEI INSINT

Coding sequence for WP_100898502.1 - SEQ ID NO: 109

ATGAAACCTTACTTACCGCAGAACGATCCAAATGGTAATTATCGAGCAAGTTGGCTG GATAAAAATAGAGAA G AGT ACAATTTT AATT AT GATT AT CTGGCTCCTTT ACCAGT AATT GAT AAAGTGCCT C ACAAGG AAAT ATT CT C A G C AG A AT AT ACT G CT A A ACG CTT GGCAAGTATGG C A ACT CTT G C ACC A AAT ATGTTG GCTG CT A AAG CC AG A A ATTT CTT AG ACCCGCT AG AT G AGTTGG AAG AAT AT G AAG AACTTTTGGCACT ACT ACCAAAACCCG AT GT CAT AAA AAATT AT AAAACAGACTCGTGTTTTGCTGAACAACGACTTTCGGGGGCAAACCCATTAGCTATCCGA AGA ATT AAT GT ATT ACCT GAT AATTTTGCT GT AACT GATT ACCATTTT C AG AAG ATTGC AGGTGCAG AATTT ACTTT G GAAAAGGCACTCAAGGAAGGCAAGCTGTATTTCTTAGATTACCCTTTGCTATCTGATATT CAAGGTGGTGTCTA T AAT AAT GTT AA A AAGT ACCTT CCC A AG CCG C A AG CT CT ATTTT ACT G G C A A AGT AAT GAT AGTTTT AAT G GTG GTTCTCTAGTGCCTGTTGCTATCCAGATTAATCATGACTCTGGCGCAAATAGCCTGTATA CACCAGATGACCCC CATTTAGATTGGTTTTTGGCAAAAACCTGCGTCCAAATTGCTGATGGCAACCACCAAGAA TTGGGTAGTCATTT TTCCTAT ACCC ATG C AGTT ATG G CT CCGTTT G C AATT GT AACT G CG CG G C AATT AG C AG AA AAT CAT CCC AT CG CCTT ACT GTT AAAACCT CACTT CCGTTT CATGCT ATTT G ATAACG ATTTGGG ACGCACTCAGTTTTT ACAGCCT G GTGGACCGGTTGATGAGTTTATGGCAGGTTCATTAGCAGAATCTGTTGGATTTGTGGCGA AAACTTATGAAGA ATGG AGT GT AG AAAAGTTT ACCTTCCCT CGGTT AAT AAAAAGCCGT CAAACAG AT G ACCCAG AAATTTT G CCG

CACTTTCCTTTCCGGGACGATGGAATATTAATCTGGAATGCCATCGAAAAGTTTGTG GCTGAATACTTGCAACT

CTATTATAAGACTTCACAGGATCTCAGCGATGACTATGAATTGCAAAATTGGGCTAG GGAATTAGTCGCCCAA

GATGGTGGTAGAGTCAAGGGAATGCCAGCCAAGATTGAGACTTTAGAACAACTGATT GAAATCATTAGTGTA

GTAGTCTTCACTTGCGCTCCTCTCCACTCTGCTTTGAATTTTTCTCAGTACGAATAT ATGGCTTTTGTGCCCAATA

TGCCTTATGCAGCCTACCACCCAATTCCAGAAACTAAGGGTGTGGATTTGGAAACTA TTATGAAAATACTTCCT

CCCTTT AAACAAGCTGCCG AT CAGGTAAT GT GG ACT G AG ATTTT G ACAT CGTT CCATT AT G ACAAATT AGGTTT

TTATGATGAGGAGTTTGCTGATCCATTGGCGCAGGAAATTGTGGTGCAATTCCAACA TAATCTCCATCAAATA

G AACGGCAAAT AG AC AT CAG AAAT C AAACTCGT CCCAT ACCTT ACAATT ACCTT AAACCTT CGC AAATT ATT AA

TAG CAT C AAT ACTT AA

Amino acid Sequence for WP_100898502.1 - SEQ ID NO: 110

MKPYLPQNDPNGNYRASWLDKNREEYN FNYDYLAPLPVI DKVPHKEIFSAEYTAKRLASMATLAPNMLAAKARN F

LDPLDELEEYEELLALLPKPDVIKNYKTDSCFAEQRLSGANPLAI RRINVLPDN FAVTDYH FQKIAGAEFTLEKALKEGK

LYFLDYPLLSDIQGGVYNNVKKYLPKPQALFYWQSNDSFNGGSLVPVAIQI N HDSGANSLYTPDDPHLDWFLAKTC

VQIADGN HQELGSHFSYTHAVMAPFAIVTARQLAENHPIALLLKPHFRFMLFDNDLGRTQFLQPGGP VDEFMAGS

LAESVGFVAKTYEEWSVEKFTFPRLIKSRQTDDPEILPH FPFRDDGILIWNAI EKFVAEYLQLYYKTSQDLSDDYELQN

WARELVAQDGGRVKGMPAKIETLEQLIEIISVVVFTCAPLHSALN FSQYEYMAFVPNMPYAAYHPI PETKGVDLETI

MKILPPFKQAADQVMWTEI LTSFHYDKLGFYDEEFADPLAQEIVVQFQH NLHQI ERQIDIRNQTRPIPYNYLKPSQI I

NSI NT

Coding sequence for RCJ35150.1 - SEQ I D NO: 111

ATGGTGAAACCATATTTACCACAAAAAGATCCTGATGTTAATGTCCGAATCAATTGG CTAGATAAAAATCGAG

AAG AGT ACAAATTT AATT ACG ATT AT CT AGCTCCT CT ACCAGT AATT GAT AAAGTTCCT CAT AAGG AAAT ATT CT

CG G CG G A AT AT ACT G CT A AACGTTT G G C A AGT ATG G C AACT CTT G C ACC A AAT ATG CTAG CTG CC AA AG CC AG

AAATTTCTTAGACCCATTGAATGAATTGGAAGAATATGAAGAACTTTTGTCACTCCT ACCAAAACCTGATGTTA

TAAAAAATTACAAAACAGACTCTTGTTTTGCAGAACAACGCCTCTCTGGAGCAAACC CATTAGCTATCCAAAAA

ATTGATGTATTACCTGATAATTTTGCTGTCACAGATGCACATTTTCAGAAAGTAGCA GGTACAGAATTTACTTT

AG AA AAG G C ACTT A AG G A AG G C A AG CTGT ATTT CTT AG ATT AT CCTTT GTT ATCTG AT ATT C AAG GTG GTATCT

ACG AG AAT GTTAAAAAGT ACCTT CCCAAGCCACAAGCT CT ATTTT ATTGGCAAAGT AAT GAT AGTTCT AATGGT

GGTTCTCTAGTACCTGTTGCCATTCAGATTAATCATGACTCTGGTGCAAAAAGCGTG ATTTATACACCAGATGA

TCCCCATTTAGATTGGTTTTTGGCAAAAACCTGCGTTCAAATTGCTGATGGCAACCA TCAAGAGTTGGGTAGTC

ATTTCGCCTATACCCATGCAGTTATGGCTCCGTTTGCAATTGTAACTGCGCGGCAAC TAGCAGAAAATCATCCC

ATCGCTTTACTGTTAAAACCCCATTTCCGTTTCATGCTATTTGATAACGATTTGGGG CGCACTCAGTTTTTACAA

CCTGGAGGCCCGGTTGATGAGTTTATGGCAGGTTCATTGGCGGAGTCTCTTGGATTT GTGGCGAAAGTTTATG

AAGAATGGAGTGTTGAAAAATTTACCTTTCCTCGGTTAATAAAAAGTCGTCGAACGG ATGACCCAGAAATTTT

ACCGCACTTTCCTTTTCGGGATGATGGCATATTAATCTGGAATGCCGTCGAAAAGTT TGTGTATGAATATTTGC

AACTCTATTACAAAACCTCACAGGATCTAATTGATGACTATGAGTTGCAAAATTGGG CTAGAGAATTAGTTGC

CCAAGATGGTGGTAAAGTCAAGGGAATGCCAGCGAAGATTGAGACTCTAGAACAACT AATCGAAATCATCAG

TGTGGTAGT ATT C ACTT G CG CTCCTCT AC ACT CT G CTTT G AATTTTT CT C AGT ACG A AT AT ATG G CTTTT GT ACCC

AATATGCCCTATGCAGCCTACCACCCAATTCCAGAAACTAAAGGTGTGGACTTGGAA ACTATCATGAAGATAC

TT CCTCCCTTT AAACAAGCTGCCG AT CAGGT GAT GT GG ACT GAG ATTTT AAC AT CGT ACCACT AT GAT AAATT G

GGTTTTTATGATGAGGAGTTTGCTGATCCGTTGGCGCAGGAAATTGTGGTGCAATTC CAACAGAATTTGCATG A AAT AG AACG G C AA AT AG AT ATT A AA AAT C AA ACT CGTCCC AT ACCTT AC AACT ACTT C AAG CCTT CG C AA ATT ATT A AC AG C ATT A AT ACTT G A

Amino acid Sequence for RCJ35150.1 - SEQ ID NO: 112

MVKPYLPQKDPDVNVRI NWLDKN REEYKFNYDYLAPLPVIDKVPHKEIFSAEYTAKRLASMATLAPNMLAAKARN

FLDPLNELEEYEELLSLLPKPDVIKNYKTDSCFAEQRLSGANPLAIQKIDVLPDN FAVTDAH FQKVAGTEFTLEKALKE

GKLYFLDYPLLSDIQGGIYENVKKYLPKPQALFYWQSN DSSNGGSLVPVAIQI NHDSGAKSVIYTPDDPHLDWFLAK

TCVQIADGN HQELGSHFAYTHAVMAPFAIVTARQLAENHPIALLLKPHFRFMLFDNDLGRTQFLQPGGP VDEFM

AGSLAESLGFVAKVYEEWSVEKFTFPRLI KSRRTDDPEI LPHFPFRDDGI LIWNAVEKFVYEYLQLYYKTSQDLI DDYEL

QNWARELVAQDGGKVKGMPAKI ETLEQLI EI ISVVVFTCAPLHSALN FSQYEYMAFVPN MPYAAYHPIPETKGVDL

ETIM KILPPFKQAADQVMWTEILTSYHYDKLGFYDEEFADPLAQEIVVQFQQN LH EI ERQIDI KNQTRPIPYNYFKPS

QI INSINT

Coding sequence for WP_094352972.1 - SEQ ID NO: 113

AT G AAACC AT ATTT ACCACAAAAAG AT CCT GAT GTT AAT GT CCG AAT CAATT GGCT AG AT AG AAATCG AG AAG

AGTACAAATTTAATTACGATTATCTAGCTCCTCTACCAGTCATTGATAAAGTTCCTC ATAAGGAAATCTTCTCGG

C AG A AT AT ACT G CT A A ACGTTT GGCAAGTATGG C A AGT CTT G C ACC A AAT AT G CT AG CT G CT A AAG CC AG A AA

CTT CTT AG ACCC ATT AG AT G AATT GG AAG AAT ACG AAG AACTTTT GT C ACT CCT ACC AAAACCCG AT GT CAT AA

AAAATTACAAAACAGACTCTTGTTTTGCGGAACAACGACTCTCTGGAGCGAACCCAT TAGCTATCCAAAAAATT

GATGTATTACCTGATAATTTTGCTGTCACAGATGCACATTTTCAGAAGGTTGCAGGT ACAGAATTTACTTTGCA

AAAAGCACTCAAGGAAGGCAAGCTGTATTTCTTAGATTATCCTTTATTATCTGATAT TAAAGGTGGTGTCTACG

AT AAT GTT A AAA AGT ACCTT CCC A AG CC AC A AG CT CT ATTTT ACT G G C A AAGT AAT G ATAGTTCT AAT G GTG GT

TCTCTAGTGCCTGTTGCCATCCAGATTAATCATGACTCTGGTGGAAAAAGCGTGATT TATACACCAGATGACCC

CCATTTAGATTGGTTTTTGGCAAAAACCTGCGTTCAAATTGCTGATGGCAACCATCA AGAATTGGGTAGTCATT

TCGCCTATACCCATGCAGTTATGGCTCCGTTCGCGATTGTAACTGCGCGGCAACTAG CAGAAAATCATCCCATC

GCTTT ACT GTT AAAACCCCACTT CCGTTTTAT GCT ATTT GAT AACG ATTTGGGGCGCACTCAGTTTTTACAACCT

GGAGGCCCGGTTGATCAGTTTATGGCAGGTTCATTGGCGGAGTCTCTTGGATTTGTA GCGAAGGTTTATGAA

GAATGGAGTGTTGAAAAATTTACCTTCCCTCGGTTAATAAAAAGTCGCCGAACCGAT AACCCAGAAATTTTAC

CGCACTTTCCTTTCCGGGACGATGGCATATTAATTTGGAATGCCGTCGAAAAGTTTG TGGCTGAATACTTGCAA

CT CT ATT AC AAAACCT C AC AAG AT AT C AGT G ACG ACT AT G AGTT GCAAAATT GGGCT AG AG AATT AGT AGCT C

AAGATGGTGGTAAAGTCAAGGGAATGCCAGCCAAGATTGAGACTCTAGAACAACTGA TTGAAATCATCAGTG

TG GT AGT ATT C ACTT G CG CTCCTCT AC ATT CT G CTTT G AATTTTT CTC AGT ACG A AT ATATG G CTTTT GT ACCC A A

TATGCCCTATGCAGCCTACCACCCAATTCCAGAAACTAAAGGTGTGGACTTGGAAAC TATCATGAAGATACTTC

CTCCTTTTAAACAAGCTGCCGATCAGGTGATGTGGACTGAGATTTTAACATCGTACC ACTATGACAAATTGGGT

TTTTATGATGAGGAGTTTGCCGATTCATTGGCGCAGGAAATTGTGGTGCAATTCCAA CAAAATTTGCATGAAA

TAGAACGGCAAATAGACATTAGAAATCAAACTCGTCCCATACCTTACAACTACTTCA AGCCTTCGGAAATTATT

A AC AG C ATT AAT ACTT G A

Amino acid Sequence for WP_094352972.1 - SEQ ID NO: 114

MKPYLPQKDPDVNVRINWLDRN REEYKFNYDYLAPLPVIDKVPH KEI FSAEYTAKRLASMASLAPNMLAAKARNFL

DPLDELEEYEELLSLLPKPDVIKNYKTDSCFAEQRLSGANPLAIQKI DVLPDN FAVTDAHFQKVAGTEFTLQKALKEGK

LYFLDYPLLSDI KGGVYDNVKKYLPKPQALFYWQSN DSSNGGSLVPVAIQIN HDSGGKSVIYTPDDPH LDWFLAKTC

VQIADGN HQELGSHFAYTHAVMAPFAIVTARQLAEN HPIALLLKPH FRFM LFDN DLGRTQFLQPGGPVDQFMAG

SLAESLGFVAKVYEEWSVEKFTFPRLI KSRRTDNPEILPH FPFRDDGILIWNAVEKFVAEYLQLYYKTSQDISDDYELQ NWARELVAQDGGKVKGMPAKI ETLEQLIEI ISVVVFTCAPLHSALNFSQYEYMAFVPN MPYAAYH PIPETKGVDLE

TI MKILPPFKQAADQVMWTEILTSYHYDKLGFYDEEFADSLAQEIVVQFQQN LHEI ERQIDIRNQTRPI PYNYFKPSEI

INSINT

Coding sequence for WP_104909167.1 - SEQ ID NO: 115

AT G AAACC AT ACTT ACCACAAAAAG AT CCT GAT GTT AAT GT CCG AAT CAATT GGCT AG AT AAAAAT CG AG AAG

AGT ACAAATTT AATT AC AATT AT CT AGCT CCT CT ACCAATT ATT GAT AAAGTTCCT CAT AAGG AAAT ATT CTCGG

CG G A AT ATACTG CT AA ACGTTT G G C A AGT ATG G C AACT CTT G C ACC A AAT AT G CT AG CT GCT A AAG CC AG A AA

CTTCTTAGACCCATTAGATGAATTGGAAGAATATGAAGAACTTTTATCACTACTACC AAAACCCGATGTTATAA

AGAATTACAAAACAGACTCTTGTTTTGCGGAACAAAGACTCTCTGGAGCGAACCCAC TAGCTATCCAAAGAAT

TGATGTATTACCTGATAATTTTGCTGTCACAGATTCCCATTTTCAGAAGGTTGCAGG TACAAAATTGACGTTGG

AAAAGGCACTCAAGGAAGGCAAGCTGTATTTCTTAGATTACCCTCTGTTATCTGATA TTCAAGGTGGTGTCTAC

GAT AAT ATT C A A AAGT ACCTTCCC AAG CC AC AAG CT CT ATTTT ATT G G C AA AGT AAT GAT AGTT CT AAT G GTG G

TTCTCTAGTGCCTGTTGCCATCCAGATTAATCATGACTCTGGTGCAAAAAGCGTGAT TTATACACCAGATGACC

CCCATTTAGATTGGTTTTTGGCAAAAACCTGCGTTCAAATTGCTGATGGCAACCATC AAGAATTGGGTAGTCAT

TTTGCCTATACCCATGCAGTTATGGCTCCGTTTGCAATTGTAACTGCGCGGCAACTA GCAGAAAATCATCCCAT

CGCCTTACTGTTAAAACCTCACTTCCGTTTTATGCTATTTGATAACGATTTGGGACG CACTCAGTTTTTACAGCC

GGGAGGCCCGGTTGATGAGTTTATGGCAGGCTCATTGGCAGAGTCTCTTGGCTTTGT GGCGAAGGTTTATGA

AG AATGG AGT GTT G AAAAGTTT ACCTTCCCT CGGTT AAT AAAAAGT CGCCG AACGG AT G ACCCAG AAATTTT A

CCGCACTTTCCTTTCCGGGACGATGGCATATTAATTTGGAATGCTGTCGAAAAGTTT GTGGCTGAATACTTGCA

ACTCTATTACAAAACCTCACAAGAGTTAATTGATGACTATGAGTTGCAAAATTGGGC TAGAGAATTAGTGGCC

CAAGATGGTGGTAAAGTCAAGGGAATGCCAGACAAGATTGAGACCTTAGAACAACTG ATTGAAATCATCAGT

GTGGTAGTATTCACTTGCGCTCCTCTACACTCTGCTTTGAATTTTTCTCAGTACGAA TATATGGCTTTTGTACCC

AATATGCCCTATGCAGCCTACCACCCAATTCCAGAAATTAAAGGTGTGGACTTGGAA ACTATTATGAAGATAC

TT CCTCCCTTT AAACAAGCTGCT G ACCAAGT AAT GTGG ACT GAG ATTTT AACAT CGT ACC ACT AT G ACAAATT G

GGTTTTTATGATGAGGAGTTTGCCGATCCATTGGCGCAGGAAATTGTGGTGCAATTC CAACAGAATTTACATG

AAAT AG AACGGC AAAT AG AC ATT AG AAAT CAAACTCGT CCCAT ACCTT AC AACT ACTT CAAGCCTTCGCAAATT

ATT A AC AGT AT C A AT ACTT G A

Amino acid Sequence for WP_104909167.1 - SEQ ID NO: 116

MKPYLPQKDPDVNVRINWLDKN REEYKFNYNYLAPLPII DKVPHKEIFSAEYTAKRLASMATLAPNM LAAKARN FL

DPLDELEEYEELLSLLPKPDVIKNYKTDSCFAEQRLSGANPLAIQRI DVLPDN FAVTDSH FQKVAGTKLTLEKALKEGK

LYFLDYPLLSDIQGGVYDNIQKYLPKPQALFYWQSNDSSNGGSLVPVAIQI N HDSGAKSVIYTPDDPH LDWFLAKTC

VQIADGN HQELGSHFAYTHAVMAPFAIVTARQLAEN HPIALLLKPH FRFM LFDN DLGRTQFLQPGGPVDEFMAG

SLAESLGFVAKVYEEWSVEKFTFPRLI KSRRTDDPEI LPHFPFRDDGI LIWNAVEKFVAEYLQLYYKTSQELIDDYELQ

NWARELVAQDGGKVKGMPDKI ETLEQLIEIISVVVFTCAPLHSALN FSQYEYMAFVPNMPYAAYHPIPEI KGVDLET

IM KI LPPFKQAADQVMWTEILTSYHYDKLGFYDEEFADPLAQEIVVQFQQNLHEIERQIDI RNQTRPIPYNYFKPSQI

INSINT

Coding sequence for WP_106217928.1 - SEQ ID NO: 117

ATGAACGTAATTCAGCCGTCATCAGCGCAAATAGAGCGAGAAACGCGCCAGTTTTTA CCAGATCGCGACCAGT AT AAGTTT G ACT ACG ATTTT CT CAAACCGCT AGCT CT GCTT C AACCCGTT GTT CCAGCCTTGCCG ACT CCACC AG GCTACCCTCGCGTGCCTGGGTCTTCTACCTTTTCACCTTACTATGTATTCACGCGGTCGT CACTGCCTAACACCC TCG ACCCCTTT G ATGG ACTGCAAGCCTTT G ATG A I I I I I I CCCCGCGCAGGGGAAGCCAGAAGTCAGTAAGAT TTATCAAAGCGATCGCTCTTTTGCCGAGCAGAGATTATCTGGTGTGAATCCGATGGTACT TCATCGGATTGTGC

AGATTCCGCCTCAATCTTCTGTGACTTATGAAGAACTCCAGCTCGCTTGCCCCCATC TGCGGCTAGATATGGCA

TTAGCCAATGGCAATATTTATGTTGCCGATTACAGTGGACTCGGCTTTGTACAAGGT GGAACTTTTAAAGACCT

GAAAAAGTATTTACCCACCCCAGTTGCATTTTTCTACTTTGATGAAACTCAACAAGA ATTAATCCCGATTGCAAT

TCAAGTACAGCCCAAACCAGGTGGAGCGATTTTCACTCCGCAAGATACACCGCTAGA TTGGCTGGTAGCCAAG

ATGTGCGTTCAAATAGCAGATGCTAACCACCACGAGATGGGTGCTCATTTGTGCTGG ACGCATTTTGTGATGG

AACCTTTTGCCATTTCTACACCTCGGCAACTAGCCATCAATCATCCAGTGCATTTAC TGCTAGCGCCTCATCTGC

GCTTCCTGTTGGCAATTAACGATCAAGGCAGACAACTGCTAGTCAATCCCTACGTCG ATGGTCAAGTGGGTGG

TCACGTCGATCGAATTATGGCAGGCACGTTAGAGGAATCCTTGGAAATTGTGAAGCA CACCTATTCTGAATGG

AGTTTAGACAAGTTTGCTTTCCCGCAAGAAATACAGAATCGCGGATTGGAGGATGCG AACAAACTGCCGCACT

TCCCTTATCGAGATGATGGTCTGTTGCTCTGGAATGCCATTCATAAGTTTGTTTCCG GTTATCTCAAATATTGCT

ATCCCACACCCGCTGATATTCAAGCAGATCGTGAATTACAAGCTTGGGCGCAGGAAC TAGCCTCGCCAGATGG

TGGACGGGTCAAAGGAATGCCTTGTTCGTTCTCGACGGTAGAGCAACTGATTGAGGT GATTGCCAACGTGATT

TTTACCTGTGGACCGCAGCACGCAGCCGTGAACTATTCACAATTCGACTACATGGCA TACATTCCGAATATGCC

CCATGCTGCCTATGTCAATATCACTGGTAAAGGCATGATTCCAGATGAGAAAGCCCT GATGAAGTTCTTACCA

CCAAGGGATCAGGCAGAAGCTCAAATCAAAATTGTCACTTACCTGTCTTTCTATCGG CACGATCGCCTCGGCTA

TTACGATCGAGCGTTTAACCTTACCTTCCGCGAAACTCCAGTCAAGATGATGGTTCA GCAGTTCCAACAGGAG

TTGAATGAGATCGAGCAGCGGATTGATACCAGGAATCGGCAAAGGTTTGTACCTTAT CCTTATCTCAAGCCTT

CCTT AGTT CC AA AT AG CTTT AGTG CTT G A

Amino acid Sequence for WP_106217928.1 - SEQ ID NO: 118

MNVIQPSSAQIERETRQFLPDRDQYKFDYDFLKPLALLQPVVPALPTPPGYPRVPGS STFSPYYVFTRSSLPNTLDPF

DGLQAFDDFFPAQGKPEVSKIYQSDRSFAEQRLSGVNPMVLHRIVQIPPQSSVTYEE LQLACPH LRLDMALANGNI

YVADYSGLGFVQGGTFKDLKKYLPTPVAFFYFDETQQELIPIAIQVQPKPGGAI FTPQDTPLDWLVAKMCVQIADA

N HH EMGAH LCWTH FVMEPFAISTPRQLAIN HPVH LLLAPH LRFLLAINDQGRQLLVNPYVDGQVGGHVDRIMAG

TLEESLEIVKHTYSEWSLDKFAFPQEIQNRGLEDAN KLPHFPYRDDGLLLWNAIH KFVSGYLKYCYPTPADIQADREL

QAWAQELASPDGGRVKGMPCSFSTVEQLIEVIANVI FTCGPQHAAVNYSQFDYMAYI PNMPHAAYVNITGKGM I

PDEKALM KFLPPRDQAEAQIKIVTYLSFYRH DRLGYYDRAFN LTFRETPVKMMVQQFQQELN EI EQRIDTRN RQRF

VPYPYLKPSLVPNSFSA

Coding sequence for WP_019498926.1 - SEQ ID NO: 119

ATGAACGCGTATAACTTAGATCTGGATCCGACCTATATCAAATACAAAACTATTCTC ACTGAAAACCGCAACGA

ATATGAATTCGATCTTAGCGATCGCGACCTCGCACCCATACCGATGCTGAAGGGAAA CCTGCCGCGCTCGGAA

A ACTTTT CC AT CG ATT ACCTGGGTAGGGTAGCGG CTCC AAT G G CT A AG CTG G C AG C A AAT ACCCTGGCGGTCA

AACTAAAATCTGCTTGGGATCCGCTTGACGAACTGCAAGACTATGAAGATTTCTTTC AGGTTCTGGAGAAACC

CAAAGTCATCTCTACCTACCAAAGCGATAAAGCCTTTGCCGAACAAAGACTGTCCGG CCCTAATCCCCTGGTAC

TCAAGCGAGTTGATGACTTAGCTCAATATTTTCAGAGCAGCGATATTGCCGAAATAG AAACCAAACTAGGCGA

CTCCATAGATTTGACAGATAACCTGTACGTTGCCGACTACACCGAACTGCTGCCCAT TCCCAGCGGCACCTTCG

ATCGCGGGCGTACCTATTTACCCAGACCGATCGCTTTGTTTAGCTGGCGCAGTGAGG CATCTAGCGATCGCGG

TCAGCTCGTGCCCGTAGCAATTAAACTCGACGTGCCGCTCAAAGATAAAACCATCCT TACGCCCGAGGATGAA

TCGCTGGACTGGCTCTATGCCAAAACCTGCGTGCAGATTGCCGATGGCAACTATCAC GAACTAATGAGCCACC

TCTGCCGCACGCATTTTGTGATGGAACCCTTTGCGATCGCCACCGGACAGCATTTGC CCGAAACCCATCATCTC

GGAGCGCTCTTGAGGCAGCATTTTAAATTTATGCTGGCGTTAAGTAAGTTTGCCCGC AAAACCCTGATTGCCA

GCGGTGGTTCGATCGATCGCATCTTGGCAGGAGAACTATCCGGTTCCCTAGAGATCA TCAGGCAAGCCTTTAG AACCTGGCGGTTCGATAGTTTTTCTTTCCCGCAAGCGATCGCGGCACGCGGTATGGACGA TGCCCAAAAGCTG

CCTCACTACCCCT AT CGCG AT GAT GGCAAGCT GGTTT GGG ATGCAATTT GGCAATTT GTTTCAGCTTATTT GGG

GCTTCACTACCACACTGCCGATAGTATTAGCAGCGATCGGGCGTTGCAAGACTGGGC GCAAAAACTCCATCTC

GTGTTTAGCATAGCTGGCGGTGATGGCAAAGGGATGCCTGCACAAATAGATACGCTG GAGCAATTAGTGGAA

GTTGTGACTACGATTGTCTTCACCTGCGGGCCGCAACACGCGGCGGTCAATTTCCCT CAATACGAGTACATGA

CCTTTGCACCTAATATGCCGCTATCCTCTTATCGCGAGTTTGCCGGAGCAGCGGAGT TTACTCAAAAGGATTTC

ATGCGATTCCTACCGCCATCCCAACAAGCCGCCGGACAGCTCTCGACTACTTTTCTA CTGTCTTCATTCCGCTAC

GATCGGTTGGGGCATTACGATCCATCTTTCTTCGAGGCCTTTGCCGATGGTATGCAG GACAAAGTCAAAACTG

TAGTAACGGCTTTTCAGCAGCAATTGGATGTGGTAGAGGCTGAAATCGATCGCCGCA ACCAAAACCGGACAG

TT CCCT ATCCCT AT CT CAAACCATCGCTT ATTCCT AAC AGCATT AGC AT CT AA

Amino acid Sequence for WP_019498926.1 - SEQ ID NO: 120

MNAYN LDLDPTYIKYKTILTENRN EYEFDLSDRDLAPIPMLKGNLPRSEN FSIDYLGRVAAPMAKLAANTLAVKLKSA

WDPLDELQDYEDFFQVLEKPKVISTYQSDKAFAEQRLSGPN PLVLKRVDDLAQYFQSSDIAEIETKLGDSIDLTDN LY

VADYTELLPIPSGTFDRGRTYLPRPIALFSWRSEASSDRGQLVPVAI KLDVPLKDKTILTPEDESLDWLYAKTCVQIAD

GNYH ELMSH LCRTH FVM EPFAIATGQH LPETHH LGALLRQH FKFMLALSKFARKTLIASGGSIDRILAGELSGSLEII R

QAFRTWRFDSFSFPQAIAARGM DDAQKLPHYPYRDDGKLVWDAIWQFVSAYLGLHYHTADSISSDRALQDWAQ

KLH LVFSIAGGDGKGMPAQIDTLEQLVEVVTTIVFTCGPQHAAVN FPQYEYMTFAPN MPLSSYREFAGAAEFTQK

DFM RFLPPSQQAAGQLSTTFLLSSFRYDRLGHYDPSFFEAFADGMQDKVKTVVTAFQQQLDVV EAEIDRRNQNRT

VPYPYLKPSLI PNSISI

Coding sequence for WP_103124384.1 - SEQ ID NO: 121

AT G AAACC AT ATTT ACCCCAGGT AG ATCCT AAT CCT AAC ATCCGC AAAG AT G AGCT AGT AAAAAAT CAAGC AG

ATT AT AAATTT AAT C AC AATT AT CT AGCTCCT ATT CCCGTT AT AG AT AAAGT CCCT C ACCAAG AATT ATT CT CCG

CAGAATATACGGCTAAACGCCTCGCTAGTATGGCAAATTTAGCACCAAATATGCTGG CTGCCAAAGCAAGAAA

TTTCCTTGACCCTTTAGATGAATTAGAAGAATACGAAGAACTATTAACGCTGCTACC TAAACCAGCAGTGATGA

ACAATTATAAAACAGACTCATGTTTTGCCGAGCAAAGATTATCAGGTGCGAACCCTT TAGCTATTCAAAGAATT

G AG AATTT ACCAG AAAAT ATTGG AGT AACT AACGC AC ATTTT CAAAAAGCT GTCGGCACAG AAAGT AGTTT AG

AAGCGGCTCTCAAAGAAGGTAAACTTTATCTATTAGACTATCCCACACTCTTTGATA TTAAAGGTGGTACCTCT

CAAAACCTGAGAAAGTATTTACCTAAGCCGCAAGCTTTATTTTACTGGCAGAGCAAC GGTTTACCAAATGGTG

GTTCCTTGCGTCCAGTAGCAATTAAATTAAATAATGATGCTGGGACAGATGGATTGA TTTACACTCCTGATGAC

CCTTATCTAGATTGGTTTTTAGCAAAAACCTCTGTGCAGATTGCTGACGGAAACCAT CAAGAATTAGGTAGTCA

TTTTGCTTATACTCATGCTGTTATGGCTCCTTTTTGTATTGCCACAGCACGCCAATT AGCAGCCAATCATCCCAT

TGCTTTACTACTAAAACCGCACTTCCGGTTTATGTTATTTGATAACGATTTAGGACG CACTCACTTTTTACAGCC

AGGTGGGCCAGTCGATGAATTTATGGCTGGTTCTTTGCAAGAGTCTTTAACTTTTGT CGTGAAAACTTATCAAG

AGTGGAGTGTCGAGAAATTTGTCTTCCCGACATTAATGAGAAATCAAAATATGGATG ATCCAGAAATATTACC

GCATTTTCCCTTTCGAGATGATGGAATATTAATTTGGGATGCCATTCAAAAATTTGT TACAGACTATCTGCAACT

TTATTACCAAACTTCCCAAGATTTGAGCGAAGATTATGAATTACAAAATTGGGCAAG GGAATTAGTTGCTCAA

GATGGTGGTCGCGTTAAAGGAATGCCAGAAAAAATTGAAACCATAGACCAATTAATT CAAATTATCACGGTTG

T A ATTTT C ACTT G CG CT CCTTT CC ACT CT G CTTT A AATTTTT CTC AGT ACG AGTATATG G CTTTCGT ACCG A AT AT

GCCCTATGCAGCTTATCATCCAACGCCAGAAAAAAAGGGCGTGGATATGCAAACTAT TATGAAGATATTACCA

CCTTTCAAGCAAGCTGCTGATCAAGTAATGTGGACACAT ATTTT AACATCGTACCACCACGACAAATTGGGGT

ATTACGATGAAGAATTTTCTGACCCATTGGCACAGGAATTAGTGATGCAATTCCAAC AGAATTTGCATGATATA G AACG AAAAATT GAT ATT AG AAAT C AAACCCGTCCT AT ACCTT AT AATT ACCT C AAACCTT CGC AAATT ATT AAC AGTATCAATACTTGA

Amino acid Sequence for WP_103124384.1 - SEQ ID NO: 122

MKPYLPQVDPN PNI RKDELVKNQADYKFN HNYLAPIPVIDKVPHQELFSAEYTAKRLASMAN LAPNMLAAKARN F

LDPLDELEEYEELLTLLPKPAVM NNYKTDSCFAEQRLSGANPLAIQRIENLPEN IGVTNAHFQKAVGTESSLEAALKE

GKLYLLDYPTLFDI KGGTSQNLRKYLPKPQALFYWQSNGLPNGGSLRPVAIKLN NDAGTDGLIYTPDDPYLDWFLAK

TSVQIADGNHQELGSH FAYTHAVMAPFCIATARQLAAN HPIALLLKPH FRFMLFDNDLGRTH FLQPGGPVDEFMA

GSLQESLTFVVKTYQEWSVEKFVFPTLM RNQNMDDPEI LPH FPFRDDGI LIWDAIQKFVTDYLQLYYQTSQDLSED

YELQNWARELVAQDGGRVKGM PEKIETIDQLIQI ITVVIFTCAPFHSALNFSQYEYMAFVPNM PYAAYH PTPEKKG

VDMQTIMKI LPPFKQAADQVMWTHILTSYHHDKLGYYDEEFSDPLAQELVMQFQQN LHDI ERKIDIRNQTRPIPY

NYLKPSQII NSI NT

Coding sequence for BBD59026.1 - SEQ ID NO: 123

AT G AAACC AT ATTT ACCCCAGGT AG ATCCT AAT CCT AAC ATCCGC AAAG AT G AGCT AGT CAAAAACCAAACAG

ATT AT AAATTT AAT C AC AATT AT CT AGCTCCT ATT CCCGTT AT AG AT AAAGT CCCT C ACCAAG AATT ATT CT CCG

CAGAATATACGGCTAAACGCCTCGCTAGTATGGCAAATTTAGCACCAAATATGCTGG CTGCCAAAGCAAGAAA

TTTCCTTGACCCTTTAGATGAATTAGAAGAATACGAAGAACTATTAACGCTGCTACC TAAACCAGCAGTGATGA

ACAATTATAAAACAGACTCATGTTTTGCCGAGCAAAGATTATCAGGTGCGAACCCTT TAGCTATTCAAAGAATT

GATAGTTTACCAGAAAAGCTTGGAATAACAAACGCCCATTTTCAAAAATCTGTCGGG ACAGAAAGTAGTTTAG

AAGCGGCTCTCAAAGAAGGTAAACTTTATTTATTAGACTATCCCACACTCTTTGATA TTAAAGGTGGTATTTCTC

AAAACCTGAGAAAGTATTTACCTAAGCCGCAAGCTTTATTTTACTGGCAGAGCAACG GTTTACCAAATGGTGG

TTCCTTGCGTCCAGTAGCAATT AAATT AAAT AATGATCCTGGGACAGATGGATTGATTTACACTCCTGATGATC

CTTATCTAGATT GGTTTTTAGCAAAAACCTCT GT GCAGATT GCT GACGGAAACCATCAAGAATTAGGTAGTCAT

TTT G CTT ATACTC ATG CTGTTATG G CT CCTTTTT GT ATT G CC AC AG C ACG CC A ATT AG C AG CCA AT CAT CCC ATT

GCTTTACTACTAAAACCGCACTTCCGGTTTATGTTATTTGATAACGATTTAGGACGC ACTCACTTTTTACAGCCA

GGTGGGCCAGTCGATGAATTTATGGCTGGTTCTTTGCAAGAGTCTTTAACTTTTGTC GTGAAAACTTATCAAGA

GTGGAGTGTCGAGAAATTTGTCTTCCCGACATTAATGAGAAATCAAAATATGGATGA TCCAGAAATATTACCG

CATTTTCCCTTTCGAGATGATGGAATATTAATTTGGGATGCCATTCAAAAATTTGTT ACAGACTATCTGCAACTT

TATTACCAAACTTCCCAAGATTTGAGCGAAGATTATGAATTACAAAATTGGGCAAGG GAATTAGTTGCTCAAG

ATGGTGGTCGCGTT AAAGG AAT GCCAG AAAAAATT G AAACCGT AG ACCAATT AATT CAAATTATCACGGTT GT

AATTTT CACCT GCGCT CCTTTCCACT CT GCTTT AAATTTTT CT CAGT ACG AGT AT AT GGCTTT CGT ACCG AAT AT G

CCCTATGCAGCTTATCATCCAACGCCAGAAAAAAAGGGCGTGGATATGCAAACGATT ATGAAGATATTACCAC

CTTTCAAGCAAGCTGCTGATCAAGTAATGTGGACACATATTTTAACATCGTACCACC ACGACAAATTGGGGTAT

TACGATGAAGAATTTGCTGACCCATTGGCACAGGAATTAGTGGTGCAATTCCAACAG AATTTGCATGATATAG

AACG AAAAATT GAT ATT AG AAAT C AAACTCGTCCT AT ACCTT AT GATT ACCT CAAACCTT CGC AAATT ATT AAC A

GTATCAATACTTGA

Amino acid Sequence for BBD59026.1 - SEQ ID NO: 124

MKPYLPQVDPN PNI RKDELVKNQTDYKFN HNYLAPIPVIDKVPHQELFSAEYTAKRLASMAN LAPNMLAAKARN F

LDPLDELEEYEELLTLLPKPAVM NNYKTDSCFAEQRLSGANPLAIQRIDSLPEKLGITNAHFQKSVGTESSLEAALKEG

KLYLLDYPTLFDI KGGISQNLRKYLPKPQALFYWQSNGLPNGGSLRPVAIKLN N DPGTDGLIYTPDDPYLDWFLAKTS

VQIADGN HQELGSHFAYTHAVMAPFCIATARQLAANHPIALLLKPH FRFM LFDN DLGRTHFLQPGGPVDEFMAG

SLQESLTFVVKTYQEWSVEKFVFPTLM RNQNM DDPEI LPHFPFRDDGI LIWDAIQKFVTDYLQLYYQTSQDLSEDYE LQNWARELVAQDGGRVKGM PEKI ETVDQLIQIITVVI FTCAPFHSALN FSQYEYMAFVPN MPYAAYHPTPEKKGV

DMQTIMKI LPPFKQAADQVMWTH ILTSYHHDKLGYYDEEFADPLAQELVVQFQQNLHDIERKI DI RNQTRPIPYDY

LKPSQII NSI NT

Coding sequence for WP_096579406.1 - SEQ ID NO: 125

AT G AAACC AT ATTT ACCCCAGGT AG ATCCT AAT CCT AAC ATCCGC AAAG AT G AGCT ATT CAAAAACCAAACAG

ATT AT AAATTT AAT C AC AATT AT CT AGCTCCT ATT CCCGTT AT AG AT AAAGT CCCT C ACCAAG AATT ATT CT CCG

CAGAATATACGGCTAAACGCCTCGCTAGTATGGCAAATTTAGCACCAAATATGCTGG CTGCCAAAGCGAGAAA

TTTCCTTGACCCTTTAGATGAATTAGAAGAATACGAAGAACTATTAACGCTGCTACC TAAACCAGCAGTGATGA

ACAATTATAAAACAGACTCATGTTTTGCCGAGCAAAGATTATCAGGTGCGAACCCTT TAGCTATTCAAAGAATT

G AG AATTT ACCAG AAAAT ATTGG AGT AACT AACGC AC ATTTT CAAAAAGCT GTCGGCACAG AAAGT AGTTT AG

AAGCGGCTCTCAAAGAAGGTAAACTTTATTTATTAGACTATCCCACACTCTTTGATA TTAAAGGTGGTATTTCTC

AAAACCTGAGAAAGTATTTACCTAAGCCGCAAGCTTTATTTTACTGGCAGAGCAACG GTTTACCAAATGGTGG

TTCCTTGCGTCCAGTAGCAATTAAATTAAATAATGATGCTGGGACAGATGGATTGAT TTACACTCCTGATGACC

CTTATCTAGATT GGTTTTTAGCAAAAACCTCT GT GCAGATT GCT GACGGAAACCATCAAGAATTAGGTAGTCAT

TTT G CTT ATACTC ATG CTGTTATG G CT CCTTTTT GT ATT G CC AC AG C ACG CC A ATT AG C AG CCA AT CAT CCC ATT

GCTTTACTACTAAAACCGCACTTCCGGTTTATGTTATTTGATAACGATTTAGGACGC ACTCACTTTTTACAGCCA

GGTGGGCCAGTCGATGAATTTATGGCTGGTTCTTTGCAAGAGTCTTTAACTTTTGTC GTGAAAACTTATCAAGA

GTGGAGTGTCGAGAAATTTGTCTTCCCGACATTAATGAAAAATCAAAATATGGATGA TCCAGAAATATTACCG

CATTTTCCCTTTCGAGATGATGGAATATTAATTTGGGATGCCATTCAAAAATTTGTT ACAGAATATCTGCAACTT

TATTACCAAACTTCCCAAGATTTGAGCGAAGATTATGAATTACAAAATTGGGCAAGG GAATTAGTTGCTCAAG

ATGGTGGTCGCGTTCAAGGAATGCCAGAAAAAATTGAAGCCGTAGACCAATTAATTC AAATTATCACGGTTGT

AATTTT CACCT GCGCT CCTTTCCACT CT GCTTT AAATTTTT CT CAGT ACG AGT AT AT GGCTTT CGT ACCG AAT AT G

CCCT ATGCAGCTTATCATCCAACGCCAGAAAAAAAGGGCGTGGATATGCAAACT ATT ATGAAGATATTACCAC

CTTTCAAACAAGCTGCTGATCAAGTAATGTGGACACATATTTTAACATCGTACCACC ACGACAAATTGGGGTAT

TACGATGAAGAATTTGCTGACCCATTGGCACAGGAATTAGTGGTGCAATTCCAACAG AATTTGCATGATATAG

AACG AAAAATT GAT ATT AG AAAT C AAACTCGTCCT AT ACCTT AT AATT ACCT C AAACCTTCGCAAATT ATT AACA

GTATCAATACTTGA

Amino acid Sequence for WP_096579406.1 - SEQ ID NO: 126

MKPYLPQVDPN PNI RKDELFKNQTDYKFN HNYLAPIPVIDKVPHQELFSAEYTAKRLASMAN LAPNMLAAKARN FL

DPLDELEEYEELLTLLPKPAVMN NYKTDSCFAEQRLSGANPLAIQRI EN LPEN IGVTNAH FQKAVGTESSLEAALKEG

KLYLLDYPTLFDI KGGISQNLRKYLPKPQALFYWQSNGLPNGGSLRPVAIKLN N DAGTDGLIYTPDDPYLDWFLAKTS

VQIADGN HQELGSHFAYTHAVMAPFCIATARQLAANHPIALLLKPH FRFM LFDN DLGRTHFLQPGGPVDEFMAG

SLQESLTFVVKTYQEWSVEKFVFPTLM KNQNM DDPEI LPHFPFRDDGI LIWDAIQKFVTEYLQLYYQTSQDLSEDYE

LQNWARELVAQDGGRVQGM PEKI EAVDQLIQIITVVI FTCAPFHSALN FSQYEYMAFVPN MPYAAYHPTPEKKGV

DMQTIMKI LPPFKQAADQVMWTH ILTSYHHDKLGYYDEEFADPLAQELVVQFQQNLHDIERKI DI RNQTRPIPYNY

LKPSQII NSI NT

Coding sequence for WP_019504688.1 - SEQ ID NO: 127

AT G AAG AAT AAGTCAAAAACAAAT GT CGG AG AAAAAATGGCT ATTTTTTCTCCCGCATTAAGCG AAG ACG AAT T AGCACAACGC ACT CAAT ACTT AAAATTT CAACAAC AGG AAT AT G AGTTT ACT CAT G AAT ACGT AG AAGGT CT A AGTTT ATTT AAAGAAGTTCCTGTTCAAGAAGGCTTTTCAACTGCTTATCTTGCCGATAGAGAATTCCAG CTATC AGCGATATCAATCAATATGTTAGCAGTCGAACCACGTCCTTTTCTTGACCCTTTGGAAAC ATTAGGAGATTACG AAAATTTTT AT AAG ATT AT CCG AAAACCTGGT GTT GCCAACATTT AT C AAAC AG AT CGTGCTTTTGCCG AAC AA

AG ATT GTCTG G G GTT A AT CCCTT G GT C ATT AA AA AATTT ACCG AA AT G CCTG CTG GTGTTG AT ATTT CTTT AC A

AGATTTAGGTCAAGAAACTCAAGTTTTATTCAGCTCCAGCGCAACTAATTTGCAAGC AGAAATTCAACGAGGA

CATATCTTCGTTGCCGACTATACAGAAAGTTTGTCTTTTGTTGAAGGTGGAACTTAC GAAAAAGGACGTAAGT

ATTTACCAAAACCAATCGCTTTTTTCTGGTGGCGTAAAGATGGCATTAAAGATCGCG GTGAATTAGTCCCCATT

GCTATTGCGATCGAGTTAAATACTGCGGATAAAAAATGGAAAATCTTGATACCCAGG GACAAAGATTTGCACT

GGACAGCTGCCAAACTTTGCGTGCAAATTGCTGATGCCAATCATCATGAAATGAGTA CTCATTTAGGGCGTAC

G CAT CTT GT AAT G G A ACCTTTT G CG GTC AGTACT G CC AG AC A ATT AG CT AA A AAT CAT CCTTT AG GATT G CTTT

TGCGCCAACACTTTCGCTTTATGATAGCGATTAATGATATGGCTCGCAGAGAGTTGA TTAATCCAGGTGGTTTT

GTAGAAGCAGCACTTGCAGGAACATTGCCAGAATCTCTACGAATTGTTAAAAATGCT TGTGTTAGTTGGAATA

TTAAAGATTTTGCCTTTCCCACGGAGCTCAAAAATCGTGGTATGGATGAAAAAGACG ATCGAGATAATTACAA

ATTACCCCACTATCCCTACCGCGATGATGGTTTAATGCTTTGGAATGCGATCGAGGA TTTTGTAACTGGTTATC

TTAAGATCTTTTATCCCAAACCTGAGGATATTCAAAGCGATCGAGAATTACAACAAT GGGCAGCAGAATTAGC

ATCT GCCG ATGGT GG AAAAGTTGCCAAAATGCCCG AAAAAATT AGT GAT ATT G AGG AACT AATCG AAATTATT

ACC ACT ATT ATTTTT ATTT GTG GTCCT C A AC ATT CG G CG GT G A ATTTT CCCC AAT AT G A AT AT ATT G GTTTT ATAC

CTAATATGCCTCTAGCTGCTTATCAAGAAATTACTGGAGCAGAAGATCAATTTAAAG AGGAACGAGATCTGCT

ACAACTTTT ACCT CCT CT AAAAC AAAC AGCG ACT C AATT ACT G ACG AT GT AT AACCTTT CAACTT AT CATT ACG A

TCGCCTGGGTTATTATGACGAAGAGTTTGAAAATACGGTTAAAGGTACAGACATTGA ACCGATAGTTGCCAAA

TTCAAACAAGATTTGAATCAAATAGAAGTAGAGATTGATAATAAGAATAAAGATCGT ACTATTCCCTATCCGTT

T CT A AAG CCTTCCTT AGTTTT A AAC AGT ATTT GTATCT AA

Amino acid Sequence for WP_019504688.1 - SEQ ID NO: 128

MKNKSKTNVGEKMAIFSPALSEDELAQRTQYLKFQQQEYEFTH EYVEGLSLFKEVPVQEGFSTAYLADREFQLSAISI

N MLAVEPRPFLDPLETLGDYEN FYKI IRKPGVAN IYQTDRAFAEQRLSGVNPLVI KKFTEMPAGVDISLQDLGQETQ

VLFSSSATN LQAEIQRGH IFVADYTESLSFVEGGTYEKGRKYLPKPIAFFWWRKDGIKDRGELVPIAIAIELNTADKK

WKI LIPRDKDLHWTAAKLCVQIADANH HEMSTHLGRTHLVM EPFAVSTARQLAKN HPLGLLLRQH FRFMIAI ND

MARRELI NPGGFVEAALAGTLPESLRIVKNACVSWN IKDFAFPTELKN RGM DEKDDRDNYKLPHYPYRDDGLML

WNAI EDFVTGYLKI FYPKPEDIQSDRELQQWAAELASADGGKVAKMPEKISDI EELIEIITTII FICGPQHSAVN FPQYE

YIGFIPN MPLAAYQEITGAEDQFKEERDLLQLLPPLKQTATQLLTMYN LSTYHYDRLGYYDEEFENTVKGTDI EPIVA

KFKQDLNQI EVEIDNKN KDRTIPYPFLKPSLVLNSICI

Coding sequence for OCQ98836.1 - SEQ ID NO: 129

AT G AAACC AT ACTT ACCCCAGGT AG ACCCT AACCC AAACATT CGT AAAG AT G AGCT AGT AAAAAAT CG AG AAG ATT AT AAATTT AAT CAT GATT ACCT AGCTCCT ATT CCT GTT ATT GAT AAAGTCCCCC AT AAAG AACT CTT CT CGG CAGAAT AT ACAGCTAAACGCCTCGCAAGTATGGCT AATTT AGCACCAAATATGTTAGCCGCCAAAGCCAGAAA TTTTCTT GACCCTTTAGAT G AATTAG AAG AATACG AAG AACT GTT G AC ACT GCTACCTAAACCAGCAGTAATGA ATAATTATAAAACCGATTCATGTTTTGCCGAGCAAAGATTATCAGGTGCGAACCCTTTAG CAATACGCAGAATT GATAGTTTACCAGCAAATCTCGGTATCACCAACGCCCATTTTCAAAAATCTGTCGGCACA GAAAGTAACTTAGA AG CG G CTCT C AAAG A AG GT AA ACTTT AT CT ATT AG ATT AT CCT AC ACT CTTT GAT ATT AAAG GT G G A ACTT CT C AAA AT GT GAG AA AGT ATTT ACCT AAG CCTC AAG CTTT ATTTT ACTG G C AG AG C A AT G GTGT AG C A AAT G GTG G TTCTCTCCGTCCAGTGGCGATTAAATTAAATAATGATGCTGGTACAGATGGATTGATTTA CACTCCCGATGACC CTT ATTT AGATTGGTTTTTAGCAAAAACTTCTGTGCAGATAGCTGACGGAAATCATCAAGAATTAGG TAGTCAT TTTGCATATACTCATGCTGTTATGGCTCCATTTTGTATCGCCACAGCACGCCAATTAGCA GCAAATCATCCCATC GCTTTACTACTAAGACCGCACTTCCGGTTCATGTTATTTGATAACGATTTAGGACGCACT CATTTTCTACAACCA GGTGGCCCAGTCGATGAATTTATGGCTGGTTCTTTAGAAGAATCATTAACTTTTGTCGTC AAAACTTACCAAGA

ATGG AGT GTT GAT AAATTT GT CTTCCCG ACATT AAT G AAAAGT CAAAACATGG AT G ACCC AG AT AT ATT ACCG

CATTTTCCGTTCCGGGATGATGGTATATTGATTTGGAATGCCATTCATAAATTTGTC ACAGATTATTTGCAACTT

TATTACAAAACACCTCAAGACTTAAGCGAAGATTATGAATTGCAAAATTGGGCAAGA GAATTAGTTGCTCAAG

ATGGTGGACGGGTTAAAGGAATGCCAGAGAAAATTGAAACTATCGACCAATTAATTC AAGTTATTACGGTTAT

AGTTTTTACCTGCGCTCCTTTCCATTCGGCTTTAAATTTTGCCCAGTACGAATACAT GGCTTTCGTGCCGAATAT

GCCTTATGCAGCTTATCATCCAACTCCCGAAAGTAAGGGTGTGGATATGCAAACCAT CATGAAACTATTGCCA

CC ATT C AAG C A AG CTG CTG ACC AAGT AAT GT G G AC AC AT ATTTT AAC AT CTT ACC ATT ACG AT A AATT G G GTT A

TTACGATGAAGAATTTGCCGACCCATTGGCACAGGAATTAGTTGTACAGTTCCAACA GAATTTACATGATATA

G AACG AC AAATT GAT ATT AG AAAT C AAACTCGT CCT AT ACCTT AT AATTT CCT CAAACCTTCCC AAATT ATT AAC

AGTAT C A AT ACTT A A

Amino acid Sequence for OCQ98836.1 - SEQ ID NO: 130

MKPYLPQVDPN PNI RKDELVKNREDYKFN HDYLAPI PVIDKVPH KELFSAEYTAKRLASMAN LAPN MLAAKARNFL

DPLDELEEYEELLTLLPKPAVMN NYKTDSCFAEQRLSGANPLAI RRIDSLPAN LGITNAHFQKSVGTESNLEAALKEG

KLYLLDYPTLFDI KGGTSQNVRKYLPKPQALFYWQSNGVANGGSLRPVAIKLNN DAGTDGLIYTPDDPYLDWFLAK

TSVQIADGNHQELGSH FAYTHAVMAPFCIATARQLAAN HPIALLLRPH FRFMLFDNDLGRTHFLQPGGPVDEFMA

GSLEESLTFVVKTYQEWSVDKFVFPTLMKSQNM DDPDILPHFPFRDDGI LIWNAIH KFVTDYLQLYYKTPQDLSEDY

ELQNWARELVAQDGGRVKGMPEKIETIDQLIQVITVIVFTCAPFHSALNFAQYEYMA FVPN MPYAAYHPTPESKG

VDMQTIMKLLPPFKQAADQVMWTHILTSYHYDKLGYYDEEFADPLAQELVVQFQQN LHDI ERQI DI RNQTRPIPY

N FLKPSQII NSI NT

Coding sequence for WP_062293357.1 - SEQ ID NO: 131

AT G AAACC AT ACTT ACCCCAGGT AG ACCCT AACCC AAACATCCGT AAAG AT G AGCT AGT AAAAAAT CG AG AAG ATT AT AAATTT AAT CAT GATT ATTT AGCTCCT ATT CCT GTT ATT GAT AAAGTCCCCC AT CAAG AACT ATTTT CGGC AGAATATACAGCTAAACGCCTCGCCAGCATGGCAAATTTAGCACCAAATATGTTAGCTGC CAAAGCCAGAAAT TTT CTT G ATCCTTT AG AT G AATT AG AAG AAT ACG AAG AACT GTT G AC ACTGCT ACCT AAACC AGC AGT G ATG A ACAATTATAAGACCGATTCATGTTTTGCCGAGCAAAGATTATCAGGTGCTAACCCTTTAG CAATTCGGAGAATT GAT AGTTT ACC AG C AA AT CT AG G CAT C AC A A AT G CCC ATTTT C AA AA AT CTGTCG G G AC AG AA AGT AACTT G G AAGCGGCTCTCAAAGAAGGTAAACTTTATCTATTAGATTATCCTGCACTTTTTGATATTA AAGGTGGAACTTCT CAAAATGTGAGAAAGTATTTACCTAAGCCTCAAGCTTT ATTTT ACTGGCAGAGCAATGGTGTAGCAAATGGTG GTTCGCTCCATCCAGTGGCGATT AAATT AAAT AATGATGCTGGGACAGATGGATTGATTTACACTCCCGATGA CCCTTATCTAGATTGGTTTTTAGCAAAAACTTCTGTACAGATTGCTGACGGCAACCATCA AGAATTAGGTAGTC ATTTTGCCTATACTCATGCTGTTATGGCTCCTTTTTGTATTGCCACAGCACGCCAATTAG CCGCAAATCATCCCA TT G CTTT ACT ACT AA AACC AC ATTTCCG GTT CAT GTT ATTT GAT AACG ATTT G G G ACG C ACT C ATTT CTT AC AG C CAGGTGGCCCAGTCGATGAATTTATGGCTGGTTCTTTAGAAGAATCATTAACTTTTGTCG TCAAAACTTACCAA GAATGGAGTGTTGATAAATTTGTCTTCCCGACATTAATGAAAAGTCAAAACATGGATGAC CCAGATGTATTAC CACATTTTCCGTTCCGGGATGAT GGGAT GTTGATTT GGAAT GCCATTCATAAATTT GTCACAG ATT ATTT GCAA CTTT ATT ACAAAACTTCCCAAG ACTT AAGCG AAG ATTAT G AATT GCAAAATTGGGCAAG AG AATT AGTT GCTC AAGATGGTGGACGGGTTAAAGGAATGCCGGACAAAATTGAAACTATCGACCAATTAATTC AAATTATTACGGT TGT AGTTTTT ACCTG CG CT CCTTT CC ATT CT G CTTT AA ATTTTTCCC AGT ACG A AT AC AT GG CTTT CGT ACC AAAT ATGCCTTATGCAGCTTATCATCCCACTCCTGAAAGTAAAGGTGTGGATATGCAAACTATC ATGAAGATATTGCC ACC ATTT AAG CAAG CTG CTG ACC AAGT AAT GTG G ACG CAT ATTTT AAC AT CTT ACC ATT ACG AT AAATT AG GTT ATTATGATGAGGAATTTGCCGACCCATTAGCACAGGAATTAGTTGTGCAGTTCCAACAGA ATTTACATGATAT AG AACG AAAAATT GAT ATT AG AAAT CAAACT CGT CCT AT ACCGT AT AATTT CCT CAAACCTTCCC AAATT ATT AA CAGTATCAATACTTAA

Amino acid Sequence for WP_062293357.1 - SEQ ID NO: 132

MKPYLPQVDPN PNI RKDELVKNREDYKFN HDYLAPI PVIDKVPHQELFSAEYTAKRLASMAN LAPN MLAAKARNFL

DPLDELEEYEELLTLLPKPAVMN NYKTDSCFAEQRLSGANPLAI RRIDSLPAN LGITNAHFQKSVGTESNLEAALKEG

KLYLLDYPALFDI KGGTSQNVRKYLPKPQALFYWQSNGVANGGSLH PVAI KLNN DAGTDGLIYTPDDPYLDWFLAK

TSVQIADGNHQELGSH FAYTHAVMAPFCIATARQLAAN HPIALLLKPH FRFMLFDNDLGRTH FLQPGGPVDEFMA

GSLEESLTFVVKTYQEWSVDKFVFPTLMKSQNM DDPDVLPHFPFRDDGMLIWNAI HKFVTDYLQLYYKTSQDLSE

DYELQNWARELVAQDGGRVKGMPDKIETIDQLIQI ITVVVFTCAPFHSALN FSQYEYMAFVPN MPYAAYHPTPES

KGVDMQTIMKILPPFKQAADQVMWTHI LTSYHYDKLGYYDEEFADPLAQELVVQFQQN LHDIERKIDI RNQTRPIP

YN FLKPSQII NSI NT

Coding sequence for WP_104398120.1 - SEQ ID NO: 133

ATGCTGACACCATCGCTCCCAAAAAATGATTCTGATCCAGTCAAAAGACAAGATCTA TTAAGACGACAAAAAC

AAGT GT ACATTT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT

CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGCGTAGCCA CGAGAATAGAAAATGT

CTTT GATCCCTTCG AC AAATT AG AAG ATT ACG AAG AACTTTTT CCT AT CCTTCCC AAACCC AC AAGCATT AAAAC

TTGGCAATCTAACACAGGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGGT AATTCGCGGGATTAGC

AGCTTACCAAATAATTTCCCCGTCAGCGATACTATCTTCCAAAAAGCCATGGGACCG GATAAAACCATTGCCTC

GGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCACCCCTAAACAACCTAAC TTTAGGCAGTTATCAAC

GGGGGATGAAAGCTGTGACAGCACCTCTTGTCCTTTTCTGTTGGCGTGCTAGAGGTT TACGCGGTCAAGGGG

GATT AGT ACC AGTT GCCATT C AATT GT AT CAAG ATCCG ACCC AACCT AAT CAGCGCAT CT AT ACCCCCG AT G AC

GGACTTAATTGGTTAATGGCGAAGATTTTCGTGCAAATTGCCGACGGAAACCACCAT GAATTAGTTAGTCACC

TCAGCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAG CACTCAATCATCCTCTG

GCAATTCT ATT AAGACCTCATTTTCAATTT ACCCTCGCCATT AATAGTTT AGCCGAGAGCGAGTT AATT AACCCC

GGCGGATTTGTTGATCGTCTATTAGCGGGGACGCTAGAAGCATCGATCGAGCTAATT AAGAGTTCCTATCGTC

AAAGATTAGATAATTTCGCCGATTATGCCCTACCAAAGGAATTAGCATTGCGCCAAG TCCAGGATACCTCGCT

ACTACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAAC CTACGTCAAAGATTACC

TAAGTCTTTACTATACTTCCGACGCGGATGTAAACGGGGATACAGAATTACAAGCTT GGGCGCGAAAATTGAT

GTCACCTGAAGGTGGAGGCATCAAAAAATTAGTTTTTGACGGACAATTAGACACTTT AGCCAAATTAGTCGAA

GTTGTCACCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATCCT CAATACGATTATCTCGC

CTTTTGCCCGAATATTCCCCTAGCGGGTTATCAATCCCCTCCCAAAGCAGCTGAAGA GGTGGATATAGATTATA

TTCTCCGTCTTTTGCCGCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGA CTTTAACAGCTTTTCAAT

TT AACCGTTTTGGTT ATCCAT CCCG AAGT GCTTT CCCAG ATCAACGCACTT ACCCG ATTTT GGCGGTTTTCCAAG

CT AAATT AAAAGCGATCGAA AATCAGATCGATCGGCGCAATTT AACCCGATTTACGCCTT AT ATTTTCCTGAAA

CCCT CT CGCATCCCCAAT AGT AT CAAT ATTT AG

Amino acid Sequence for WP_104398120.1 - SEQ ID NO: 134

MLTPSLPKNDSDPVKRQDLLRRQKQVYIYDSVNGITLVKDLPTH EN FSISYQVM RGKGFSALIANGVATRIENVFDP

FDKLEDYEELFPI LPKPTSIKTWQSNTGFAYQRLAGANPMVIRGISSLPN NFPVSDTIFQKAMGPDKTIASEAAKGN L

FLADYAPLN NLTLGSYQRGMKAVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTQPNQRIYTPDDGLN WLMA

KI FVQIADGN HH ELVSHLSHTH LVAEAFVLATATELALNHPLAILLRPHFQFTLAINSLAESELINPGGFVDRLLAGTLE

ASI ELIKSSYRQRLDNFADYALPKELALRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSL YYTSDADVNGDTELQ AWARKLMSPEGGGIKKLVFDGQLDTLAKLVEVVTQII FVAGPQHAAVNYPQYDYLAFCPNIPLAGYQSPPKAAEEV

DI DYILRLLPPQAQAAYQLEIMQTLTAFQFN RFGYPSRSAFPDQRTYPI LAVFQAKLKAIENQIDRRNLTRFTPYIFLKP

SRIPNSIN I

Coding sequence for WP_002758835.1 - SEQ ID NO: 135

ATGCT G AC ACCAT CGCT ACCCAAAAAT G ATCCT G ATCCAGT CAAAAG ACAAG AT CT ATT AAG ACG ACAAAAAC

AAGT GT ACATTT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT

CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGAGTCGCCA CAAGAGTAGAAAATAT

CTTCG AT CCCTT CG AC AC ATT AG AAG ATT ACG AAG AACTTTTT CCT AT CCTTCCC AAACCC AC AAGCATT AAAAC

TTGGCAATCTAATACAGGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCCTGGT AATTCGCGGGATTAGC

AGCTTACCAGATAATTTCCCCGTCAGCGATGCCATCTTCCAAAAAGCCATGGGACCG GATAAAACCATTGACT

CGGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCCCCCTAAACAACCTAA CTTTAGGCAGTTATCAA

AAG G G C AT G AA AACT GT AAC AG C ACCT CTT GT CCTTTT CTGTTG G CGTG CT AG AG GTTT ACG CG GT C A AG G G G

GATTAGTACCTGTTGCCATTCAATTATATCAGGATCCGACCCAACCTAATCAGCGCA TCTATACCCCCGATGAC

GGACTTAATTGGTTAATGGCGAAGATTTTCGTGCAAATTGCCGACGGAAATCACCAT GAATTAGTTAGTCACC

TCAGCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGATACAGCTACCGAGTTAG CAATCAATCATCCTCTG

GCAATTCTATTAAGACCTCATTTTCAATTTACCCTCGCTATTAATAGTTTAGCCGAG AGCGAGTTAATTAACCCT

GGCGGATTCGTTGATCGTCTATTAGCGGGGACGCTAGAAGCATCTATCGAGATAATT AAGACTTCCTATCGTC

AAAGATTGGATAATTTCGCCGATT AT ACCCT ACCCAAGCAATTAGCCTTCCGCCAAGTCGATGATACCTCCCTA

CTACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAACC TACGTCAAAGATTACCT

AAGTCTTTACTATACTTCCGACGCGGATGTAAACGGGGATACAGAATTACAAGCTTG GGTGCGAAAATTGATG

TCACCTGAAGGTGGAGGCATCAAAAAATTAGTTTCTGACGGAGAATTAGACACTTTA GCCAAATTAGTCGAAG

TTGTCACCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATCCTC AATACGATTATCTCGCC

TTTTGCCCGAATATTCCCCTAGCGGGTTATCAATCTCCTCCCAAAGCAGCTGAGGAG GTGGATATAGATTATAT

TCTCCGTCTTTTGCCGCCCCAGTCCCAGGCCGCTTATCAATTGGAAATTATGCAGAC TTTAACAGCTTTTCAATT

TAACCGTTTTGGCTATCCATCCCGAAGTGCTTTCCCAGATCAACGCACTTACCCGAT TTTGGCGGTTTTCCAAGC

TAAATTAAAGGCGATCGAAAATGAGATCGATCGGCGCAATTTAACCCGATTTACGCC TTATATTTTCCTGAAAC

CCT CTCGCAT CCCC AAT AGT AT C AAT ATTT AG

Amino acid Sequence for WP_002758835.1 - SEQ ID NO: 136

MLTPSLPKNDPDPVKRQDLLRRQKQVYIYDSVNGITLVKDLPTHENFSISYQVMRGK GFSALIANGVATRVENI FDP

FDTLEDYEELFPILPKPTSI KTWQSNTGFAYQRLAGANPLVIRGISSLPDNFPVSDAIFQKAMGPDKTIDSEAAKGNLF

LADYAPLN NLTLGSYQKGMKTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTQPNQRIYTPDDGLN WLMAKI

FVQIADGN HH ELVSHLSHTHLVAEAFVLDTATELAIN HPLAILLRPHFQFTLAI NSLAESELINPGGFVDRLLAGTLEAS

IEII KTSYRQRLDNFADYTLPKQLAFRQVDDTSLLPDYPYRDDALLLWQATETYVKDYLSLYYT SDADVNGDTELQA

WVRKLMSPEGGGIKKLVSDGELDTLAKLVEVVTQI IFVAGPQHAAVNYPQYDYLAFCPN IPLAGYQSPPKAAEEVDI

DYILRLLPPQSQAAYQLEIMQTLTAFQFN RFGYPSRSAFPDQRTYPILAVFQAKLKAIENEIDRRNLTRFTPYI FLKPSR

IPNSI NI

Coding sequence for WP_072927101.1 - SEQ ID NO: 137

ATGCTGACACCATCGCTACCCCAAAATGATCCTGATCCAGCCAAAAGACAAGAGCTA TTAAGACGACAAAAAC AAGT GT ACAT CT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGCGTGGCCACGA GAATAGAAAACG TCTTTGATCCCTTCGACAAATTAGAAGATTACGAAGAACTTTTTCCCATCCTTCCCCAAC CCACAAGCATTAAAA CTTGGCAATCTAACACAAGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGGTCA TCCGCGGGATTAG

CAGCTTACCGGATAATTTTCCCGTCAGCGATGCTATCTTCCAAAAAGCCATGGGACC TGATAAAACCATTGCCT

CGGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCCCCCTAAACAACCTAA CTTTAGGCAGTTATCAA

A AG G GT AT G AA AACT GT AAC AG C ACCT CTT GT CCTTTT CTGTTG G CGTG CT AG AG GTTT ACGGGGTCAAGGGG

GATTAGTACCAGTTGCCATTCAATTATATCAAGATCCGACTCAACCTAATCAGCGCA TCTATACCCCCGATGAC

GGACTTAATTGGTTAATGGCGAAAATTTTCGTCCAAATTGCCGACGGAAATCACCAT GAATTAGTTAGTCACCT

C AG CC AT ACCC ATTT AGTAG CG G A AG CTTTT GTTTT AG CC AC AG CT ACCG AGTTAG C ACTT AAT C ATCCTCTG G

CAATTCTATTAAGACCTCATTTTCAATTTACCCTCGCTATTAATAGTTTAGCCGAGA GCGAGTTAATTAACCCCG

GCGGATTCGTTGATCGTCTATTAGCGGGGACGCTAGAAGCATCGATCGAGCTAATTA AGAGTTCCTATCGTCA

AAGATTAGATAATTTCGCCGATTATGCCCTACCAAAGCAATTAGAATTGCGCCAAGT CCAGGATACCTCGCTAC

TACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAACCT ACGTCAAAGATTACCTA

AGTCTTTACTATACTTCCGATGCGGACGTAAATGAGGATACAGAATTACAAGCTTGG GTGCGAACATTGATGT

CACCTGAAGGTGGAGGCATCAAAAAATTAGTTTCTGAAGGAGAATTAGACACTTTGG CCAAATTAATCGAAGT

TGTCACCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATCCTCA ATACGATTATCTCGCCT

TTTGCCCGAATATTCCCCTAGCGGGTTATCAATCTCCTCCCAAAGCAGCTGAGCAGG TGGATATAGATTATATT

CTCCGTCTTTTGCCCCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGACT TTAACAGCTTTTCAATTT

AACCGTTTTGGCTATCCATCCCGAAGTGCTTTCCCAGATCAACGCACTTACCCGATT TTGGCGGTTTTCCAAGCT

AAATTAAAGGCGATCGAAAATCAGATCGATCGGCGCAATTTAACCCGATTTACGCCT TATATTTTCCTGAAACC

CT CT CGC ATCCCCAAT AGT AT CAAT ATTT AG

Amino acid Sequence for WP_072927101.1 - SEQ ID NO: 138

MLTPSLPQNDPDPAKRQELLRRQKQVYIYDSVNGITLVKDLPTH EN FSISYQVM RGKGFSALIANGVATRIENVFDP

FDKLEDYEELFPI LPQPTSIKTWQSNTSFAYQRLAGANPMVIRGISSLPDN FPVSDAIFQKAMGPDKTIASEAAKGN L

FLADYAPLN NLTLGSYQKGMKTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTQPNQRIYTPDDGLN WLMAK

IFVQIADGNH HELVSH LSHTHLVAEAFVLATATELALNH PLAILLRPH FQFTLAINSLAESELI NPGGFVDRLLAGTLEA

SIELI KSSYRQRLDN FADYALPKQLELRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSLYYTSDADVN EDTELQA

WVRTLMSPEGGGIKKLVSEGELDTLAKU EVVTQI IFVAGPQHAAVNYPQYDYLAFCPNIPLAGYQSPPKAAEQVDI

DYILRLLPPQAQAAYQLEI MQTLTAFQFNRFGYPSRSAFPDQRTYPI LAVFQAKLKAI ENQIDRRNLTRFTPYI FLKPS

RIPNSI NI

Coding sequence for WP_110578596.1 - SEQ ID NO: 139

ATGCT G AC ACCAT CGCT ACCCAAAAAT G ATCCT G ATCCAGT CAAAAG ACAAG AT CT ATT AAG ACG ACAAAAAC

AAGT GT ACAT CT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT

CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGCGTAGCCA CGAGAATAGAAAATGT

CTTTGATCCCTTTGATAAATTAGAAGATTACGAAGAACTTTTTCCTATCCTTCCCAA ACCCACAAGTATTAAAAC

TTGGCAATCTAACACAGGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGGT AATTCGCGGTATTAGC

AGCTTACCGGATAATTTTCCCGTCAGCGATGCCATCTTCCAAAAAGCCATGGGACCG GATAAAACCATTGCCTC

G G A AG CTG CTAG G G GT AACTT ATTT CT AG C AG ATT ATG CCCCCCT AA AC AACCT A ACTTT AG G C A ATT AT C A AA

GGGGGATGAAAGCTGTGACAGCACCTCTTGTCCTTTTCTGTTGGCGTGCTAGAGGTT TACGTGGTCAAGGGG

GATTAGTACCGGTTGCCATTCAATTATATCAGGATCCTACCCAACCTAATCAGCGCA TCTATACTCCCGATGAC

GGACTTAATTGGTTAATGGCGAAGATTTTCGTGCAAATTGCCGACGGAAACCACCAT GAATTAGTTAGTCACC

TCAGCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAG CACTCAATCATCCTCTG

GCGATTCTATTAAGACCTCATTTTCAATTTACCCTCGCCATTAATAGTTTAGCCGAG AGCGAGTTAATTAACCCA

GGCGGATTTGTTGATCGTCTATTAGCGGGGACGCTAGAGGCATCGATCGAGCTAATT AAGAGTTCCTATCGTC AAAGATTAGATAATTTCGCCGATTATGCCCTACCAAAGCAATTAGAATTGCGCCAAGTCC AGGATACCTCGCT

ACTACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAAC CTACGTCAAAGATTACC

TAAGTCTTTACTATACTTCCGACGCGGACGTAAACGAGGATACAGAATTACAAGCTT GGGTGCGAAAATTGAT

GTCACCTGAAGGTGGAGGCATCAAAAAATTAGTTTCTGACGGACAATTAGACACTTT AGCCAAATTAATCGAA

GTTGTCACCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATTCT CAATACGATTATCTCGC

CTTTTGCCCGAATATTCCCCTAGCGGGTTATCAATCTCCTCCCAAAGCAGCTGAGGA GGTGGATATAGATTATA

TTCTCCGTCTTTTGCCGCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGA CTTTAACAGCTTTTCAAT

TT AACCGTTTTGGTT ATCCAT CCCGCAGTGCTTT CCCAG ATCAACGCACTT ACCCG ATPT GGCGGTTTTCCAAG

CT AAATTAAAAGCG AT CG AAAAT GAG AT CG ATCGGCGCAATTTAACCCG ATTT ACGCCTT AT ATTTT CCT G AAA

CCCT CT CGCATCCCCAAT AGT AT CAAT ATTT AG

Amino acid Sequence for WP_110578596.1 - SEQ ID NO: 140

MLTPSLPKNDPDPVKRQDLLRRQKQVYIYDSVNGITLVKDLPTHENFSISYQVMRGK GFSALIANGVATRI ENVFDP

FDKLEDYEELFPI LPKPTSIKTWQSNTGFAYQRLAGANPMVIRGISSLPDN FPVSDAIFQKAMGPDKTIASEAARGN L

FLADYAPLN NLTLGNYQRGMKAVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTQPNQRIYTPDDGLN WLMA

KI FVQIADGN HH ELVSHLSHTH LVAEAFVLATATELALNHPLAILLRPHFQFTLAINSLAESELINPGGFVDRLLAGTLE

ASI ELIKSSYRQRLDNFADYALPKQLELRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSL YYTSDADVN EDTELQ

AWVRKLMSPEGGGIKKLVSDGQLDTLAKLIEVVTQII FVAGPQHAAVNYSQYDYLAFCPN IPLAGYQSPPKAAEEV

DI DYILRLLPPQAQAAYQLEIMQTLTAFQFN RFGYPSRSAFPDQRTYPI LAVFQAKLKAIEN EIDRRN LTRFTPYI FLKP

SRIPNSIN I

Coding sequence for WP_045360762.1 - SEQ ID NO: 141

ATGCTGACACCATCGCTACCCCAAAATGATCCTGATCCAGCCAAAAGACAAGATCTA TTAAGACGACAAAAAC

AAGT GT ACATTT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT

CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGAGTCGCCA CAAGGGTAGAAAATAT

CTTCG AT CCCTTT GAT AAATT AG AAG ATT ACG AAG AACTTTTT CCCCT CCTTCCCCAACCC AC AAGCATT AAAAA

TTGGCAATCTAACACAAGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGGT CATCCGGGGGATTAGC

AGCTTACCGGATAATTTCCCAGTCACCGATGCTATCTTCCAAAAAGCTATGGGACCG GATAAAACCATTGCCTC

GGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCACCCTACACCACCTAAC TTTAGGCAGTTATCAAA

GGGGTATGAAAACTGTGACAGCACCTCTTGTCCTTTTCTGTTGGCGTGCTAGAGGTT TACGCGGTCAAGGGGG

ATT AGT ACC AGTT GCC ATT CAATT GT AT C AGG AT CCG ACCCT ACCT AAT CAGCGCAT CT AT ACCCCCG AT G ACG

GACTTAATTGGTTAATGGCGAAAATTTTCGTGCAAATTGCCGACGGAAATCACCATG AATTAGTTAGTCACCTC

ACCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAGCA CTTAATCATCCTCTGGC

AATTCTATTAAGACCTCATTTTCAATTTACCCTCGCTATTAATACTCTAGCCGAGAG CGAGTTAATTAGCCCTGG

CGGATTCGTTGATCGTCTATTAGCGGGGACGCTAGAAGCATCGATCGAGCTAATTAA GAGTTCCTATCGTCAA

AGATTGGATAATTTCGCCGATTATGCCCTACCAAAGCAATTAGAATTGCGCCAAGTC CAGGATACCTCGCTACT

ACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAACCTA CGTCAAAGATTACCTAA

GTCTTTACTATACTTCCGATGCGGATGTAAACGGGGATACAGAATTACAAGCCTGGG TGCGAAAATTGATGTC

ACCTGAAGGTGGAGGCATCAAAAAATTAGTTTCTGACGGACAATTAGACACTTTAGC CAAATTAATCGAAGTT

GTCACCCAGATAATTTTTGTGGCTGGACCACAACACGCGGCGGTTAATTATCCTCAA TACGATTATCTCGCCTT

TTGCCCGAATATTCCCCTAGCGGGTTATCAATCCCCTCCCAAAGCAGCTGAGGAAGT GGATATAGATTATATTC

TCCGTCTTTTGCCGCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGACTT TAACAGCTTTTCAATTTA

ACCGTTTTGGTTATCCATCCCGAAGTGCTTTCCCAGATCAACGCACTTACCCGATTT TGGCGGTTTTCCAAGCTA AATTAAAAGCGATCGAAAATGAGATCGATCGGCGCAATTTAACCCGATTTACGCCTTATA TTTTCCTGAAACCC T CTCGCAT ACCC AAT AGT AT C AAT ATTT G A

Amino acid Sequence for WP_045360762.1 - SEQ ID NO: 142

MLTPSLPQNDPDPAKRQDLLRRQKQVYIYDSVNGITLVKDLPTH EN FSISYQVM RGKGFSALIANGVATRVEN IFDP

FDKLEDYEELFPLLPQPTSI KNWQSNTSFAYQRLAGANPMVIRGISSLPDN FPVTDAIFQKAMGPDKTIASEAAKGN

LFLADYATLH HLTLGSYQRGM KTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTLPNQRIYTPDDGLNWLMA

KI FVQIADGN HH ELVSHLTHTHLVAEAFVLATATELALN HPLAI LLRPH FQFTLAI NTLAESELISPGGFVDRLLAGTLE

ASI ELIKSSYRQRLDNFADYALPKQLELRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSL YYTSDADVNGDTELQ

AWVRKLMSPEGGGIKKLVSDGQLDTLAKLIEVVTQII FVAGPQHAAVNYPQYDYLAFCPNI PLAGYQSPPKAAEEV

DI DYILRLLPPQAQAAYQLEIMQTLTAFQFN RFGYPSRSAFPDQRTYPI LAVFQAKLKAIEN EIDRRN LTRFTPYI FLKP

SRIPNSIN I

Coding sequence for REJ48186.1 - SEQ I D NO: 143

ATGCTGACACCATCGCTCCCCAAAAATGATCCTGATCCAGTCAAAAGACAAGAGCTA TTAAGACGACAAAAAC

AAGT GT ACATTT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT

CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGAGTCGCCA CAAGGGTAGAAAATAT

CTTCG AT CCCTTT G AC AAATT AG AAG ATT ACG AAG AACTTTTT CCC AT CCTTCCCCAACCCACAAGC ATT AAAAC

TTGGCAATCTAACACAAGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGGT AATCCGCGGGATTAGC

AGCTTACCAAATAATTTTCCCGTCAGCGATGCTATCTTCCAAAAAGCTATGGGACCC GATAAAACCATTGCCTC

GGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCCCCCTACACCACCTAAC TTTAGGCAGTTATCAAA

G G G GTATG AA AACT GT G AC AG C ACCT CTT GTCCTTTT CTGTTGGCGTGCTAGAG GTTT ACG G G GT C A AG GG G

GATT AGT ACC AGTT GCCATT C AATT GT AT CAGG AT CCG ACCCT ACCT AAT CAGCGCAT CT AT ACCCCCG AT G AC

GGACTTAATTGGTTAATGGCGAAAATTTTCGTGCAAATTGCCGACGGAAATCACCAT GAATTAGTTAGTCACC

TCAGCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAG CACTTAATCATCCTCTG

GCAATTCT ATT AAGACCTCATTTTCAATTT ACCCTCGCT ATT AATAGTTT AGCCGAGAGCGAGTT AATT AACCCC

GGCGGATTCGTTGATCGTCT ATT AGCGGGGACCCTAGAAGCATCTATCGAGCT AATT AAGAGTTCCTATCGTC

AAAGATTGGATAATTTCGCCGATTATGCCCTACCAAAGCAATTAGAATTGCGCCAGG TCCAGGATACCTCCCT

ACTACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAAC CTACGTCAAAGATTACC

TAAGTCTTTACTATACTTCCGACGCGGACGTAAACGAGGATACAGAATTACAAGCTT GGGCGCGAAAATTGAT

GTCATCTGAAGGTGGAGGCATCAAAAAATTAGTTTCTGACGGAGAATTAGACACTTT AGCCAAATTAGTTGAA

GTTGTCACCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATCCT CAATACGATTATCTCGC

CTTTAGCCCCAATATTCCCCTAGCGGGTTATCAATCCCCTCCCAAAGCAGCTGAGGA AGTGGATATAGATTATA

TTCTCCGTCTTTTGCCGCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGA CTTTAACAGCTTTTCAAT

TT AACCGTTTTGGTT ATCCAT CCCG AAGT GCTTT CCCAG ATCAACGT GCTT ACCCG ATTTTGGCGGTTTTCCAAG

CT AAATT AAAAGCG AT CG AAAAT GAG AT CG ATCGGCGCAATTTAACCCG ATTT ACGCCTT AT ATTTT CCT G AAA

CCCT CT CGCAT ACCCAAT AGT AT C AAT ATTT AG

Amino acid Sequence for REJ48186.1 - SEQ ID NO: 144

MLTPSLPKNDPDPVKRQELLRRQKQVYIYDSVNGITLVKDLPTHENFSISYQVMRGK GFSALIANGVATRVENI FDP

FDKLEDYEELFPI LPQPTSIKTWQSNTSFAYQRLAGANPMVIRGISSLPN NFPVSDAIFQKAMGPDKTIASEAAKG NL

FLADYAPLH HLTLGSYQRGMKTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTLPNQRIYTPDDGLN WLMAKI

FVQIADGN HH ELVSHLSHTHLVAEAFVLATATELALN HPLAILLRPHFQFTLAI NSLAESELINPGGFVDRLLAGTLEA

SIELI KSSYRQRLDN FADYALPKQLELRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSLYYTSDADVN EDTELQA WARKLMSSEGGGIKKLVSDGELDTLAKLVEWTQI IFVAGPQHAAVNYPQYDYLAFSPN IPLAGYQSPPKAAEEVDI

DYILRLLPPQAQAAYQLEI MQTLTAFQFNRFGYPSRSAFPDQRAYPILAVFQAKLKAI EN EI DRRN LTRFTPYIFLKPSR

IPNSI NI

Coding sequence for REJ50596.1 - SEQ I D NO: 145

ATGCTGACACCATCGCTACCCCAAAATGATCCTGATCCAGCCAAAAGACAAGATCTA TTAAGACGACAAAAAC

AAGTGTACGTCTATGATTCCGTTAATGGTATCACCCTCGTCAAAGATTTACCCACCC ACGAAAACTTTTCTATTT

CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGAGTGGCCA CAAGAGTAGAAAATAT

CTTCG AT CCCTTT G AC AAATT AG AAG ATT ACG AAG AACTTTTT CCC AT CCTTCCC AAACCCACAAGT ATT AAAAC

TTG G C A AT CT A AC AC AAGTTTT G CCT ACC A AAG ATT AG C AG GAG C A A AT CCC ATG GT AATTCG CG G G ATT AG C

AGCTTACCAGATAATTTCCCAGTCACCGATGCTATCTTCCAAAAAGCCATGGGACCG GATAAAACCATTGCCTC

GGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCCCCCTACACCACCTAAC TTTAGGCAGTTATCAAA

AGGGTATGAAAACTGTAACAGCACCTCTTGTCCTTTTCTGTTGGCGTGCTAGAGGTT TACGCGGTCAAGGGGG

ATTAGTACCTGTTGCCATTCAATTATATCAGGATCCTACCCAACCTAATCAGCGCAT CTATACCCCCGATGACG

GACTTAATTGGTTAATGGCGAAGATTTTCGTGCAAATTGCCGACGGAAATCATCATG AATTAGTTAGTCACCTC

AGCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAGCA CTCAATCATCCTCTGGC

AATTCTATTAAGACCTCATTTTCAATTTACCCTCGCCATTAATAGTTTAGCCGAGAG CGAGTTAATTAACCCCGG

CGGATTCGTTGATCGTCTATTAGCGGGGACGCTAGAAGCATCTATCGAGATAATTAA GACTTCCTATCGTCAA

AGATTGGATAATTTCGCCGATTATACCCTACCCAAGCAATTAGCCTTCCGCCAAGTC GATGATACCTCCCTACT

ACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAACCTA CGTCAAAGATTACCTAA

GTCTTTACTATACTTCCGACGCGGATGTAAACAAGGATACAGAATTACAAGCTTGGG TGCGAAAATTGATGTC

ACCTGAAGGTGGAGGCATTAAAAAATTAGTTTCTGACGGAAAATTAGACACTTTAGC CAAATTAATCGAAGTT

GTCACCCAGATAATTTTTATTGCTGGACCACAACACGCGGCGGTTAATTATTCTCAA TACGATTATCTCGCCTTT

TGCGCGAATATTCCCCTAGCCGGTTATCAATCTCCTCCCAAAGCATCTGAGGAGGTG GATATGGATTATATTCT

CCGTCTTTTGCCCCCCCAGGCCCAGGCCACTTATCAATTGGAAATTATGCACACTTT AACAGCTTTTCAATTCAA

CCGTTTTGGTTATCCATCCCGAAATGATTTCCCAGATCAACGCACTTACCCGATTTT GGCGGTTTTCCAAGCTAA

ATTAAAAGCGATCGAAAATGAGATCGATCGGCGCAATTCAACCCGAATTACGCCTTA TATTTTCCTGAAACCCT

CTCG C AT CCCC A AT AGT AT C A AT ATTT AA

Amino acid Sequence for REJ50596.1 - SEQ ID NO: 146

MLTPSLPQNDPDPAKRQDLLRRQKQVYVYDSVNGITLVKDLPTHENFSISYQVM RGKGFSALIANGVATRVENI FD

PFDKLEDYEELFPILPKPTSI KTWQSNTSFAYQRLAGANPMVIRGISSLPDN FPVTDAIFQKAMGPDKTIASEAAKGN

LFLADYAPLH HLTLGSYQKGMKTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTQPNQRIYTPDDGLN WLMA

KI FVQIADGN HH ELVSHLSHTH LVAEAFVLATATELALNHPLAILLRPHFQFTLAINSLAESELINPGGFVDRLLAGTLE

ASI EI IKTSYRQRLDN FADYTLPKQLAFRQVDDTSLLPDYPYRDDALLLWQATETYVKDYLSLYYTSDADVN KDTELQ

AWVRKLMSPEGGGIKKLVSDGKLDTLAKLIEVVTQII FIAGPQHAAVNYSQYDYLAFCANI PLAGYQSPPKASEEVD

MDYILRLLPPQAQATYQLEI MHTLTAFQFN RFGYPSRNDFPDQRTYPILAVFQAKLKAI ENEIDRRNSTRITPYIFLKP

SRIPNSIN I

Coding sequence for WP_041804209.1 - SEQ ID NO: 147

ATGCT G AC ACCAT CGCT ACCCAAAAAT G ATCCT G ATCCAGT CAAAAG ACAAG AT CT ATT AAG ACG ACAAAAAC AAGT GT ACATTT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGAGTCGCCACAA GGGTAGAAAATAT CTTCG AT CCCTTT G AC AAATT AG AAG ATT ACG AAG AACTTTTT CCC AT CCTTCCCCAACCCACAAGC ATT AAAAC TTGGCAATCTAACACAAGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGGTAAT CCGCGGGATTAGC

AGCTTACCAAATAATTTTCCCGTCAGCGATGCTATCTTCCAAAAAGCTATGGGACCC GATAAAACCATTGCCTC

GGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCCCCCTACACCACCTAAC TTTAGGCAGTTATCAAA

G G G GTATG AA AACT GT G AC AG C ACCT CTT GTCCTTTT CTGTTGGCGTGCTAGAG GTTT ACG G G GT C A AG GG G

GATT AGT ACC AGTT GCCATT C AATT GT AT CAGG AT CCG ACCCT ACCT AAT CAGCGCAT CT AT ACCCCCG AT G AC

GGACTTAATTGGTTAATGGCGAAAATTTTCGTGCAAATTGCCGACGGAAATCACCAT GAATTAGTTAGTCACC

TCAGCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAG CACTTAATCATCCTCTG

GCAATTCT ATT AAGACCTCATTTTCAATTT ACCCTCGCT ATT AATAGTTT AGCCGAGAGCGAGTT AATT AACCCC

GGCGGATTCGTTGATCGTCT ATT AGCGGGGACCCTAGAAGCATCTATCGAGCT AATT AAGAGTTCCTATCGTC

AAAGATTGGATAATTTCGCCGATTATGCCCTACCAAAGCAATTAGAATTGCGCCAGG TCCAGGATACCTCCCT

ACTACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAAC CTACGTCAAAGATTACC

TAAGTCTTTACTATACTTCCGACGCGGACGTAAACGAGGATACAGAATTACAAGCTT GGGCGCGAAAATTGAT

GTCATCTGAAGGTGGAGGCATCAAAAAATTAGTTTCTGACGGAGAATTAGACACTTT AGCCAAATTAGTCGAA

GTTGTCACCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATCCT CAATACGATTATCTCGC

CTTTAGCCCCAATATTCCCCTAGCGGGTTATCAATCCCCTCCCAAAGCAGCTGAGGA AGTGGATATAGATTATA

TTCTCCGTCTTTTGCCGCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGA CTTTAACAGCTTTTCAAT

TT AACCGTTTTGGTT ATCCAT CCCG AAGT GCTTT CCCAG ATCAACGT GCTT ACCCG ATTTTGGCGGTTTTCCAAG

CT AAATTAAAAGCG AT CG AAAAT GAG AT CG ATCGGCGCAATTTAACCCG ATTT ACGCCTT AT ATTTT CCT G AAA

CCCT CT CGCAT ACCCAAT AGT AT C AAT ATTT AG

Amino acid Sequence for WP_041804209.1 - SEQ ID NO: 148

MLTPSLPKNDPDPVKRQDLLRRQKQVYIYDSVNGITLVKDLPTHENFSISYQVMRGK GFSALIANGVATRVENI FDP

FDKLEDYEELFPI LPQPTSIKTWQSNTSFAYQRLAGANPMVIRGISSLPN NFPVSDAIFQKAMGPDKTIASEAAKG NL

FLADYAPLH HLTLGSYQRGMKTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTLPNQRIYTPDDGLN WLMAKI

FVQIADGN HH ELVSHLSHTHLVAEAFVLATATELALN HPLAILLRPHFQFTLAI NSLAESELINPGGFVDRLLAGTLEA

SIELI KSSYRQRLDN FADYALPKQLELRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSLYYTSDADVN EDTELQA

WARKLMSSEGGGIKKLVSDGELDTLAKLVEVVTQI IFVAGPQHAAVNYPQYDYLAFSPN IPLAGYQSPPKAAEEVDI

DYILRLLPPQAQAAYQLEI MQTLTAFQFNRFGYPSRSAFPDQRAYPILAVFQAKLKAI EN EI DRRN LTRFTPYIFLKPSR

IPNSI NI

Coding sequence for WP_004162848.1 - SEQ ID NO: 149

ATGCTGACACCATCGCTCCCCAAAAATGATCCTGATCCAGTCAAAAGACAAGATCTA TTAAGACGACAAAAAC AAGT GT ACATTT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT CCCC ACG AAAACTTTT CT ATTT CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGAGTCGCCACAA GGGTAGAAAATAT CTTCG AT CCCTTT G AC AAATT AG AAG ATT ACG AAG AACTTTTT CCC AT CCTTCCCCAACCCACAAGC ATT AAAAC TTGGCAATCTAACACAAGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGGTAAT CCGCGGGATTAGC AGCTTACCAAATAATTTTCCCGTCAGCGATGCTATCTTCCAAAAAGCTATGGGACCCGAT AAAACCATTGCCTC GGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCCCCCTACACCACCTAACTTT AGGCAGTTATCAAA G G G GTATG AA AACT GT G AC AG C ACCT CTT GTCCTTTT CTGTTGGCGTGCTAGAG GTTT ACG G G GT C A AG GG G GATT AGT ACC AGTT GCCATT C AATT GT AT CAGG AT CCG ACCCT ACCT AAT CAGCGCAT CT AT ACCCCCG AT G AC GGACTTAATTGGTTAATGGCGAAAATTTTCGTGCAAATTGCCGACGGAAATCACCATGAA TTAGTTAGTCACC TCAGCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAGCAC TTAATCATCCTCTG GCAATTCTATTAAGACCTCATTTTCAATTT ACCCTCGCT ATT AATAGTTT AGCCGAGAGCGAGTTAATTAACCCC GGCGGATTCGTTGATCGTCT ATT AGCGGGGACCCTAGAAGCATCTATCGAGCT AATT AAGAGTTCCTATCGTC AAAGATTGGATAATTTCGCCGATTATGCCCTACCAAAGCAATTAGAATTGCGCCAGGTCC AGGATACCTCCCT

ACTACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAAC CTACGTCAAAGATTACC

TAAGTCTTTACTATACTTCCGACGCGGACGTAAACGAGGATACAGAATTACAAGCTT GGGCGCGAAAATTGAT

GTCATCTGAAGGTGGAGGCATCAAAAAATTAGTTTCTGACGGAGAATTAGACACTTT AGCCAAATTAGTCGAA

GTTGTCACCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATCCT CAATACGATTATCTCGC

CTTTAGCCCCAATATTCCCCTAGCGGGTTATCAATCCCCTCCCAAAGCAGCTGAGGA AGTGGATATAGATTATA

TTCTCCGTCTTTTGCCGCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGA CTTTAACAGCTTTTCAAT

TT AACCGTTTTGGTT ATCCAT CCCG AAGT GCTTT CCCAG ATCAACGT GCTT ACCCG ATTTTGGCGGTTTTCCAAG

CT AAATTAAAAGCG AT CG AAAAT GAG AT CG ATCGGCGCAATTTAACCCG ATTT ACGCCTT AT ATTTT CCT G AAA

CCCT CT CGCAT ACCCAAT AGT AT C AAT ATTT AG

Amino acid Sequence for WP_004162848.1 - SEQ ID NO: 150

MLTPSLPKNDPDPVKRQDLLRRQKQVYIYDSVNGITLVKDLPPH EN FSISYQVM RGKGFSALIANGVATRVENI FDP

FDKLEDYEELFPI LPQPTSIKTWQSNTSFAYQRLAGANPMVIRGISSLPN NFPVSDAIFQKAMGPDKTIASEAAKG NL

FLADYAPLH HLTLGSYQRGMKTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTLPNQRIYTPDDGLN WLMAKI

FVQIADGN HH ELVSHLSHTHLVAEAFVLATATELALN HPLAILLRPHFQFTLAI NSLAESELINPGGFVDRLLAGTLEA

SIELI KSSYRQRLDN FADYALPKQLELRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSLYYTSDADVN EDTELQA

WARKLMSSEGGGIKKLVSDGELDTLAKLVEVVTQI IFVAGPQHAAVNYPQYDYLAFSPN IPLAGYQSPPKAAEEVDI

DYILRLLPPQAQAAYQLEI MQTLTAFQFNRFGYPSRSAFPDQRAYPILAVFQAKLKAI EN EI DRRN LTRFTPYIFLKPSR

IPNSI NI

Coding sequence for BAG04096.1 - SEQ ID NO: 151

AT GT ACATTT AT GATT CCGTT AATGGT AT C ACCCT CGT CAAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT CCT

AT C A AGT AAT G CG G G GT A AAG GTTT C AGT G CTTT AATT G CC AAT G G AGT CG CC AC A AGG GT AG AAA AT AT CTT

CGATCCCTTTGACAAATTAGAAGATTACGAAGAACTTTTTCCCATCCTTCCCCAACC CACAAGCATTAAAACTTG

GCAATCTAACACAAGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGGTAAT CCGCGGGATTAGCAGC

TTACCAAATAATTTTCCCGTCAGCGATGCTATCTTCCAAAAAGCTATGGGACCCGAT AAAACCATTGCCTCGGA

AGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCCCCCTACACCACCTAACTTT AGGCAGTTATCAAAGGG

GTATGAAAACTGTGACAGCACCTCTTGTCCTTTTCTGTTGGCGTGCTAGAGGTTTAC GGGGTCAAGGGGGATT

AGT ACCAGTTGCCATT C AATT GT AT CAGGATCCG ACCCT ACCT AAT C AGCGCAT CT AT ACCCCCG AT G ACGG AC

TTAATTGGTTAATGGCGAAAATTTTCGTGCAAATTGCCGACGGAAATCACCATGAAT TAGTTAGTCACCTCAGC

CATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAGCACTT AATCATCCTCTGGCAATT

CTATTAAGACCTCATTTTCAATTTACCCTCGCTATTAATAGTTTAGCCGAGAGCGAG TTAATTAACCCCGGCGG

ATTCGTTGATCGTCTATTAGCGGGGACCCTAGAAGCATCTATCGAGCTAATTAAGAG TTCCTATCGTCAAAGAT

TGGATAATTTCGCCGATTATGCCCTACCAAAGCAATTAGAATTGCGCCAGGTCCAGG ATACCTCCCTACTACCA

GATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAACCTACGTC AAAGATTACCTAAGTCT

TTACTATACTTCCGACGCGGACGTAAACGAGGATACAGAATTACAAGCTTGGGCGCG AAAATTGATGTCATCT

GAAGGTGGAGGCATCAAAAAATTAGTTTCTGACGGAGAATTAGACACTTTAGCCAAA TTAGTCGAAGTTGTCA

CCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATCCTCAATACG ATTATCTCGCCTTTAGC

CCCAATATTCCCCTAGCGGGTTATCAATCCCCTCCCAAAGCAGCTGAGGAAGTGGAT ATAGATTATATTCTCCG

TCTTTTGCCGCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGACTTTAAC AGCTTTTCAATTTAACCG

TTTTGGTTATCCATCCCGAAGTGCTTTCCCAGATCAACGTGCTTACCCGATTTTGGC GGTTTTCCAAGCTAAATT

AAAAGCGATCGAAAATGAGATCGATCGGCGCAATTTAACCCGATTTACGCCTTATAT TTTCCTGAAACCCTCTC

G CAT ACCCAAT AGT AT C AAT ATTT AG Amino acid Sequence for BAG04096.1 - SEQ I D NO: 152

MYIYDSVNGITLVKDLPTHENFSISYQVMRGKGFSALIANGVATRVEN IFDPFDKLEDYEELFPILPQPTSIKTWQSN

TSFAYQRLAGANPMVIRGISSLPN NFPVSDAIFQKAMGPDKTIASEAAKGNLFLADYAPLHHLTLGSYQRGM KTVT

APLVLFCWRARGLRGQGGLVPVAIQLYQDPTLPNQRIYTPDDGLNWLMAKI FVQIADGNHH ELVSHLSHTHLVAE

AFVLATATELALN HPLAILLRPH FQFTLAI NSLAESELI NPGGFVDRLLAGTLEASIELIKSSYRQRLDN FADYALPKQLE

LRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSLYYTSDADVN EDTELQAWARKLMSSEGGGI KKLVSDGELDT

LAKLVEVVTQ.il FVAGPQHAAVNYPQYDYLAFSPN IPLAGYQSPPKAAEEVDIDYILRLLPPQAQAAYQLEIMQTLTA

FQFNRFGYPSRSAFPDQRAYPI LAVFQAKLKAIENEIDRRNLTRFTPYI FLKPSRIPNSIN I

Coding sequence for WP_002786802.1 - SEQ ID NO: 153

ATGCTGACACCATCGCTACCCCAAAATGATCCTGATCCAGCCAAAAGACAAGAGCTA TTAAGACGACAAAAAC

AAGT GT ACAT CT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT

CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGCGTGGCCA CGAGAATAGAAAACG

TCTTTGATCCCTTCGACAAATTAGAAGATTACGAAGAACTTTTTCCCATCCTTCCCC AACCCACAAGCATTAAAA

CTTGGCAATCTAACACAAGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGG TCATCCGCGGGATTAG

CAGCTTACCGGATAATTTTCCCGTCAGCGATGCTATCTTCCAAAAAGCCATGGGACC TGATAAAACCATTGCCT

CGGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCCCCCTAAACAACCTAA CTTTAGGCAGTTATCAA

CGGGGGATGAAAACTGTAACAGCACCTCTTGTCCTTTTCTGTTGGCGTGCTAGAGGT TTACGGGGTCAAGGG

GGATTAGTACCAGTTGCCATTCAATTGTATCAGGAGCCGACCCTACCTAATCAGCGC ATCTATACCCCCGACGA

CGGACTTAATTGGTTAATGGCGAAGATTTTCGTGCAAATTGCCGATGGAAACCACCA TGAATTAGTTAGTCAC

CTCAGCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTA GCACTTAATCATCCTCTG

GCAATTCTATTAAGACCTCATTTTCAATTTACCCTCGCTATTAATAGTTTAGCCGAG AGCGAGTTAATTAACCCC

GGCGGATTCGTTGATCGTCTATTAGCGGGGACGCTAGAAGCATCGATCGAGCTAATT AAGAGTTCCTATCGTC

AAAGATTAGATAATTTCGCCGATTATGCCCTACCAAAGCAATTAGAATTGCGCCAAG TCCAGGATACCTCGCT

ACTACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAAC CTACGTCAAAGATTACC

TAAGTCTTTACTATACTTCCGATGCGGACGTAAATGAGGATACAGAATTACAAGCTT GGGTGCGAACATTGAT

GTCACCTGAAGGTGGAGGCATCAAAAAATTAGTTTCTGAAGGAGAATTAGACACTTT GGCCAAATTAATCGAA

GTTGTCACCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATCCT CAATACGATTATCTCGC

CTTTTGCCCGAATATTCCCCTAGCGGGTTATCAATCTCCTCCCAAAGCAGCTGAGCA GGTGGATATAGATTATA

TTCTCCGTCTTTTGCCCCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGA CTTTAACAGCTTTTCAAT

TT AACCGTTTTGGCT AT CCAT CCCG AAGT GCTTT CCCAG ATCAACGCACTT ACCCG ATTTT GGCGGTTTTCCAAG

CTAAATTAAAGGCGATCGAAAATCAGATCGATCGGCGCAATTTAACCCGATTTACGC CTTATATTTTCCTGAAA

CCCT CT CGCATCCCCAAT AGT AT CAAT ATTT AG

Amino acid Sequence for WP_002786802.1 - SEQ ID NO: 154

MLTPSLPQNDPDPAKRQELLRRQKQVYIYDSVNGITLVKDLPTH EN FSISYQVM RGKGFSALIANGVATRIENVFDP

FDKLEDYEELFPI LPQPTSIKTWQSNTSFAYQRLAGANPMVIRGISSLPDN FPVSDAIFQKAMGPDKTIASEAAKGN L

FLADYAPLN NLTLGSYQRGMKTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQEPTLPNQRIYTPDDGLN WLMAKI

FVQIADGN HH ELVSHLSHTHLVAEAFVLATATELALN HPLAILLRPHFQFTLAI NSLAESELINPGGFVDRLLAGTLEA

SIELI KSSYRQRLDN FADYALPKQLELRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSLYYTSDADVN EDTELQA

WVRTLMSPEGGGIKKLVSEGELDTLAKLI EVVTQI IFVAGPQHAAVNYPQYDYLAFCPNIPLAGYQSPPKAAEQVDI

DYILRLLPPQAQAAYQLEI MQTLTAFQFNRFGYPSRSAFPDQRTYPI LAVFQAKLKAI ENQIDRRNLTRFTPYI FLKPS

RIPNSI NI Coding sequence for WP_002800102.1 - SEQ ID NO: 155

ATGATACCATCGCTACCCCAAAATGATGCTGATTCTATCAAACGACAAGAATTACTA CAAAGACAAAAACAAG

TCTACATCTATGATTCCGTTAGTGGTATCACCCTCGTCAAAGATTTACCTGCCCAAG AAAATTTCTCTATTTCCT

ATCAATTAATGCTGCGTAAAGGCTTGAGTGCTTTAATTGCCAATAGCGTGGCCACGA AAATAGAAAATGTCTT

T G ATCCCTTT G AC AAATT AG AAG ATT ACG AACAACTTTTTCCT CT CCTTCCC AAACCCACAAGT ATT AAAACTT G

GCAATCTAACACAAGTTTTGCCTACCAAAGATTAGCGGGATTAAATCCCATGGTCAT CCGCGGGATTAGCAGC

ATACCGGATAATTTCCCCGTCAGCGATGCTATCTTCCAAAAAGCCATGGGACCCGAT AAAACCATTGCCTCGG

A AG CCG CT AAG G GT A ACTT ATTT CT AG C AG ATT ATG CCCCCCT AA AC AACCT A ACTTT AG G C AGTT AT C AA AG G

GGTATGAAAACCGCAACAGCACCTCTTGTCCTTTTCTGTTGGCGTGCTAGGGGTTTA CGGGGTCAAGGGGGAT

TAGTACCGGTTGCCATTCAATTGTATCAGGATCCGACCGTACCTAATCAGCGCATCT ATACCCCCGATGACGGA

CTTAATTGGTTAATGGCGAAGATTTTCGTGCAAATTGCCGACGGAAATCACCATGAA TTAGTTAGTCATCTCAG

CCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAGCACT CAATCATCCTCTGGCAA

TTCTATTAAAACCTCATTTTCAATTTACCCTCGCTATTAATACTTTAGCCGAGAGCG AGTTAATTAGCCCAGGCG

GATTCGTTGATCGTCTATTAGCGGGGACGCTAGAAGCATCTATCGAGATAATTAAGA CTTCCTATCGTCAAAG

ATTGGATAATTTCGCCGATTATACCCTACCCAAGCAATTAGCCTTCCGCCAAGTCGA TGATACCTCCCGACTAC

CAGATTACCCCTACCGGGACGATGCTCTCTTACTTTGGCAAGCAACGGAAACCTACG TCAAAGATTACCTAAG

TCTTTACTATACTTCCGACGCGGATGTAAACGAGGATACAGAATTACAAGCTTGGGT GCGAAAATTGATGTCA

CCT G AAGGT GG AGGC ATT AAAAAATT AGTTT CT G ACGG AAAATT AG ACACTTT AGCCAAATT AAT CG AAGTT G

TCACCCAGATAATTTTTATTGCTGGACCACAACACGCGGCGGTTAATTATTCTCAAT ACGATTATCTCGCCTTTT

GCGCGAATATTCCCCTAGCCGGTTATCAATCTCCTCCCAAAGCATCTGAGGAGGTGG ATATGGATTATATTCTC

CGT CTTTT GCCCCCCCAGGCCCAGG CC ACTT AT C AATT G G AAATT AT G C AC ACTTT AAC AG CTTTT C AATT C A AC

CGTTTTGGTTATCCATCCCGAAATGATTTCCCAGATCAACGCACTTACCCGATTTTG GCGGTTTTCCAAGCTAAA

TT AAAAGCG AT CG AAAAT G AG ATCG AT CGGCGC AATT CAACCCG AATT ACGCCTT AT ATTTT CCT G AAACCCT C

TCGCATCCCCAAT AGT AT CAAT ATTT AG

Amino acid Sequence for WP_002800102.1 - SEQ ID NO: 156

MI PSLPQNDADSIKRQELLQRQKQVYIYDSVSGITLVKDLPAQEN FSISYQLMLRKGLSALIANSVATKIENVFDPFDK

LEDYEQLFPLLPKPTSIKTWQSNTSFAYQRLAGLNPMVI RGISSIPDNFPVSDAIFQKAMGPDKTIASEAAKGN LFLA

DYAPLN N LTLGSYQRGMKTATAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTVPNQRIYTPDDGLNW LMAKIFV

QIADGNH HELVSHLSHTH LVAEAFVLATATELALN HPLAILLKPH FQFTLAINTLAESELISPGGFVDRLLAGTLEASIEI

IKTSYRQRLDN FADYTLPKQLAFRQVDDTSRLPDYPYRDDALLLWQATETYVKDYLSLYYTSDADVN EDTELQAWV

RKLMSPEGGGI KKLVSDGKLDTLAKLI EVVTQI IFIAGPQHAAVNYSQYDYLAFCAN IPLAGYQSPPKASEEVDMDYI

LRLLPPQAQATYQLEIM HTLTAFQFN RFGYPSRNDFPDQRTYPILAVFQAKLKAI EN EI DRRNSTRITPYIFLKPSRIPN

SIN I

Coding sequence for WP_002793167.1 - SEQ ID NO: 157

ATGATACCATCGCTACCCCAAAATGATGCTGATTCTATCAAACGACAAGAATTACTA CAAAGACAAAAACAAG TGTACATCTATGATTATGTTAGTGGTATCACCCTCGTCAAAGATTTACCTGCCCAAGAAA ATTTCTCTATTTCCT ATCAATTAATGCTGCGTAAAGGCTTGAGTGCTTTAATTGCCAATGGCGTGGCCACGAGAA TAGAAAATGTCTT T G ATCCCTTT G AC AAATT AG AAG ATT ACG AACAACTTTTTCCT AT CCTTCCC AAACCCACAAGT ATT AAAACTT G GCAATCTAACACAGGTTTTGCCTACCAAAGATTAGCGGGAACAAATCCAATGGTCATCCG CGGGATTAGCAGC TTACCAGATAATTTCCCCGTCAGCGATGCTATCTTCCAAAAAGCGATGGGACCGGATAAA ACCATTGCCTCGG A AG CCG CT AAG G GT A ACTT ATTT CT AG C AG ATT ATG CCCCCCT AA AC AACCT A ACTTT AG G C AGTT ATC AACG G GGGATGAAAACTGTGACAGCACCTCTTGTCCTTTTCTGTTGGCGTGCTAGAGGTTTACGC GGTCAAGGGGGAT TAGTACCAGTTGCCATTCAATTATATCAAGATCCGACCCTACCTAATCAGCGCATCTATA CCCCCGACGACGGA

CTTAATTGGTTAATGGCGAAGATTTTCGTGCAAATTGCCGACGGAAATCATCATGAA TTAGTTAGTCACCTCAG

CCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAGCACT CAATCATCCTCTGGCAA

TTCTATTAAAACCTCATTTTCAATTTACCCTCGCTATTAATACTTTAGCCGAGAACG AGTTAATTAGCCCAGGCG

GATTCGTTGATCGTCTATTAGCGGGGACGCTAGAAGCATCTATCGAGATAATTAAGA CTTCCTATCGTCAAAG

ATTGGATAATTTCGCCGATTATACCCTACCCAAGCAATTAGCCTTCCGCCAAGTCGA TGACACCTCCCTACTACC

AGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGGAAGCAACGGAAACCTACGT CAAAGATTACCTAAGT

CTTTACTATACTTCCGACGCGGATGTAAACGAGGATACAGAATTACAAGCTTGGGTG CGAAAATTGATGTCAC

CTGAAGGTGGAGGCATTAAAAAATTAGTTTCTGACGGAAAATTAGACACTTTAGCCA AATTAATCGAAGTTGT

CACCCAG ATAATTTTT ATT GCTGG ACCACAACACGCGGCGGTTAATT ATTCT CAAT ACGATT AT CT CGCCTTTT G

CGCGAATATTCCCCTAGCCGGTTATCAATCTCCTCCTAAAGCAGCTGAGGAGGTGGA TATGGATTATATTCTCC

GTCTTTTGCCCCCCCAGGCCCAGGCCACTTATCAATTGGAAATTATGCACACTTTAA CAGCTTTTCAATTCAACC

GTTTTGGTTATCCATCCCGAAATGATTTCCCAGATCAACGCACTTACCCGATTTTGG CGGTTTTCCAAGCTAAAT

TAAAAGCGATCGAAAATGAGATCGATCGGCGCAATTCAACCCGAATTACGCCTTATA TTTTCCTGAAACCCTCT

CG C AT CCCC AAT AGT ATT A AT ATTT A A

Amino acid Sequence for WP_002793167.1 - SEQ ID NO: 158

MI PSLPQNDADSIKRQELLQRQKQVYIYDYVSGITLVKDLPAQENFSISYQLM LRKGLSALIANGVATRI ENVFDPFD

KLEDYEQLFPI LPKPTSIKTWQSNTGFAYQRLAGTNPMVI RGISSLPDNFPVSDAIFQKAMGPDKTIASEAAKGN LFL

ADYAPLN NLTLGSYQRGM KTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTLPNQRIYTPDDGLNWLMAKIF

VQIADGN H HELVSHLSHTH LVAEAFVLATATELALN HPLAILLKPHFQFTLAINTLAEN ELISPGGFVDRLLAGTLEASI

EI IKTSYRQRLDN FADYTLPKQLAFRQVDDTSLLPDYPYRDDALLLWEATETYVKDYLSLYYTSDADVN EDTELQAW

VRKLMSPEGGGI KKLVSDGKLDTLAKLI EVVTQII FIAGPQHAAVNYSQYDYLAFCAN IPLAGYQSPPKAAEEVDMD

YI LRLLPPQAQATYQLEIM HTLTAFQFNRFGYPSRN DFPDQRTYPILAVFQAKLKAIENEIDRRNSTRITPYI FLKPSRIP

NSI N I

Coding sequence for WP_061431977.1 - SEQ ID NO: 159

ATGATGATACCATCGCTCCCAAAAAATGATCCTGATCCAGTCAAAAGACAAGATCTA TTAAGACGACAAAAAC

AAGT GT ACATTT AT G ATTCCGTT AATGGT AT C ACCCT CGT C AAAG ATTT ACCT ACCC ACG AAAACTTTT CT ATTT

CCTATCAAGTAATGCGGGGTAAAGGTTTCAGTGCTTTAATTGCCAATGGAGTCGCCA CAAGGGTAGAAAATAT

CTTCGATCCCTTTGACAAATTAGAAGATTACGAACAACTTTTTCCTATCCTTCCCCA ACCCACAAGCATTAAAAC

TTGGCAATCTAACACAAGTTTTGCCTACCAAAGATTAGCGGGAGCAAATCCCATGGT AATCCGCGGGATTAGC

AGCTTACCAAATAATTTTCCCGTCAGCGATGCCATCTTCCAAAAAGCCATGGGACCC GATAAAACCATTGCCTC

GGAAGCCGCTAAGGGTAACTTATTTCTAGCAGATTATGCCCCCCTACACCACCTAAC TTTAGGCAGTTATCAAA

GGGGGATGAAAACTGTGACAGCACCTCTTGTCCTTTTCTGTTGGCGTGCTAGAGGTT TACGGGGTCAAGGGG

GATT AGT ACC AGTT GCCATT C AATT GT AT CAGG AT CCG ACCCT ACCT AAT CAGCGCAT CT AT ACCCCCG AT G AC

GG ACTT AATTGGTT AATGGCG AAAATTTTCGT GCAAATTGCT G ACGG AAAT CACCAT G AATT AGTTAGT CACCT

CACCCATACCCATTTAGTAGCGGAAGCTTTTGTTTTAGCCACAGCTACCGAGTTAGC ACTTAATCATCCTCTGG

CAATTCTATTAAGACCTCATTTTCAATTTACCCTCGCTATTAATAGTTTAGCCGAGA GCGAGTTAATTAACCCCG

GCGGATTCGTTGATCGTCTATTAGCGGGGACGCTAGAAGCATCTATCGAGCTAATTA AGAGTTCCTATCGTCA

AAGATTGGATAATTTCGCCGATTATACCCTACCAAAGGAATTAGAATTGCGCCAAGT CCAGGATACCTCGCTA

CTACCAGATTACCCCTACCGAGACGATGCTCTCTTACTTTGGCAAGCAACGGAAACC TACGTCAAAGATTACCT

AAGTCTTTACTATACTTCCGACGCGGACGTAAACGAGGATACAGAATTACAAGCTTG GGTGCGAACATTGATG

TCACCTGAAGGTGGAGGCATCAAAAAATTAGTTTCTGACGGAGAATTAGACACTTTA GCCAAATTAGTCGAAG TTGTCACCCAGATAATTTTTGTGGCCGGACCACAACACGCGGCGGTTAATTATCCTCAAT ACGATTATCTCGCC TTTAGCCCCAATATTCCCCTAGCGGGTTATCAATCCCCTCCCAAAGCAGCTGAGGAAGTG GATATAGATTATAT TCTCCGTCTTTTGCCGCCCCAGGCCCAGGCCGCTTATCAATTGGAAATTATGCAGACTTT AACAGCTTTTCAATT TAACCGTTTTGGTTATCCATCCCGAAGTGCTTTCCCAGATCAACGTGCTTACCCGATTTT GGCAGTTTTCCAAGC TAAATTAAAAGCGATCGAAAATGAGATCGATCGGCGCAATTTAACCCGATTTACGCCTTA TATTTTCCTGAAAC CCT CTCGCAT ACCC AAT AGT AT C AAT ATTT G A

Amino acid Sequence for WP_061431977.1 - SEQ ID NO: 160

MM IPSLPKNDPDPVKRQDLLRRQKQVYIYDSVNGITLVKDLPTHENFSISYQVMRGKGFSAL IANGVATRVENI FD

PFDKLEDYEQLFPI LPQPTSIKTWQSNTSFAYQRLAGANPMVI RGISSLPNN FPVSDAI FQKAMGPDKTIASEAAKG

N LFLADYAPLHH LTLGSYQRGM KTVTAPLVLFCWRARGLRGQGGLVPVAIQLYQDPTLPNQRIYTPDDGLNWLM

AKIFVQIADG NH HELVSHLTHTH LVAEAFVLATATELALN HPLAILLRPHFQFTLAI NSLAESELIN PGGFVDRLLAGTL

EASIELI KSSYRQRLDN FADYTLPKELELRQVQDTSLLPDYPYRDDALLLWQATETYVKDYLSLYYTSDADVNEDTE LQ

AWVRTLMSPEGGGIKKLVSDGELDTLAKLVEVVTQI IFVAGPQHAAVNYPQYDYLAFSPNIPLAGYQSPPKAAEEV

DI DYILRLLPPQAQAAYQLEIMQTLTAFQFN RFGYPSRSAFPDQRAYPI LAVFQAKLKAI ENEI DRRN LTRFTPYIFLKP

SRIPNSIN I

Coding sequence for OUS02327.1 - SEQ ID NO: 161

ATGGTCGGTCACGATGGGCCGAAATACGCACACGAATCAAATCAACCTTCATTGCCA CAAAACGATACCCCAG

CAGAGCAAGAGGCTCGCCGTACTGCATTGGGATTAACTCAAGAAAAATACCATTTGA GCAACGACAATGACCT

GGGCTTACCGCTACTGAAGGAAGTCCCAGCAGAGGAAGCCTTCAGCAATATTTACGA AGCCGGTCGCGCAAT

T G ACACTTTT CCCTT GTT AG AG AACC AT G AC AAGGT AAT GTCGCAGCT AACAAATCCCT ATGGT CCCTT CACAG

GATTGGCTGATTACGAAAGTATGTTTATTGATATCCCAAAGCCGGCTGTTACCAAAA ATTGGTTAACAGACGA

AAGTTTTGGTGAGCAGCGCCTTTCTGGTGTTAATCCCGTAATGATAGAGCGCGTGAA AAATGCAAAAGATTTG

GCCTCCAAGTTT AAT GT CAGCCAATT G AAAG AT GT CTT GG AT AGCG ACAT AAACTT GG AT G AACT CAT AAAAG

ATGAGCTATTGTACATTACGGACCTATCCCCCTATCTAAAGGATATTCCTGAAGGTA AAGTACCCTCCCCGGGC

G G CT AC ATT CC AA AAT ATTT ACC AA AACCC ATCG GTTT ATTTT ACT G G CAT A A AG AT G GTG C AAA ATT A AAG G A

CCCCTCTTTAAAATCGGGCCGATTGTTACCTCTCGCCATTCAGGTTGACCTTGAAGG TGACCAAGTAAAAATAC

TT ACGCCAAAAAGCCCAG AGTTACTTT GG ACAATTGCCAAAAT GTGCTT CT CTATT GCCG AT GTCAAT GTCCAT

GAAATGTCGACTCACTTAGGGCGGGCACATTTTGCCCAAGAATCCTTTGGAGCGATT ACCCCCTGTCAACTAG

CGCCTAAACACCCACTAGCAATTTTACTAAAACCCCATCTGCGTTTTCTGGTGGCTA ATAATCAAGCCGGTATT

GAAAAACTTGTGAACACAGGTGGCCCCGTAGACATGCTGTTAGCTTCAACCCTACAG GGGTCGCTAGATATAA

GTACTACTGCGGCGAAATCTTGGTCAGTGACAGAAACATTCCCCGAATCAATACAAG CAAGAAATGTTGCTTC

AGAGGAATCGTTACCCCATTACCCTTATCGGGACGATGGTATTTTGATATGGGATGC TGTGGTTGGTTACGTT

AACGAATACGTCAATATCTATTATAAAAATGAAGAAGATGTAGTGAAGGATTATGAA TTGCAGGCATGGGCT

AAAAACTTAGCAGATACCGGCGTCCACGGTGGAAACATCAAAGATATGCCGAGCCAG ATAGAGAGTATCAAA

CAACT AT C AC AACT CCTTT CT GT CAT CATTTT CC AT AAT AGT GCCGG AC AT AGTT CT AT C AATT ACCCACAAT AT C

CCTGTATAGGTTTTTGCCCTAATATGCCTTTAGCGGGTTATAGCAATTACCGTGAAT TCCTGGCTAAGGAGAAA

ACAACACAAGAGGAGCAGCTCACCTTTTTACTAAGCTTCGCACCACCCCAAGCATTA GCCTTAGGGCAGATCG

ATATCACAAACTCTCTGTCCATTTATCATTATGATACTTTGGGCGATTATGCAAAAG AGTTAACCGACCCTTTGG

CAAAACACGCT CT AT ACT GTTT CACT CAAAAATT G ACAGCT ATT G AACAAC AG ATT G AGGT CAG AAACAGT C A

ACGGGCCGAGCCTTATAAGTACATGTTGCCGTCTGAAATTTTGAATAGCGCCAGCAT TTAA

Amino acid Sequence for OUS02327.1 - SEQ ID NO: 162 MVGH DGPKYAHESNQPSLPQN DTPAEQEARRTALGLTQEKYHLSNDNDLGLPLLKEVPAEEAFSN IYEAGRAI DTF

PLLENH DKVMSQLTNPYGPFTGLADYESM FIDIPKPAVTKNWLTDESFGEQRLSGVNPVM IERVKNAKDLASKFN

VSQLKDVLDSDI NLDELI KDELLYITDLSPYLKDIPEGKVPSPGGYIPKYLPKPIGLFYWH KDGAKLKDPSLKSGRLLPLA

IQVDLEGDQVKI LTPKSPELLWTIAKMCFSIADVNVH EMSTHLGRAH FAQESFGAITPCQLAPKH PLAI LLKPHLRFL

VAN NQAGI EKLVNTGGPVDMLLASTLQGSLDISTTAAKSWSVTETFPESIQARNVASEESLPHYPYRD DGILIWDAV

VGYVN EYVNIYYKN EEDVVKDYELQAWAKN LADTGVHGGN IKDMPSQIESIKQLSQLLSVIIFH NSAGHSSI NYPQY

PCIGFCPN MPLAGYSNYREFLAKEKTTQEEQLTFLLSFAPPQALALGQIDITNSLSIYHYDTLGDYAK ELTDPLAKHAL

YCFTQKLTAI EQQIEVRNSQRAEPYKYMLPSEILNSASI

Coding sequence for WP L06300061.1 - SEQ ID NO: 163

ATGCTCCAACCGAGTTTGCCCCAAGACGATACCCTCGATCGACAGCAGCAGCGAAAT CAGGCGATCGCGCAG

CAGCGAGAAGATTATCAATATAGCCAGACAGCCGGGATCCTGCTAATTAAAGAGTTG CCCCAGTCGGAAATG

TTTT C ACT C AA AT ACTT ATT G G AG CG AG ATG CTG G GTTAGT AT CTTT A ATT G C A A AT ACTTT G G C AAG C AGTAT

CGAAAATGTCTTCGATCCCTTCGATAAATTAGAAGATTATCAGGAGATGTTTCCACT GTTACCCAAACCCTCGG

TCTGGGAAACATTCCGCAATGATGCTGTTTTTGCCCGTCAGCGTATTGCTGGTGCCA ACCCGATGGTAATCGA

GCGT GT AATT G AC AAGTTGCCCG AT AACTTT CC AGTT ACAG AT GCC AT ATT CC AAAAAAT CAT GTT AACT AAAA

AAACTCTGGCAGAGGCAATTGCTGAGGGAAGAATCTTCCTCACCAATTATCAAGGGC TGGATGGACTCAAGC

CAGGAGGCT ACCAATACGAACGGGATGGACAACAAGTT AAAGTAACAAA AACT ATTGCCGCGCCCTT AGTAT

TGTACTGCTGGAAACCCACAGGTTATGGAGATTATCGTGGTAATTTAGCACCGATCG CCATTCAAATCAATCA

GCAACCCGATCCGATCGCCAATCCAATTTATACCCCAAGAGACGGAAGGCATTGGTT GATGGCAAAAATCTTT

GCTCAGATGGCTGATGGAAACTATCACGAAGCTATCAGTCATCTAGGCCGAACTCAT TTGGTATTAGAACCTTT

T GT GTT AGCAACCGCCAAT G AATT AGCCCCAAAT C ATCCCCTTT C AGTT CTGCT C AAACCCC ATTTT CAATTT AC

CCTAGCAATCAACGAACTAGCCCGAGAACAATTGATTAGCCCAGGCGGTTATGCAGA CGATTTGCTAGCCGG

AACTCTAGAAGCCTCGATCGGTGTAATTAAAGCAGCCATCAAAGAATACCTAGAAAA CTTCACTGAGTTTGCC

ATACCTAAAGAACTCACCCGGCGAGGAGTAGGGGAAACCGATGTGGATGGATCGGGA GAAAATTTTTTGCCA

GACTACCCCTATAGAGATGATGCTCTACTATTGTGGAACGCAATTAAAGTTTACGTC AGTGATTATCTAAACCT

CTACTACACGTCTTCAGCCAAGATTATTGGCGATCCGGAACTACAGAATTGGGCGAA AAAGCTGATTTCTCCA

GAGGGGGGTAATGTCACGGGTTTAGTTCCCAATGGTCAACTGACAACGCTAGAACAA CTTGTCGAGATCGTC

ACCCAATT AATTTTT GT CAGT GGCCCT CAACATGGTGCGGT G AACT AT CCT CAGTAT G ACT AT AT GGCATTT GT

ACCCAATATCCCGCTGGCTACCTATGGAAATCCGCCCAGCCGCGATGTGGAAATTAA TGAGGAGACCATTTTA

AATATTCTGCCACCACAAAAGTTGGCAGCCAAGCAACTGGAATTGATGAGAACTCTC TCTGTTTTCCGGGCAA

ATCGTTTAGGGTATCCAGATCGAGAATTCGTCGATGTTCGCGCTCGGGGAGTGTTGC AGAAATTTCAAGCAAG

ATTGCAAGAAATCGAACAAGAAATTTCGGTACGGAATGAAACTCGACTCGAACCATA TCTATTTCTCTTGCCCT

CC AAT GTG CC AA AT AGTTT AAAT ATTT A A

Amino acid Sequence for WP_106300061.1 - SEQ ID NO: 164

MLQPSLPQDDTLDRQQQRNQAIAQQREDYQYSQTAGILLIKELPQSEMFSLKYLLER DAGLVSLIANTLASSIENVF

DPFDKLEDYQEM FPLLPKPSVWETFRNDAVFARQRIAGANPMVI ERVIDKLPDN FPVTDAIFQKI MLTKKTLAEAIA

EGRI FLTNYQGLDGLKPGGYQYERDGQQVKVTKTIAAPLVLYCWKPTGYGDYRGN LAPIAIQINQQPDPIAN PIYTP

RDGRHWLMAKIFAQMADGNYH EAISHLGRTH LVLEPFVLATANELAPNHPLSVLLKPHFQFTLAI NELAREQLISP

GGYADDLLAGTLEASIGVI KAAI KEYLENFTEFAI PKELTRRGVGETDVDGSGENFLPDYPYRDDALLLWNAI KVYVS

DYLNLYYTSSAKI IGDPELQNWAKKLISPEGGNVTGLVPNGQLTTLEQLVEIVTQLI FVSGPQHGAVNYPQYDYMAF

VPN IPLATYGN PPSRDVEIN EETILNI LPPQKLAAKQLELMRTLSVFRAN RLGYPDREFVDVRARGVLQKFQARLQEI

EQEISVRN ETRLEPYLFLLPSNVPNSLNI Coding sequence for WP_099065794.1 - SEQ ID NO: 165

ATGACACAGCCAAGTTTGCCCCAAGATGATAGCCCTGAGCAACAGTTACAGCGAAAG CAAGAGATTGCACGT

CAACGGGAAGATTATCAATATAGCGAAACAGCGGGAATACTTTTGATTAAAGAATTG CCACAGTCAGAAATGT

TTT C ATTT A AAT ATTT ACT GGAGCGAGAT AA AAGTTT AAT AT C ATT AAT CG CC A AT ACTTT G G C AACT AAT ATT G

ATAATGTTTTCGATCCCTTCGATAGTTTAGAAGACTATCAACAGATGTTTCCACTGC TGCCCAAACCTTCGACAT

TGCAAACATTCCGCAACGATGGTGTTTTTGCTCGTCAGCGCATTGCTGGTGCTAACC CGATGGTAATTGAACG

G GTAGTG G G AAA ATT ACCCG AT AACTT CG C AGTT AC AG ATG CC AT CTTT C AA A AA ATT AT G CT AACT C AA AAG

ACGTTAGCACAGGCGATCGCAGAGGGCAGAATTTTCATCACCAATTATCAGGGGCTT GATGGACTCACTCCAG

GAACCTACGAACAAGGAACAAAAACCATTGCTGCTCCCTTGGTGTTGTACTGCTGGA AACCCGTAGGTTATGG

AG ATT AT CGCGG AAGTTT G ACTCCAATTGCC ATT C AACT CAAT CAGCAACCCCAT CC AG AAAAC AATCCAATTT

ATACACCAATGGATGGAATGCATTGGTTTATGGCAAAAATCTATGCTCAGATGGCTG ATGGCAACTATCATGA

AGCTATCAGCCATCTGGGACGAACTCATTTGGTATTAGAGCCATTTGTCTTAGCAAC TGCCAATGAACTAGCAC

CTAATCATCCTCTTTCAGTGTTGCTAAAACCCCATTTTCAATTCACCCTAGCAATCA ATGAACTGGCACGGGAAC

AATTGATCAGCCCAGGTGGCTACGCAGATACCTTGCTAGCTGGAACCCTGGAAGCCT CCATCAGCGTTATTAA

AGCAGCT ATT AAAGAATATCTGGAAAACTTCAGTGACTTTGCCTTGCCCAAGGAATT AACT AGGCGAGGAGTG

GGGGAAACCGATGTGGATGGACAGGGAGAAAACTTTTTGCCGGACTACCCCTATCGG GATGATGGTTTGCTA

TT GTGG AAAGCAATT G AGGCTTACGTTAGCAATTATTT AG AT CT CT ATT ACACATCTCCAGT CCAG ATT ATT AA

GGATACAGAACTACAGAATTGGGTGCAAAAGTTAATATCTCCAGAGGGGGGTGGTGT CAAAGGATTAGTGCC

CAATGGTCAATTGCAAACTGTGGAACAGTTAGTGGCCATCGCCACCCAACTAATTTT TATCAGTGGGCCTCAG

CATGGTGCGGTGAACTATCCCCAATACGACTACCTTGCCTTCGTACCCAATATGCCG TTAGCTACTTATGCACC

ACCTCCCAGCCGCGATCGAGAAATTAATGAAGCCACAATCCTGAAGATTCTCCCCCC ACAAAAGCTGGCAGCA

AAGCAATTAGAGTTGATGAGAACTCTCACTGTTTTCCAACCAAATCGCTTGGGCTAT CCAGACAAGAACTTTGT

CGATGTCCGCGCTCAGAATGTTTTGCGGCAATTCCAGGCAAAATTACAAGAAGTTGA GCAAGTGATTAATCAG

CGAAATCAGACCCGCCTTGAACCTTATACCTTTCTTTTACCCTCGAATGTACCTAAT AGCTTAAATATTTAG

Amino acid Sequence for WP_099065794.1 - SEQ ID NO: 166

MTQPSLPQDDSPEQQLQRKQEIARQREDYQYSETAGI LLIKELPQSEM FSFKYLLERDKSLISLIANTLATN IDNVFDP

FDSLEDYQQM FPLLPKPSTLQTFRNDGVFARQRIAGAN PMVI ERVVGKLPDN FAVTDAIFQKIM LTQKTLAQAIAE

GRI FITNYQGLDGLTPGTYEQGTKTIAAPLVLYCWKPVGYGDYRGSLTPIAIQLNQQPHPENN PIYTPMDGM HWF

MAKIYAQMADGNYH EAISHLGRTHLVLEPFVLATANELAPN HPLSVLLKPHFQFTLAIN ELAREQLISPGGYADTLL

AGTLEASISVIKAAI KEYLEN FSDFALPKELTRRGVGETDVDGQGENFLPDYPYRDDGLLLWKAIEAYVSNYLDLYYTS

PVQII KDTELQNWVQKLISPEGGGVKGLVPNGQLQTVEQLVAIATQLI FISGPQHGAVNYPQYDYLAFVPNMPLAT

YAPPPSRDREIN EATILKILPPQKLAAKQLELM RTLTVFQPNRLGYPDKN FVDVRAQNVLRQFQAKLQEVEQVI NQR

NQTRLEPYTFLLPSNVPNSLN I

Coding sequence for WP_012596348.1 - SEQ ID NO: 167

ATGGTACAACCAAGTTTACCCCAAGATGATACCCCCGATCAACAGGAGCAGCGAAAT CGGGCAATCGCACAG

CAACGAGAAGCGTATCAATATAGCGAGACAGCCGGGATACTGTTGATCAAAACCTTG CCTCAGTCGGAAATG

TTTTCATTGAAATACTTGATTGAGCGAGATAAGGGATTAGTGTCCCTAATTGCCAAT ACCTTAGCCAGCAATAT

CGAGAATATCTTCGATCCCTTCGATAAATTAGAAGATTTTGAGGAAATGTTTCCATT GTTACCCAAACCTCTAG

TAATGAACACCTTCCGCAATGATAGGGTGTTTGCTCGTCAGCGTATTGCTGGTCCTA ATCCGATGGTTATTGAG

CGGGT CGTT G AC AAATT GCCAG AT AACTTCCCT GTG ACGG ATGCG AT GTTT CAAAAAAT CAT GTT CACG AAAA

AGACTCTAGCAGAGGCAATTGCACAAGGGAAACTCTTTATCACTAATTACAAAGGAT TGGCGGAGCTTTCACC

AGGACGCTATGAATATCAAAAAAATGGAACACTCGTCCAAAAAACCAAAACGATCGC GGCTCCGTTAGTATTA TACGCCTGGAAACCTGAAGGATTCGGCGATTATCGGGGGAGTTTAGCACCGATCGCCATT CAAATCAATCAGC

AACCTGACCCAATAACCAATCCCATTTATACGCCAAGGGATGGGAAGCATTGGTTTA TAGCAAAAATCTTTGC

CCAGATGGCTGATGGCAATTGTCACGAAGCAATTAGCCACTTAGCACGAACCCATCT GATCTTAGAACCTTTTG

TGCTGGCAACGGCCAAT G AACT CGCACCAAATCATCCTTTATCT GTT CT GCTT AAACCCCATTT CCAATTT ACCT

TGGCCATTAATGAACTGGCACGAGAACAGTTGATCAGTGCCGGAGGTTATGCCGATG ATCTGCTCGCTGGAA

CCCTTGAAGCCTCTATCGCTGTCATTAAAGCGGCTATCAAGGAATATATGGACAATT TCACTGAGTTTGCTTTG

CCTCGTGAGCTTGCTCGCCGAGGAGTGGGGATAGGGGATGTAGATCAAAGGGGAGAA AACTTCTTGCCGGA

CTACCCCTATCGAGATGACGCGATGCTCTTGTGGAATGCGATCGAGGTTTATGTGAG GGATTATCTCAGTCTTT

ACTATCAATCTCCCGTCCAGATTCGTCAAGATACAGAACTGCAAAATTGGGTTAGGC GACTGGTGTCCCCAGA

AGGGGGTAGGGTCACGGGATTAGTGTCCAATGGGGAACTGAATACAATTGAGGCATT GGTGGCGATCGCAA

CTCAGGTCATTTTTGTCAGTGGTCCTCAGCACGCTGCGGTTAACTATCCCCAATACG ACTATATGGCGTTTATTC

CTAATATGCCCCTAGCTACCTATGCCACTCCCCCTAATAAGGAGAGCAACATTAGTG AAGCAACAATCCTCAAT

ATTCTTCCTCCACAAAAGTTGGCAGCAAGGCAACTGGAGTTGATGAGAACGCTGTGT GTTTTCTATCCCAATCG

TTTAGGATATCCCGACACAGAATTTGTGGATGTTCGGGCTCAGCAGGTGCTGCATCA ATTTCAAGAAAGATTG

CAGGAAATTGAACAAAGGATCGTCCTATGCAATGAAAAACGACTGGAACCCTATACT TACCTCTTACCTTCAAA

CGT CCCT AACAGT ACCAGT ATTT AA

Amino acid Sequence for WP_012596348.1 - SEQ ID NO: 168

MVQPSLPQDDTPDQQEQRNRAIAQQREAYQYSETAGILLI KTLPQSEM FSLKYLIERDKGLVSLIANTLASNI EN IFD

PFDKLEDFEEM FPLLPKPLVMNTFRN DRVFARQRIAGPNPMVI ERVVDKLPDN FPVTDAM FQKIMFTKKTLAEAIA

QGKLFITNYKGLAELSPGRYEYQKNGTLVQKTKTIAAPLVLYAWKPEGFGDYRGSLA PIAIQINQQPDPITNPIYTPR

DGKHWFIAKIFAQMADGNCH EAISH LARTHLILEPFVLATANELAPNH PLSVLLKPHFQFTLAI NELAREQLISAGGY

ADDLLAGTLEASIAVIKAAI KEYMDN FTEFALPRELARRGVGIGDVDQRGEN FLPDYPYRDDAMLLWNAIEVYVRD

YLSLYYQSPVQI RQDTELQNWVRRLVSPEGGRVTGLVSNGELNTIEALVAIATQVI FVSGPQHAAVNYPQYDYMAF

IPNMPLATYATPPN KESNISEATI LN ILPPQKLAARQLELMRTLCVFYPN RLGYPDTEFVDVRAQQVLHQFQERLQEI

EQRIVLCNEKRLEPYTYLLPSNVPNSTSI

Coding sequence for WP_036533591.1 - SEQ ID NO: 169

ATGCTCCCACCGAGTTTGCCCCAAGATGATACTCCTGATCAGCAGCTACAGCGAAAT CAGGCGATCGCGCAAC

AGCGAGAAGACTATCAATATAGCCAGACTGCGGGAATACTACTAATTAAAACGTTGC CTCAATCGGAAATGTT

TTCATTCAAATATTTGCTAGAGCGCGATAAGGGGCTGGTTTCCTTAATTGTGAATAC CCTAGCAAGCAAAATCG

AGAATATCTTCGATCCCTTCGAGAAATTAGAAGATTATCAGGAGATGTTTCCACTGT TGCCCAAACCCTCAGTT

CTAGAAACCTTCCGACATGATGCTGTCTTTGCCCGTCAACGCATTGCGGGTGCAAAC CCGATGGTCATTGAGC

GCGT AATT AGCAAATTACCGG ATAACTTCCCGGT CACAG ATGCCAT GTTTCAAAAAATT AT GT CAACCAAAAA

GACGTTGGCAGAGGCGATCGCTGAAGGGAGACTCTTCCTCACGAACTATAAGGGGCT GGATGGACTGACCCC

AGGACACTACGAAAGAGGAACAAAAACCATTGCAGCTCCCTTAGTCTTGTACTGCTG GAAACCAACAGGTTAT

GGTGATTATCGCGGGAATTTAGCACCGATCGCCATTCAAATTAATCAGAAACCTGAC CCGATAATCAATCCAA

TATATACCCCAAGGGATGGGATGCATTGGTTTATGGCAAAAATCTTTGCCCAGATGG CAGATGGCAACTATCA

CGAAGCGATCAGTCATCTAGGTCGAACGCATCTAGTTTTAGAACCATTTGTGCTGGC CACCGCCAATGAGCTA

GCCCCCAATCATCCTCTTTCCATTCTCCTCAAGCCCCATTTTCAATTCACTCTGGCA ATCAATGAACTAGCACGA

GAACAATTGATCAGCAAAGGTGGCTATGCAGATACGCTGCTCGCGGGCACACTGGAA GCCTCCATCAGCGTC

ATTAAAGCAGCCATCCAGGAATACTTCGAAAACTTTACAGAGTTTGCAGTACCGAAA GAGCTAACCCGGCGAG

GCATTGGGGAAACCGATTTAGATGCACAGGGCGAGAATTTCTTACCCGACTACCCCT ACCGAGATGATGCACT

GTTATTGTGGGATGCAATTAAAAACTACGTAAGGGATTATCTGAATCTCTACTATAC GTCCCAAGACAAAATCC TCAAGGATACCGAACTAAAGAATTGGGTGAGTAAGCTTATTTCTCCTGAGGGGGGAAATG TCAAAGGATTGG

TTCCCAATGGTGAGCTTACCACCCTAGATCAGTTAGTTGAGATAGCAACGCAGCTAA TTTTTGTCAGTGGCCCA

CAACACGCTGCGGTGAATTATCCCCAATACGACTACATGGCCTTTGTCCCTAACATG CCCCTAGCTACCTATGC

CCCTCCGAGTAGCGATCCGACGATCGATGAAACCACGATTCTGAAAATTCTTCCTCC ACAAAAACTAGCCGCA

AAGCAATTAGAGCTAATGAAAACTCTTTCTGTTTTTCGGGCAAATCGCTTAGGCTAT CCAGACAATGAATTTGT

TGATGTTCGGGCTCAGAATGTATTAATTAAATTTCAGGGAAATTTGAAAAAAGTCGA GGATAAAATTACCGCA

CGG AAT GAG ACT CG ACTTG AGCCGT AT GT ATTT CT CTTGCCCT CC AACGT ACCT AAT AGT ACAAAT ATTT AG

Amino acid Sequence for WP_036533591.1 - SEQ ID NO: 170

MLPPSLPQDDTPDQQLQRNQAIAQQREDYQYSQTAGILLIKTLPQSEM FSFKYLLERDKGLVSLIVNTLASKIENI FD

PFEKLEDYQEM FPLLPKPSVLETFRH DAVFARQRIAGAN PMVIERVISKLPDN FPVTDAMFQKI MSTKKTLAEAIAE

GRLFLTNYKGLDGLTPGHYERGTKTIAAPLVLYCWKPTGYGDYRGNLAPIAIQINQK PDPII NPIYTPRDGM HWFM

AKIFAQMADGNYHEAISHLGRTH LVLEPFVLATAN ELAPNHPLSILLKPH FQFTLAIN ELAREQLISKGGYADTLLAGT

LEASISVIKAAIQEYFEN FTEFAVPKELTRRGIGETDLDAQGEN FLPDYPYRDDALLLWDAIKNYVRDYLNLYYTSQDK

ILKDTELKNWVSKLISPEGGNVKGLVPNGELTTLDQLVEIATQLI FVSGPQHAAVNYPQYDYMAFVPNMPLATYAP

PSSDPTI DETTILKILPPQKLAAKQLELMKTLSVFRAN RLGYPDN EFVDVRAQNVLIKFQGN LKKVEDKITARNETRLE

PYVFLLPSNVPNSTNI

Coding sequence for WP_015784471.1 - SEQ ID NO: 171

ATGGTACAACCAAGTTTACCCCAAGATGATACCCCCGATCAACAGGAGCAGCGAAAT CGGGCAATCGCACAG

CAACGAGAAGCGTATCAATATAGCGAGACAGCCGGGATACTGTTGATCAAAACCTTG CCTCAGTCGGAAATG

TTTTCATTGAAATACTTGATTGAGCGAGATAAGGGATTAGTGTCCCTAATTGCCAAT ACCTTAGCCAGCAATAT

CGAGAATATCTTCGATCCCTTCGATAAATTAGAAGATTTTGAGGAAATGTTTCCATT GTTACCCAAACCTCTAG

TAATGAACACCTTCCGCAATGATAGGGTGTTTGCTCGTCAGCGTATTGCTGGTCCTA ATCCGATGGTTATTGAG

CGGGT CGTT G ACAAATT GCCAG AT AACTTCCCT GT GAT GG ATGCG AT GTTT CAAAAAATCAT GTT CACG AAAA

AGACTCTAGCAGAGGCAATTGCACAAGGGAAACTCTTTATCACTAATTACAAAGGAT TGGCGGAGCTTTCACC

AGGACGCTATGAATATCAAAAAAATGGAACACTCGTCCAAAAAACCAAAACGATCGC GGCTCCGTTAGTATTA

TACGCCTGGAAACCTGAAGGATTCGGCGATTATCGGGGGAGTTTAGCACCGATCGCC ATTCAAATCAATCAGC

AACCTGACCCAATAACCAATCCCATTTATACGCCAAGGGATGGGAAGCATTGGTTTA TAGCAAAAATCTTTGC

CCAGATGGCTGATGGCAATTGTCACGAAGCAATTAGCCACTTAGCACGAACCCATCT GATCTTAGAACCCTTT

GTG CTG G C A AT G G CC AAT G A ACTT G C ACC AA AT CAT CCTTT GTCTGTTCTG CTT A AACCCC ATTTCC A ATTT ACC

TTGGCTATTAATGAACTGGCACGAGAACAGTTGATCAGTGCCGGAGGTTATGCCGAT GCTCTGCTGGCTGGA

ACCCTTGAAGCCTCTATCGCTGTCATTAAAGCGGCCATCAAGGAATATATGGACAAT TTCACTGAGTTTGCTTT

GCCTCGGGAGCTTGCTCGGCGAGGAGTGGGGGTAGCAGATGTGGATCAAACGGGAGA AAACTTCTTGCCGG

ACTACCCCTATCGAGATGATGCGATGTTATTGTGGAATGCGATCGAGGTTTATGTGA GGGATTATTTAAGTCT

TTACTATCAATCTCCTGTCCAAATTCGTCAAGATACAGAACTACAAAATTGGGTTAG GCGACTGGTGTCTCCAG

AAGGGGGTAGCGTCACGGGATTAGTGCCCAATGGGGAACTGAATACAATTGAGCAAC TGGTGGCGATCGCA

ACTC AG GT C ATTTTT GT C AGT G GTCCTC AG C ACG CTG CG GTC AACT AT CCCC A AT ACG ACT ATATG G CGTTT AT

TCCCAATATGCCCCTAGCTACCTATGCCACTCCCCCTCATAAAGATAGCAACATTAG TGAAGCAACCATCCTCA

ATATTCTTCCTCCACAAAAGTTGGCAGCAAGGCAACTGGAGTTGATGAGAACGCTGT GTGTTTTCTATCCCAAT

CGTTT AGGATATCCAGACACAGAATTTGTAGATGTCCGTGCGCAGAGGGTGCTGCATCAATTTCA AGAAAGAT

TGCAGGAAATTGAACAAAGGATCGTCCTATGCAATGAAAAACGACTGGAACCGTATA CTTACCTCTTACCTTC

A AAT GTCCCT A AC AGT ACC AGT ATTT AG

Amino acid Sequence for WP_015784471.1 - SEQ ID NO: 172 MVQPSLPQDDTPDQQEQRNRAIAQQREAYQYSETAGILLI KTLPQSEM FSLKYLIERDKGLVSLIANTLASNI EN IFD

PFDKLEDFEEM FPLLPKPLVMNTFRN DRVFARQRIAGPNPMVI ERVVDKLPDN FPVM DAM FQKIMFTKKTLAEAI

AQGKLFITNYKGLAELSPGRYEYQKNGTLVQKTKTIAAPLVLYAWKPEGFGDYRGSL APIAIQINQQPDPITNPIYTP

RDGKHWFIAKIFAQMADGNCH EAISH LARTHLILEPFVLAMANELAPN HPLSVLLKPHFQFTLAI NELAREQLISAG

GYADALLAGTLEASIAVI KAAIKEYM DN FTEFALPRELARRGVGVADVDQTGEN FLPDYPYRDDAMLLWNAIEVYV

RDYLSLYYQSPVQI RQDTELQNWVRRLVSPEGGSVTGLVPNGELNTI EQLVAIATQVI FVSGPQHAAVNYPQYDY

MAFIPNMPLATYATPPHKDSN ISEATI LN I LPPQKLAARQLELMRTLCVFYPNRLGYPDTEFVDVRAQRVLHQFQER

LQEIEQRIVLCNEKRLEPYTYLLPSNVPNSTSI

Coding sequence for WP_094531790.1 - SEQ ID NO: 173

ATGATCTTCTCGCTTTTGAGTGGTGTTGCCAGAATATTAAATTTTGTCGCGGCGAAG TTAGTAGACTTAGCTGA

TTGGATATCAAGGCGATCGCCTTCCAGCAAGTATCCACTGCTGCCCCAGAATGATCC TGAAATAAATCAGCGT

CAAGCATTT CT C AAT AAT GCC AG ACAACTTT ACCAAT AC AACT AT ACTT ACAT CG ACTCGTTGCCAAT GGTGG A

GACAGTTCCCACCATTGAGAGATTCTCTTTATCTTGGGGTTTACTCGTTGGCAAAGC TGTAGTCACGGTTTTGC

T G AAT G AAAG AGCT AATCTAT CATT GG AAAAAG AT AAACTAGCTTCT CAAGCCAAGCAACG AG AATTTT CAAA

ACGTTTATTAGAGGCTGGAATGTCTCACTCAGACACAGCCATATTGGATCTATTAGA CGAATTGCCAACAGTTT

TAGAAACTCCGCCATCTGATTTAGAAGGGGTAAATATTGAAGAATATAACAATCTAT TTTGGGTTATTCCTCTT

CCT ACG AT C AGT C AAAACT AT AT C AGT AACACT G AATT CGCG AG ATTGCG AGTT GCTGGGTTT AATCCCTT AGT

GATTCAACGAGTTAAAGCATTAGATGCAAGGTTCCCTTTAACAGAGGAGCAATTCCA GACAGTTTTGCCAAAT

GATTCTTTAGCCTTAGCAGGAGCCGAGGGTCGTTTGTATTTAGCCGATTATGCAGAA CTAGAGGCGATCGCTG

GTGGTACATTTCCCACAGGAGAGCAAAAATATGTCAATGCTCCTTTAGCTCTGTTTG CCATTCCACAAGGAGAA

AGAAGTCTGACTCCGATCGCAATTCAACTGGGGCAAGACCCGAATATCAATCCCATC TTTTTGCGCCGAGTTG

GTGACGAACCGAACTGGTTGATTGCTAAAACTGTTGTTCAAATTGCTGATGCTAATC ACCATCAACTGATTAGC

CATTTGGGTAGAACCCATTTATTTGTCGAACCATTTGTAATTGCCACCAATCGCCAA CTTGCCAGCAATCATCCT

CTGTATATTTTACTGAAACCCCATTTCCAAGGGACTTTAGCGATCAATGACGCAGCG CAGTCAAACCTAGTTAG

CGTTGGTGGTGGTGTTGATAGTTTGCTAGCAGGGACGATTGCAAGTTCTCGCGCTGT TTCTGTACATGGGGTT

AAGTCTTATCAATTTGAAGATGCGCTCCTTCCTAATGCACTCAAGAAACGCGGCGTT GATGATCCCAGCTTATT

GCCAGACTATCCCTATCGCGACGATGCGTTATTAATTTGGGAAGCGATCGCTACTTG GGTGAAGAGTTATCTA

TCGATTTATTATTTCAATGATGATGCTGTGGTTCGCGATACGGAACTGCAAGCATGG GCAAAGGAAATCATTG

CTAATGATGGTGGTCGGGTGACTAGCTTTGGTGAAAATGGACAGATTCGGACTTTAT CCTATTTAGCTGATGC

CCTGACTGCGGTGATCTTCACAGGTAGCGCTCAACATGCGGCAGTGAATTTCCCGCA GGGAGATCTGATTGTT

TATACGCCTGCGATTCCTTTGGCGGGTTATACACCTGCGCCAACTCAGACTACAGGT GCAGAAGAAGCAGATT

TCTTTGCGATGTTGCCGCCGATCGAACAAGCTAAGGGACAATTGAAACTAACTTATA TTCTCGGTTCGGTCTAT

TACACGACACTGGGAGATTATGGTACTGATTATTTCAGCGACGATCGCATTCAGCAG CCTTTACGCGATTTTCA

AGATCTGTTAAAGGAGATCGAATCTACGATCAAGTCTCGCAATGAACAACGAGTTGC AGATTATAACTATTTG

AGACCATCACGGATTCCCCAAAGCATTAATATCTAA

Amino acid Sequence for WP_094531790.1 - SEQ ID NO: 174

MI FSLLSGVARILNFVAAKLVDLADWISRRSPSSKYPLLPQNDPEINQRQAFLN NARQLYQYNYTYIDSLPMVETVPT

IERFSLSWGLLVGKAVVTVLLN ERANLSLEKDKLASQAKQREFSKRLLEAGMSHSDTAILDLLDELPTVLETPPSDLEG

VNI EEYN NLFWVIPLPTISQNYISNTEFARLRVAGFNPLVIQRVKALDARFPLTEEQFQTVLPNDSL ALAGAEGRLYLA

DYAELEAIAGGTFPTGEQKYVNAPLALFAIPQGERSLTPIAIQLGQDPNI NPIFLRRVGDEPNWLIAKTVVQIADAN H

HQLISH LGRTH LFVEPFVIATN RQLASNH PLYI LLKPHFQGTLAI NDAAQSN LVSVGGGVDSLLAGTIASSRAVSVHG

VKSYQFEDALLPNALKKRGVDDPSLLPDYPYRDDALLIWEAIATWVKSYLSIYYFN DDAVVRDTELQAWAKEIIAN D GGRVTSFGENGQI RTLSYLADALTAVI FTGSAQHAAVN FPQGDLIVYTPAIPLAGYTPAPTQTTGAEEADFFAMLPP

IEQAKGQLKLTYILGSVYYTTLGDYGTDYFSDDRIQQPLRDFQDLLKEIESTIKSRN EQRVADYNYLRPSRIPQSIN I

Coding sequence for PZ042668.1 - SEQ ID NO: 175

ATGGTCTTCTCGCTTTTGAGTGGTGTTGCCAAAACATTAAATTTCGTCGCATCTAAG TTGAAAGACTTGGCTGA

TTGGATATCAAGGCGATCGCCTTCTAGCAAATATCCGCTACTGCCCCAGAACGATCC TGAAATAAAGCAGCGT

CAATCGTTTCTAGATAATGCAAGGCAACTCTATCAATATAACTACACCTACATTGAC TCGCTCCCACTGGTGGA

AACAGTTCCCACCAATGAGAGATTTTCTTTGTCTTGGGGATTGCTAGTTGGCAAGGC AGCAATCAAGGTTTTG

CTGAATGAGCGGGCGAATCCATTGTTGTTGGAAGCGGGGAAACAAACCTCTAAGGCT AAGCAACAAGACTTC

TCAAAACGTTTGCTGGAAGCTAGTGTAGCTCAGTCAGAATCTGCCCTATTGGAACTA TTGGAAGATTTGCCAA

CGGTTTTAGAAACTCCACCCAGTGAATTAGAAGGGGTGAATATTGAAGAGTATAACA ATTTGTTTTGGGTTAT

TCCT CTT CCCT CG AT CAGT CAAAACT AT ACCAGT AAT AAAG AATT CGCCAG ATT GCG AGTT GCT GGGTTT AAT C

CCTT AGT GATT CAACG AATT AC AGCCCT AG AT GC AAG ATTTCCTTT AACT G AAGCGCAATT CCAG AAGGTT CT A

CCCAATGATTCTTTGGCTGTAGCAGGAGCCGAAGGTCGTTTGTATTTAGCCGATTAT GCGGAACTAGAGGCGA

TCGTTGGTGGCACATTTCCCACGGGAGAGCAGAAATATATCAATGCTCCTTTAGCGC TGTTTGCCATTCCTCAA

GGGGAAAAGAGCCTGACTCCGATCGCCATTCAACTAGGACAAGACCCCAATACCCAT CCCATCTTTTTGCACC

AAGTCGGTGACGAACCAAACTGGTTAATTGCTAAAACTGTTGTTCAAATTGCCGATG CCAATCACCATCAACT

GATTAGTCATTTGGGTAGAACTCATTTATTTGTCGAACCCTTTGTAATTGCTACTAA TCGCCAACTTGCAAGCAA

TC AT CCTTT GTAT AT CTT G CT G A AG CC AC ATTTT C AAG G G ACTTT G GC AATT AAT G ACG C AG C AC AGT CC AA AC

TGGTTAGCGCTGGTGGCGGTGTTGATAGTTTGCTAGCAGGTACGATTGAGAGTGCTC GCGCTGTTTCCGTACA

TGGGGTCAAAACCTATAAATTTGAAGATGCGCTGCTACCTAAAGCCCTGAAAAAACG TGGCGTTGACGATCCC

AACTTATTGCCAGATTATCCCTATCGTGATGATGCTTTATTAGTTTGGGAAGCGATC GCTACTTGGGTGAAAAA

TTATCTATCAATCTATTACTTCAATGATGAAGATGTGATTAGAGATACGGAACTGCA AGCATGGGCAAAGGAA

ATCATCGCTAATGATGGTGGTCGGGCGACTAGCTTCGGTGAAAATGGGCAGATTCGG ACTTTATCCTATTTAG

CTGATGCTTTGACTGCGGTGATCTTTACAGGTAGCGCTCAACATGCGGCGGTAAACT TCCCACAGGGTGATTT

GATTGTTTATACGCCTGCGATTCCCTTGGCGGGTTATACGCCTGCACCAACTCAGAC TACAGGTGCAACCGAA

GCCGATTTCTTTTCACTCCTTCCGCCAATTGAGCAAGCTAAGGGACAATTGAAACTA ACCTATATTCTCGGCTC

AGTCTATTACACAACGCTGGGAGAATATGGTGATGGTTATTTCACTGACGATCGCAT TGAGAAGCCATTACGG

GATTTTCAAGATAATTTGAAAGCGATCGAGTCAGAAATCAAGTCTCGCAACGAAAAA CGAGTTGCAGATTACA

ATT ATTT G AAACCAT C ACGG ATTCCT CAAAGT AT CAAT AT CT AA

Amino acid Sequence for PZ042668.1 - SEQ ID NO: 176

MVFSLLSGVAKTLN FVASKLKDLADWISRRSPSSKYPLLPQNDPEIKQRQSFLDNARQLYQYNYTYIDSLPLVE TVPT

N ERFSLSWGLLVGKAAIKVLLNERAN PLLLEAGKQTSKAKQQDFSKRLLEASVAQSESALLELLEDLPTVLETPPSELE

GVN IEEYN NLFWVIPLPSISQNYTSN KEFARLRVAGFN PLVIQRITALDARFPLTEAQFQKVLPN DSLAVAGAEGRLY

LADYAELEAIVGGTFPTGEQKYI NAPLALFAI PQGEKSLTPIAIQLGQDPNTH PIFLHQVGDEPNWLIAKTVVQIADA

N HHQLISH LGRTHLFVEPFVIATNRQLASN HPLYILLKPH FQGTLAIN DAAQSKLVSAGGGVDSLLAGTI ESARAVSV

HGVKTYKFEDALLPKALKKRGVDDPN LLPDYPYRDDALLVWEAIATWVKNYLSIYYFNDEDVI RDTELQAWAKEI IA

N DGGRATSFGENGQI RTLSYLADALTAVI FTGSAQHAAVNFPQGDLIVYTPAIPLAGYTPAPTQTTGATEADFFSLLP

PI EQAKGQLKLTYI LGSVYYTTLGEYGDGYFTDDRIEKPLRDFQDNLKAIESEI KSRNEKRVADYNYLKPSRIPQSI NI

Coding sequence for WP_106893977.1 - SEQ ID NO: 177

ATGAGTCTTTTTTCACGCGTTCGTCCGACCCTTCCGCAGAACGACTCCCCCGCAGCG CAGCAGCAGCGCCAAG

AGGCATTGCTGGACGAACAGAGCAAGTATGTCTGGAAAGATGATTTCGAGACGCTTC CGGGAATCCCTTTGG CGGCAAGCGTGCCGCGCGACGATCGGCCAACCATCACCTGGCTCTTAGAAGTGGCGGACG TCGGCATCGACA

TTGTGGCCAACCAAATCCTGGCCCAAACGGGCCGCGGTGACTCACTCAAATCGCAGA CTGCGGCCGCTGCGA

TCAGACCACATTTGGATAGCATGCGTCAGACCATAGCGACGATTCGCAGCGAGCAGA AGGCGACCCCGGACA

GCCCGCTTCGAATCGTCGACCATGTGGCCGGGACGCTGCTCAGTCTGCATCGCTCCC GCCTGGACAACGAGTT

GAAAACGCTGCAGAACATGATTGCGGCAACCTACCTCGGCAAGCTGGAAAACCCGAG CCTGGAGCAGTATCG

AAAGCTGTTTGTCACGCTGCCCTTGCCGGCAATCGCCGATACCTTCATGGACGACGC GACATTTGCCCGGATG

CGCGTCGCCGGGCCGAACAGCGTGCTGATTGCCGGCCTGAGTGCCTGGCCGTTGAAG TTTGGGCTCAGCGAG

GCGCAGTATCAATCGGTGATGGGCACCAACGATAGTCTGGCCTCGGCGTTAACCGAG CAGCGGCTCTACTGG

CTCGATTACGAGGAACTGAGCACTCTGAAAACGGGCACCACTGGTGGAAAGCCCAAG TTCTTATGTGCCCCGC

TCGCGCTGTTTGCGATCCCGAAGGGCGGTGGCGCGCTGACGCCGGTTGCCATTCAGC TCGGACAATCACCGG

CAGACGGCTTGTTCCTCCGGGTCAGCGACCAGAACAGTCCTGACTGGTGGTCGTGGC AGATGGCCAAGACGT

TCGTACAGGCCGCCGAGGGCAACTATCATGAGCTGTTTGTGCATCTCGCCCGCACGC ACCTCGTCATCGAGGC

ATTTGCCGTCGCGACGCATCGGCGGCTGGCGCCCGAGCACCCGCTGAACGTGCTGTT GCTGCCGCATTTTGAA

GGCACCCTGTTCATCAACAATTCTGCGGCAGGCAGTTTGATTGCTGAAGGTGGTCCG ATCGACCATATTTTTGC

TGG ACAG ATCACCT CCACCCAG ACCCTCGCCGGT AGCG ACCGGCT GGCGTTT GAT GT CACCGCACACAT GCTG

CCCAACG ACTT GGCCAGCCGT CGT GTTGCCG ACGTCGCCGCACT CCCT G ACT ACCCGT ATCGCG AT G ACGCAC

TGCTGGTCTGGCAGGCGATTCAAGACTGGGTCCGGCAATACGTCAGCGTCTACTATC TGAACGATGCCAACGT

CGCGGGCGACACCGAACTGCAAGGTTGGCGTGACGAGTTGCTCGGGCTCGGCAAAAT CAAGGGGCTGCCGG

AACTCAAGGACCGTGAGACGCTGATCAGCGTGGTGACGATGGTTATCTTTACGGCCA GTGCTCAGCACGCCG

CGGTGAACTTCCCGCAGAAGGACTTGATGAGCTTTGCACCCGCAATCAGCGGAGCCG CGTGGGCGCCGGTGC

CTAAGCCCGATCAGCCGCAATCGGAGGCGGCCTGGCTGAAACTGTTGCCGCCGATCA AGGAAGCACAAGAGC

AGTTGAACGTGCTGTGGTTACTCGGATCGGTGCACTATCGGCCGCTCGGTGACTACC GGGTGAACCATTGGCC

GTATCTGCCCTGGTTTCAAGATCCGCGCATCACGGGCAAGAATGGCCCGCTGGCACG TTTCAAACTGGCATTG

AAGGCGGTGGAGATGGAAATCGATAACCGGAACGCCGAGCGCGAGGTGCCGTATCCT TATCTGCAGCCGAG

TTTGATTCCGACCAGCATCAACATCTGA

Amino acid Sequence for WP_106893977.1 - SEQ ID NO: 178

MSLFSRVRPTLPQN DSPAAQQQRQEALLDEQSKYVWKDDFETLPGIPLAASVPRDDRPTITWLLEVADVGIDIV AN

QI LAQTGRGDSLKSQTAAAAIRPH LDSMRQTIATI RSEQKATPDSPLRIVDHVAGTLLSLHRSRLDN ELKTLQNMIAA

TYLGKLENPSLEQYRKLFVTLPLPAIADTFMDDATFARMRVAGPNSVLIAGLSAWPL KFGLSEAQYQSVMGTNDSL

ASALTEQRLYWLDYEELSTLKTGTTGGKPKFLCAPLALFAI PKGGGALTPVAIQLGQSPADGLFLRVSDQNSPDWW

SWQMAKTFVQAAEGNYHELFVHLARTHLVIEAFAVATHRRLAPEHPLNVLLLPH FEGTLFINNSAAGSLIAEGGPID

HI FAGQITSTQTLAGSDRLAFDVTAHMLPNDLASRRVADVAALPDYPYRDDALLVWQAIQDW VRQYVSVYYLN D

ANVAGDTELQGWRDELLGLGKI KGLPELKDRETLISVVTMVIFTASAQHAAVN FPQKDLMSFAPAISGAAWAPVP

KPDQPQSEAAWLKLLPPIKEAQEQLNVLWLLGSVHYRPLGDYRVN HWPYLPWFQDPRITGKNGPLARFKLALKAV

EM EIDNRNAEREVPYPYLQPSLI PTSI NI

Coding sequence for BBC22503.1 - SEQ ID NO: 179

ATGATCTTCTCAATTTTGAGCGGTGTCGCCAGAATATTAAATTTCCTCTCGGATAAG CTAGCCAATTTAGCTAAT

TTAATATCTAAGCCATCGAAGTCGAGCAACTATCCACTACTGCCCCAGAATGATCCC GAAATTTCTCAGCGTCA

GGCGTTGCTAAATAAGTCTCGGCAACTGTATCAATACAACTACACCTATATTGATTC GCTGCCGATGGTGGAG

AAAGTGCCAACCAGCGAGAGATTTTCTCTATCTTGGGGATTGTTGGTTGGGAAGGTT GTGGTCAAGGTATTGC

TCAATGATCGCGCTAATCCTGCCGCATTTATTGATAAGGAAAAATCGAAAGCCAAGC AACTGGAATTCTCGAA

GAAGTTGCTTGAGGCGAGTATGGCGAAGTCGGATACGGCTTTGGTGGAATTACTTTC CAACTTACCTGCAATT CTTGAAGATGATCCCATTGATGTAGCAGGCTCGAATATTCAAGAATACAACGAGCTTTTT TGGATTATTCCCCT

TCCGACAATTAGTCAAAGCTTGTTTAGTAATACTGAATTTGCAAGGTTGCGGGTTGC GGGTTTTAATCCTTTGA

TGATTCAACGGGTAACTTCTCTGGATGCAAGATTCCCTGTAACTGAAGCCCAGTTTC AATCAGTTTTGGCAGAT

GATTCTCTCGCCGCCGCAGGTGCTGAAGGACGCTTGTATTTAGCGGATTATGCCGAA TTAGAAGCGCTGACTG

GGGGGACATTTCCGAAGGGTAAGCAGAAATATATTAATGCGCCTTTAGCTCTCTTTG CGGTTCCTAAAGGGAA

AAAGAGTCTGACTCCGATCGCGATTCAGTTAGGGCAAGACCCTAATACGCATCCAAT TTTTGTTAGTCAACATG

GGGATGAGCCGAATTGGTTGATTGCGAAAACCGTTGTCCAGATTGCTGATGCTAATT ACCATCAACTGATTAG

CCATTTAGGACGTACCCATTTATTCATTGAACCCTTTGCGATCGCTACAAATCGTCA GTTGGCTAACAATCACCC

TCTGTATATTTTGCTGAAGCCCCATTTCCAAGGTACTTTGGCGATTAATGATGCTGC TCAGTCGGGACTGGTGA

GTGCAGGTGGAACTGTTGATAGCTTATTAGCAGGAACTATTGATACTGCTCGCGCCC TATCGGTGCATGGAGT

CAAAACCTATAATTTTGATGAAGCAATGCTACCTGTTGCGCTCAAAAAACGTGGCGT TGACGATCCAAAGTTA

CTGCCTGAATATCCCTATCGCGATGATGCGTTATTGGTGTGGGAAGCGATCGCTACT TGGGTAAAGAACTATC

TCTCTGTTTACTATGAAAATGATAATGATGTTGCTAGGGATTCAGAACTACAAGCAT GGGTTAAGGAAATTAC

TGCTAACGATGGCGGTCGGGTAACGAGCTTTGGGCAAAATGGACAGATTCGCACCCT ATCCTATTTGGTTGAT

GCTGTGACCCTGCTCATCTTTACCAGTAGCGCCCAGCACGCGGCCGTGAACTTTCCC CAAGGTGACTTGATGG

ACTATGCCCCTGCGGTTCCTTTAGCTGGCTATACTCCTGCGCCCACTAGTACCACTG GTGCAACCATAGATAAT

TTCTGGTCGATGATTCCTGCTATTGATCAGGCAAAAAGTCAGTTAACGATGACCTAT ATTCTCGGCTCGGTCTA

TTACACGACTTTGGGAGATTATGGCAATGCGTATTTCACTGACGATCGCATTGAGCA GCCCCTGCGCGATTTCC

AAGACAATTTGAAGGCGATTGAGTCTACGATTAAGTCTCGCAATGAGCAGCGAAATG TGGATTATAGTTATCT

CAG ACCAT C ACGCATTCCT CAAAGT ATT AAT AT CT AA

Amino acid Sequence for BBC22503.1 - SEQ I D NO: 180

MI FSILSGVARI LN FLSDKLANLAN USKPSKSSNYPLLPQNDPEISQRQALLNKSRQLYQYNYTYI DSLPMVEKVPTSE

RFSLSWGLLVGKVVVKVLLNDRAN PAAFIDKEKSKAKQLEFSKKLLEASMAKSDTALVELLSNLPAI LEDDPIDVAGS

N IQEYN ELFWI IPLPTISQSLFSNTEFARLRVAGFNPLM IQRVTSLDARFPVTEAQFQSVLADDSLAAAGAEGRLYLA

DYAELEALTGGTFPKGKQKYINAPLALFAVPKGKKSLTPIAIQLGQDPNTHPI FVSQHGDEPNWLIAKTVVQIADAN

YHQLISHLGRTHLFIEPFAIATNRQLAN NH PLYI LLKPHFQGTLAI NDAAQSGLVSAGGTVDSLLAGTIDTARALSVH

GVKTYNFDEAMLPVALKKRGVDDPKLLPEYPYRDDALLVWEAIATWVKNYLSVYYEN DNDVARDSELQAWVKEIT

ANDGGRVTSFGQNGQIRTLSYLVDAVTLLIFTSSAQHAAVNFPQGDLM DYAPAVPLAGYTPAPTSTTGATIDNFW

SMI PAIDQAKSQLTMTYILGSVYYTTLGDYGNAYFTDDRI EQPLRDFQDN LKAI ESTIKSRN EQRNVDYSYLRPSRI P

QSIN I

Coding sequence for WP_055077131.1 - SEQ ID NO: 181

ATGATCTCTTCGATTTTGCGTGGTATTGCCCAAATATTAAATTTCCTTGCGACTAAG TTGTCCGACTTAGCAAAT

TTAATATTGCGGCGATCGCCTTCAAGTAAATATCCCCTATTACCTCAGAACGATCCC GAAATCGATCGACGACA

GGCTCTGCTCAACCAGTCTAGACAGCTCTATCAATATAACTACACCTATGTCGCCCC CTTGCCGATGGTCGAAA

A AGT G CCA ACT GG CG AG C AGTTCT C ATT GT CTT G G G G CTT ATT G GT AG G AA AG G C AGTT ATCG AA ATTTT ATT

AAATGATATTGCGAATCCTTTCCTCTTGAGTGAAAAGGGTAAAAATGCCTCTAAAGC TAGGCAACAAGACTTC

TCAAAACGTTTACTTGAAGCTGGCGTTGCTCAGTCGAATTCCGCAATAATAGGTCTG CTGTCAGAGATTCCCAC

CCTATTAGAGACCGAACCCACCAACGTCGAAGGTTCAAACATTAAGGAATATAACGA TCTTTTTTGGATTATTT

CTTTGCCCAAGATCAGTCAAAATTTTACAACTAATTCCGAGTTTGCAAGGCTCCGCG TCGCTGGATTTAACCCT

GTGACGATCCAACGCATCAAGACCTTAGATGCGAAATTTCCTCTCACGGAAGATCAA TTTCAAACGGTGTTAG

CGGGGGACTCTCTCGCTGAGGCTGGAGCACAAGGTCGCTTGTATCTGGCTGATTATG CAGAGCTAACGGCGA

TCGCGGGTGGTACTTTTCCTAAGGGAGCGCAAAAGTATATAAATGCACCTTTGGCAT TGTTTGCCGTTCCCAAA GGACAGCAGAGTTTGACACCGATCGCCATTCAATTAGGGCAAGACCCCAGTGCTTATCCC ATCTTTGTCTGTCA

GGCTGATGATGAACCGAACTGGCTTCTAGCTAAAACCGTTGTCCAGATTGCTGATGC CAATTACCACGAACTG

ATTAGCCATTTAGGTAGAACCCATTTATTTATCGAACCCTTTGCGATCGCGACTAAT CGCCAACTTGCCAGCAA

TCATCCTTTGTACATTCTGCTCAAGCCTCATTTCCAAGGAACTTTAGCGATCAATGA TGCCGCTCAATCGGGACT

GATTAGTGCTGGTGGAACCGTGGATAGTCTACTAGCGGGAACGATCGCTTCCTCGCG CACCCTGTCGGCACA

GTCCGTTGAAAACTATAACTTCAATGAAGCGATGTTGCCTGTAGCCCTGAAAAAGAG GGGAGTGGACGATGT

CAATATGCTGCCCGATTATCCCTATCGCGATGATGCTTTATTGGTCTGGGGAGCGAT CGCAACTTGGGTCAAA

AACTATCTATCCATCTATTATTTCAGCGATACCGATGTCATGAGAGATGTGGAACTG CAAGCATGGGCAAAGG

AAATTACCTCGATTGATGGCGGGCGCGTCAAGAGTTTTGGTCAAAATGGTCAGATTC AGACCTTTGATTATTT

GGTCGATGCGGTGACATTGCTGATCTTTACCAGCAGCGCCCAACATGCGGCAGTAAA CTTCCCTCAAGGCGAT

TTGATGGACTACACGCCAGCAATTCCGCTAGCAGGCTATACTCCCGCACCAACGGCA ACCACTGGTGCAACGG

AAGCAGATTTCTTTGCCATGCTACCGCCCATCGACCAAGCTAAGAGTCAATTGACCA TGACCTATATTTTGGGC

T CT GTTT ATT ACACG ACCCT AGGCG ACT AT GGTT C AG ATT ATTT CAACG ACG ATCGCCTT CAGCAACCCTT ACG

CGATTTTCAAGATGGGTTAAAAGCGATCGAGTCTACAATTAAGTCGCGCAATGAGAC TAGGGCTGCTGATTAC

A ATT ACTT AAA ACC AT C ACG GATT CCT C AA AG C ATT AAT AT CT AA

Amino acid Sequence for WP_055077131.1 - SEQ ID NO: 182

MISSILRGIAQILN FLATKLSDLANLILRRSPSSKYPLLPQN DPEIDRRQALLNQSRQLYQYNYTYVAPLPMVEKVPTG

EQFSLSWGLLVGKAVI EI LLN DIAN PFLLSEKGKNASKARQQDFSKRLLEAGVAQSNSAI IGLLSEIPTLLETEPTNVEG

SNI KEYN DLFWIISLPKISQN FTTNSEFARLRVAGFNPVTIQRIKTLDAKFPLTEDQFQTVLAGDSLAEAGAQGRLYLA

DYAELTAIAGGTFPKGAQKYI NAPLALFAVPKGQQSLTPIAIQLGQDPSAYPI FVCQADDEPNWLLAKTVVQIADAN

YH ELISH LGRTH LFI EPFAIATN RQLASNH PLYI LLKPHFQGTLAI NDAAQSGLISAGGTVDSLLAGTIASSRTLSAQSV

ENYN FN EAMLPVALKKRGVDDVNMLPDYPYRDDALLVWGAIATWVKNYLSIYYFSDTDVMRDVELQ AWAKEITS

IDGGRVKSFGQNGQIQTFDYLVDAVTLLI FTSSAQHAAVN FPQGDLM DYTPAI PLAGYTPAPTATTGATEADFFAM

LPPI DQAKSQLTMTYILGSVYYTTLGDYGSDYFN DDRLQQPLRDFQDGLKAI ESTIKSRNETRAADYNYLKPSRI PQSI

N l

Coding sequence for WP_009629598.1 - SEQ ID NO: 183

ATGATCTCTTCGATTTTGCGTGGTATTGCCCAAATATTAAATTTCCTTGCGACTAAG TTGTCCGACTTAGCAAGT

TTAATATTGCGGCGATCGCCTTCAAGTAAATATCCCCTATTACCTCAGAACGATCCC GAAATCGATCAACGACA

GGCTCTGCTCAACCAGTCTAGACAGCTCTATCAATATAACTACACTTACGTCGCCCC CTTGCCGATGGTCGAAA

A AGT G CCA ACT AG CG AG C AGTT CT C ATT AT CTT G G G G CTT ATT G GT AG G AA AG G C AG CG AT CG A AGTTTT ATT

AAATGATATTGCGAATCCTTTCCTCTTGAGTGAAAAGGGTAAAAATGCCTCTAAAGC TAGGGAGCAAGACTTC

TCAAAACGTTTACTTGAAGCTGGCATTGCTCAGTCGAATTCCGCAATAATAGGGCTA CTGTCAGAGATTCCCTC

CCT ATT AG AG ACCG AACC AACCAAT GTT G AAGGTT C AAAT ATT AAGG AAT AT AACG AT CTTTTTT GG ATT ATTT

CTTTACCCACGATCAGTCAAAGTTTTACAACTAATTCCGAGTTTGCAAGGCTTCGCG TCGCTGGATTTAACCCT

GT G ACG ATCCAACGT AT CAAG ACCTT AG ATGCG AAATTT CCT CT C ACGG AAG AT C AATTT C AAACAGT GTT AG

CGGGGGACTCTCTCGCTGAGGCTGGAGCGCAAGGTCGCTTGTATCTGGCTGATTATG TAGATCTAACGGCGA

TCGCGGGCGGTACGTTTCCTAAAGGAGCACAAAAGTATATAAATGCACCTTTGGCTC TGTTCGCAGTTCCCAA

AGGACAGCAGAGTTTGACCCCGATCGCCATTCAGCTAGGGCAAGACCCCAGTGCTTA TCCCATCTTTGTCTGTC

AGGCTGATGATGAACCGAACTGGCTTCTAGCTAAAACCGTTGTTCAGATTGCTGATG CCAATTACCACGAACT

GATTAGCCATTTAGGTAGAACCCATTTATTTATCGAACCCTTTGCGATCGCAACTAA TCGCCAACTTGCCAGCA

ATCATCCTTTGTATATTCTGCTCAAGCCTCACTTTCAAGGAACTTTAGCGATCAATA ATGCCGCTCAATCGGGAC

TGATTAGTGCTGGTGGAACCGTAGATAGTCTATTAGCGGGAACGATCGCGTCCTCGC GCACCCTTTCGGTACA 111

GTCAGTTAAGAACTATAACTTCAATGAAGCGATGTTGCCTGTAGCCCTGAAGAAGAG AGGGGTTGACGATGT

TAATATGCTGCCCGATTATCCCTATCGCGATGATGCTTTATTGGTCTGGGGAGCGAT CGCGACTTGGGTCAAA

AATTATCTATCCATCTATTATTTCAGCGATACCGATGTCCTTAGAGATTCTGAACTG CAAGCATGGGCAAAGGA

AATTACCTCGGTTGATGGTGGGCGCGTCACAAGTTTTGGTCAAGATGGTCAGATTCA GACCTTCGATTATTTA

GTCGATGCAGTGACATTGCTGATCTTTACCAGCAGCGCTCAACATGCGGCGGTAAAC TTCCCTCAGGGAGATT

TGATGGACTACACGCCAGCAATTCCGCTAGCGGGCTATACTCCCGCACCAAAGTCAA CCACTGGTGCAACGGA

AGCAGATTTCTTTGCCATGCTACCGCCCATCGACCAAGCTAAGAGTCAATTGACAAT GACCTATATTCTGGGAT

CTGTTTATTACACGACCCTAGGCGACTATGGTTCAGATTATTTCAACGACGATCGCC TTCAGCAACCCTTACGC

GATTTTCAAGATGGGTTAAAAGCGATCGAGTCTACAATTAAGTCGCGCAATGAGACT AGGGTTGCTGATTACA

ATT ACTT AAA ACC AT CG CG G ATT CCT C AA AG C ATT AAT AT CT A A

Amino acid Sequence for WP_009629598.1 - SEQ ID NO: 184

MISSILRGIAQILN FLATKLSDLASLI LRRSPSSKYPLLPQNDPEI DQRQALLNQSRQLYQYNYTYVAPLPMVEKVPTSE

QFSLSWGLLVGKAAIEVLLNDIAN PFLLSEKGKNASKAREQDFSKRLLEAGIAQSNSAI IGLLSEI PSLLETEPTNVEGS

N IKEYN DLFWIISLPTISQSFTTNSEFARLRVAGFNPVTIQRIKTLDAKFPLTEDQFQTVLAGDSL AEAGAQGRLYLAD

YVDLTAIAGGTFPKGAQKYI NAPLALFAVPKGQQSLTPIAIQLGQDPSAYPIFVCQADDEPNWLLAKTVVQIADANY

HELISHLGRTHLFIEPFAIATNRQLASNH PLYI LLKPHFQGTLAIN NAAQSGLISAGGTVDSLLAGTIASSRTLSVQSVK

NYN FN EAM LPVALKKRGVDDVNM LPDYPYRDDALLVWGAIATWVKNYLSIYYFSDTDVLRDSELQAWAKEITSV

DGGRVTSFGQDGQIQTFDYLVDAVTLLI FTSSAQHAAVN FPQGDLMDYTPAIPLAGYTPAPKSTTGATEADFFAML

PPIDQAKSQLTMTYILGSVYYTTLGDYGSDYFN DDRLQQPLRDFQDGLKAIESTI KSRNETRVADYNYLKPSRI PQSI

N l

Coding sequence for WP_015133151.1 - SEQ ID NO: 185

ATGACCGCGACCTCCCCATCTAGTAGCCAAAACCTCAGCGACAAACAGGAAAAATAC CAATACAACTATCGGT

ATATGCCCCCATTGGCGATGGTCGACAGCCTGCCTGAAGAAGAGCAATGGTCTACCT CTTGGAAAATGACGGT

GGGTAAAGTTGGCTTCCAGCTCCTTGTCAACAAAATCATTTTGAATTATGGCGATCA AGGAGAAGCAGGGGC

AGCAGACGACGTTCGCGC I I I I I I GATTAGTACCTTTAAACAAACCCTCGCCGAACAAAAAGGCTTTTCAAAAG

TGGGGATTCTCCTGCAAGGCGCCAAATTTTTACCCAGATTAATTTGGGGCAAGATCA CCACACAAATCGTCGA

TGTCGAAGATTTGATGAAAGAGATGATCGAAAGCATGAGTCGCAAATTTTTAGAGGA CTTTGCGGCCAATGTT

ATGCAAAAGTTGACCGAAGATGCCCCCAAAGGTCGCTTTTCATCAATCAAAGAATTT GAAACGCTATTCACAG

AAATCGATCTGCCCGATATTGCCTACACCTATCAGGAAGACGAAACCTTCGCCTATA TGCGCGTTGCTGGACC

GAATGCTGTAATGCTCCAGAAAATCACCGAGCCAGATCCCCGTTTCCCAGTCACAGA AGCCCATTACCAAGCG

GTTATGGGAGAAGAAGATTCTTTAGCCGCAGCACGCTCAGAAGGTCGTTTATATTTG TGCGACTATGCCATCC

TCGATGGGGCAATAGAGGGAGATTTTCCTGTGGCTCAGAAATATCTCTATGCACCAT TAGCACTCTTCGCTGT

GCCCAAAGCTGATGCAGTCAAACGAAATTTAATGCCTGTAGCCATTCAGTTAGGTCA AGTCCCTAAACAAAAC

CCTATTCTGACTCCCAAATCTAATAAATATGCATGGCTCTGTGCGAAAACGGCAGTG CAGATTGCTGATGCCA

ATTTCCATGAAGCGGTCACCCATCTAGCTCGCACCCACTTGTTTATGGGGCCCTTTG CGATCGCCACCCATCGA

CAACT ACC AG AG AGCC AT CCCCT CTTT AAACT ACTT AAACCT CA I I I I I I TGGGATGCTGGCCATTAACGACTCA

GCCCAAGCTAAACTCATTGCGAAAGGCGGTGGCGTCAATAAAATCCTCTCTGCCACT ATCGATAACGCCCGTT

TATTCGCCATCTTGGGCGTACAAACCTATGGCTTTAACAGTGCCATGCTACGCAAAC AATTGGCAGCCAGAGG

CGTTGATGATACTGAGGGATTACCTATTTATCCGTATCGTGACGATGCTCTATTAAT TTGGGATGCCATTAATA

ATTGGGTGCAAAGTTATCTCAAAACCTACTATGCGAATGATGCAGCAGTGCGGAGAG ATCAGGCGATCCAAG

CTTGGGTAAAAGAATTAATCTCCGAAGATGGCGGTCGTGTGGTGGAATTTGGGGAAG ATGGTGGCATCCAAA

CTCTTGAGTATCTTATCGAAGCAGTGACACTCATCATTTTTACGGTGAGCGCGCAAC ATGCAGCAGTAAATTTC CCTCAAAAAAATCTTATGAGCTTCGCCCCTGGTATGCCCACAGCAGGTTACTCACCCCTT GATAATCTCGGGGA ACACACCAC AG AGCAAG ACT AT CTCG ATTT ATT ACCACCG AT GT CCCAAGCT CAGG AACAGCT C AAACT CTGTC ACTTATTAGGTTCTGCACATTTTACTGAGCTTGGTCAATATGATGCCAAGCATTTCACCG ACTTCAAGATTCAA GGGGCACTCAAACAATTCCAAGCACGCCTAAAAGAGATTGAAGGTATTATTCACAAACGC AATCGTGATCGCC CT G AAT ACG AAT ACCTTTT ACCATCGCT AATTCCCCAAAGT AT CAAT AT CT AG

Amino acid Sequence for WP_015133151.1 - SEQ ID NO: 186

MTATSPSSSQN LSDKQEKYQYNYRYMPPLAMVDSLPEEEQWSTSWKMTVGKVGFQLLVNKI ILNYGDQGEAGA

ADDVRAFLISTFKQTLAEQKGFSKVGI LLQGAKFLPRLIWGKITTQIVDVEDLMKEMI ESMSRKFLEDFAANVMQKL

TEDAPKGRFSSIKEFETLFTEIDLPDIAYTYQEDETFAYM RVAGPNAVMLQKITEPDPRFPVTEAHYQAVMGEEDSL

AAARSEGRLYLCDYAI LDGAIEGDFPVAQKYLYAPLALFAVPKADAVKRNLMPVAIQLGQVPKQNPILTPKSNKYA

WLCAKTAVQIADAN FHEAVTHLARTHLFMGPFAIATH RQLPESH PLFKLLKPH FFGMLAI NDSAQAKLIAKGGGVN

KI LSATIDNARLFAI LGVQTYGFNSAM LRKQLAARGVDDTEGLPIYPYRDDALLIWDAI N NWVQSYLKTYYANDAA

VRRDQAIQAWVKELISEDGGRVVEFGEDGGIQTLEYLI EAVTLI IFTVSAQHAAVN FPQKN LMSFAPGM PTAGYSP

LDNLGEHTTEQDYLDLLPPMSQAQEQLKLCHLLGSAH FTELGQYDAKH FTDFKIQGALKQFQARLKEI EGII HKRNR

DRPEYEYLLPSLIPQSI NI

Coding sequence for WP_063872765.1 - SEQ ID NO: 187

ATGACTACTTCATCACCAGATAATTCCCGCAGTCTCCCCATCACCCAGAATTTGGAA TTAGCGAGGCAGGAATA

T CAAT AT A ACTATACCC AT ATT CC ACCT ATT CCT ATG GT G A AT C AG CTTCCT AAT C AG G AA AACTT C ACT ACTAG

ATGGAC I I I I I I ATTAGCCCAGCAGTTACGGGAGATTTTCATTAATACTCTGATCACTAACCGAGGCGATCG CA

GTT CC AA AT CGGTTCGTGAT C A AGT C AAA AG GTTT ATTTT AG A AG CCTT GTT C A AG G GG G CTATACC AG CCA A

AGTAAGTGTGATTGCGAGACTTTTCCAAATTATTCCCCAGTTTCTCATTCAAGGAAT ATCTAAAGATTTTCACGA

ACTAGATGATCTGTTTTTTTCCCTTTTCAAAACCAACGGACTGTTAATATTCAGAGA TTCTCTGAATCGAATTAC

AGCCCTTTTAGATAAAGGCCATCCCACAGGTCATGTGAATAGTTTAAAGGACTACCA AAAGTTATTTACCACAA

TTGAATTACCAGCGATCGCCAAAACTTTCGATCAAGATCAAGTCTTTGCCTATATGC AAGTCGCCGGCTACAAT

CCCCTAGTCATCAAGCGGGTAAATAGTCCAGGCGCTAACTTCCCAGTTGAAGAGACA CATTACCAAGCAGTCA

TGGGGAGCGATGATTCATTAGCAGCCGCAGGACAAGAAGGAAGGCTATACCTAGCAG ACTATCAAATTTTAG

ACGGTGCTATCAACGGTACATATCTAAATTACCAAAAGTATGCCTATGCTCCCCTAG CGCTGTTTGCCATCCCC

AAAAACTCAGACCCCAATCGTCTCCTGCGCCCCATAGCTATTCAATGTGGTCAAACT CCTGGAGCCGATTATCC

CATAATTACCCCCAATTCCGGCAAATACGCCTGGCTATTTGCCAAAACCATTGTCCA CATAGCCGATGGCAACT

TTCATGAAGCCGTCAGTCACCTCGCCCGAACGCACCTATTCGTTGGTGTCTTTGTCA TCGCCACCCATCGGCAA

TTGTCCCCCAGCCATCCCCTCAGCCTCCTACTGCGTCCCCATTTTGAAGGCACTTTA GCAATTAACAATGCCGCC

CAAGAAGTTTTGATTGCTCCTGGCGGCGGAGTTGATAGGTTACTCTCATCGACCATT GATAACTCACGGATTTT

AGCAGTGCGCGGTTTGCAAAGCTATAGTTTCAATGAAGCTATGTTGCCAAACCAACT CAAACAAAGAGGTGTT

GATGATCCTGAACTACTGCCTGTTTATCCTTACCGGGACGATGCACTACTAATTTGG AACGCCATTCATCAATG

GGTTTCCGACTACCTGAGCCTTTATTACCCTACAGATAAAGATATTCAAAATGATAC TGCTTTGCAAGCATGGG

CAGCCGAAGCCAAAGCTGAGAATGGTGGACGTGTACCTGATTTTGGTGAAAATGGAG GTATTCAGACACTAG

ACTACCTAGTTGATGCTGCTACCCTGATTATTTTTACAGCCAGCGCCCAACACGCGG CGGTTAACTTCCCCCAA

AAAGATTTGATGAGTTATGCCCCAGCTTTTCCCTTAGCAGGATATGTATCCGCCTCC ATCAAGGGAGAAGTTA

GTGAACAAGACTACCTGAATTTACTCCCACCTTTGGAGCAAGCGCAACAGCAATTTA ACTTGCTCACTTTACTA

G G GTCTAT AT ATT AC AACC AG CTT G GT G A AT AT CC A AA AT C AC ACTTT G CT A ACCCC AAG GT AC A AACCTT GTT

ACAGAAGTTCCAAAGCCAACTCCAGCAAATTGAAATTACGATCAATCAGCGCAATTT GCACCGCCCAACTTAC

G AAT AT CT ACTT CCTT CT AAAATCCCT C AG AGCATT AAT ATTT G A Amino acid Sequence for WP_063872765.1 - SEQ ID NO: 188

MTTSSPDNSRSLPITQN LELARQEYQYNYTH IPPIPMVNQLPNQEN FTTRWTFLLAQQLREI FINTLITNRGDRSSKS

VRDQVKRFILEALFKGAIPAKVSVIARLFQII PQFLIQGISKDFH ELDDLFFSLFKTNGLLI FRDSLN RITALLDKGHPTGH

VNSLKDYQKLFTTI ELPAIAKTFDQDQVFAYMQVAGYNPLVIKRVNSPGAN FPVEETHYQAVMGSDDSLAAAGQE

GRLYLADYQILDGAI NGTYLNYQKYAYAPLALFAIPKNSDPNRLLRPIAIQCGQTPGADYPI ITPNSGKYAWLFAKTIV

HIADGN FH EAVSHLARTHLFVGVFVIATHRQLSPSH PLSLLLRPH FEGTLAI N NAAQEVLIAPGGGVDRLLSSTIDNS

RILAVRGLQSYSFNEAMLPNQLKQRGVDDPELLPVYPYRDDALLIWNAI HQWVSDYLSLYYPTDKDIQNDTALQA

WAAEAKAENGGRVPDFGENGGIQTLDYLVDAATLI IFTASAQHAAVNFPQKDLMSYAPAFPLAGYVSASI KGEVSE

QDYLNLLPPLEQAQQQFNLLTLLGSIYYNQLGEYPKSH FANPKVQTLLQKFQSQLQQI EITINQRNLHRPTYEYLLPS

KI PQSIN I

Coding sequence for WP_096687527.1 - SEQ ID NO: 189

ATGAGATCACCAACTCCAAAACAACGACGACAAGAGTTAATTGAGCAGTATGTATTA TCGCGCCGTACCATGA

TGGCGCTGATGGCCTTCGCTTGTACTCCTGGTTTGGAAACTTTACTAGTCGGTGACA ATAAATCCTCAAAACCT

AAGCAATT GG AT AAT CCG AAT GGTT GT ACTCCCGGTTT GG AAACTTT ACT AT CT AAT G ACAATAAACCCTCAAA

ACCTAAGCCACCAAATAATCCTAGCATCCCAAGCTTACCTCAAAATGATACAAAAGC GACTCAACAAGAACGC

CTGACGCAGTTGGGAAAGACTCGTGAAGAATATCAGTTGGGGTTGCGGTTGCCTAAT TCTGCTCGCGTGAAG

ACTTTACCCGCGACTGAATTATTTTCTGAAGGATACGAGAAGAACCGAGTAATCTTA TCGCAGAAGATAGGAG

CCAATCAACAAGCGTTTTTACAAAACCCCAAACCTTTTCAAAGCTTCGATGATTACA GCGCGCTGTTTCCCGTTT

TGCCGCTACCCGATATCGCTAAAACATTCCGTAATGATTCGGTATTCGCACGACAGA GGCTTTCTGGCTGTAAC

CCGATGGAACTAAAGAACGTTCTAGCACTTGATTATAATCTTCGTAGCAAACTCGCC ATAACAGATGAAATTTT

TCAAGCTGTGCTAAATGCGACAAGAACCAGAGAGCGCATTAATAAGACTCTCAACAG CGCTATTCGAGAAGG

CAGCTTATTTGTTACCGATTATGCAATACTTGATAGCATTCAGCCGAAAGAAAAGCA ATTTGTTTGTGCCCCCA

TTGCACTCTATTATGCCCAAAGAATTCGTGGCGATTTTCAGCTAATCCCCATTGCTA TCCAGTTAGGACAGGCG

CCGGGTTCAAGTTTACTTTGCACACCAAATGATGGAGTAGATTGGACTTTAGCCAAG TTAATAACCCAAATGG

CT G ATTT CTACGT C AAT C AGTT AT ATCG G C ACTT G G G AC AG ACT CAT CT AGT AAT G G AG CCA ATT G CTTT AG C A

ACAGCGCGCGAACTAGCTGCGAAGCATCCCGTAAACGTACTCTTAAAGCCTCACTTT GAGTTTACAATGGCAA

TTAATAGCCTTGGTGATGAAGTGCTAATTAATCCGGGCGGAGCAGTAGATATTATAT TACCGGGTACTTTAGA

A AG CTCG CT A A AACTT ACCG ATAC AG GTGTAG CT G ACTTTTT C AAC A ACTTT AG C AG CTTT G C ACTT CCT ACT A A

TTTACGTCAGCGCGGTGTTGATAATCCTTATACCTTACCAGATTTTCCTTATCGAGA CGACGGGTTGCTCGTTT

GGAATGCTTTAGAAGACTATGTAAGTAAATATATCGGTATTTACTATAAATCTAACC GAGATATCCGCGAGGA

TTTCGAGCTACAAAATTGGTTCCAAGTTTTACGGAAACCAAAGAGCGAAGGTGGTTT TGGTATAGTTTCATTAC

CAGCAAACCT G AC AAACCGCG ACC AATT GAT AG AC ATTTT G ACAAT AATT ATTTT CACTGCTGGT CCCCAAC AC

T C AG CC ATT G CTT G G ACT C AAT AT C AAT ATATG G CTTTT ATT CCT AAT ATG CCTG G AG CT ATTT AT C AG CCT ATT

CCTACAACTAAAGGGAAATTCGCTGACGAAAACAGCCTTACTAGTTTCCTACCTGGA ATCAAACCAAGCCTTAC

CCAAGTT CAGTTT AT GT CGTTAGTCGGT ACCAAGCGCG ACCCAAAAGCATTTACT G ATTTT GGT GT G AACAGTT

TT CAAG ACCCGC AAGCCATT AG AGTT CTT AG AG ATTT CC AAAAT CGTTT AG AAT CAAT AG AAAAACGG ATT G A

AGCACAAAATCAACGTCGCGAAGAATGCTACCCGGCGTTTCTTCCCTCTCGGATGTC TAATAGCGTAAGTGGT

TGA

Amino acid Sequence for WP_096687527.1 - SEQ ID NO: 190

MRSPTPKQRRQELIEQYVLSRRTMMALMAFACTPGLETLLVGDNKSSKPKQLDNPNG CTPGLETLLSNDNKPSKP

KPPNN PSIPSLPQNDTKATQQERLTQLGKTREEYQLGLRLPNSARVKTLPATELFSEGYEKNRVI LSQKIGANQQAFL

QN PKPFQSFDDYSALFPVLPLPDIAKTFRN DSVFARQRLSGCNPM ELKNVLALDYN LRSKLAITDEI FQAVLNATRTR ERI N KTLNSAI REGSLFVTDYAI LDSIQPKEKQFVCAPIALYYAQRI RGDFQLIPIAIQLGQAPGSSLLCTPN DGVDWTL

AKLITQMADFYVNQLYRHLGQTHLVM EPIALATARELAAKHPVNVLLKPH FEFTMAI NSLGDEVLINPGGAVDII LP

GTLESSLKLTDTGVADFFN NFSSFALPTNLRQRGVDN PYTLPDFPYRDDGLLVWNALEDYVSKYIGIYYKSNRDIRED

FELQNWFQVLRKPKSEGGFGIVSLPANLTN RDQLIDILTII IFTAGPQHSAIAWTQYQYMAFIPN MPGAIYQPIPTTK

GKFADENSLTSFLPGI KPSLTQVQFMSLVGTKRDPKAFTDFGVNSFQDPQAIRVLRDFQN RLESI EKRI EAQNQRRE

ECYPAFLPSRMSNSVSG

Coding sequence for WP_015138267.1 - SEQ ID NO: 191

ATGAATGTGGCATCAGCAGATAATTCGAGAAGTTCCCCCAGCAACCACAACTTGGAT ATAGCTAGGCAGCAAT

ATCAATATAACTACACCCATATTCCCCCTTTGGCGATGGTGAATCAACTGCCACCTG CGGAAGAGTTCACCACT

CGTTGGTATTGTTTATTAGCTAAAGAATTACGCCTGATTTTTATCAATACCCTGATT GTCAACCGGGGTAATCG

TGGTTTTAAGTCGGTGAAAGATGATGTCATTGCGTTTCTTTTAGAAGCTTTGATTAA GGGAGCCATCCCATTTC

GCCTGGGTGT AATT G CC AG ACTG CTG C A AATT CT CCCCC AATTT CTG CTG CGTAG CGTCTCT AA AG ATTT G CG G

G AACTGG AT GAT CT GTTTTT AT C ACT ACTT AAGG AAATTGG ACT GT CAATTTTT AC AG ATT C ACT C AACCGCAT C

ACTAAGCTGTTATTTGAGAAACAACCCAAAGGACGCGTAACCAGTCTCAAGGATTAC GAAAAATTGCTACCAG

TGTTGGGATTGCCCAAGATTGCCAGCACTTATCAAGAAGATGAAGTTTTTGCTTATA TGCAAGTGGCTGGTTAT

AATCCCTTAATGATTAAGCGGGTAACTAGCCCAGGCGATCGCTTCCCAGTCACAGAC GAGCATTACCAAGCCG

TGATGGGTAGTGATGATTCCTTAGCAGCAGCCGGGGAAGACGGTAGACTTTATCTGG CAGACTATGGGATTT

TAGATGGTGCGAT C AAT G GT AC AC ACCC AA A ACT AC A AAAGT ATGTCTACG C ACCTCT GG C ACT GTTT G CTGT

ACCCAAAGGCGCAGATGCTCACCGTTTACTCCGCCCAGTAGCCATTCAATGTGGACA AACCCCAGACGCAGAT

CACCCCATCATTACCCCTAACTCTGGTAAATACGCCTGGCTGTTTGCCAAAACTATT GTCCTCATCGCCGATGCC

AACTTTCACGAAGCCGTCAGCCACCTAGCTAGAACACACCTGTTTGTGGGTGTATTC GTGATGGCAACCCATC

GGCAACTCCCAAGCAATCATCCCCTCAGCCTGTTGTTACGCCCCCATTTCGAGGGTA CATTAGCCATCAATAAT

GCCGCCCAAGAGAACCTCATCGCTCGTGATGGAGGTGTTGATCTATTACTTTCATCA ACTATTGATAACTCTCG

T ATTTT AGCCGTGCGT GG ATT G C AA AG CT AT A ACTT C AACG C AG CC ATGTT ACCC A AG CA ACT C A AAC AG CGT

GGTGTGGATGATCCCAACCTATTACCTGTTTATCCTTACCGAGATGATGCCCTGTTA ATCTGGGATGCTATCCG

TGATTGGGTGTCAGACTACCTCAAGCTTTACTATCCTACAGATGCAGATGTGGAAAA AGACGCAGCCTTACAA

GCATGGGCAACCGAAGCCCAAGCTTACGAAGGTGGTAGAATTACTGGCTTTGGTGAA GATGGAGGTATCAAA

ACCAGAGAATATCTAATTGATGCGGTAACACTGATCATTTTCACCGCCAGTGTTCAA CACGCGGCGGTAAACT

TTCCCCAGAAAGATATCATGGGCTATGCCCCAGTTGTCCCACTAGCCGGTTATATGC CAGCCTCAACCCTCAAG

GGAGAAGTGACTGAGCAAGACTACCTCAACTTGCTGCCTCCACTAGAACAAGCACAA GGGCAATATAACTTAC

TTT ACTT ATT AGGATCTGTGT ATT AC AAC A AACT CG GT C AAT AT CC AC AACC AC ACTTT ACT GAT CC AC AAGT AA

CATCCTTATTGCAAAGCTTCCAAGATAAACTCCAGCTAATTGAAGACACCATCAATC AGCGCAATTTAAACCGC

CCAGCCT ATGAAT ATTTGCTCCCTTCCAAGATTCCCCAGAGTATT AAT ATTT AA

Amino acid Sequence for WP_015138267.1 - SEQ ID NO: 192

MNVASADNSRSSPSNH NLDIARQQYQYNYTH IPPLAMVNQLPPAEEFTTRWYCLLAKELRLIFI NTLIVNRGN RGF

KSVKDDVIAFLLEALI KGAIPFRLGVIARLLQILPQFLLRSVSKDLRELDDLFLSLLKEIGLSIFTDSLN RITKLLFEKQPKGR

VTSLKDYEKLLPVLGLPKIASTYQEDEVFAYMQVAGYNPLMI KRVTSPGDRFPVTDEHYQAVMGSDDSLAAAGED

GRLYLADYGILDGAI NGTHPKLQKYVYAPLALFAVPKGADAH RLLRPVAIQCGQTPDADH PI ITPNSGKYAWLFAKT

IVLIADANFH EAVSHLARTHLFVGVFVMATH RQLPSNHPLSLLLRPH FEGTLAI N NAAQENLIARDGGVDLLLSSTID

NSRI LAVRGLQSYN FNAAM LPKQLKQRGVDDPN LLPVYPYRDDALLIWDAI RDWVSDYLKLYYPTDADVEKDAAL

QAWATEAQAYEGGRITGFGEDGGI KTREYLIDAVTLII FTASVQHAAVNFPQKDIMGYAPVVPLAGYMPASTLKGE VTEQDYLNLLPPLEQAQGQYNLLYLLGSVYYNKLGQYPQPHFTDPQVTSLLQSFQDKLQL IEDTINQRNLNRPAYEY

LLPSKI PQSI NI

Coding sequence for WP_094347473.1 - SEQ ID NO: 193

ATG ACTG CTT CAT C ACC AG AAA ATT C AAT C AG CTT AT C AAGT ACT CAT ACTTT AG AT AT AG CT AG G C AAG AGT A

TCAATATAACTACACCCATATTCCATCTATTGCGATGCTAGATCGGCTTTCTATTGC CGAAGAGTTCGCTACTAA

CTGGTA I I I I I I ATT AGCCCAGCAGTT ACG AGTT GT GTTTATT AATACCTT GATT GT CAACAG AGGCAAT CAAG

GTTCTAAATCGATTCGTGATGATGTCGAAAGGTTTATTTTAGAAGCCTTTCTCAAGG GAGCAGTACCAGTAAA

AATCACTATTCTGGCAAGAATCCTGCAAATTATCCCTCAGTTTTTGCTCAATGGCAT CTCTAAGGATGTTAGAG

AACTCGACGATCTTTTTTATTCTATTCTGAAAGAAAACGGACTTGTGATCCTCAGAG ATGCTCTAAATAGGATA

ATTAACCTTCTATACGAAGGACAGCCTACAGGACATGCAACCAGTCTTAAGGACTAC GAAAATTTGTTTCCGG

TGATTGGTGTGCCAGGAATCGCTAAAACTTACCAAGAAGATGAAGTATTTGCCTATA TGCGAGTGGCTGGCTA

CAATCCCGTCACGATCGCGCGAGTAACGACTCCAGGCGATCGCTTCCCAGTCATAGA CGAACATTACCAAGGA

GTGATGGGAACTGACGATTCATTAGCAGCAGCCGGACTTGAAGGCAGACTCTACTTA GCTGACTATAAAATTT

TAGATGGTGCGGTCAACGGTACATTCCCACACGAGCAAAAATATCTCTATGCTCCCC TAGCACTATTTGCCTTA

CCCAAAGGCTCAGACCCCACCCGTTTATTGCGTCCAATAGCCATTCAATGCGGTCAA ACCCCAGACCCAGATTA

TCCAATTGTTACCCCTAACTCCGGTAAATACTCTTGGCTTTTTGCCAAAACAGTAGT CCAAATAGCAGATGCAA

ACTACCACGAAGCTGTTACTCATCTAGCAAGAACTCACCTGTTTGTTGGTGTTTTTG CGATCGCCACCGCTCGA

CAATTGCCACTCACCCATCCCCTAAGAATTCTCCTGCACCCGCATTTTGACAGCACT TTAGCAATTAACGATGCC

GCCCAACGTATTCTCATAGCTCCAGGCGGTGGTGTCGATAGATTACTCTCATCATCA ATCGATAACTCTCGCGT

TTT AGTAGTGCTAGGGTTG C A AAG CT AT G GTTTT AAT AG CG CC AT CTT ACCT AAG C A ATTCC A AC AG CG CG GT

GTAGACGATCCCAACCTCTTGCCTGTTTATCCTTACCGGGATGATGCGCTACTAGTC TGGGATGCCATTCATCA

ATGGGTTGCAGACT ACCT AAATCTTT ACT ACACCACCGATGAAGACATTCAAAAAGACACAGCATTGCAAGCC

TGGGCAGCCGAAATCTCAGCTTACGATGGTGGTCGCATCCCCGATTTTGGCGAAGAT GGGGGCATCAAAACG

CGCAATTACCTGATTGATGCCACTACGCTGATTATTTTCACTGCCAGCGCTCAACAC GCTGCGGTTAACTTTCC

GCAAAAAGATTTTATGAGCTACGCCGCAGCGATTCCAATGGCAGGTTATTTACCAGC CTCAACTCTCAAAAGA

GAAGTTACTGAGCAAGACTACCTTAATTTGCTCCCTCCCTTAGATCAGGCGCAACGG CAATACAACCTACTCAG

CTT ATT GGGATCTGTGT ATT AC A AC AAG CTG G GT GATT AT C AG CAAG GAT ACTTT AC AG ACC AG A A AGT AA AA

CC ATT G CT AC AAG C ATT CC AA AGT AAT CTT C AG C AG GT AG AAG AT ACC AT CAAG C AACGT A ATTT GCACCGTCC

ACCCT AT G AGT AT CT ACTT CCTT CT AAAATTCCT CAG AGCAT CAAT AT CT AG

Amino acid Sequence for WP_094347473.1 - SEQ ID NO: 194

MTASSPENSISLSSTHTLDIARQEYQYNYTHI PSIAM LDRLSIAEEFATNWYFLLAQQLRVVFINTLIVN RGNQGSKSI

RDDVERFILEAFLKGAVPVKITI LARILQIIPQFLLNGISKDVRELDDLFYSILKENGLVI LRDALNRI IN LLYEGQPTGHAT

SLKDYEN LFPVIGVPGIAKTYQEDEVFAYM RVAGYN PVTIARVTTPGDRFPVIDEHYQGVMGTDDSLAAAGLEGRL

YLADYKILDGAVNGTFPHEQKYLYAPLALFALPKGSDPTRLLRPIAIQCGQTPDPDY PIVTPNSGKYSWLFAKTVVQI

ADANYHEAVTH LARTHLFVGVFAIATARQLPLTHPLRILLHPHFDSTLAI NDAAQRILIAPGGGVDRLLSSSI DNSRVL

VVLGLQSYGFNSAILPKQFQQRGVDDPNLLPVYPYRDDALLVWDAI HQWVADYLNLYYTTDEDIQKDTALQAWA

AEISAYDGGRIPDFGEDGGI KTRNYLIDATTLI IFTASAQHAAVNFPQKDFMSYAAAIPMAGYLPASTLKREVTEQDY

LN LLPPLDQAQRQYNLLSLLGSVYYNKLGDYQQGYFTDQKVKPLLQAFQSNLQQVEDTIKQR NLHRPPYEYLLPSKI

PQSIN I

Coding sequence for WP_012164252.1 - SEQ ID NO: 195 AT G ACGCC ACAAT AT G AAT ATCG AT ACG ATGCCCT G AAAG ACGTTT CCCCT G AATT G AAAT AT CC AATGGCC A

AGGAGGTGTTTCCAGCAGACCAATCTTTGACAAAATGGCCCTGGACTCGAGACCTCG TTTCCGTTGTACTCAG

AATTATTGCCAATCAGGCCATGCAGGATATATCCGTCCGCCGAGGATCAGCCTGTCG TCTGATTACGTTTATCC

GCTTGTATCGAATTCTAGAAAATCCCCTCTATCAGTCAGGTCTGGAGCGGGTTTTCA ATGCTATCAATAATCTC

GTACGGGGTCTCTCCAATATTTTTGGCAACAGAGCCCAGTCTCAAAATATCAAGCAT GATGTAAAGGACGAGC

AACATCCTGAAAAAGTCTCCGCCCGCATTTCCGCCATAGCCAAGGATATCCAAGAAA CGGCTGAGTCGAGAGA

GGCAAGAGAGCAAACTTCTTTAGCTGACTATCGCGATCTCTTTCAGATCATTTACTT ACCGGACATTAGCAACC

ATTTCCTAGAAGATCGTGCCTTTGCCGCTCAACGGGTTGCCGGAGCCAACCCCCTCG TGATTAACCGCATTTCT

G AACTCCCAG ACCATTT CCAAGT CACT G ACCAACAGTTTAAAGCT GT G ATGGG AG ATAGT G AGT CCCT CCAAG

CAGCTTTGAATGATGGCCGAGTCTATCTGGCAGACTATCAAATTCTAGAAGAAATTG ATGCGGGTACTGTTGA

GGTAAAGGATCGCGAAATTCCAAAGTATAGATATGCGCCGTTGGCCTTATTTGCGAT CGCATCCGGAAATTGT

CCCGGTCGCCTCCTCCAACCGATTGCCATTCAATGCCACCAAGAAGCAGGCAGCCCG ATATTTACACCACCCA

GTCTAGAAGCCGATAAAGAGGAGCGGCTCGCTTGGCGCATGGCCAAGACCGTCGTTC AAATCGCCGATGGTA

ACTACCATGAATTGATTTCTCATTTAGGGCGGACTCATCTCTGGATTGAGCCCATTG CTTTAGGCACTTACCGA

CGCCTAGGAACAGAGCATCCACTGGGTAAATTGCTCCTCCCCCACTTCGAAGGCACC TTATTTATCAACAATGC

G G C AG CCA AT AG CTT AATT GCTCCAGGTGGCACCGT AG AC A AAAT CTT ATTT G G C ACCTT AA AGT CAT CTGTTC

AGCTCAGCGTCAAAGGCGCTAAGGGTTACCCCTTTTCTTTCAATGACTCCATGCTCC CCCAAACCTTTGCATCG

CGAGGCGTGGACGACCTACAAAAGCTACCGGACTACCCGTATCGAGATGATGCATTA CTGATTTGGCACGCC

ATT C ACG ATT GGGTT G AGGCCTAT CTT CAG AT CT ACT ACAAAG AT G ATG AT GC AGT CCT CAAGG AT G ACAT CCT

CCAGGATTGGTTAGCCGAGCTACGAGCTGAAGATGGAGGCCAGATGACTGAAATCGG TGAATCAACTCCAGA

AGAACCCGAGCCTAAAATTCGCACCTTGGATTACCTCATTAATGCGACAACGCTCAT TATTTTTACCTGCAGTG

CCCAACATGCATCTGTCAACTTCCCTCAAGCATCATTGATGACGTTCGTCCCCAATA TGCCCCTAGCAGGGTTC

AATGAAGGTCCGACGGCAGAGAAAGCCAGTGAAGCAGATTATTTCTCTTTACTACCA CCCCTGAGTTTGGCCG

AACAACAGTTGGATCTAGGGTATACCTTGGGTTCGGTCTACTATACTCAGCTCGGAT ATTACAAAGCCAATGA

TGTGGATTTAGATGATATTAACGACCATACCTACTTCAAGGACCTCCAAGTTAAACA GGCCCTCCGAGACTTCC

AACAAAGATTAGAAGAAATTGAGTTGATCATTCAAGACCGGAACGAAACCCGACCCA CTTATTACGACATCTT

GCT CCCATCCAAG ATTCCCCAAAGT ACC AACATTT AA

Amino acid Sequence for WP_012164252.1 - SEQ ID NO: 196

MTPQYEYRYDALKDVSPELKYPMAKEVFPADQSLTKWPWTRDLVSVVLRIIANQAMQ DISVRRGSACRLITFI RLY

RILEN PLYQSGLERVFNAIN NLVRGLSNI FGNRAQSQNIKH DVKDEQHPEKVSARISAIAKDIQETAESREAREQTSL

ADYRDLFQIIYLPDISNH FLEDRAFAAQRVAGANPLVI NRISELPDHFQVTDQQFKAVMGDSESLQAALNDGRVYL

ADYQILEEIDAGTVEVKDREIPKYRYAPLALFAIASGNCPGRLLQPIAIQCHQEAGS PIFTPPSLEADKEERLAWRMAK

TVVQIADGNYHELISH LGRTHLWIEPIALGTYRRLGTEHPLGKLLLPHFEGTLFI NNAAANSLIAPGGTVDKILFGTLKS

SVQLSVKGAKGYPFSFN DSMLPQTFASRGVDDLQKLPDYPYRDDALLIWHAIH DWVEAYLQIYYKDDDAVLKDDIL

QDWLAELRAEDGGQMTEIGESTPEEPEPKIRTLDYLI NATTLI IFTCSAQHASVNFPQASLMTFVPN MPLAGFNEGP

TAEKASEADYFSLLPPLSLAEQQLDLGYTLGSVYYTQLGYYKANDVDLDDIN DHTYFKDLQVKQALRDFQQRLEEIEL

IIQDRN ETRPTYYDI LLPSKIPQSTN I

Coding sequence for WP_015121985.1 - SEQ ID NO: 197

AT G AC AG ATTT AT CAG A AAAT AAT C A AAAT AATTT GTC ACC AGTG G AT AA ATT A AA ACTT G CTAG G C A AG AAT ACC AGTATA ACTATAG CC AT ATTCC ACCT ATT G C AAT G GTG G AT C A ACTTCCT AGT AAT GAG A ATTT CT CT ACT G GCTGGCTGCGTTTGTTAGCTAAAGAATTAAAAGTTGTTTTTATCAATACCCTAATCGCAA ATCGAGGAAATCGT GGTTCCGAAAGTGTCCGCGACGATGTGAGATTATTTCTGATAGAAGTGTTAGCTAAAGGG GCATTACCGTTTA ATTT AACT GTT AGT GCT AG AATTTT AC AAATT ATT CCG AATTT ATT ACTT AC AGG AAT AT C AAAGG ATT AT AGT G AAATTGATGAGTTGTTCTTTTCCATACTTAGGGAAAGCGGACTTTCTATTTTTCAAGATT CTCTAAGTCGAGTTA AAAGT CTTTT AT AT G AAAAACGTCCT AGGG G AC AT GCG AAAAGCTT AAAT GATT AT C ACAAG CT GTT CCCCG A GATGGGAATACCCAAGATAGCCGAGAATTTCTCTACAGACGAACAATTTGCTTATATGCG GGTAGCTGGATAC AACCCGGTAATGATTGAGCAAGTGAATAAATTGGGCGATCGCTTTCCCGTTACCGAGGCT CAATATCGGGAA GTCATGGGAGATGATTCTTTAGCGGCAGCAGGTGAAGAAGGAAGACTTTATTTAGCAGAC TATGGAATTTTG A AAG GTGCTGTTAACGGT ACTTTT CCTT C AC AG C AA AAGT AT ATTT ACGCTCCCCTAG C ACT ATTT G C A ATTCCT AAAAATTCCAATAGCAATAAACCAACTTTAATGCGTCCAGTTGCGATTCAGTGCGGTCAA AATCCCCAGGATA ATCCGATTATTACGCCTAAATCAGACAAATATGCTTGGCTGTTTGCAAAAACTATCGTGC AAATCGCAGATGCT AACT ACC ACG AAG CTGT AACT C ATTT AG G ACG C ACT C ATTT ACTT GT AG GTCCTTTT GTT GTTG C A ACT C ATCGT CAGTT ACCGG AT AGT CAT CCGCTT AATATATTACT AAGT CCT CATTTT G AAGG AACTTT AGCG ATAAACG AT GC AGCCCAACGTCGTTTGATTGCTGCTGGTGGAGGTGTGGATAAATTACTGGCATCGACTAT TGATAATTCCCGT GTTTTGGCAGCAGTCGGTTTACAAAGCTATGGGTTTAATGAAGCCATGTTACCCAAGCAA TTAGAGAAACGCG GCGTTAACGATACACAAAAGCTACCTGTTTACCCATACCGCGATGATGCGCTGTTAGTTT GGAATACAATTCAT CAATGGGTTGGT G ACT ATTTAAACATTT ACTACAAAAGCG ATGCGG AT GTT AAAAAT G ACACCAAACTTCAG A ACTGGGCTATTGAAGCAGGGGCTTTTGATGGCGGAAGAGTTCCAGATTTTGGTCAACAAC ATGGGCTTATTCA AACCTTAGATTACTTAATTGATGCTATTACGCTGATTATTTTTACTGCTAGCGCTCAACA TGCTGCGGTT AATTT TCCCCAGGGAGACATGATGAACTACGCTCCAGCAGTACCCTTAGCTGGTTATCAGCCTGC TTCAATTCTTGAAG GCAAAGTT ACCG AAG AAAACT ATTT AAATTT ACTT CC ACCTTT AG AACAAGC AC AAG AACAATT AAACTT AGT C CACTT GTT AGGTT CT ATTT ACT AT C AAACTTT AGGT GATT ACCCAG AG AATT ACTT CAAAG AT ACCTT AGT AAAA CC AG CTTT G C AACAATT CCG AAAT AATTT A ATT G AAGTTG AAG CT ACT ATT CAT C A ACG C AAT C AAA AT CGTCC T ACTT ACG A AT ATTT G CTT CCTT C AAAA ATT CCT C AAAGT ATT AAT ATTT AG

Amino acid Sequence for WP_015121985.1 - SEQ ID NO: 198

MTDLSENNQNN LSPVDKLKLARQEYQYNYSH IPPIAMVDQLPSN EN FSTGWLRLLAKELKVVFINTLIANRGN RGS

ESVRDDVRLFLI EVLAKGALPFNLTVSARILQIIPN LLLTGISKDYSEI DELFFSILRESGLSI FQDSLSRVKSLLYEKRPRGH

AKSLN DYHKLFPEMGIPKIAEN FSTDEQFAYM RVAGYN PVM IEQVNKLGDRFPVTEAQYREVMGDDSLAAAGEE

GRLYLADYGILKGAVNGTFPSQQKYIYAPLALFAI PKNSNSN KPTLMRPVAIQCGQN PQDNPI ITPKSDKYAWLFAK

TIVQIADANYHEAVTHLGRTHLLVGPFVVATHRQLPDSH PLNI LLSPH FEGTLAI NDAAQRRLIAAGGGVDKLLASTI

DNSRVLAAVGLQSYGFNEAMLPKQLEKRGVN DTQKLPVYPYRDDALLVWNTI HQWVGDYLNIYYKSDADVKN DT

KLQNWAIEAGAFDGGRVPDFGQQHGLIQTLDYLIDAITLI IFTASAQHAAVN FPQGDMMNYAPAVPLAGYQPASI

LEGKVTEENYLNLLPPLEQAQEQLNLVHLLGSIYYQTLGDYPENYFKDTLVKPALQQ FRNN LIEVEATIHQRNQNRP

TYEYLLPSKIPQSIN I

Coding sequence for WP_038083060.1 - SEQ ID NO: 199

ATGACTGCTTCATCACAAGATAATTCGATAAATGTCCCAAATGCAGATAATCTGGAC ATAGCTAGGCAAGAAT

ACCAATATAGCTACACCCATATCCCACCTCTGGCTATGGTGGATCGGCTACCTCCAG CAGAAGATTTTGCAAGT

GCCTGGTACTTTTTGTTGGCTCAGCAAGTTAGGGGACTATTTGTTAATACTCTAATT ACTAACCGAGGAAATCG

CGGCTCCGAGTCGATCCGTGATGATGTGAGATTGTTTATCCTGGAAGTATTGCTGAA AGGAGCAATACCTTTC

CAAACCAAC ATT ATT GTT AAAGTTTT ACAAATT GT CCCT CAG ATTTT AGCT CAAGGT AT AT CT CG AG ATT ACCG A

G AACT CG ACG AT CT GTT ATTTT CT AT CCT C AAAG ACAGCGGC AT CACAATT CTT AAAG ATT CTTT AAAC AAAGTT

ATTGAGCTTTTGTACGAAGGACAACCAACTGGACGCCCTACCAGTTTGAATGATTAC GAAAAGTTATTCCCAG

TGCTGGGAGTCCCCGCGATCGCAACAACATTCCAAGACGATGAAGTGTTTGCCTATA TGCGAGTTGCAGGGTA

CAATCCCGTAATCATTGAGCGAGTCAGCAGTCCTGGCGATCGTTTTCCAGTCACAGA AGAACATTACCAGGTG GTGATGGGAACTGATGATTCCCTTGCAGCAGCCGGAGAAGAAGGAAGGCTCTACTTAACA GATTATGGAATT

TTAGAAGGAACGATCGGCGGGACATTCCCGTACTATCAAAAATACCTTTACGCTCCC TTAGCACTTTTTGCATT

ACCCAAAGGCTCTGACCCCAACCGTCTGCTGCGCCCGATAGCCATTCAATGCGGTCA AACTCCCGGTCCAGAT

TATCCGATCGTCACCCCTAACTCCGGTAAGTATGCTTGGCTGTTTGCCAAAACCGTT GTCCAGATAGCAGATGC

CAAT GTCCACG AAGCT GT C ACT CACCT AGCC AG AACACACTT ATT CGTTGGTGCTTTT GT ACTT GCAACCC AT C

GCCAACTTCTCCGCACCCATCCTTTAAGCGTACTTCTGCGTCCTCATTTCGAGGGAA CCTTAGCAATTAACGAT

GCAGCCCAACGAGCTTTGATTGCTCCTGGTGGTGGAGTTGATAGATTGCTTTCAGCA ACCATCGATAACTCTC

G G GTTTT AGCGGTGTACGGGTTG C A AAGTT AC AGTTT CAAT A AT G CC ATCCT ACC A AAG C A ATTT A AG C AG CG

AGGCGTGGAAGATCCCAATCTATTGCCCGTATATCCTTACCGAGATGATGCACTTTT GGTTTGGAATGCCATTC

ATCAATGGGTTTCGAGTTACGTAAACCTTTACTACTCCACTAATGAGGACATTCAAA AAGACGCAGCCCTTCAA

GCATGGGTTGCTGAAGCCCGATCTTACGATGGCGGTCGCGTGTTTGATTTTGGTGAA GATGGAGGTATCAAG

ACACGAGAATATCTAGCAGATGCCCTTACGCTGATTATTTTCACAGCCAGCGCTCAA CATGCTGCGGTTAACTT

TCCCCAGAAAAGTCTCATGGGTTACGCAGCTGCCGTACCACTAGCAGGTTACGCACC AGCCTCAACTCTCACTA

AGGAAGTGAGTGAAGAAGACTATCTCAAATTGCTCGCACCCCTAGATCAAGCACAAA GGCAGTATAATTTACT

GGCTTTGCTGAGTGCTGTTTACTATAACAAACTCGGTGAATACCCGCAAGGACACTT TACAAATCCACAAGTCC

AACCTTTACTACAGGAATTTCAGAGCAATCTCAAGCAGGTTGAAGCAACTATCAATC AGCGCAATTTGAAACG

CCCAAT CT AT AATT ATTT GCTGCCTTCCAAAATT CCCC AG AGC ATT AAT ATTT AG

Amino acid Sequence for WP_038083060.1 - SEQ ID NO: 200

MTASSQDNSINVPNADN LDIARQEYQYSYTHIPPLAMVDRLPPAEDFASAWYFLLAQQVRGLFVNTLITN RGN RG

SESI RDDVRLFILEVLLKGAIPFQTN IIVKVLQIVPQILAQGISRDYRELDDLLFSILKDSGITILKDSLN KVIELLYEGQPTG

RPTSLN DYEKLFPVLGVPAIATTFQDDEVFAYM RVAGYN PVI IERVSSPGDRFPVTEEHYQVVMGTDDSLAAAGEE

GRLYLTDYGILEGTIGGTFPYYQKYLYAPLALFALPKGSDPN RLLRPIAIQCGQTPGPDYPIVTPNSGKYAWLFAKTW

QIADANVHEAVTH LARTHLFVGAFVLATH RQLLRTH PLSVLLRPH FEGTLAI NDAAQRALIAPGGGVDRLLSATIDN

SRVLAVYGLQSYSFNNAI LPKQFKQRGVEDPN LLPVYPYRDDALLVWNAIHQWVSSYVNLYYSTNEDIQKDAALQA

WVAEARSYDGGRVFDFGEDGGIKTREYLADALTLII FTASAQHAAVN FPQKSLMGYAAAVPLAGYAPASTLTKEVS

EEDYLKLLAPLDQAQRQYNLLALLSAVYYNKLGEYPQGH FTNPQVQPLLQEFQSNLKQVEATI NQRN LKRPIYNYLL

PSKI PQSI NI

Coding sequence for WP_006516541.1 - SEQ ID NO: 201

ATG ACTG C AAG CTAT A A AAAT C AA AAT CT G C AAG AA AAA AAG C AG CAAT ATC AGTATAACTATACCC AT ATCC

CACCTGTGGCCATGGTAGACAAACTGTCAGAAGAGGAGGGGTTTTCTCCTGGATGGC GGTTGTTAGTGGCCA

AGGTTGGGTTTGAACTCCTCGTTAACACCATTATTGCTAATCGTGGAGATCAGGGTA AATCTGGAGCAGCCGA

TGATGTCAAAATATTTCTGATAGAAACGGTTAAGGAAACATTGGTAGATTACAAAGG TTTTTCTCGCCTGAAG

ATTCTCTGGCAAGGGGCAAAATATACCCCTAGACTCTTATTTGGCAGATTATCTATC AATGTAGAAGAGATTGA

AGATCTGATTACAGATATTATCAAAAGTGTCAGCGCTGATTTCCTCCGAGATTTTGC AGCTAACGTACAGCAAA

AATT AAT ACTGGACTCTCCTAAAGGTAAAGGGGATGACCTCAAAGATTTTCAGGAGCTATTTCAAAC CATTGA

TCTACCTGCCATCGCTTATACCTATGAGGAGGATGAGGTATTTGCATCCATGCGGGT AGCTGGGCCTAATCCG

GTCATGCTACAGCGACTGACAGAACCTGAGGCACGGCTGCCGATCACAGAGGCTCAA TATCAAGCCGTCATG

GGAGCAACGGATTCTCTGACAGAGGCCTATGCAGAGGGACGTGTATACCTGACGGAT TACGCCATTCTAGAG

GGGGCAATCAATGGCTCATTTCCCGCCGATCAGAAATATCTATACGCCCCCCTAGCC CTATTTGCTGTACCGAA

AGCCGATGTGGGCGATCGTCGTCTGCGTCCGGTGGCCATTCAATGTGGGCAAAACCC TAATGATTTTCCCATC

CACACGCCCAAATCAAATCCCTATGCATGGCTCTGCGCTAAGACCATTGTGCAGGTT GCCGATGCGAACTTCC

ATGAGGCGGTTACCCATCTGGCGCGGACTCATTTGTTCATTGGGCCATTTGCGATCG CAACCCACCGCCAACTC CCCGACAATCATCCCCTCAGTCTTCTCCTGCGCCCCCACTTCCAAGGCATGCTGGCCATC AACAACGAAGCCCA

GGCCAAGCTGATTGCTGCCGGTGGTGGCGTTAACAAAATTCTCTCAGCAACCATCGA CACGTCCCGAGTATTT

GCCGTCCTGGGGGTACAAACCTATGGCTTCAATTCCGCCATGTTCCCCAAGCAGCTG CAACAGCGCGGTGTAG

ACG ACACCAACAGCCTACCCAT CT ACCCCT ACCGT GAT G ACGGT AGCTTAATTTGGG ACGCCAT CCACAATT G

GGTAGAGGACTATCTCAAGCTGTACTATGCCGATGACGCTGCAGTACAGCAAGATGC TAATTTGCAAGCCTGG

GCACAGGAACTCATTGCTTATGATGGCGGTCGCGTCATAGAGTTTGGCGAAACTGAC GAACAACTGCAAACG

CTGCTGCAAACCCTTACGTATCTCATTGATGCCATTACTCTGATTATTTTTACCGCC AGTGCTCAACACGCCGCT

GTGAATTTCCCCCAAAAGGACATCATGAGCTTCACCCCAGCGATGCCGACCGCTGGC TATGATGAGTTACCAG

ATCTGGGAGACCAGACCACAAAAGAAGATTACCTGAGTTTGTTACCGCCTTTAAACC AAGCCCAAGAGCAGCT

CAAGCTATTGCACTTGCTTGGCTCCGTGCATTTTACAGAATTAGGCCAGTACGAAAA GGGACATTTTCAAGAC

AGT CAAGT ACAAGCCCCCTTGC AACGTTTCCAG AATCG ATT AG AAG AAAT CACAG AT GT GAT CT ACC AGCGC A

ATCGCAATCGTCCCGCCTACGAATATCTATTACCCAAGAATATTCCCCAAAGCATCA ATATCTAG

Amino acid Sequence for WP_006516541.1 - SEQ ID NO: 202

MTASYKNQN LQEKKQQYQYNYTH IPPVAMVDKLSEEEGFSPGWRLLVAKVGFELLVNTNANRGDQGKSGAADD

VKIFLIETVKETLVDYKGFSRLKILWQGAKYTPRLLFGRLSINVEEIEDLITDI IKSVSADFLRDFAANVQQKLILDSPKGK

GDDLKDFQELFQTIDLPAIAYTYEEDEVFASMRVAGPNPVMLQRLTEPEARLPITEA QYQAVMGATDSLTEAYAEG

RVYLTDYAILEGAINGSFPADQKYLYAPLALFAVPKADVGDRRLRPVAIQCGQN PNDFPI HTPKSNPYAWLCAKTIV

QVADAN FHEAVTHLARTH LFIGPFAIATH RQLPDNH PLSLLLRPH FQGM LAIN N EAQAKLIAAGGGVN KILSATIDT

SRVFAVLGVQTYGFNSAM FPKQLQQRGVDDTNSLPIYPYRDDGSLIWDAIH NWVEDYLKLYYADDAAVQQDANL

QAWAQELIAYDGGRVIEFGETDEQLQTLLQTLTYLI DAITLIIFTASAQHAAVN FPQKDI MSFTPAMPTAGYDELPDL

GDQTTKEDYLSLLPPLNQAQEQLKLLHLLGSVHFTELGQYEKGH FQDSQVQAPLQRFQN RLEEITDVIYQRNRN RP

AYEYLLPKNI PQSIN I

Coding sequence for WP_099100980.1 - SEQ ID NO: 203

ATG ACTG CTT CAT C ACC AG AAA ATT C AATT AG CT CAT CAAGT ACT CAT ACTTT AG AT AT AG CT AG G C A AG AGT A

TCAATATAACTACACCCATATTCCATCTATTGCGATGCTAGATCGGCTTTCTATTGC CGAAGAGTTCGCTACTAA

CTGGTA I I I I I I ATT AGCCCAGCAGTT ACG AGTT GT GTTTATT AATACTTT GATT GT CAACAG AGGCAAT CAAG

GTTCTAAATCGATTCGTGATGATGTCGAAAGGTTTATTTTAGAAGCCTTTCTCAAGG GAGCAGTACCAGCAAA

AAT CAGT ATTTT GGC AAG AAT CCTGC AAATT AT CCCT C AGTTTTT GCT CAAAAGT AT AT CT AAGG AT GTT AG AG

AACTCGACGATCTTTTTTATTCTATTCTGAAAGAAAACGGACTTGTAATCCTCAGAG ATGCTCTAAATAGGATA

ATT AACCTT CT AT AT G AAGG AC AACCT ACAGG ACATGCAACCAGT CT C AAGG ATT AT G AAAATTT GTTT CC AGT

GATTGGTATGCCAGCGATCGCTAAAACCTACCAAGAAGATGAAGTATTTGCCTACAT GAGAGTCGCTGGCTAC

AATCCCGTCACGATCGCGCGAGTAACGACTCCAGGCGATCGCTTCCCAGTCACAGAC GAACATTACCAAGCAG

TGATGGGAACTGACGATTCACTAGCAGCAGCCGGACTTGAAGGCAGGCTCTACTTAG CTGACTATAAAATTTT

AGATGGTGCGGTCAACGGTACATTCCCACACGAGCAAAAATATCTCTATGCTCCCCT AGCACTATTTGCCTTAC

CCAAAGGCTCAGACCCCACCCGTCTATTGCGTCCAATAGCCATTCAATGCGGTCAAA CCCCAGGCCCAGATTAT

CC AATT GTTACCCCT A ACT CCG GT AAAT ACT CTT G G CTTTTT G CC AA AAC AGT AGT CC AAAT AG C AG ATG C AA A

CTACCACGAAGCTGTTACTCATCTAGCAAGAACTCACCTCTTGGTTGGTGTTTTTGC GATCGCCACCGCTCGAC

AATTGCCACTCACCCATCCCCTAAGAATTCTCCTGCACCCGCATTTTGACAGCACTT TAGCAATTAACGATGCCG

CCCAACGTATTCTCATAGCTCCAGGCGGTGGTGTCGATAGATTACTCTCATCATCAA TCGATAACTCTCGCGTT

TT AG C AGTG CTAG G GTT G C A AAG CT AT G GTTTT A AC AG CG CC AT CTT ACCT AAG C AATT CCA AC AG CG CG GTG

TAGACGATCCCAACCTCTTGCCTGTTTATCCTTACCGGGATGATGCACTATTAGTCT GGGATGCCATTCATCAAT

GGGTTTCAGACTACCTGAACCTTTACTACACCACGGATGAAGACATTCAAAAAGACA CAGCATTGCAAGCGTG GGCAGTTGAAATCTCAGCTTACGATGGTGGTCGCATCCGCGATTTTGGCGAAGATGGGAG CATCAAAACGCG CAATTACCTAATTGATGCCACTACGCTGATTATTTTCACTGCCAGCGCTCAACACGCTGC CGTTAACTTTCCGCA A AA AG ATTTT ATG G G CTACG CCG C AG CC AT ACC ATT G G C AG GTT ATTT ACC AG CCT C AACT CT C AA AAG AG AA GTTACTGAGCAAGACTACCTTAATTTGCTCCCTCCCTTAGATCAGGCGCAACGGCAATAC AACCTACTCAGCTT ATT GGGGTCTGTGT ATT AC AAC AAG CTG G GT GATT AT C AG C AAG GAT ACTTT AC AG ACC AG A AAGT AAA ACC A TTGCT ACAAGCATTCCAG AGT AATCTT CAGCAGGT AG AAG AT ACCATCAAGCAACGTAATTT GCACCGT CCAC CCT AT G AGT AT CT GCTT CCTT CT AAAATT CCT CAG AGC AT CAAT ATCTG A

Amino acid Sequence for WP_099100980.1 - SEQ ID NO: 204

MTASSPENSISSSSTHTLDIARQEYQYNYTHI PSIAM LDRLSIAEEFATNWYFLLAQQLRVVFINTLIVN RGNQGSKSI

RDDVERFILEAFLKGAVPAKISI LARILQIIPQFLLKSISKDVRELDDLFYSILKENGLVILRDALN RI IN LLYEGQPTGHAT

SLKDYEN LFPVIGMPAIAKTYQEDEVFAYMRVAGYNPVTIARVTTPGDRFPVTDEHYQAVMGTDDSL AAAGLEGR

LYLADYKILDGAVNGTFPHEQKYLYAPLALFALPKGSDPTRLLRPIAIQCGQTPGPD YPIVTPNSGKYSWLFAKTVVQI

ADANYHEAVTH LARTHLLVGVFAIATARQLPLTHPLRILLHPHFDSTLAI NDAAQRILIAPGGGVDRLLSSSI DNSRVL

AVLGLQSYGFNSAI LPKQFQQRGVDDPN LLPVYPYRDDALLVWDAIHQWVSDYLNLYYTTDEDIQKDTALQAWA

VEISAYDGGRIRDFGEDGSI KTRNYLIDATTLI IFTASAQHAAVNFPQKDFMGYAAAIPLAGYLPASTLKREVTEQDYL

N LLPPLDQAQRQYN LLSLLGSVYYNKLGDYQQGYFTDQKVKPLLQAFQSNLQQVEDTIKQRNLHRPPYEYLLPS KIP

QSIN I

Coding sequence for WP_096578311.1 - SEQ ID NO: 205

ATGCTGCCAACTTTACCGCAGAATGATCCCAATCCTAGTGTGCGTCAAGCACAATTG GCTCGCAGCCGATATAT

CTACAAATTTACTCATAAGTACCAAGGCTGTCCCGGAAATTCACCTTTACCTAATGG GATTGCGCTGGCAGAAC

ATGTTCCTCCTGATCAGGAGTTTACTCCAGACTATCTTTTGCGGGTTACTCAGGTTA ACGCCACCTTACTGGCA

AACCACGCAGCCATCGACCTGGAGTATCTCACAGGAGGAAACGCAGGTAGCAGCTTT TCGCTGTCTGATTGGT

TAGGATTAACTCGGGCTGTAGGCAATAAACACTTACTTTTTTCCACACCGCTCAAGG TGACTTCCAGGATAGAT

AGTTCTTTTCCGATTAATTTGGATGCCTACGATGCAATGTTTGCGTTGATCCAGAAA CCTGAGATTGTTTACAA

GTT AA AG C A AG G CAG G GAT GTTT GCGATCGCG CTTTT G CCTG G C A AAG G CTG G CTG GTG CT AATCCG AT G GT

TTTGCAAGGTATTACTCATTTACCACCGACGTTTCAGCTTACTAACCAGCAATATCA AGCTGCTATTAGAGATG

AGAACGACACCCTTGAAGCTGCTGGTAAGGAAGGGAGGCTTTACGTTGCTGACTACT CGCTGCTTAGTGGGC

TTCCTCACGGTACTTGGAGTGATGGCGTTCTTGGTGTGCCTCGTAATAAGTATATCT TTGACCCAATCGCTCTA

TTTGCTTGGAAAAAAGAAACTCCACTGGAATTAGGAGGGTTATTACCCGTAGCAATT CAATGCCAACAAACTC

AAGATTCTATTTCGTGGTGTCGTTCGGTTGCACCAATCTTTACTCCTAATGATGGAA TCTTCTGGGAAATGGCT

A AAG CT ATT GT CC AATCCG CTG ATG GT AAC ATT C AG G A AAT G GTCT ACC ATTT AG G G C AC ACG C ACTTT GT A AT

GGAAGCCGTAATTGTTGCCGCAGAGCGCAATCTAGCTGCTGTTCATCCAATTCATGT ACTGCTTAAGCCCCATT

TTGAATTTACGCTATCACTAAATGACTATGCATACAAGCACCTAATTGCACCAGGTG GTGCAGTTGATTCGGTG

ATG G GTT C AAC ACTT G A AGG C AG CTT AACT CTT AT G CTT CG G G GT AT G A AAA ACT AT GCTTTT AAT C A AG CT CT

ACCTCCCCTAGATTTCAAAAATCGTGGCGTTGATAATTTAGATGGGTTACCTGAGTA TCCTTATCGCGATGATG

GTTTATTAGTTTGGACGGCAATTCGTAAGTTTGTATCCAAATATCTCCGGCTCTACT ATACCAATGATATTGATG

TCAAAACCGATACCGAACTCCAAAACTGGGTCAAAAGTATTGGCAATAGTCAAGAAG GAAATATTCAAGGAG

TGGAGGAAATCCAAACCTTAGAAAAGCTGATTGATATGGTAGCCTTAATCATTTTTA CCGCTTCAGCACAGCAT

GGGTCACTCAACTACGCACAATTCCCAATGATGGGTTATGTACCGAATGTGTCTGGA GCAATTTACGCAGAAG

CT CCCACAAAT ACAACT CCT CAG AAT C AAG AC AATT ATTT AAT GTT GTT GGCTCCCGT ACAAC AAGCCCT G ATA

C AGTT C AC A ACT CT AT AT C AATT GT CG A ACGT ACG CTACG GT A AATT AG GT C ATT ATCCCTG CTT AT ATTTT C AA GATTCGCGAGTACTTCCTTTAGTCAAGGAATTCCAGCAGAACTTAGCTGTTGTTGAGTCA GAAATTCTTGATCG CG ACC AAACT CGTTTT AT GT CAT ATCCTTTT CTGCTTCCCT CT CAAATT GGG AACAGCAT CTTT ATTT G A

Amino acid Sequence for WP_096578311.1 - SEQ ID NO: 206

MLPTLPQNDPN PSVRQAQLARSRYIYKFTHKYQGCPGNSPLPNGIALAEHVPPDQEFTPDYLLRVTQVNAT LLANH

AAI DLEYLTGGNAGSSFSLSDWLGLTRAVGN KHLLFSTPLKVTSRIDSSFPIN LDAYDAMFALIQKPEIVYKLKQGRDV

CDRAFAWQRLAGANPMVLQGITHLPPTFQLTNQQYQAAIRDEN DTLEAAGKEGRLYVADYSLLSGLPHGTWSDG

VLGVPRNKYIFDPIALFAWKKETPLELGGLLPVAIQCQQTQDSISWCRSVAPIFTPN DGIFWEMAKAIVQSADGN IQ

EMVYHLGHTHFVM EAVIVAAERN LAAVH PIHVLLKPHFEFTLSLNDYAYKH LIAPGGAVDSVMGSTLEGSLTLM LR

GM KNYAFNQALPPLDFKN RGVDNLDGLPEYPYRDDGLLVWTAI RKFVSKYLRLYYTN DIDVKTDTELQNWVKSIG

NSQEGN IQGVEEIQTLEKLIDMVALII FTASAQHGSLNYAQFPMMGYVPNVSGAIYAEAPTNTTPQNQDNYLMLL

APVQQAUQFTTLYQLSNVRYGKLGHYPCLYFQDSRVLPLVKEFQQN LAVVESEILDRDQTRFMSYPFLLPSQIGNSI

FI

Coding sequence for RCJ33284.1 - SEQ I D NO: 207

ATG ACTG CTT CAT C ACC AG AAA ATT C AATT AG CT CAT C AAGT ACT CAT ACTTT AG AC AT AG CT AG G C A AG AGT A

TCAATATAACTACACCCATATTCCATCTATTGCGATGCTAGATCGGCTTTCTATTGC CGAAGAGTTCGCTACTAA

CTGGTA I I I I I I ATT AGCCCAGCAGTT ACG AGTT GT GTTTATT AATACCTT GATT GT CAACAG AGGCAAT CAAG

GTTCTAAATCGATTCGTGATGATGTCGAAAGGTTTATTTTAGAAGCCTTTCTCAAGG GAGCAGTACCAGTAAA

A AT C AGT ATT CTG G CAAG A AT CCTG CAAATT ATCCCT C AGTTTTT G CT C AAA AG CAT AT CTC AG G AT GTT AG AG

AACTCGACGATCTTTTTTATTCTATTCTGAAAGAAAACGGACTTGTAATCCTCAGAG ATGCCCTAAATAGGATA

ATT AACCTT CT AT AT G AAGG AC AACCT ACAGG ACATGCAACCAGT CT C AAGG ACT ACG AAAATTT GTTT CCGGT

GATTGGTGTGCCAGCGATCGCTAAAACTTACCAAGAAGACGAAGTATTTGCTTACAT GCGAGTGGCTGGCTAC

AATCCCGTCACGATCGCGCGAGTAACGACTCCAGGCGATCGCTTCCCAGTCACAGAC GAACATTACCAAGGCG

TGATGGGAACTGACGATTCATTAGCAGCAGCCGGACTTGAAGGCAGACTCTACTTAG CTGACTATAAAATTTT

AGATGGTGCGGTCAACGGTACATTCCCACACGAGCAAAAATATCTCTATGCTCCCCT AGCACTGTTTGCCTTAC

CCAAAGGCTCAGACCCCACCCGTTTATTGCGTCCGATAGCCATTCAATGCGGTCAAA CACCAGACCCAGATTAT

CCAATTGTTACCCCTAACTGCAGTAAATACTCTTGGCTTTTTGCCAAAACAGTAGTC CAAATAGCAGATGCCAA

CTACCACGAAGCTGTTACTCATCTAGCAAGAACTCACCTGTTTGTTGGTGTTTTTGC GATCGCCACCGCAAGAC

AACTGCCACTCACCCATCCCCTAAGAATTCTACTGCACCCGCATTTTGACAGCACTT TAGCAATTAACGATGCT

GCTCAACGGATTCTCATAGCTCCAGGCGGTGGTGTCGATAGATTACTCTCATCATCA ATCGATAACTCTCGCGT

TTT AG C AGTG CT AG G CTT AC AA AG CT AT G GTTTT AAC AGT G CC AT CTT ACCT A AG C A ATT CCAACAGCGTGGTG

TAGACGATCCCAACCTCTTGCCTGTTTATCCTTACCGGGATGATGCACTATTAGTCT GGGATGCCATTCATCAAT

GGGTTTCAGACTACCTAAACCTTTACTACACCACCGATGAAGACATTCAAAAAGACA GAGCATTGCAAGCGTG

GGCAGCCGAAATCCCAGCTTACGATGGTGGTCGCATTCCCGATTTTGGCGAAGATGG AGGCATCAAAACGCG

CAATTATCTAATTGATGCCACTACGCTGATTATTTTCACTGCCAGCGCCCAACACGC TGCGGTTAACTTTCCGCA

AAAAGATTTTATGGGCTACGCCGCAGCGATTCCAATGGCAGGTTATTTACCAGCCTC AACTCTCAAAAGAGAA

GTTACTGAGCAAGACTACCTTAATTTGCTCCCTCCGTTAGATCAGGCGCAACGGCAA TACAACCTACTCAGCTT

ATT GGGGTCTGTGT ATT AC AAC AAG CTG G GT GATT AT C AG CAAG GAT ACTTT AC AG ACC AG A AAGT AAA ACC A

TTGCT AC AAGC ATTCCAG AGT AAT CTT CAGC AGGT AG AAG AT ACG AT CAAGC AACGT AATTTGCGCCGT CC AT

CCT AT G AGT AT CT ACTTCCTT CT AAAATTCCT C AG AGCAT CAAT ATCTG A

Amino acid Sequence for RC133284.1 - SEQ ID NO: 208 MTASSPENSISSSSTHTLDIARQEYQYNYTHI PSIAM LDRLSIAEEFATNWYFLLAQQLRVVFINTLIVN RGNQGSKSI

RDDVERFILEAFLKGAVPVKISILARI LQI IPQFLLKSISQDVRELDDLFYSI LKENGLVILRDALNRII NLLYEGQPTGHAT

SLKDYEN LFPVIGVPAIAKTYQEDEVFAYM RVAGYN PVTIARVTTPGDRFPVTDEHYQGVMGTDDSLAAAGLEGRL

YLADYKILDGAVNGTFPHEQKYLYAPLALFALPKGSDPTRLLRPIAIQCGQTPDPDY PIVTPNCSKYSWLFAKTVVQI

ADANYHEAVTH LARTHLFVGVFAIATARQLPLTHPLRILLHPHFDSTLAI NDAAQRILIAPGGGVDRLLSSSI DNSRVL

AVLGLQSYGFNSAI LPKQFQQRGVDDPN LLPVYPYRDDALLVWDAIHQWVSDYLNLYYTTDEDIQKDRALQAWA

AEI PAYDGGRIPDFGEDGGIKTRNYLIDATTLII FTASAQHAAVN FPQKDFMGYAAAI PMAGYLPASTLKREVTEQD

YLNLLPPLDQAQRQYNLLSLLGSVYYN KLGDYQQGYFTDQKVKPLLQAFQSN LQQVEDTI KQRN LRRPSYEYLLPSK

IPQSI N I

Coding sequence for WP_052555973.1 - SEQ ID NO: 209

ATGGCGCGAACCGCTCGGTACCGGTTCGGACCCGAATTGCCCGGCGCCCGACCCGAT GCCCAGGTGGTTCAC

CCGATGAGCGCATTTCTGCCCGCGTTCGATCCGGACCCGGAAACCCGTGCCGCCGGG CGCGCCGCGAAGCGG

GGCGAGTACACGTACAACCACGAATACGTTTCGCCGCTCGCGTTCGTCGGGGAGGTG CCCAGCCGCGACCGG

TT CCCCATCG ATTTCACCACGCT CGTT CT CGGCAAG ATCAT G ACG AACGTGGCG AACCAGGCGG ACGCGG ATT

CCGCGCTGCGCCGGCGCCTGCGCGCGATGGACGTCCCGATCGCCGACATGGTGCTCG CCGGGTCGACGGCCG

TTCGCGCCGTCGGCGCCGCGGTGGGTGCCGTGATCGGGGCGGCGGCGGATGCCCGTC GGTTGCAAACGATC

GACGACTACAACGCTCTCTTCCACGTCATCGGGCTGCCGCCGATCGCGAAGGACTTT GAATTCGACAGCACGT

TCGCGGAATTGCGGCTCGCCGGGCCGAACCCGGTGATGATTCACCGGGTCGACAAGC CGGACGATCGATTCC

CGGTCACGGACGCGCATTTTCAGGTCGCACTGCCCGGCGACACCCTCGCGGCGGCCG GGGCGGAAGGGCGA

CT GTTT CT GGT GG ACTACCAG AG ACTT G ACGGGGT CG AG ACCGGT GTAAGCCCGT GCGGGCTGCCG AAGTAC

CTCTACGCCCCGCTCGCGCTGTTCGCGGTGAACAAGGACACGCGAAAACTGGTCCCG GTCGCGATCCAGTGC

AAGCAGCGGCCGGGACCGGAGAACCCGATCTTCACGCCGGACGACGGCTACAACTGG CGGATCGCCAAGAC

GATCGTGGAAATCGCCGACGGCAACTACCACGAGGCGATCACGCACCTCGGGCGCAC GCACCTGACGGTCGA

GCCGTTCGTGGTCGCGGCGCACCGGCAGTTCGGTCCGAACCACCCGCTCAATGTGCT GCTCCAACCGCACTTC

GGTGGCACACTCGCGATCAATCACCTCGCGCGTCTCAAACTGATTTCGCCCGATGGC GTCGTGGACCGGCTCC

TCGGCGCGAAGATCTCCGCGGCGCTGGAACTCAGCGCGTGGGGGGTGCAGGGCCACG CCTTCATGGATTTGC

TGCCGCCGGCGTCGTTTCGGCGCCGCGGGGTCGATAACACGGCCACCTTGCCGAGCT ACTCCTACCGCGATGA

CGCCCTCTTGCACTGGGAGGCCGTTCGCGAGTGGGTCGCGACGTACCTGCGGTGCTT CTACCGGTCCGATGCC

GAAGTCGCGGCGGACGTGGAAGTCGCGGCGTGGCTCACGGAGGCGTCCGCGAAGACC GGCGGGCGCATCA

ACGGGATCGAACCGGCCCGCACCTTCGCGGAACTGGTCGACGTGACCGCCCTTGTGA TTTTCACCGCGAGCGC

GCAGCACGCGGCGGTGAACTTCCCGCAATACGACATCATGAGTTACGCCCCCGCGAT GCCGCTCGCGGGTTA

CGCCCCGGCGCCCACGAGCAAGACCGGCGCCACAGAAGCCGACTACATGGCGATGCT GCCACCGCGGGACC

AGGCCGCGCTCCAGATGAACACCGGCTTCATGCTCGGAACGGCGCACTACACGCGGC TGGGGCACTACGAAC

CGGGGTACTTCGGCGAACCGCGCATTAACGAACTAGCGGCGCGATTCGCGGCGAAGA TGGACGAGATCGAG

GCCACCATCACGGAAAGAAACCGGCACCGCCGGCCGTACCCGTTTATGCTGCCATCG GGTGTGCCGCAGAGC

AT CAACATTT G A

Amino acid Sequence for WP_052555973.1 - SEQ ID NO: 210

MARTARYRFGPELPGARPDAQVVH PMSAFLPAFDPDPETRAAGRAAKRGEYTYNH EYVSPLAFVGEVPSRDRFPI

DFTTLVLGKI MTNVANQADADSALRRRLRAM DVPIADMVLAGSTAVRAVGAAVGAVIGAAADARRLQTIDDYNA

LFHVIGLPPIAKDFEFDSTFAELRLAGPNPVMI HRVDKPDDRFPVTDAH FQVALPGDTLAAAGAEGRLFLVDYQRLD

GVETGVSPCGLPKYLYAPLALFAVNKDTRKLVPVAIQCKQRPGPEN PIFTPDDGYNWRIAKTIVEIADGNYHEAITHL

GRTH LTVEPFVVAAHRQFGPN HPLNVLLQPHFGGTLAI NHLARLKLISPDGVVDRLLGAKISAALELSAWGVQGHA FMDLLPPASFRRRGVDNTATLPSYSYRDDALLHWEAVREWVATYLRCFYRSDAEVAADVE VAAWLTEASAKTGG

RINGI EPARTFAELVDVTALVIFTASAQHAAVN FPQYDI MSYAPAMPLAGYAPAPTSKTGATEADYMAMLPPRDQ

AALQMNTGFMLGTAHYTRLGHYEPGYFGEPRI NELAARFAAKMDEIEATITERNRH RRPYPFMLPSGVPQSIN I

Coding sequence for WP_103667398.1 - SEQ ID NO: 211

ATGATCTTCTCGCTTTTGAGTGGTGTTGCCAGAGTATTAAATTTCGTTTCGGCTAAG TTAACAGACTTAGCCAA

TTTAATATCAAGGCGATCGCAGTCAAGCAAATACCCGCTGTTGCCTCAGAATGATCC CGCAACTACTCAGCGTC

A AG CAT CT CT A AAT C A AT CTAG G C A ACT CT AT C A AT AT AACT AC ACCT AT ATT G AGT C ATT G CCA AT G GT AG AG

AAGGTTCCCAAGAATGAGAGATTTTCTCTATCTTGGGGATTATTAGTTGGGAAGGTA GTGGTCAAAGTTTTGT

T AAAT GAT CG AGCT AAT CCTTCGGCATT C ATT G ACAAAG AG AAAT CT AAAGC AC AAC AACT AG ACTT CT CAAA

ACGTTTGCTT G AAGCT AGCAT GT CT CAGT CT G AAAAT GCATTAAT AG AACT ATT GTCCG AATT GCCAACAATT C

TTGAAGATGAGCCAATTGATTTAGAAGGGTCAAACATTCAAGAATACAACAATCTTT TTTGGATTATTCCTCTA

CCTG C AAT CAGT C AA AATTTT AAG AG C A ATT C AG A ATTT G C A AG GTTACGCGTTGCT G GCTTT AAT CCT CT AGT

GATTCAAAAGGTTAAGGCTTTGGATGCCAAATTCCCCTTGACTGAGGCGCAATTCCA GAAGGTTTTGGCTGGT

GATTCTTTAGCTGCGGCAGGAGCAGAAGGGCGTTTGTATTTGGCTGATTATGTAGAA CTAACCGCGATCGCAG

GCGGCACTTTCCCTAAATCAGAACAGAAATATATCAACGCACCTTTAGCTCTATTTG CGATTCCTAAAGGGAAA

AAGAGCCTGACTCCGATCGCCATTCAACTAGGACAAGATCCGAATACTAATCCCATC TTTGTCTGTCAAGCTGG

TGATGAGCCAAACTGGATGCTAGCAAAAACTGTTGTCCAAATTGCCGATGCTAATTA CCATGAACT AATT AGT

CATTTGGGTAGAACTCATCTATTTATCGAGCCTTTTGCGATCGCTACTAATCGCCAA CTCGCCAGTAATCATCCT

CTATATGTTTTACTAAAGCCACATTTTCAAGGGACTTTAGCGATTAATGATGCGGCT CAGTCAGGACTGATTAA

TGCAGGTGGAACCATTGATAGTCTATTAGCAGGCACGATTACTTCGTCTCGCGCACT TTCAGTTCAGGGTGTA

AAAACCTATAACTTTGATGAGGCGATATTGCCTGTAGCTTTGAAGAAGAGAGGAGTT GATGATCCAAACCTAT

TGCCAGACTATCCCTATCGCGATGATGCTTTGTTAGTTTGGGATGCTATTTCAACTT GGGTTAAAAGCTATCTA

TCGATCTATTACTTCAATGACAATGATGTGATTAGAGATTCGGAACTGCAAGCTTGG GCACAGGAAATCATTT

CTGACAATGGTGGTCGCGTAACTAGTTTCGGACAGAGTGGACAGATTCGCACTTTTG ATTATTTAGTCAATGCT

GTAACTCTACTAATCTTTACTGGTAGTGCTCAACATGCGGCGGTGAACTTCCCCCAA GGCGACTTGATGGTTTA

TGCTCCCGCATTTCCTCTAGCTGGCTATACCCCTGCACCAACTTCAACCACAGGTGC AAGCGAGGCAGATTTCT

TTGCAATGTTGCCTCCTATCGATCAGGCTAAGAGCCAATTGACGATGACTTATATTC TTGGTTCGGTCTATTAC

ACGACCTTGGGTGAGTATGGGCCTAGTTATTTCAATGACGATCGCATTAAGCAGCCC CTACTCGATTTCCAAG

AT CAGTT AAAGGCG AT CG AGT CAAC AAT C AAGT CT CGT AAT G AAAAACG AGTT ACGG ACT AT AACT ATTT GAG

ACCAT C ACGG ATT CCT C AAAGT ATT AAT AT CT AA

Amino acid Sequence for WP_103667398.1 - SEQ ID NO: 212

MI FSLLSGVARVLN FVSAKLTDLANLISRRSQSSKYPLLPQNDPATTQRQASLNQSRQLYQYNYTYI ESLPMVEKVPK

N ERFSLSWGLLVGKVVVKVLLNDRANPSAFIDKEKSKAQQLDFSKRLLEASMSQSENALI ELLSELPTILEDEPIDLEG

SNIQEYNN LFWI IPLPAISQN FKSNSEFARLRVAGFNPLVIQKVKALDAKFPLTEAQFQKVLAGDSLAAAGAEGRLYL

ADYVELTAIAGGTFPKSEQKYINAPLALFAIPKGKKSLTPIAIQLGQDPNTNPI FVCQAGDEPNWMLAKTVVQIADA

NYHELISHLGRTHLFIEPFAIATN RQLASNH PLYVLLKPH FQGTLAIN DAAQSGLI NAGGTIDSLLAGTITSSRALSVQG

VKTYNFDEAI LPVALKKRGVDDPNLLPDYPYRDDALLVWDAISTWVKSYLSIYYFNDNDVI RDSELQAWAQEIISDN

GGRVTSFGQSGQIRTFDYLVNAVTLLIFTGSAQHAAVNFPQGDLMVYAPAFPLAGYT PAPTSTTGASEADFFAM LP

PI DQAKSQLTMTYI LGSVYYTTLGEYGPSYFN DDRI KQPLLDFQDQLKAIESTIKSRN EKRVTDYNYLRPSRI PQSIN I

Coding sequence for WP_023071825.1 - SEQ ID NO: 213 AT G ACTGC AAGCT ACT CC AACCC AG ACC AACAT AAAAAACGTTT AG AAT AT C AAT ACAACT AT ACCC AT ATT CC

GCCCATAGCTATGGTGGATAAGCTATCAGAGGAAGAGCAATTTTCTTCGCGATGGCG TTTGATGGTGGCTAAA

GTTGGTTTTGAAATACTGGTTAATACGATTATTGTCAATCGAGGTGATCAAGGTAAA TCAGGAGCCGCAGACG

ATGTTAAAGCCTTTCTCATAGAGACTTTTCAGGAGACTTTAGCAGACTATTCAGTGA GGTCTCGGCTGAAAATC

CTCTGGCAGGGAGCAAAGTTTATACCCAGGATTCTATTTACGCGGTTATCCTTAAAG GCAGAAGAGCTAGAAA

ACCTGATCAAAGAGATTATTCAGAGTGTCAATGGCGATTTTCTACGAGATTTTGCCG CCAATGTGCAACAGAA

GTTAAAACTCGATGCGCCTGTAGGGCGCGGCCAGGACATTAAAGATTTTCAGGCTCT GTTTCAAACGATTGAC

TTACCAGACATCGCCTACACCTACGAAACCGATGAGGTGTTTGCATCAATGCAGGTA GCCGGGCCAAATCCAG

TCATGATCAAGCGGCTGTCAACACCGGATGCTCGTCTGCCCATCACAGAGACTCTGT ACAAAGGGGGCATGG

GAGAAACGGATTCCCTGGCCGATGCCTATGCTGAAGGACGTTTATACCTAGCTGATT ATGGCATTCTGGATGG

AGCCATCAACGGTTCATTTCCTGAGGCGCAGAAATATCTCTACGCGCCACTTGCGTT ATTTGCTGTAGCAAAAA

CGGGCGATCGCCGTTTGCGGCCAGTAGCAATTCAATGTGGGCAAAATCCCGAGGAGT TTCCTCTTTATACCCC

GCAATCAAATCCCTATGCCTGGCTCTGTGCAAAGACCATGGTGCAGATTGCTGATGC TAATTTCCATGAGGCA

GTCACCCAT CTGGCACGT ACT CATTT GTT GATT GG ACCATTTGCG ATCGCAACCCACCGCCAACTATCCG ACG A

CCATCCCCTCAGCCTCCTGCTCCGCCCCCACTTCCAGGGCATGCTAGCCATCAATAA CGAAGCCCAAGCCAAGC

TGATCGCCCCTGGCGGTGGCGTCAACAAGATTCTCTCAGCCACCATCGATACCTCGC GAGTATTTGCTGTCATC

GGCGTCCAGACCTACGGCTTTAACTCCGCCATGTTACCCAAACAACTTCAGCAGCGC GGAGTAGACGATACAG

ATAGCCTCCCCATTTACCCCTACCGTGACGACAGCATCTTAATTTGGGACGCCATTC ATGACTGGGCCGAAAAC

TATCTCAGCCTCTACTATGCCAATGATGCGGCCGTTCAGCAGGATAACGCTCTACAG GCATGGGCACAGGAAC

TAAGCGCCCACAATGGCGGTCGCGTCCAAGAATTCGGCGAAGCCGAAGGGCAGCTCC AAACCCTTGCATATC

TGATTGACGCCATCACGCTGATTATATTCACCGCTAGCGCCCAACATGCAGCAGTCA ATTTCCCCCAAAAGGAA

ATCATGAGCTACGCCCCAGCCATGCCAACCGCTGGCTATGCCGCATTAGAAAATCTC GGAGAGCACACCACTC

AAGCAAACTACCTGAGCTTATTACCCCCCATCGACCAAGCGCAGGAGCAACTTAAGT TATTGCATCTGCTAGG

CT CT GT CC ACTT CAC AC AGTT AGG ACAGT ACG AG AAAAAT C ATTTCCAGG AT GCCAAT AT C AAAAT CCCGCT AG

AACAGTTTCAAAACCGTCTCGAAGAGATTACAGATATTATCCATGAGCGTAATCGCG ATCGGTCTCCCTACGA

GT ATTT ACT ACCCAAAAAT ATTCCCCAAAGCAT CAAT AT CT AG

Amino acid Sequence for WP_023071825.1 - SEQ ID NO: 214

MTASYSNPDQHKKRLEYQYNYTH IPPIAMVDKLSEEEQFSSRWRLMVAKVGFEILVNTI IVNRGDQGKSGAADDV

KAFLIETFQETLADYSVRSRLKILWQGAKFIPRILFTRLSLKAEELENLIKEI IQSVNGDFLRDFAANVQQKLKLDAPVG

RGQDIKDFQALFQTIDLPDIAYTYETDEVFASMQVAGPNPVMI KRLSTPDARLPITETLYKGGMGETDSLADAYAE

GRLYLADYGILDGAI NGSFPEAQKYLYAPLALFAVAKTGDRRLRPVAIQCGQN PEEFPLYTPQSNPYAWLCAKTMV

QIADAN FHEAVTH LARTHLLIGPFAIATH RQLSDDH PLSLLLRPH FQGM LAIN N EAQAKLIAPGGGVNKILSATIDTS

RVFAVIGVQTYGFNSAMLPKQLQQRGVDDTDSLPIYPYRDDSI LIWDAI HDWAENYLSLYYANDAAVQQDNALQ

AWAQELSAHNGGRVQEFGEAEGQLQTLAYLI DAITLII FTASAQHAAVN FPQKEIMSYAPAMPTAGYAALENLGE

HTTQANYLSLLPPIDQAQEQLKLLHLLGSVHFTQLGQYEKNH FQDANI KIPLEQFQNRLEEITDI IH ERN RDRSPYEYL

LPKNIPQSI NI

Coding sequence for WP_096618242.1 - SEQ ID NO: 215

ATGCGATCGCCAACTCCAAAGCAACGACGACAAGAGTTAATAGATACATATATTTTA TCACGTCGTAGCATGA TGATGCTAATGGCTGTAGCTGCTACTCCGGGTATAGAAATGTTACTGTTCGGTGGGAATA AATCCTCACAAGC TAGTGCAACAGGTAATTTTGAAAATTGCAATCCGGGTTTGGAAACTTTACTATCCAATGA AAATCAACCCTCAA AACCCAAACCACC AAAT AAT CCCAAC AT CCCT ACCTT ACCT CACAAGG AT ACAAAAGCAACT CAAC AAG AACGC CTGCTTCAGTTGGGCAAGGCTCGCGAAGAATATCAGACAGGGTTACGGCTGCCTAATTCT GCGAAAGTGAAG ACTTTACCCGCTCAAGAAGCATTTTCGGAAAGATATAACAATAATCGAGTCATCTTATCG GAGAAAATAGCAG

CTAATCAACAAGCATTTCTCAGCAATCCTCAACCTTTTCAAAGCTTCGATGACTACG CGGCGTTGTTTCCCGTTT

TGCCGTTACCAGGTATTGCTAAAACCTTCCGCAACGATGATGTATTTGCACGGCAGC GTCTTTCTGGCTGCAAT

CCCATGGAACTGAAGAACGTTCTCAAACTGGGTTACAGTCTTCGCGACAAAATGGGG ATAACGGATGAGATT

TTT C AAG CTGT ACT G G G CG CG AC A AG AG G C AG A AAG CCG ATT CAT A AT AAT C AG ACT CTC AAC AG CG CT ATT C

GAGAAGGGAGTTTATTTGTCACAGACTATGCGGTACTTGATAGCGTTACACCGAAGG AAACGCAATATTTGTG

CGCCCCCATTGCCCTCTATTATGCCGCAAGGATTCGCGGCGATTTTCATTTAATTCC CATTGCTATCCAGTTGGG

ACAGGTACCAGGAGAAAGTTTACTTTGTACACCTTTAGATGGCGTAGATTGGACTTT AGCCAAATTAATTACCC

AGATGGCTGATTTCTCCATCAATCAACTGTACCGTCACTTGGGACAAACTCATCTAG TAATGGAACCAATCGCC

TTAGCAACAGTACGCGAACTAGCTGCTCGCCATCCCGTCAACGTCCTCTTAAAGCCT CATGTTGAATTTACAAT

GGCAATTAATAGCCTTGGTGATCAGGTGTTGATTAATCCGGGGGGAGCAGTAGATGT TATCTTACCAGGCACT

TTGGAAAGCTCACTCAAACTCACCGAAAGAGGGGTATCCGACTTTTGCAACAACTTC AGCAACTTTGCACTCCC

GACTAATTTACGTCAGCGCGGTGTTGATAATTCTTCGATTCTGCAAGATTTTCCCTA TCGAGACGACGGCTTGC

TCATCTGGAATGCCTTAGAAGAATATGTGAGTCAATATATCGGAATTTACTACAAAT CCAACCGAGATATCCGC

GAGGATTTCGAGCTACAAAAATGGTTCCAAGCTTTACGGAAACCCGTTAGTGAAGGT GGTTTTGGTATAGTTT

CATTACCAGCAAGCTTGACGAACCGCAACCAATTGATAGATATTTTGACAATCATTA TTTTCACCGCAGGTCCG

CAACACTCAGCGATCGCTTGGACTCAATATCAATACATGGCTTTTATTCCGAATATG CCCGGAGCGCTTTATCA

GCCTATTCCCACAACCAAAGGAAAATTTGCAAATGAAAATAGCCTCACGAGTTTCCT ACCGGGAGTCAAACCA

AGCCTTACTCAAGTCCAGTTTATGTCGTTAGTCGGTACCAAGCGCGACCCCAAGGCG TTTACAGACTTCGGTAC

AAAT AGTTTT CAAG ACCCT CG AGCC ATT AGGGTT CTT AG AG ATTT GCAG AAT CGCTT AG AGT C AGT AG AAAAA

CGGATTAAAATACTTAATAAACGTCGCCAAGAATGCTACCCTGCTTTTCTACCCTCT CGAATGTCGAATAGTGT

CAGTGGATAG

Amino acid Sequence for WP_096618242.1 - SEQ ID NO: 216

MRSPTPKQRRQELIDTYI LSRRSMMM LMAVAATPGIEM LLFGGN KSSQASATGN FENCNPGLETLLSNENQPSKP

KPPNN PNI PTLPHKDTKATQQERLLQLGKAREEYQTGLRLPNSAKVKTLPAQEAFSERYNN NRVI LSEKIAANQQAF

LSN PQPFQSFDDYAALFPVLPLPGIAKTFRN DDVFARQRLSGCNPM ELKNVLKLGYSLRDKMGITDEI FQAVLGATR

GRKPI HN NQTLNSAIREGSLFVTDYAVLDSVTPKETQYLCAPIALYYAARI RGDFHUPIAIQLGQVPGESLLCTPLDGV

DWTLAKLITQMADFSI NQLYRH LGQTHLVMEPIALATVRELAARH PVNVLLKPHVEFTMAINSLGDQVLIN PGGA

VDVI LPGTLESSLKLTERGVSDFCN NFSN FALPTN LRQRGVDNSSI LQDFPYRDDGLLIWNALEEYVSQYIGIYYKSN R

DI REDFELQKWFQALRKPVSEGGFGIVSLPASLTNRNQLIDILTII IFTAGPQHSAIAWTQYQYMAFI PNM PGALYQP

IPTTKGKFANENSLTSFLPGVKPSLTQVQFMSLVGTKRDPKAFTDFGTNSFQDPRAI RVLRDLQN RLESVEKRIKILN

KRRQECYPAFLPSRMSNSVSG

Coding sequence for WP_107806740.1 - SEQ ID NO: 217

AT G ACT ACTT CAT C ACCAG AT AATTCCCGC AGT CTCCCCAT CACCCAG AACTT GG AGTT AGT G AGGCAGG AAT AT CAAT AT AACT AT ACCCAT ATTCCACCT ATTCCT ATGGT G AAT CAGCTT CCT AAT CAGG AAAACTT CACT ACT A GATGGAC I I I I I I ATTAGCCCAGCAGTTACGGGAGATTTTCATTAATACTCTGATCACTAACCGAGGCGATCG C AGTTCCAAATCGGTTCGTGATCAAGTCAAAAGGTTTATTTTAGAAGCCTTGTTCAAGGGG GCTATACCAGCCA AAGTAAGTGTGATTGCGAGACTTTTCCAAATTATTCCCCAGTTTCTCATTCAAGGAATAT CTAAAGATTTTCACG AACT AG AT G ATCTG I I I I I I I CCCTTTT CAAAACCAACGG ACT GTT AAT ATT CAG AG ATT CT CT G AAT CG AATT A CAGCCCTTTT AG AT AAAGGCC AT CCCACAGGT CAT GT G AAT AGTTT AAAGG ACT ACC AAAAGTT ATTT ACCACA ATTGAATTACCAGCGATCGCCAAAACTTTCGATCAAGATCAAGTCTTTGCCTATATGCAA GTCGCCGGCTACAA TCCCCTAGTAATCAAGCGGGTAAAAAGTCCAGGCGCTAACTTCCCAGTTGAAGATACACA TTACCAAGCAGTA ATGGGGAGTGATGATTCATTAGCAGCCGCAGGACAAGAAGGACGGCTATACCTAGCAGAC TATCAAATTTTA

GACGGTGCTATCAACGGTATATATCTAAATTACCAAAAGTATGCCTATGCTCCCCTA GCGCTGTTTGCCATCCC

CAAAAACTCAGACCCAAATCGTCTACTGCGCCCCATAGCTATTCAATGTGGTCAAAC TCCTGGAGCCGATTATC

CCATCATTACCCCCAATTCCGGCAAATACGCCTGGCTATTTGCCAAAACCATTGTCC ACATAGCAGATGGCAAC

TTTCATGAAGCTGTCAGTCACCTAGCCCGAACGCACCTATTCGTTGGTGTCTTTGTC ATCGCCACCCATCGGCA

ATTGTCCCCCAGCCATCCCCTCAGCCTCCTACTGCGTCCCCATTTTGAAGGCACTTT AGCGATTAACAATGCCGC

CCAAGAAGTTTTGATTGCTCCTGGCGGCGGAGTTGATATATTGCTTTCATCGACAAT TGATAACTCTCGGATTT

TAGCAGTGCGCGGTTTGCAAAGCTATAGTTTCAATGAAGCTATGTTGCCAAACCAAC TCAAACAACGAGGTGT

TGATGATCCTGAACTACTGCCTGTTTATCCTTACCGGGATGATGCATTACTAATTTG GAACGCCATTCATCAAT

GGGTTTCCGACTACCTGAGCCTTTACTACCCTACAGATAAAGATATTCAAAATGATA CTGCTTTGCAAGCATGG

GCAGCCGAAGCCAAAGCTGACAATGGTGGACGTGTACCTGATTTTGGTGAAAATGGA GGTATTCAGACACTA

GACTACCTAGTTGATGCTGCTACCCTGATTATTTTTACAGCCAGCGCCCAACACGCT GCGGTTAACTTCCCCCA

AAAAGATTTGATGAGTTATGCCCCTGCTTTTCCCTTAGCAGGATATGTATCCGCCTC CATCAACGGAGAAGTTA

GTGAGCAAGACTACCTGAATTTACTCCCACCTTTGGAGCAAGCGCAACAGCAATTTA ACTTGCTCACTTTACTA

G G GTCTAT AT ATT AC AACC AG CTT G GT G A AT AT CC A AA AT C AC ACTTT G CT A ACCCC AAG GT AC A AAT CTT GTT

ACAGAAGTTCCAAAGCCGTCTTCAGCAAATTGAAATTACGATCAATCAGCGCAATTT GCACCGCCCAACTTACG

AAT AT CT ACTT CCTT CT AAAATCCCT C AG AGCATT AAT ATTT G A

Amino acid Sequence for WP_107806740.1 - SEQ ID NO: 218

MTTSSPDNSRSLPITQN LELVRQEYQYNYTHIPPIPMVNQLPNQENFTTRWTFLLAQQLREIFI NTLITN RGDRSSKS

VRDQVKRFILEALFKGAIPAKVSVIARLFQII PQFLIQGISKDFH ELDDLFFSLFKTNGLLI FRDSLN RITALLDKGHPTGH

VNSLKDYQKLFTTI ELPAIAKTFDQDQVFAYMQVAGYNPLVIKRVKSPGAN FPVEDTHYQAVMGSDDSLAAAGQE

GRLYLADYQILDGAI NGIYLNYQKYAYAPLALFAI PKNSDPNRLLRPIAIQCGQTPGADYPIITPNSGKYAWLFAKTIV

HIADGN FH EAVSHLARTHLFVGVFVIATHRQLSPSH PLSLLLRPH FEGTLAI N NAAQEVLIAPGGGVDILLSSTIDNSR

ILAVRGLQSYSFNEAMLPNQLKQRGVDDPELLPVYPYRDDALLIWNAIHQWVSDYLS LYYPTDKDIQNDTALQAW

AAEAKADNGGRVPDFGENGGIQTLDYLVDAATLI IFTASAQHAAVNFPQKDLMSYAPAFPLAGYVSASI NGEVSEQ

DYLNLLPPLEQAQQQFN LLTLLGSIYYNQLGEYPKSH FAN PKVQILLQKFQSRLQQIEITI NQRNLHRPTYEYLLPSKIP

QSIN I

Coding sequence for WP_017804222.1 - SEQ ID NO: 219

AT G ACT ACTT CAT C ACCAG AT AATTCCCGC AGT CTCCCCAT CACCCAG AACTT GG AGTT AGT G AGGCAGG AAT

AT CAAT AT AACT AT ACCCAT ATTCCACCT ATTCCT ATGGT G AAT CAGCTT CCT AAT CAGG AAAACTT CACT ACT A

GATGGAC I I I I I I ATTAGCCCAGCAGTTACGGGAGATTTTCATTAATACTCTGATCACTAACCGAGGCGATCG C

AGTTCCAAATCGGTTCGTGATCAAGTCAAAAGGTTTATTTTAGAAGCCTTGTTCAAG GGGGCTATACCAGCCA

AAGTAAGTGTGATTGCGAGACTTTTCCAAATTATTCCCCAGTTTCTCATTCAAGGAA TATCTAAAGATTTTCACG

AACT AG AT G ATCTG I I I I I I I CCCTTTT CAAAACCAACGG ACT GTT AAT ATT CAG AG ATT CT CT G AAT CG AATT A

CAGCCCTTTT AG AT AAAGGCC AT CCCACAGGT CAT GT G AAT AGTTT AAAGG ACT ACC AAAAGTT ATTT ACCACA

ATTGAATTACCAGCGATCGCCAAAACTTTCGATCAAGATCAAGTCTTTGCCTATATG CAAGTCGCCGGCTACAA

TCCCCTAGTAATCAAGCGGGTAAAAAGTCCAGGCGCTAACTTCCCAGTTGAAGATAC ACATTACCAAGCAGTA

ATGGGGAGTGATGATTCATTAGCAGCCGCAGGACAAGAAGGACGGCTATACCTAGCA GACTATCAAATTTTA

GACGGTGCTATCAACGGTATATATCTAAATTACCAAAAGTATGCCTATGCTCCCCTA GCGCTGTTTGCCATCCC

CAAAAACTCAGACCCAAATCGTCTACTGCGCCCCATAGCTATTCAATGTGGTCAAAC TCCTGGAGCCGATTATC

CCATCATTACCCCCAATTCCGGCAAATACGCCTGGCTATTTGCCAAAACCATTGTCC ACATAGCAGATGGCAAC

TTTCATGAAGCTGTCAGTCACCTAGCCCGAACGCACCTATTCGTTGGTGTCTTTGTC ATCGCCACCCATCGGCA ATTGTCCCCCAGCCATCCCCTCAGCCTCCTACTGCGTCCCCATTTTGAAGGCACTTTAGC GATTAACAATGCCGC

CCAAGAAGTTTTGATTGCTCCTGGCGGCGGAGTTGATATATTGCTTTCATCGACAAT TGATAACTCTCGGATTT

TAGCAGTGCGCGGTTTGCAAAGCTATAGTTTCAATGAAGCTATGTTGCCAAACCAAC TCAAACAACGAGGTGT

TGATGATCCTGAACTACTGCCTGTTTATCCTTACCGGGATGATGCATTACTAATTTG GAACGCCATTCATCAAT

GGGTTTCCGACTACCTGAGCCTTTACTACCCTACAGATAAAGATATTCAAAATGATA CTGCTTTGCAAGCATGG

GCAGCCGAAGCCAAAGCTGACAATGGTGGACGTGTACCTGATTTTGGTGAAAATGGA GGTATTCAGACACTA

GACTACCTAGTTGATGCTGCTACCCTGATTATTTTTACAGCCAGCGCCCAACACGCT GCGGTTAACTTCCCCCA

AAAAGATTTGATGAGTTATGCCCCTGCTTTTCCCTTAGCAGGATATGTATCCGCCTC CATCAACGGAGAAGTTA

GTGAGCAAGACTACCTGAATTTACTCCCACCTTTGGAGCAAGCGCAACAGCAATTTA ACTTGCTCAGTTTACTA

G G GTCTAT AT ATT AC AACC AG CTT G GT G A AT AT CC A AA AT C AC ACTTT G CT A ACCCC AAG GT AC A AAT CTT GTT

ACAGAAGTTCCAAAGCCGTCTTCAGCAAATTGAAATTACGATCAATCAGCGCAATTT GCACCGCCCAACTTACG

AAT AT CT ACTT CCTT CT AAAATCCCT C AG AGCATT AAT ATTT G A

Amino acid Sequence for WP_017804222.1 - SEQ ID NO: 220

MTTSSPDNSRSLPITQN LELVRQEYQYNYTHIPPIPMVNQLPNQENFTTRWTFLLAQQLREIFI NTLITN RGDRSSKS

VRDQVKRFILEALFKGAIPAKVSVIARLFQII PQFLIQGISKDFH ELDDLFFSLFKTNGLLI FRDSLN RITALLDKGHPTGH

VNSLKDYQKLFTTI ELPAIAKTFDQDQVFAYMQVAGYNPLVIKRVKSPGAN FPVEDTHYQAVMGSDDSLAAAGQE

GRLYLADYQILDGAI NGIYLNYQKYAYAPLALFAI PKNSDPNRLLRPIAIQCGQTPGADYPIITPNSGKYAWLFAKTIV

HIADGN FH EAVSHLARTHLFVGVFVIATHRQLSPSH PLSLLLRPH FEGTLAI N NAAQEVLIAPGGGVDILLSSTIDNSR

ILAVRGLQSYSFNEAMLPNQLKQRGVDDPELLPVYPYRDDALLIWNAIHQWVSDYLS LYYPTDKDIQNDTALQAW

AAEAKADNGGRVPDFGENGGIQTLDYLVDAATLI IFTASAQHAAVNFPQKDLMSYAPAFPLAGYVSASI NGEVSEQ

DYLNLLPPLEQAQQQFN LLSLLGSIYYNQLGEYPKSH FAN PKVQILLQKFQSRLQQIEITI NQRNLHRPTYEYLLPSKIP

QSIN I

Coding sequence for WP_010472182.1 - SEQ ID NO: 221

AT G ACGCC ACAAT AT G AAT ATCG AT ACG ATGCCCT G AAAG ACGTTT CCCCT G AATT G AAAT AT CC AATGGCC A

AGGAGGTGTTTCCAGCAGACCAATCTTTGACAAAATGGCCCTGGACTCGAGATCTCG TTTCCGTTGTCCTCAG

AATTATTGCCAATCAGGCCATGCAGGATATATCCGTCCGTAGAGGATCAGCCTGTCG TCTGATTACGTTTATTC

GCTTATATCGAATTCTAGAAAATCCCCTCTATCAGTCAGGTCTGGAGAGGCTTTTCA ATGCTGTCAATAATCTT

GTACGGGGTCTCTCCAATATTTTTGGCAACAGAGCCCAGTCTCAAAATATCAAACAT GATGTAAAGGAGGAGC

AACATCCTGACAAAGTCTCCGCCCGCATTTCAGCAATGGTCAAGGATATCCAAGAAA CGGCTGAATCGAGAGA

GGCTAAAGAGCAACCGTCCTTAGCAGACTATCGCGATCTCTTTCAGATCATTTACTT ACCAGACATTAGCAATC

ATTTCCTAGAAGATCGTGCCTTTGCCGCTCAACGGGTTGCCGGAGCTAACCCCCTCG TGATTAACCGAATTTCT

G AACTCCCAG ACCATTT CCAAGT CACT G ACCAACAGTTTAAAT CGGT G ATGGG AG ATAGT G AGT CCCT CCAAG

CAGCCTTGAATGATGGCCGAGTGTATCTGGTAGACTATCAAATTCTTGAAGAAATTG ATGCGGGTACAGTCGA

GGTGAAGGATCGTGAAATTCTGAAGTATCGCTATGCACCGTTGGCCTTATTTGCGAT CGCATCCGGGAATTGT

CCCGGTCGCCTCCTCCAGCCGATTGCCATTCAATGCCATCAAGAAGCAGGCAGCCCG ATATTTACACCACCCA

GTCTAGAAGCCGATAAAGAGGAGCGGCTTGCTTGGAGAATGGCCAAGACCGTCGTTC AAATCGCCGACGGTA

ACTACCATGAATTGATTTCTCATTTAGGGCGGACTCATCTCTGGATTGAGCCCATTG CTTTAGGCACTTACCGA

CGCCTAGGAACAGAGCATCCACTGGGTAAATTGCTCCTACCCCACTTCGAAGGCACC TTATTTATCAACAATGC

AGCAGCCAATAGCTTAATTGCCCCGGGTGGCACCGTAGACAAAATCTTGTTTGGCAC CTTAAAGTCATCCGTTC

AGCTCAGCGTCAAAGGCGCTAAGGGTTACCCCTTTTCTTTCAATGATTCCATGCTCC CCCAAACCTTTGCATCCC

GAGGCGTGGACGACCTACAAAAGCTACCGGACTACCCCTATCGAGATGATGCATTAC TGATTTGGCATGCCAT

TCACGATTGGGTTGAGGCCTATCTTCAGATCTACTACAAAGATGATGATGCAGTTCT CAAGGATGAAACCCTC CAGGATTGGTTAACCGAGCTAAGAGCTGAAGATGGGGGCCAGATGACTGAAATCGGTGAA TCGACTCCAGA AGAACCCGAGCCTAAAATTCGCACCTTGGATTATCTAGTAAACGCGACAACGCTGATTAT TTTCACTTGTAGTG CTCAACATGCATCGGTCAATTTTCCCCAAGCATCGTTGATGACGTTTGTCCCCAATATGC CCCTAGCCGGGTTC AATGAAGGCCCGACAGCAGAGAAAGCCAGTGAAGCAGACTATTTCTCTTTACTACCACCC CTGAGTTTGGCCG AACAACAGTTGGATCTAGGGTATACCTTGGGTTCGGTCTACTATACTCAGCTCGGATATT ACAAAGCCAATGA T GT AG ATTT AGGT GAT ATT AACAACCAT ACCT ACTT CAACG ACCTCCAAGTT AAACAGGCT CT CCT AAGCTT CC AACAAAGATTAGAAGAGATTGAGTTGATCATTCAAGACCGGAACGAAACCCGACCCACAT ATTACGACATCTT GCT CCCGTCCAAG ATTCCCCAAAGT ACCAAC ATTT AA

Amino acid Sequence for WP_010472182.1 - SEQ ID NO: 222

MTPQYEYRYDALKDVSPELKYPMAKEVFPADQSLTKWPWTRDLVSVVLRIIANQAMQ DISVRRGSACRLITFI RLY

RILEN PLYQSGLERLFNAVNN LVRGLSNI FGNRAQSQNIKH DVKEEQHPDKVSARISAMVKDIQETAESREAKEQPS

LADYRDLFQIIYLPDISNH FLEDRAFAAQRVAGANPLVIN RISELPDHFQVTDQQFKSVMGDSESLQAALNDGRVYL

VDYQILEEIDAGTVEVKDREI LKYRYAPLALFAIASGNCPGRLLQPIAIQCHQEAGSPIFTPPSLEADKEERLAWRMAK

TVVQIADGNYHELISH LGRTHLWIEPIALGTYRRLGTEHPLGKLLLPHFEGTLFI NNAAANSLIAPGGTVDKILFGTLKS

SVQLSVKGAKGYPFSFN DSMLPQTFASRGVDDLQKLPDYPYRDDALLIWHAIH DWVEAYLQIYYKDDDAVLKDETL

QDWLTELRAEDGGQMTEIGESTPEEPEPKIRTLDYLVNATTLII FTCSAQHASVN FPQASLMTFVPNMPLAGFNEG

PTAEKASEADYFSLLPPLSLAEQQLDLGYTLGSVYYTQLGYYKANDVDLGDI N NHTYFN DLQVKQALLSFQQRLEEIE

LI IQDRNETRPTYYDILLPSKI PQSTNI

Coding sequence for WP_103139451.1 - SEQ ID NO: 223

AT G ACAAAT AGT CT AACT AGTGCCACAACT AATT CCAAT CT AG AAT CAGCT AG AG AGCAAT AT AAGT AT AACT

ACAGCTACATTCCGCCGATCGCAATGGTGGATGAACTACCAGATGGGGAAGATTTCT CCCGTCAATGGTTGCT

GTTGCTGGCTAAAGAGTTAAAAGTAATTTTTGTGAATATTTTGATTACCAATAGAGG TAATCGAGGTTCGCAA

AAG ATT CGT G ATG AT GT CAG AAATTTT ATT CT AG AAGTT ATT CT CAAAGGT GCT AT ACCAGCT AACAT C AGT GT

AATTGCTCGATTTATGCAAATTGTCCCCCAATTGTTAATTCGGGGGTTTTCTACGGA TTTTCACGAACTGGACG

AT CT GTT ATTTTCGCT AATT AAAG AAAGTGGGCTTTT AATT CT G AGT G ATTCCTT CC AACG AATT ACT AAACT CC

TCGACAAAGGAAAACCCACAGGCCATGTGAGTAGTTTGGCGGACTATCAAAAGTTGT TTCCCGTAATTCCCCC

GCCAAAGATTGCTAAAACTTTCCAAAATGATGCTGAATTTGCCTATATGCGGGTTGC TGGCTACAATCCGGTG

ATGATTCAGCGAGTTAGTGAGTTAGATGAACGCTTCCCCGTTACCGATGCACAATAT CAAGCCGTCATGGGTA

GTGATGATTCCCTTGCCCTGGCTGGTCAAGAAGGTAGACTTTATCTAGCTGACTATG GCATTTTCAACGGTGG

ACT C AAT G GTT CAT GTCCC AG CT AT C AAA AGT AT CT CT AT G C ACCTTT AG C ACT GTTT GC AGTTCCTCC AG G CTC

AAACCCCAATCGTCTATTACAGCCAGTGGCGATTCAATGCGGTCAAAACCCCAAGGA AAATCCCATCATCACG

CCAAAATCTAGTGAATATGCTTGGTTAATTGCTAAAGCCATCGTCCAGATTGCTGAT GCTAACTTTCACGAACC

AATTACCCACCTTGCCAGAACACATTTATTAGCGGGGATTTTTGCGATCGCTACCCA TCGTCAACTCCCCAATTC

TCATCCCCTCTACGTGCTTCTCACGCCCCATTTTGAAGGCACTTTAGCCATTAATGA TGCCGCCCAACGCGCCCT

AATTGCACCTTTGGGTGGGGTAGATATTTTGCTTTCATCTACTATTGATAACTCTCG TGTCTTAACTGTGCTAGG

TCTGCAAAGCTATGGCTTTAATCATGCCATGTTGCCGAAACAATTCCAGCAACGGGG TGTAGATGATGCCAAT

CTTTTACCTGTATATCCTTATCGGGATGATGGTTTATTACTGTGGGATGCAATTCAT CAATGGGTTGCCGATTA

CATTCAAATTTACTACCACACAGACCAAGAAATTCAAGCCGACGCATATATTCAAGC TTGGGCAAAAGAGGTA

CAGGCTTATGATGGTGGTCGCCTCACAGAGTTTGGTGAAGATGGCAAAATTCAGACC AGGGAATATTTAATTG

ATGCCGTCACCTTAATTATTTTTACCGCCAGCGTCCAACACGCCGCCGTCAACTTTC CCCAAAAAGATGTCATG

GGTT AT ACT CCAGCCGTACCCTT AGCAGGTT ATTTACCCGCCTCCATTCTT CAAGGGG AAGTT ACAG AAAAAG A

CT AT CT CAACTTTTT ACC ACCATT AG ACCAAGCCC AACAGCAAT AT AAT CT ACTCGCCTT ACT AGGTT CT GTTT A TTACAACAGACTAGGGGAATACCCGCCCCAACATTTTGCTGATCCTAAAGTCGAACCCTT ATTGCGATCGTTCC AAAAG AACTT AC AAG AG AT CG AAACC AT CATCCAAAAGCGT AACAGCG AT CGCCCACCCT ACG AAT AT CTCCT ACCCT C AA AA ATTCCT C AA AG CAT C AAT AT CT AA

Amino acid Sequence for WP_103139451.1 - SEQ ID NO: 224

MTNSLTSATTNSNLESAREQYKYNYSYIPPIAMVDELPDGEDFSRQWLLLLAKELKV I FVNI UTNRGN RGSQKIRDD

VRNFI LEVILKGAIPANISVIARFMQIVPQLLI RGFSTDFH ELDDLLFSLIKESGLLILSDSFQRITKLLDKGKPTGHVSSLA

DYQKLFPVI PPPKIAKTFQN DAEFAYM RVAGYN PVM IQRVSELDERFPVTDAQYQAVMGSDDSLALAGQEGRLYL

ADYGI FNGGLNGSCPSYQKYLYAPLALFAVPPGSN PNRLLQPVAIQCGQNPKENPIITPKSSEYAWLIAKAIVQIADA

N FH EPITH LARTHLLAGIFAIATHRQLPNSHPLYVLLTPH FEGTLAI NDAAQRALIAPLGGVDILLSSTIDNSRVLTVLG

LQSYGFNHAMLPKQFQQRGVDDANLLPVYPYRDDGLLLWDAI HQWVADYIQIYYHTDQEIQADAYIQAWAKEV

QAYDGGRLTEFGEDGKIQTREYLIDAVTLII FTASVQHAAVNFPQKDVMGYTPAVPLAGYLPASI LQGEVTEKDYLN

FLPPLDQAQQQYNLLALLGSVYYN RLGEYPPQH FADPKVEPLLRSFQKNLQEI ETI IQKRNSDRPPYEYLLPSKI PQSI

N l

Coding sequence for WP_075890025.1 - SEQ ID NO: 225

AT G ACCGCAACAT CAGGCTCCCAAAAT CTAGGCTT AATCG AAAAGCAAG AAAAGT AT AAGT AT AACT AT AGTC

ACATTCCTCCAGTGGCAATGGTCGATACCTTGCCGGAAAGCGAAAAATGGTCAATAC CTTGGAAGTTGATGGT

GGCGAAGGTGGGTTATCAG CTTTT G GTT AAT AAA AT AATT GT G ACTT ATG GTG ATC AAG G G AAG G CTG GTG C

AGCGAATGATGTACGGGCTTTTTTGATTGCTAGGTTAAAGGAAACTTTTGGGGAACA GAAAGGGTTGTCCAA

AGTGCGTGTCTTGCTGCAAGGTGCGAGGTTTCTGCCTCGAATTATTTGGGGTGAAAT TACGACGGATGTTGTG

GATGTTGAAGAGGTGATGCGGGATGCTATTAAAACTGTTAGTAGAGATTTTCTAGAG GATTTTGCTGCAAATG

TGATGGAGCAACTTACCGTTGACGGTAAGGATGGTCGTTGTCTATCGAGTACAGATT TTGAGAGGCTTTTTGC

CACG ATT G ATTTACCGG AGATT GCTT AT G AGT ATCAAACGG AT G AAAGTTTTGCTTATAT G AGGGT GGCGGG A

CCTAATGCGGTTATGCTCGAAAAAATCACGGAACCTGATCCTCGTTTTCCTGTGACG GAGGCTCATTATCAAGC

GGTGATGGGAGAGGGGGATTCTCTTGCTGCGGCAAGGGCGGAGGGTCGATTATTTTT GTGTGATTATGAGAT

TTTGGATGGTGCGGTTAATGGTTCTTTTCCGACGGATCAGAAATATCTTTATGCGCC GTTAGCGTTGTTTGCTG

TACCAAAGGCAGATGCTGGGAAACGTGATTTGAGGCCTGTTGCGATTCAGTTGGGTC AAAAACCGAAGGAGT

ATCCGATTCTCACGCCGAAGTCTAATCGGTATGCTTGGCTCTGTGCGAAAACGGCGG TACAGGTTGCGGATGC

GAATTTCCATGAGGCGGTTACTCATTTAGGGCGGACTCATTTGTTTATGGGGCCGTT TGTGATCGCCACCCATA

G ACAATTGCCAG AAAATCAT CCTTT GTTTAAATT ACT AACGCCCCATTTTTTAGGG AT GTT GGCG ATCAAT GAT

TCTGCGCAGGCGAAATTGATTTACAAGGGGGGTGGTGTTGATAAAATTTTGGCGACA ACTATTGATAATGCCC

GTTTGTTTGCGGTGCTGGGTGTGCAAACCTATGGTTTTAATCGTGCTATGTTGCCGG ATCAATTGGCTGCGCG

CGGTGTTGATGATACGGAGGCATTACCGGTTTATCCCTATCGTGATGATGCTTTATT GATTTGGGAGGCGATTT

ATAACTGGGTTAAGGCTTACTTGAAGACTTATTATCCGGGCGATAGTGCTGTGCAGC GTGATCAGGCGCTACA

AGCTTGGGCAAAGGAACTCATTTCCTATAAGGGTGGGCGAGTGGTGGACTTTGGTGA AGATGGTGATATCAA

AACGTTGTCGTACCTGATCGATGCAGTGACGCTCATTATTTTTACGGTGAGTGCCCA ACATGCGGCGGTAAAT

TTTCCGCAGAAGGGTTTGATGAGTTTTGCGCCGGGTATGCCGACTGCGGGCTATGCT CCCCTTGATAATCTGG

GTGATCAGACGGCAGAACAGGATTATCTTGATTTGCTGCCGCCAATTTCTCAGGCTC AGGAGCAATTAAAACT

GTGTCATTTACTTGGGTCTGTTCACTTCACGCAGTTAGGGCAGTATGACAAAAAGCA TCTTGGTGACCCGAAA

ATTCAAAAGCCGCTGCGGCAATTTCAAGGGCGACTCGAGGAAATTGAGATGATTATC CACAAGCGTAATGGC

GATCGCCCAACCTATGAATATTTACTCCCTAGTCTTATTCCCCAGAGTATCAATATC TAA

Amino acid Sequence for WP_075890025.1 - SEQ ID NO: 226 MTATSGSQNLGLIEKQEKYKYNYSH IPPVAMVDTLPESEKWSI PWKLMVAKVGYQLLVNKIIVTYGDQGKAGAAN

DVRAFLIARLKETFGEQKGLSKVRVLLQGARFLPRIIWGEITTDVVDVEEVMRDAIK TVSRDFLEDFAANVMEQLTV

DGKDG RCLSSTDFERLFATIDLPEIAYEYQTDESFAYMRVAGPNAVMLEKITEPDPRFPVTEAHY QAVMGEGDSLA

AARAEGRLFLCDYEILDGAVNGSFPTDQKYLYAPLALFAVPKADAGKRDLRPVAIQL GQKPKEYPILTPKSN RYAWL

CAKTAVQVADAN FH EAVTHLGRTHLFMGPFVIATH RQLPENH PLFKLLTPHFLGMLAI NDSAQAKLIYKGGGVDKI

LATTIDNARLFAVLGVQTYGFNRAMLPDQLAARGVDDTEALPVYPYRDDALLIWEAI YNWVKAYLKTYYPGDSAV

QRDQALQAWAKELISYKGGRVVDFGEDGDIKTLSYLIDAVTLII FTVSAQHAAVNFPQKGLMSFAPGM PTAGYAPL

DN LGDQTAEQDYLDLLPPISQAQEQLKLCHLLGSVHFTQLGQYDKKH LGDPKIQKPLRQFQGRLEEI EM II HKRNG

DRPTYEYLLPSLIPQSI NI

Coding sequence for WP_050046589.1 - SEQ ID NO: 227

ATGCGTTCGCGTAGCGGCTGCTTTGCAGCATCGCCAAACCGACAACAAAGACACCAA CAATTAATCGAGCAGT

ACGTTTT CTCGCGCCGT ACC AT G CTAG CG CT CCTT G GTTT C ATTT GTG CTCC AG G CTT G G AAC ATTTT ATAGTA A

GT G ACACT CAACCAAG AG AACCC ACGCTT CCT GCC AAT CCT CAAATCCC AACTTT ACCT C AAAAAAATT C ATT G

GCATCCCAAAAAGAACGCCAACAGCAGCTTGAGATTGCACGCTCTAAATACCAGCTA ACACCTCGACTGCCAA

ACTCTGTTAGGGTATCAACTTTACCGATCGAAGAGGCTTTTGATGGGGGCTATAGCA GTAATCGGGCAAGCAT

AACCCGGAAAATTACAGAAAATCAACAAGCATTTTTCCAAAATCCCAAACCTTTTCT CGCATTAGAAGACTACA

CAAATGTTTTTCAAGTTTTACCCGTACCGGATATTGCTAAAACCTTTCGCAAGGATG CGATATTTGCAGGGCAA

CGGCTGTGGGGTCCCAATCCCATGGAACTTACCAACGTTCTAGCACTCAATTACGAT CTTCAAGAAAAACTGG

GAATAACAAATGAGATTTTTCAAACCGTTTTGGGTGCTGCTAGAGGAACGGCATATG TTAGCGAAACTCTTGA

AAGTGCTACTAAAAATGGCGGTCTGTTTGTAACGGATTATGCAATCCTTGCGACTGA TGGCATTACCTCAAAA

ACAAAGCGATATCTCATTGCTCCTATCGCTCTTTATTACGCCGATCGCGACCGTGGT AATTGGCGTTTAATTCCC

ATTGCCATTCAACTCGGACAAGTTCCTCAAGAAAGTTTGCTTTGTACTCCCTTGGAT GGAGTGGATTGGACTCT

AGCCAAGCTCATCGCTCAAATGGCTGATTTTTCCGTTCATGAATTGGTCCGTCACTT GGGTCAAACCCATCTTG

CTCTAGAACCCATCGCACTGGCAACTGTACGCGAACTCCCTGCCCTTCATCCCGTAC ACGTCCTATTAA AACCC

CATTTTGAGTTCACAATGGCAATCAATGCTTTTGGCGATCGAGTGTTGATTAATCCA GGGGGATACGTAGATG

TCATTCTAGGAGGTACTTTAGAAAGCTCCCTCAACCTTGTAAATCTTGGTGTCTCGG AAATGTTCGATAACTTC

AGCAACTTTGCTTTGCCGAACAATTTACAAAGGCGCGGTGTTGGCGATCGCTCTTTA TTAAAAGATTTTCCCTA

TCGAGATGACGGAGTGCTGGTTTGGGATGCTCTATCCGAGTATGTCAGTCGGTATGT AGGAATTTACTACAGA

TCTTCTAAAGATATTCGAGAGGATTTCGAGTTACAAAATTGGTTAAAAGCTTTACGG ACACCTGTTAGTGATG

G AGGTTTT GGT GT C ACTT CTTT ACCATCCT ACCT AAAAG ACCGCG ACCAGTT AATT G ACCTGCT AAC AC AAATT

ATTTTT AC AG C AG GT CCG C AAC ACT C AG CC ATT G CCT G G ACT C A AT AT C AGT AT AT GT CTTTT GTCCCT AAT AT G

CCTGGAGCTATTTATCAGCCTGTTCCTATTACCAAGGGAACAATTGAAGATGAGAAG AGTTTAACAAGTTTTCT

TCCTGGTATAGAACCAACTTTTGCACAAGTTAACGTCATATCGGGAATTGGTGTCAA ACTTGATGTCAAAGCAT

TT AC AG ATTTT G GTGT C A AT AGTTTT C A AG ATCCG CG AG CT ATT G CTGTT CTT A AAG G CTT G C AAA AT CGTTT G

GAGGTTGTAGAAAAACAGATCGAACAACGAAATAAACGCCGAGAGGAATGCTACCCT GGCTTTTTACCTTCTC

GTATG G CT AAC AGTACC AGTG GTTG A

Amino acid Sequence for WP_050046589.1 - SEQ ID NO: 228

MRSRSGCFAASPNRQQRHQQLIEQYVFSRRTMLALLGFICAPGLEHFIVSDTQPREP TLPAN PQI PTLPQKNSLASQ

KERQQQLEIARSKYQLTPRLPNSVRVSTLPIEEAFDGGYSSNRASITRKITENQQAF FQNPKPFLALEDYTNVFQVLP

VPDIAKTFRKDAIFAGQRLWGPNPMELTNVLALNYDLQEKLGITNEIFQTVLGAARG TAYVSETLESATKNGGLFVT

DYAI LATDGITSKTKRYLIAPIALYYADRDRGNWRLI PIAIQLGQVPQESLLCTPLDGVDWTLAKLIAQMADFSVHEL

VRHLGQTHLALEPIALATVRELPALHPVHVLLKPHFEFTMAINAFGDRVLIN PGGYVDVILGGTLESSLNLVNLGVSE MFDNFSN FALPNNLQRRGVGDRSLLKDFPYRDDGVLVWDALSEYVSRYVGIYYRSSKDIREDFELQN WLKALRTPV

SDGGFGVTSLPSYLKDRDQLIDLLTQI IFTAGPQHSAIAWTQYQYMSFVPN MPGAIYQPVPITKGTIEDEKSLTSFLP

GI EPTFAQVNVISGIGVKLDVKAFTDFGVNSFQDPRAIAVLKGLQN RLEVVEKQIEQRN KRREECYPG FLPSRMANS

TSG

Coding sequence for WP_012163949.1 - SEQ ID NO: 229

ATGACGCATCAGTACTCCCTCACTGGCCTGCCGACCCAAATCACACCTGTAGAAATT CAACAGGACAAACATC

AACCCACTCTGGCCCCCACTCGTCCTAATCCGACCCAGCCGGAGCCTATCCCCGCAG CGCTAAAAGCAGCTCG

ACG C A A AT AT C A AT AC AACT AT AGT C AC ATT GCCCCTGTGGCCATGGTGGATCG CTT ACCC AA AG AG G AACT C

CCCTCTAGGGCTTGGTGGTCAAAGTTGATCCGTACCATGTTCAAGATTCTCTCGAAT GCCATTGTTGGCGCCCA

TAATCACCACCATGAGCATGAAGCAGAGCAGCATGCTTCTCGCCTCATTCGCAAAAC CTTGGTGGATATCTTG

AGACAACGCCCCGAGGTGCGGTGGCGTCTCATCTGGCATCTGCTGAAAACAGCGCCA ACGACTTTGCTTAACG

GTTTACGGTTGTCGTTTTCTGATGCCGAAAGCTTGCTGCACAGTTTAGCCGCCCATT TAGAGCATGATCTATTA

CGGATTCTGCACTTGAACTTAAAAGAACATCTAGCCCATGAATGTGGACAAGATCGC CCTACCTCAATAGCAG

ACTTTAATCAGCAGTTTGCAACGATTCCGTTACCGGAGTGTGCCGAATACTTTCAAG AAGATGAGTTTTTTGCT

TACTTGCGAGTAGCCGGTCCTAATCCTGTTTTGCTGCAACAAGTCCGCCATTTATCG GGAGACATCCTCTGCTC

TCATTTCCCAGTTACCAATCAGCATTATCAGACCGTAATGGGAGAAGACGATTCTCT GCAAATAGCAATCACCG

AAGGCCGTCTATACATCGCCGATTATGCTATTTTGGCTGGTGCGATCAATGGTAACT ACCCCGATCAGCAAAA

ATATATTTCGGCTCCCATCGCCCTTTTTGCCGTTCCCTCAGCTGATGCCCCCTGCCG AAATCTCCAGCCCATCGC

TATTCAATGCCGCCAATCTCCAGGGCCTGAAACACCGATTCTGACGCCGCCTACGGA TCAGAATCCAGACCAA

AAACAGGCCTGGGACATGGCGAAGACCTGCGTGCAAGTTGCAGACAGCAATTACCAT GAGGCCGTCACCCAT

TTGGGT CG AACCC AT CT GTTT ATT AGCCCGTTT GT AATT GCCACCCATCGCC AACT ACT GCCGT CT CAT CCCGT G

AGTGTCCTG CTTCG G CCT C ACTTT G A AG G C ACCTT AAGT AT C AAC A ACG GT G CT C A AAG CAT GTT AAT G G CG C

CAGAAGGTGGAGTGGATACGGTCTTGGCTGCCACTATCGACTGTGCCAGGGTCTTAG CCGTAAAGGGAGTAC

AAAGCTATTCCTTTAATCAGGCCATGCTGCCCCAACAATTGCGGCAACTGGGTTTGG ATAATGCAGAGGCGCT

TCCCATCCACCCCTATCGAGACGATGCATTGCTGATTTGGCAGGCCATCGAAACTTG GGTCACTGATTATGTGA

GCTTGTACTACCCAACAGATGACTCCGTGCAAACAGATGCGGCCCTTCAGGCTTGGG CGCAGGAGCTACAGG

CTGAAGAGGGTGGCCGAGTCCCAGATTTTGGTGAGGATGGACAATTGCGAACCCAGG CCTACTTGATTCAAG

CCCTCACGCTGATCATCTTTACCGCGAGTGCCCAACATGCCGCTGTGAATTTTCCCC AGGGCGACATCATGGTC

TATACCCCAGGGATGCCATTAGCAGGCTACCAGCCCGCTCCCAACTCGACAGCTATG TCTTCCCAGGATCGGC

TCAACCAACTGCCCTCCTTACACCAGGCCTTAAATCAGCTGGAGTTAACGTATTTGC TCGGGCAGATTTACCAT

ACGCAACTCGGTCAATACGAAAAGTCTTGGTTCTCTGATCAGCGAGTGCAAGCTCCG CTGCATCGGTTTCAAG

CCAATTTACTGGATATCGAAACTGCGATCGCAGAACGAAACCGCCATCGCCCCTACC CTTACCGCTACCTACAG

CCGTCCAAC ATT CCCC AG AGC AT CAAT AT CT AA

Amino acid Sequence for WP_012163949.1 - SEQ ID NO: 230

MTHQYSLTGLPTQITPVEIQQDKHQPTLAPTRPNPTQPEPIPAALKAARRKYQYNYS H IAPVAMVDRLPKEELPSRA

WWSKLI RTM FKILSNAIVGAH NH HH EH EAEQHASRLIRKTLVDI LRQRPEVRWRLIWH LLKTAPTTLLNGLRLSFSD

AESLLHSLAAHLEHDLLRILHLN LKEHLAH ECGQDRPTSIADFNQQFATIPLPECAEYFQEDEFFAYLRVAGPNPVLL

QQVRHLSGDILCSH FPVTNQHYQTVMGEDDSLQIAITEGRLYIADYAI LAGAINGNYPDQQKYISAPIALFAVPSAD

APCRN LQPIAIQCRQSPGPETPILTPPTDQNPDQKQAWDMAKTCVQVADSNYH EAVTH LGRTHLFISPFVIATHR

QLLPSH PVSVLLRPHFEGTLSIN NGAQSMLMAPEGGVDTVLAATI DCARVLAVKGVQSYSFNQAM LPQQLRQLGL

DNAEALPIH PYRDDALLIWQAI ETWVTDYVSLYYPTDDSVQTDAALQAWAQELQAEEGGRVPDFGEDGQLRTQA YLIQALTLII FTASAQHAAVNFPQGDIMVYTPGMPLAGYQPAPNSTAMSSQDRLNQLPSLHQALNQLELT YLLGQI

YHTQLGQYEKSWFSDQRVQAPLH RFQANLLDI ETAIAERNRHRPYPYRYLQPSN IPQSIN I

Coding sequence for WP_050046033.1 - SEQ ID NO: 231

ATGCGTTCGCGTAGCGGCTGCTTTGCAGCATCGCCAAACCGACAACAAAGACACCAA CAATTAATCGAGCAGT

ACGTTTT CTCGCGCCGT ACC AT G CTAG CG CT CCTT G GTTT CGTTT GTG CTCC AG G CTT G G A AC ATTT CAT AGTG

GGT G ACACT CAACC AAG AG AACCC AAGCTTCCTGCC AAT CCT C AAAT CCCAACTTT ACCT C AAAAAAATT C ATT

GGCATCCCAAAAAGAACGCCAACAGCAGCTTGAGATTGCACGCTCTGAATACCAGCT AACATCTCGATTGCCA

A ACT CTGTTAGGGTGT C AACTTT ACC AAT C A AAG AG G CTTTT G ATG GG G G CTAT AG C AAT AAT CGGGCAAGCA

T AACCC AG AAAATT AC AG AAAAT C AACAAGC ATTTTTCCAAAATCCC AAACCTTTT CT CGC ATT AG AAG ACT AC

ACGAATGTTTTTCAAGTTTTACCCGTACCGGATATTGCCAAAACCTTTCGCAAGGAT GTGATATTTGCAGGGCA

ACGGCTGTGGGGTCCCAATCCCATGGAACTTACCAACGTTTTAGCACTCAATTACGA TCTTCAAGAAAAACTG

GGGATAACAAATGAGATTTTTCAAACCGTTCTAGGTGCTGCTAGAGGAACGGCATAC GTTAGCGAAACTCTTG

AAAGTGCTACCAAAAAT GGTGGT CT GTTT GTAACT GATT AT GCAAT CCTTGCG ACT G ATGGCATTACTT CAAAA

ACAAACCGATATCTCATTGCTCCTATCGCTCTTTATTACGCCGATCGCAACCGTGGT AATTGGCGTTTAATTCCC

ATTGCCATTCAACTCGGGCAAGTTCCTCAAGAAAGTTTGCTTTGTACTCCCTTGGAT GGAGTAGATTGGACTCT

AGCCAAGCT CATCGCT CAAAT GGCT G ATTTTT CCGTTCAT G AATTGGTCCGT CAT CT GGGT CAAACCCATCTT G

CTCTAGAACCCATTGCACTGGCGACTGTACGCGAACTCCCTGCCCTTCATCCAGTGA ACGTCCTATTAA AACCC

CATTTTGAGTTCACAATGGCCATCAATGCTTTTGGCGATCGGGTGTTGATTAACCCA GGGGGATACGTAGATG

TCATTCTGGGAGGTACTTTAGAAAGCTCCCTCAAGCTGACTAACCTTGGTGTCTCGG AGATGTTCGATAACTTC

AGCAACTTTGCTCTGCCGAACAATTTACAAAGGCGCGGTGTTGGCGATCGCTCTTTA TTAAAAGATTTTCCCTA

TCGAGATGACGGAGTGTTGGTTTGGGATGCTCTATCCGAGTATGTCAGTCGGTACGT AGGAATTTACTACAAA

TCTTCTAAAGATATTCGAGAGGATTTCGAGTTACAAAATTGGTTAAAAGCTTTACGG ACACCTGTTAGTGATG

G AGGTTTT GGT GT C ACTT CTTT ACCATCCT ACCT ACAAG ACCGCG ACCAGTT AATT G ACCTGCT AACACAAATT

ATTTTTACAGCAGGT CCGCAACACT CAGCCATT GCTT GG ACT CAAT AT CAGT AT AT GT CTTTT GTTCCT AAT AT G

CCTGGAGCTATTTATCAGCCTGTTCCTATTACCAAGGGAACAATTGAAGATGAGAAG AGTTTGACAAGTTTTCT

TCCT GGT AT AG AACCAACTTTT GCACAAGTT AACGT CAT ATCGGG AATTGGT GT CAAACTT GAT ATCAAAGCAT

TTACAGATTTCGGTGTCAATAGTTTTCAAGATCCGCGAGCTATTGCTGTTCTTAAAG GCTTGCAAAATCGTTTG

GATGTTGTAGAAAAACAGATCGAACAACGCAATAAACGCCGAGAGGAATGCTACCCT GGCTTTTTACCTTCTC

GTATG G CT AAC AGTACC AGTG GTTG A

Amino acid Sequence for WP_050046033.1 - SEQ ID NO: 232

MRSRSGCFAASPNRQQRHQQLIEQYVFSRRTMLALLGFVCAPGLEH FIVGDTQPREPKLPAN PQIPTLPQKNSLAS

QKERQQQLEIARSEYQLTSRLPNSVRVSTLPI KEAFDGGYSNN RASITQKITENQQAFFQN PKPFLALEDYTNVFQVL

PVPDIAKTFRKDVI FAGQRLWGPNPMELTNVLALNYDLQEKLGITNEIFQTVLGAARGTAYVSETLESATKNGG LFV

TDYAILATDGITSKTNRYLIAPIALYYADRNRGNWRLIPIAIQLGQVPQESLLCTPL DGVDWTLAKLIAQMADFSVHE

LVRH LGQTH LALEPIALATVRELPALH PVNVLLKPH FEFTMAINAFGDRVLI N PGGYVDVI LGGTLESSLKLTNLGVSE

MFDNFSN FALPNNLQRRGVGDRSLLKDFPYRDDGVLVWDALSEYVSRYVGIYYKSSKDIREDFELQN WLKALRTPV

SDGGFGVTSLPSYLQDRDQLIDLLTQI IFTAGPQHSAIAWTQYQYMSFVPN MPGAIYQPVPITKGTIEDEKSLTSFLP

GI EPTFAQVNVISGIGVKLDI KAFTDFGVNSFQDPRAIAVLKGLQNRLDVVEKQI EQRN KRREECYPGFLPSRMANS

TSG

Coding sequence for WP_096660823.1 - SEQ ID NO: 233 AT G ACT G ATTT AT CGCAAAAT AATT CG ACAT CAGTT GAT AAATT AAAACTTGCT AGGCAAG AAT ACCAGT ACA

GCTATATCCATATTCCACCTATTGCTATGGTAGATAAACTTCCTAGTAACGAGAATT TCTCTACTGGTTGGCTGC

GTTTATTAGCTAGAGAATTAAAAGTTGTTTTTATCAATACCCTAATTGCAAATCGAG GAAATCGCGGTTCGGAA

AATGTTCGCGACGATGTGAGATTATTTTTCCTGGAAGTATTAGCGAAAGGAGCATTA CCCTTTAATTTAGGTGT

T ACT GCT AG AGTTTT AC AAATT ATT CCT AAT CT ATT ACTT AAAGG AAC AT CAAAAG ATTTT AGCG AAATCG AT G

ATTT ATT CTTTT CT AT ACTT AAGG AAAGCGG ACT GT C AATTTTT CAAG ATT CTTT G AGT CG AGTT AAAAGT CTTT

TGTAT GAAAAACGTCCGACGGGACAT GTAAGCAGCTTGAAT GATTATCAAAAACTTTTCCCT GAAAT GGAAAT

ACCCAAGATAGCTGATAATTTCTCTACAGACGAACAATTTGCTTATATGCGGGTAGC TGGATATAACCCGGTA

ATGATTGAGCGAGTGAATAAATTGGGCGATCGCTTTCCTGTTACCGAAGCTCAATAT CAGGAAGTCATGGGA

GATGATTCTTTAACAGCAGCGGGTGAGGAAGGAAGACTTTATTTAGCTGATTATGGA ATTTTAGAAGGTGCTG

TT AACG GT ACTTTT CCTT C AC AG C AAA AGT AT AT CTATG CTCCG CT AG C ACT ATTT G C AATTCCT AA AAATT CCG

AG AAT G ACG AAT CG AGTTT AAT GCGT CCGGTT GCG ATT CAGTGCGGT CAAAACCCCC AG AAT AAT CCT ATTT G

T ACG CC A AA AT C AG AC A AAT AT G CTT G G CT GTTT G C A AAA ACT ATT GTT C AA AT CG C AG ATG CTA ACTACC ACG

AAGCTGTAACTCATTTAGGACGTACTCATTTGCTTGTAGGTCCCTTTGTTGTTGCAA CTCATCGTCAGTTACCGG

ATAGTCATCCGCTTAATATATTATTGCGTCCTCATTTTGAAGGGACTTTAGCAATAA ACAATGCAGCCCAAAGT

AGTTTGATTGCTGCTGGTGGGGGTGTGGATAAATTACTTGCATCGACTATTGATAAT TCCCGTGTTTTGGCAGC

AGTTGGTTTACAAAGCTATGGGTTCAATGAAGCAATGTTACCCAAGCAATTAGAAAA ACGCGGGGTTAACGA

TACACAAAAGCTACCTATTTACCCATACCGCGATGATGCTCTATTAATTTGGAATGC TATACATACATGGGTTG

CAGATTATCTAAGCATTTATTATAAGGACGATACCAGCATTCAAAATGATACCTATC TCCAAAATTGGGCTATT

GAAGCAGGGGCTTACGATGGTGGACGCGTTCCTGATTTTGGTCAAGAAAATGGGCTG ATTCAAACCTTGGAC

TATCTAATTGATGCTACTACACTGATTATTTTTACTGCTAGCGCTCAACATGCTGCG GTTAATTTCCCCCAGGGA

GACATGATGATCTACGCGGCCGCAGTACCTTTAGCTGGTTATCAACCTGCTTCAATT CTCGAAGGAAAAGTTAC

T C AGG AAG ACT ACTT AAATTT ACTT CC ACCT CT AG AGCAAGCACAAG AACAATT G AATTT AGT CT ATTT ATT AG

GTTCT ATTT ACT AT A AA ACTTT G G GT GATT ACT C AG AT AATT ACTT C A AAG AT G CTTT AGT C AA ACC AG CTTT AC

AAGAATTCCGAAAT AATTT ACTCGAAGCTGAAGCTACTATCCATCAACGCAATCAAAATCGTCCGACTTACGAA

T ATTT G CTG CCTT CAA AAATT CC AC AG AGTAT C A AT ATTT AG

Amino acid Sequence for WP_096660823.1 - SEQ ID NO: 234

MTDLSQN NSTSVDKLKLARQEYQYSYIH IPPIAMVDKLPSNEN FSTGWLRLLARELKVVFI NTLIANRGN RGSENVR

DDVRLFFLEVLAKGALPFNLGVTARVLQI IPNLLLKGTSKDFSEIDDLFFSI LKESGLSIFQDSLSRVKSLLYEKRPTGHVS

SLN DYQKLFPEM EI PKIADNFSTDEQFAYMRVAGYNPVMIERVNKLGDRFPVTEAQYQEVMGDDSLTAAGEEGR

LYLADYGILEGAVNGTFPSQQKYIYAPLALFAIPKNSENDESSLM RPVAIQCGQN PQN NPICTPKSDKYAWLFAKTIV

QIADANYH EAVTHLGRTHLLVGPFVVATH RQLPDSHPLN ILLRPHFEGTLAIN NAAQSSLIAAGGGVDKLLASTIDN

SRVLAAVGLQSYGFNEAMLPKQLEKRGVNDTQKLPIYPYRDDALLIWNAIHTWVADY LSIYYKDDTSIQNDTYLQN

WAIEAGAYDGGRVPDFGQENGLIQTLDYLI DATTLII FTASAQHAAVN FPQGDMMIYAAAVPLAGYQPASILEGKV

TQEDYLNLLPPLEQAQEQLN LVYLLGSIYYKTLGDYSDNYFKDALVKPALQEFRN NLLEAEATIHQRNQN RPTYEYLL

PSKI PQSI NI

Coding sequence for WP_110989156.1 - SEQ ID NO: 235

ATGACAGACTCTAATACTGCTCAAGAAGCTCAGTCTCAGCAATACGAGTATCGGTAC GACGCCTTTAAAAATA

TTTCACCTAAGTTGATATATCCAATGGCAGTGAAAGTCTTACCTGCTGATCAGTCGT TTACGAAATGGAAGTGG

ACGAAAAATGTAGTTTCCCTTGTACTTAGACTAGTTGCAAATCAGGCCATGCAAAAT GTATCACTCCGAAAGG

GATCGGCCTGCCGCCTGATTACATTTATCCGCTTATACAGAATTTTAGAAGATCCAA AGAACAGTTCCTATATT

GAAAGACTCTTTGATTTCATCATTAGCATTGCCCGAGCGTTGACAAATCGGTTCAAG CGCAGACCTAAATCTCA AGATATTGAACAAGATGTTAAGCAAAACCAGAAGCCCGATCAGGTGCAAGCCAGGGTTGA GGCAATGGTTGA

TGATATTCAACAGCAATCTAAAACGAAGGACCCGGTAAAGCATCTTTCATTTGAGGA CTATCGCAATCTATTTC

AGATCATCTATTTACCGGATATTAGCAATCATTTTCTTGAGGATCGCTCCTTTGCAG CTCAACGGGTGGCGGGG

GCT AACCCACT GGT CATT ATGCAAGT CT CT G AACTCCCT G AGT ATTT CAAGGT AACT G AGG AACACT AT AC AAA

GGTGATGGGTAAAGATGACTCCCTTCAGGCTGCACTAGACGAGGGGCGGATCTACCT GGCTGACTACAAGAT

TCTGGACGAAATCGATCCAGGGACTGTTGAGGTAGGGGTAAACGGTAGCATCAAAGA AACGATTGAGAAATT

CGGTTATGCACCTCTAGCTTTGTTTGCGATCGCCTCGGGTGATTGTCCGGGCCGTCT ACTGACACCGGTTGCGA

TTCAATGCAGTCAAGACGCTGGCAGTCTCATTTTTACTCCACCCAGTATAGCGGCTG TTGATGAGGAGCGATG

GGCTTGGAGAATGGCAAAGACGGTCGTTCAGGTCGCTGATGGCAATTACCATGAACT AATCTCACACCTAGG

ACGCACTCATCTGTGGATTGAGCCAATAGCGCTCGGTACCTACCGTCGTTTAGCAAA ACACAAGTTAGGTAAG

CT CCTT CTG CCT C ATTTT G AG G GT ACTTT CTT CAT C A AT AAT G CTG CTG C AG GTAG CCT GATT G CT A AG G GTG G

TGTTGTGGAAAGTATTTTATCGGGTACGTTGCTATCGTCTGTAACGCTCAGTGTTAA GGCTGCGAAGGGATAC

CCGTTTGCATTT AAT G ATTCAAT GCTTCCCAAAACCTTTGCT GCT CGTGGT GT AG AT G ATCCACAAAAATTACC

GGACTACCCCTATCGTGATGATGCGTTGCTCATTTGGGATGCCATTCATAAGTGGGT TAAGTCATACCTTGAG

GTCTACTACAGCAGTGATGATGAGGTGCTAAGTGATGCCGTTTTACAGGCGTGGCTA GCAGAACTTGTCGCTG

AGGATGGGGGCCAGATGACAGAGATAGGAGAAGTCATACCAGAGGACAGAAGACCAA AAATCCGAACGTTG

GATTATTTGATCGATGCGACAACGCTGATTATCTTCACTTGTAGCGTTCAACATGCA GCAGTCAATTTCACCCA

AGCATCGTTAATGTCGTTTGCACCCAATATGCCACTGGCAGGATTTAATGCGGCTCC AACGACTCTTAAAGTCA

GTGAAGCAGACTACTTTTCGATGCTGCCATCACTTAGCCTAGCTGAGCAACAAATGA ATTTTGGATATACATTA

GGATCCGTGTACTACACTCAAATCGGACAATACAAGGCTAATGAGGTAGAGCTAGAG GAGATGAATCAGCAT

GATT ACTTTGGT GATT CACG AAT CT CT CAT CACCT AG AG ATTTTT C AG AAC AAGTT G AAAG AG ATT G AGTT G AC

CATT CAAC AACG G AACG AAACTCGTCCT ACTTTTT ACG AT ATTTTGCT GCCGT CAAAAATT CCGC AAT CT AC AA

ATATCTAG

Amino acid Sequence for WP_110989156.1 - SEQ ID NO: 236

MTDSNTAQEAQSQQYEYRYDAFKNISPKLIYPMAVKVLPADQSFTKWKWTKNVVSLV LRLVANQAMQNVSLRK

GSACRLITFI RLYRI LEDPKNSSYIERLFDFI ISIARALTNRFKRRPKSQDI EQDVKQNQKPDQVQARVEAMVDDIQQQ

SKTKDPVKHLSFEDYRNLFQI IYLPDISNH FLEDRSFAAQRVAGAN PLVIMQVSELPEYFKVTEEHYTKVMGKDDSL

QAALDEGRIYLADYKILDEIDPGTVEVGVNGSIKETIEKFGYAPLALFAIASGDCPG RLLTPVAIQCSQDAGSLIFTPPSI

AAVDEERWAWRMAKTVVQVADGNYH ELISHLGRTHLWIEPIALGTYRRLAKHKLGKLLLPHFEGTFFIN NAAAGSL

lAKGGVVESI LSGTLLSSVTLSVKAAKGYPFAFNDSMLPKTFAARGVDDPQKLPDYPYRDDALLIWDAIH KWVKSYL

EVYYSSDDEVLSDAVLQAWLAELVAEDGGQMTEIGEVI PEDRRPKIRTLDYLIDATTLII FTCSVQHAAVNFTQASLM

SFAPN MPLAGFNAAPTTLKVSEADYFSMLPSLSLAEQQM NFGYTLGSVYYTQIGQYKANEVELEEM NQH DYFGDS

RISH HLEI FQNKLKEIELTIQQRN ETRPTFYDI LLPSKIPQSTNI

Coding sequence for WP_010473598.1 - SEQ ID NO: 237

ATGACGCATCAGTACTCCCTCACTGGCCTGCCGACCCAAATCACGCCTGTTGAAATT CAACAGGACAAACATCA

ACCCACTCTGACCTCCACTCGTCCTAATCCGACCCAGCCGGAGCCGATTCCCGCAGC GCTAAAAGCAGCTCGA

CGCAAATATCAATACAACTACAGTCACATTGCCCCTGTAGCCATGGTGGATCGCTTA CCCCAAGAGGAACTCC

CCTCTCGGACTTGGTGGTCAAAGTTGTTCCGTACCATGTTCAAGATTCTCTCGAATG CCATTGTTGGCGCCCAC

AATCACCACCATGAGCATGAAGCAGAGCAACATATTTCCCGTCTCATTCGCAAAACC TTGGTGAATATCTTGAC

TCAACGCCCCGAGGTGCGGTGGCGTCTCATCTGGCATCTGCTGAAAACAGCACCAAC GACGTTGATTAACGGT

TTACGGTTGTCGTTCGCTGATTCAGAAAGCTTGCTGCACAGTTTAGCCGCCCATTTA GAGCATGATCTATTACG

GATTCTGCACTTGAACTTAAAAGAACATCTAGCCCATGAATGTAGACAAGATCGTCC TACTTCAATAGCAGACT TTAATCAGCAATTCGCGACAATTCCGTTACCGGAGTGTGCCGAATACTTTCAGGAAGATG AGTTTTTTGCTTAC

TTGCGAGTAGCCGGTCCTAATCCTGTTTTGCTGCAACAAGTCCGTCATTTATCGGGA GACACCCTCTGCTCTCA

TTTCCCGGTTACGAATCAGCATTATCAGGCCGTGATGGGAGCAGACGATTCTCTGCA AACAGCGGTCACCGAG

GGCCGACTATACATCGCCGATTATGCTATTTTGGCCGGTGCGATCAATGGTAACTAC CCCGATCAGCAAAAAT

ATATTTCGGCTCCCATCGCCCTTTTTGCTGTTCCCTCAGCTGATGCCCCCTGCCGAA ATCTCCAGCCCATCGCTA

TTCAATGCCGCCAATCTCCAGGGCCTGAAACACCGATTCTGACGCCGCCTACGGATC AGAATCCAGACCAAAA

ACAGGCCTGGGACATGGCGAAGACCTGCGTGCAAGTTGCCGATAGCAATTACCACGA GGCCGTCACCCATTT

GGGTCGAACCCATCTGTTTATTAGCCCGTTTGTAATTGCCACCCATCGCCAATTACT GCCGTCTCATCCTGTGA

GTGTCCTGCTTCGGCCTCACTTTGAAGGCACCTTAAGTATCAACAACGGCGCTCAAA GCATGTTAATGGCGCC

AGAAGGTGGAGTGGATACGGTCTTGGCTGCCACCATCGACTGTGCCAGGGTCTTAGC CGTAAAGGGATTACA

A AG CT ATTCCTTT A AT C AG G CC ATG CTG CCCC A AC AATT G C AG C A ACT G G GTTT G G AT AAT G C AG CG G C ACT G

CCCATCCATCCCTATCGAGACGATGCCTTGCTGATTTGGCAGGCCATCGAAACTTGG GTCACTGATTATGTGAG

CTTGTACTACCCAACAGATGACTCCGTGCAAAAAGATGCGGCCCTTCAGGCTTGGGC GCAGGAGCTACAGGC

TGAAGAGGGTGGCCGAGTCCCAGATTTTGGTGAGGATGGACAATTGCGAACCCAGGC CTACTTAATTCAAGC

CCTCACGCTGATCATTTTTACCGCGAGTGCCCAACATGCCGCTGTGAATTTTCCCCA GGGCGACATCATGGTCT

ATACCCCAGGGATGCCATTAGCAGGCTACCAGCCCGCTCCCAACACGACAGCGATGT CTTCCCAGGATCGGCT

CAACCAACTGCCCCCCCTACACCAGGCCTTAAATCAGCTGGAGTTAACGTATTTGCT CGGGCAGATTTACCATA

CGCAACTCGGTCAATACGAAAAGTCCTGGTTCTCTGATCAGCGTGTACTCGCGCCTC TGCATCGTTTTCAGGCC

AATTTACTGGATATCGAAACTGCGATCGCAGAACGAAACCGCCATCGCCCCTACCCT TACCGCTACCTACAGCC

GTCCAAC ATT CCCC AG AGCAT CAAT AT CT AG

Amino acid Sequence for WP_010473598.1 - SEQ ID NO: 238

MTHQYSLTGLPTQITPVEIQQDKHQPTLTSTRPN PTQPEPI PAALKAARRKYQYNYSHIAPVAMVDRLPQEELPSRT

WWSKLFRTMFKI LSNAIVGAHN HH HEHEAEQHISRLI RKTLVNI LTQRPEVRWRLIWHLLKTAPTTLI NGLRLSFADS

ESLLHSLAAHLEH DLLRILH LNLKEH LAHECRQDRPTSIADFNQQFATI PLPECAEYFQEDEFFAYLRVAGPNPVLLQ

QVRHLSGDTLCSH FPVTNQHYQAVMGADDSLQTAVTEGRLYIADYAILAGAINGNYPDQQKYISAPIALFAVP SAD

APCRN LQPIAIQCRQSPGPETPILTPPTDQNPDQKQAWDMAKTCVQVADSNYH EAVTH LGRTHLFISPFVIATHR

QLLPSH PVSVLLRPHFEGTLSIN NGAQSMLMAPEGGVDTVLAATI DCARVLAVKGLQSYSFNQAMLPQQLQQLGL

DNAAALPI HPYRDDALLIWQAIETWVTDYVSLYYPTDDSVQKDAALQAWAQELQAEEGGRVPDFGEDG QLRTQA

YLIQALTLII FTASAQHAAVNFPQGDIMVYTPGMPLAGYQPAPNTTAMSSQDRLNQLPPLHQALNQLELT YLLGQI

YHTQLGQYEKSWFSDQRVLAPLHRFQANLLDIETAIAERN RH RPYPYRYLQPSN I PQSI NI

Amino acid Sequence for 5MEE_A - SEQ ID NO: 239

MVQPSLPQDDTPDQQEQRNRAIAQQREAYQYSETAGILLI KTLPQSEM FSLKYLIERDKGLVSLIANTLASNI EN IFD

PFDKLEDFEEM FPLLPKPLVMNTFRN DRVFARQRIAGPNPMVI ERVVDKLPDN FPVTDAM FQKIMFTKKTLAEAIA

QGKLFITNYKGLAELSPGRYEYQKNGTLVQKTKTIAAPLVLYAWKPEGFGDYRGSLA PIAIQINQQPDPITNPIYTPR

DGKHWFIAKIFAQMADGNCH EAISH LARTHLILEPFVLATANELAPNH PLSVLLKPHFQFTLAI NELAREQVISAGGY

ADDLLAGTLEASIAVIKAAI KEYMDN FTEFALPRELARRGVGIGDVDQRGEN FLPDYPYRDDAMLLWNAIEVYVRD

YLSLYYQSPVQI RQDTELQNWVRRLVSPEGGRVTGLVSNGELNTIEALVAIATQVI FVSGPQHAAVNYPQYDYMAF

IPNMPLATYATPPN KESNISEATI LN ILPPQKLAARQLELMRTLCVFYPN RLGYPDTEFVDVRAQQVLHQFQERLQEI

EQRIVLCNEKRLEPYTYLLPSNVPNSTSI

8. Consensus Sequence Motifs A Kxxxxx A D xxxxxxxx H xxxx H xxxx P x A (SEQ ID NQ:240),

VxGxxxxxxxxxxLxxxxxxxxxxxxxxHxxxNxxQxxYxxxxxN (SEQ ID NO:241),

LxxxxxxlxxxNxxxxxxYxxxxPxxxxxSI (SEQ ID NO:242);

LxxxxxYxxxxxXiXXXXXxX 2 GxxxxxxxKxLPxPxxxFxWxxxX 3 xxxPxxl (SEQ ID NO:243)

WxxAKxCxQxADxxHxExxxHxxxxHxxMxPxA (SEQ ID NO:244);

GxVxGxxxxxxxxxxLxxxxxxxxxxCxPxHxxxNxxQxxYxxxxxN M PxAxY (SEQ ID NO:245),

QXXXXXXLXXXXXDXXGXYXXXX 4 F (SEQ ID NO:246),

QxxLxxxxxxlxxxNxxRxxxYxxxxxxxxxNSI (SEQ ID NO:247),

LxxxxxYxxxxxXiXXXXXxX 2 GGxxxxxxKxLPxPxAxFxWxxxX 3 xxxPxxl (SEQ ID NO:248),

WxxAKxCxQxADxNHxExxxHxxxTHxVMxPxAxxT (SEQ ID NO:249),

GxVxGxxxxxxxxxxLxxxxxxxxxxCxPxHxxxNxxQxxYxxxxxN M PxAxY (SEQ ID NO:250),

Qxxxxxx LxxxxY D x LG xYxxx X 4 F (SEQ ID NO:251),

FQxxLxxxxxxlxxxNxxRxxxYxxxxPxxxxNSI (SEQ ID NO:252)

9. LOX mutants

Codon-optimized coding sequence of WP_002738122.1mut - SEQ ID NO: 253

ATGGT G AACACCCCGCCGCCG ACCCCGTGCCTGCCGCAG AACG AGCCGG ATGCG AACCGTCGTGCGG ATAGC

CTGAACCTGCAGCGTCAAGCGTACCGTTATGACTACCAGTATCTGCCGCCGCTGGTG CTGATGGAGAGCGTTC

CGGCGGCGGAAAACTTCAGCTTTCAATATATTACCGAACGTCTGGCGGCGACCGCGG AACTGCCGGCGAACA

TGCTGGCGGTGAAGGTTAAAAGCTTCCTGGACCCGCTGGATGAGCTGCAGGACTACG AAGATTTCTTTGCGAT

CATTCCGCTGCCGAAGATCGCGAAAGTGTATCAGACCAACGATGCGTTTGCGGAACA ACGTCTGAGCGGTGC

GAACCCGCTGGTTCTGCACCTGCTGAAGCCGGGTGATGCGCGTGCGCAGGTTCTGAA CCAAATTCCGAGCAG

CAAAACCGATTTCGAGCCGCTGTTTCAGGTTAACCAAGAACTGGCGGCGGGCAACAT CTACATTTGCGACTAT

ACCGGCACCGATATCAACTACCTGGGTCCGAGCCTGATTCAGGGTGGCACCCACGCG AAGGGTCGTAAATAT

CTGCCGAAGCCGCGTGCGTTCTTTTGGTGGCGTAAGAGCGGCATCCGTGACCGTGGT AAACTGGTGCCGATC

GCGATTCAGTTCGGCGAGAACGCGGAAAAGCTGTACACCCCGTTCGAGAAAAACCCG CTGGCGTGGCTGTTT

GCGAAGATTTGCGTGCAAGTTGCGGATAGCAACCACCACGAAATGAACAGCCACCTG TGCCGTACCCACTTCG

TTATGGAGCCGATCGCGATTTGCACCGCGCGTCAGCTGGCGGAAAACCACCCGCTGA GCCTGCTGCTGAAAC

CGCACCTGCGTTTTATGCTGACCAACAACAGCCTGGGTCAAGAGCGTCTGATCAACC CGGGTGGCCCGGTGG ATGAGCTGCTGGCGGGCACCCTGGGTGAAAGCATGGCGCTGGTTAAGGACGCGTACGCGA ACTGGAACCTG

CGTGATTTCGCGTTTCCGAAAGAGATTAGCAACCGTGGCATGGACGATACCGAACGT CTGCCGCACTACCCGT

ATCGTGACGATGGTATGCTGGTGTGGCAGAGCATCAACCAATTCGTTAGCGACTACC TGCACTACTTTTATCCG

AACCCGCAGGACATTACCAACGATCAGGAGCTGCAAGCGTGGGCGGGTGAATGCAGC AACAGCGCGGCGGA

TCAAGGTGGCAACGTGAAGGGTATGCCGGCGAACTTCACCGACGTTGAGGATCTGAT CGAAGTGGTTACCAC

CATCATTTTTATTTGCGGCCCGCTGCACAGCGCGGTTAACTACGGCCAGTACGACTA TATGACCTTTGCGGCGA

ACATGCCGCTGGCGGCGTATTGCGACCTGCCGGAGGCGATCAAGGATACCACCGGTA GCATCATTGGCGACG

CGCGTGGTAGCATCACCGAAAAAGATATTCTGCAGCTGCTGCCGCCGTACAAGAAAG CGGCGGATCAGCTGC

AAAGCCT GTTCACCCT G AGCG ACT ACCGTT AT GAT CAACTGGGCTACT AT G ACAAGGCGTTT CGT G AGCT GT A

TGGTCGTAAATTCGAGGAAGTGTTTGCGGAAGGCGATCAGGCGACCATCACCGGTTT CCTGCGTCAATTTCAG

CAAAACCT G AAC AT G AACG AGC AGG AAATCG ACGCG AACAACCAAAAGCGT ATT GTT CCGT AC ACCT ATCTG A

AACCG AGCCT GATT CT G AACAGC AT CAGCATTT AA

Amino acid sequence for WP_002738122.1mut - SEQ ID NO: 254

MVNTPPPTPCLPQN EPDANRRADSLNLQRQAYRYDYQYLPPLVLMESVPAAENFSFQYITERLAATAELPANM LA

VKVKSFLDPLDELQDYEDFFAI IPLPKIAKVYQTNDAFAEQRLSGANPLVLHLLKPGDARAQVLNQI PSSKTDFEPLFQ

VNQELAAG NIYICDYTGTDI NYLGPSLIQGGTHAKGRKYLPKPRAFFWWRKSGI RDRGKLVPIAIQFGENAEKLYTPF

EKNPLAWLFAKICVQVADSNHH EM NSH LCRTH FVM EPIAICTARQLAENH PLSLLLKPH LRFMLTN NSLGQERLIN

PGGPVDELLAGTLGESMALVKDAYANWNLRDFAFPKEISN RGMDDTERLPHYPYRDDGMLVWQSINQFVSDYL

HYFYPNPQDITN DQELQAWAGECSNSAADQGGNVKGMPANFTDVEDLI EVVTTI IFICGPLHSAVNYGQYDYMT

FAAN MPLAAYCDLPEAI KDTTGSIIGDARGSITEKDILQLLPPYKKAADQLQSLFTLSDYRYDQLGYYDKAFRELYG RK

FEEVFAEGDQATITGFLRQFQQN LN MN EQEI DANNQKRIVPYTYLKPSLILNSISI

Codon-optimized coding sequence of WP_002738122.1mut2 - SEQ I D NO: 255

ATGGT G AACACCCCGCCGCCG ACCCCGTGCCTGCCGCAG AACG AGCCGG ATGCG AACCGTCGTGCGG ATAGC

CTGAACCTGCAGCGTCAAGCGTACCGTTATGACTACCAGTATCTGCCGCCGCTGGTG CTGATGGAGAGCGTTC

CGGCGGCGGAAAACTTCAGCTTTCAATATATTACCGAACGTCTGGCGGCGACCGCGG AACTGCCGGCGAACA

TGCTGGCGGTGAAGGTTAAAAGCTTCCTGGACCCGCTGGATGAGCTGCAGGACTACG AAGATTTCTTTGCGAT

CATTCCGCTGCCGAAGATCGCGAAAGTGTATCAGACCAACGATGCGTTTGCGGAGCA ACGTCTGAGCGGTGC

GAACCCGCTGGTTCTGCACCTGCTGAAGCCGGGTGATGCGCGTGCGCAGGTTCTGAA CCAAATTCCGAGCAG

CAAAACCGATTTCGAACCGCTGTTTCAGGTTGAGCAAGAACTGGCGGCGGGCAACAT CTACATTTGCGACTAT

ACCGGCACCGATATCAACTACCTGGGTCCGTGCATGATTCAGGGTGGCACCCACGCG AAGGGTCGTAAATAT

CTGCCGAAGCCGCGTGCGTTCTTTTGGTGGCGTAAGAGCGGCATCCGTGACCGTGGT AAACTGGTGCCGATC

GCGATTCAGTTCGGCGAGAACGCGGAAAAGCTGTACACCCCGTTCGAGAAAAACCCG CTGGCGTGGCTGTTT

GCGAAGATTTGCGTGCAAGTTGCGGATAGCAACCACCACGAAATGAACAGCCACCTG TGCCGTACCCACTTCG

TTATGGAGCCGATCGCGATTTGCACCGCGCGTCAGCTGGCGGAAAACCACCCGCTGA GCCTGCTGCTGAAAC

CGCACCTGCGTTTTATGCTGACCAACAACCACCTGGGTCAAGAACGTCTGATCAACC CGGGTGGCCCGGTGGA

TGAGCTGCTGGCGGGCACCCTGGGTGAAAGCATGGCGCTGGTTAAGGACGCGTACGC GAACTGGAACCTGC

GTGATTTCGCGTTTCCGAAAGAGATTAGCAACCGTGGCATGGACGATACCGAACGTC TGCCGCACTACCCGTA

TCGT G ACG AT GGTAT GCT GGT GTGGCAG AGCAT CAACCAATT CGTT AGCG ACTACCT GCACTACTTTT AT CCG A

ACCCGCAGGACATTACCAACGATCAGGAGCTGCAAGCGTGGGCGGGTGAATGCAGCA ACAGCGCGGCGGAT

CAAGGTGGCAACGT G AAGGGTATGCCGGCG AACTTCACCG ACGTT G AGG ATCT GAT CG AAGTGGTTACCACC

ATCATTTTTATTTGCGGCCCGCTGCACAGCGCGGTTAACTACGGCCAGTACGACTAT ATGACCTTTGCGGCGAA

CATGCCGCTGGCGGCGTATTGCGACCTGCCGGAGGCGATCAAGGATACCACCGGTAG CATCATTGGCGACGC GCGTGGTAGCATCACCGAAAAAGATATTCTGCAGCTGCTGCCGCCGTACAAGAAAGCGGC GGATCAGCTGCA AAGCCTGTTCACCCTGAGCGACTACCGTTATGATCAACTGGGCTACTATGACAAGGCGTT TCGTGAGCTGTAT GGTCGTAAATTCGAGGAAGTGTTTGCGGAAGGCGATCAGGCGACCATCACCGGTTTCCTG CGTCAATTTCAGC AAAACCT G AACAT G AACG AGCAGG AAAT CG ACGCG AACAACCAAAAGCGT ATT GTT CCGT AC ACCT ATCTG A AACCG AGCCT GATT CT G AACAGC AT CAGCATTT AA

Amino acid sequence for WP_002738122.1mut2 - SEQ I D NO: 256

MVNTPPPTPCLPQN EPDANRRADSLNLQRQAYRYDYQYLPPLVLMESVPAAENFSFQYITERLAATAELPANM LA

VKVKSFLDPLDELQDYEDFFAI IPLPKIAKVYQTNDAFAEQRLSGANPLVLHLLKPGDARAQVLNQI PSSKTDFEPLFQ

VEQELAAGN IYICDYTGTDINYLGPCMIQGGTHAKGRKYLPKPRAFFWWRKSGIRDRGKLVPIAIQFGE NAEKLYTP

FEKN PLAWLFAKICVQVADSN HH EM NSH LCRTH FVM EPIAICTARQLAEN HPLSLLLKPH LRFMLTNN HLGQERLI

N PGGPVDELLAGTLGESMALVKDAYANWNLRDFAFPKEISN RGMDDTERLPHYPYRDDGMLVWQSI NQFVSDY

LHYFYPN PQDITNDQELQAWAGECSNSAADQGGNVKGMPAN FTDVEDLIEVVTTI IFICGPLHSAVNYGQYDYM

TFAAN M PLAAYCDLPEAI KDTTGSI IGDARGSITEKDI LQLLPPYKKAADQLQSLFTLSDYRYDQLGYYDKAFRELYGR

KFEEVFAEGDQATITGFLRQFQQN LN MN EQEI DANNQKRIVPYTYLKPSLILNSISI

Codon-optimized coding sequence of WP_015204462.1mut - SEQ ID NO: 257

ATGCCGCAACCGTGCCTGCCGCAGAACGAGCCGAACCCGGAAAAACGTAACAACGAC CTGAGCGATCAGCAA

CAGGCGTACGAGTATGATTACAAGTATCTGCCGCCGCTGGTGCTGCTGAAGAAAATT CCGGCGTTCGAAAACT

TTAGCGCGCAGTACATCGCGGAACGTGTGGTTGCGACCAGCGAGCTGGTTCCGAACA TGCTGGCGGCGAAAG

CGCGTAGCTTTCTGGACCCGCTGGACGATATCAAGGACTACGAGGACCTGTTCACCC TGCTGCCGCTGCCGGA

AGTGGCGAAAGTTTATCAAACCAACAACAGCTTTGCGGAGCAGCGTCTGAGCGGTGC GAACCCGTTCGTGAT

TCGTCTGCTGGACGAGGACGATGCGCGTAGCCAAGTTCTGGAACAGATCCCGAGCTT CAAAGACGATTTTGA

GCCGCTGTTCGATGTGCGTAAGGAACTGGCGGCGGGTAACATCTACATTTGCGACTA TACCGGCACCGATGA

GTACTATCGTGGCCCGAGCATGGTTCAGGGTGGCACCTACGAAAAGGGCCGTAAATA TCTGCCGAAACCGCT

GGCGTTCTTTTGGTGGCAACGTACCGGTATTAGCGACCGTGGCAAGCTGGTGCCGAT CGCGATTCAGCTGGA

TGCGAGCAAGAACAGCAAAGTGTACACCCCGACCAACAGCAAAGTTTATACCCCGTT TGAGCAAAACCCGCTG

GACTGGCTGTTCGCGAAGCTGTGCGTGCAGATCGCGGATGGTAACCACCACGAAATG AGCAGCCACCTGTGC

CGTACCCACTTCGTTATGGAGCCGATCGCGATTTGCACCGCGCACCAGCTGGCGGAA AACCACCCGCTGAGCC

TGCTGCTGCGTCCGCACTTCCTGTTTATGCTGACCAACAACAGCCTGGGCCAACAGC GTCTGATCAACCCGGG

TGGCCCGGTGGATGAGCTGCTGGCGGGCACCCTGCCGGAGAGCATGGAACTGGTTAA GGATGCGTACGAGG

GCTGGAACATTAAAGAATTCGCGTTTCCGACCGAGATCAAGAACCGTGGTATGGACA ACACCGAACGTCTGC

CGCACTACCCGTATCGTGACGATGGCATGCTGGTTTGGAAAGCGATTCACACCTTTG TGAGCGATTACGTTAA

CCACTT CT ATCCG ACCCCGG AAG ACAT CACCGGT GAT ACCG AGCTGCAAGCGTGGGCG AAGG AAT GCAGCG A

CCAAAGCGCGCAGACCAACGGTGGCAAGGTGAAAGGCATGCCGACCAGCTTTACCAC CGTGCAGGAGCTGA

TCGAAATTGTTACCACCATCATTTTCATTTGCGGTCCGCAACACAGCGCGGTTAACT ACGCGCAGGATGGCTAT

ATGACCTTTGCGGCGAACATGCCGCTGGCGGCGTACCGTGACATCCCGAAGCAGAGC CACAAACCGCAGGAT

CAACCGACCGCGACCCCGAGCGTGGCGGTTCAAACCACCGCGGAGCAGACCACCGCG GAACAAACCAAGGC

GGTGGAAATTACCGCGGACAAAGCGACCCTGGATCAGAACACCGTTCTGCAAAAACG TGCGGTGCAGACCAC

CACCGTTGAGATCCCGGAAGACCAAATTACCGAGGAACAGATCCTGAAGCTGCTGCC GCCGTACAAACGTAC

CGCGGACCAACTGCAGAGCCTGTTTGTGCTGAGCGCGTACCAATATGATCGTCTGGG TTACTATGAGAAGGC

GTT CC AACAGCT GT AC AACG AC AAGTT CG AAG AT GTTTT CAAGG ACG AT AAC AACC AAGCG AT C ATT GCG ATT

GTGCGTCAGTT CCAACAG AACCT G AACAT GGTT G AGCAGG AAAT CG ACGCGAACAACAAG AAACGT GTGGTT

CCGT ACCT GT AT CT GAAGCCG AGCCT GAT CCT G AAC AGCAT CAGCATTT AA Amino acid sequence for WP_015204462.1mut - SEQ ID NO: 258

MPQPCLPQN EPNPEKRN NDLSDQQQAYEYDYKYLPPLVLLKKIPAFEN FSAQYIAERVVATSELVPNM LAAKARSF

LDPLDDIKDYEDLFTLLPLPEVAKVYQTNNSFAEQRLSGAN PFVI RLLDEDDARSQVLEQIPSFKDDFEPLFDVRKELA

AGNIYICDYTGTDEYYRGPSMVQGGTYEKGRKYLPKPLAFFWWQRTGISDRGKLVPI AIQLDASKNSKVYTPTNSK

VYTPFEQNPLDWLFAKLCVQIADGNH HEMSSH LCRTHFVMEPIAICTAHQLAEN HPLSLLLRPHFLFMLTN NSLGQ

QRLI N PGGPVDELLAGTLPESM ELVKDAYEGWNIKEFAFPTEIKN RGMDNTERLPHYPYRDDGMLVWKAIHTFVS

DYVNH FYPTPEDITGDTELQAWAKECSDQSAQTNGGKVKGM PTSFTTVQELI EIVTTI IFICGPQHSAVNYAQDGY

MTFAAN MPLAAYRDIPKQSH KPQDQPTATPSVAVQTTAEQTTAEQTKAVEITADKATLDQNTVLQKRAVQTTTV

EI PEDQITEEQILKLLPPYKRTADQLQSLFVLSAYQYDRLGYYEKAFQQLYN DKFEDVFKDDN NQAIIAIVRQFQQNL

N MVEQEIDANN KKRVVPYLYLKPSLI LNSISI

Codon-optimized coding sequence of WP_015204462.1mut2 - SEQ I D NO: 259

ATGCCGCAACCGTGCCTGCCGCAGAACGAGCCGAACCCGGAAAAACGTAACAACGAC CTGAGCGATCAGCAA

CAGGCGTACGAGTATGATTACAAGTATCTGCCGCCGCTGGTGCTGCTGAAGAAAATT CCGGCGTTCGAAAACT

TTAGCGCGCAGTACATCGCGGAACGTGTGGTTGCGACCAGCGAGCTGGTTCCGAACA TGCTGGCGGCGAAAG

CGCGTAGCTTTCTGGACCCGCTGGACGATATCAAGGACTACGAGGACCTGTTCACCC TGCTGCCGCTGCCGGA

AGTGGCGAAAGTTTATCAAACCAACAACAGCTTTGCGGAGCAGCGTCTGAGCGGTGC GAACCCGTTCGTGAT

TCGTCTGCTGGACGAGGACGATCCGCGTAGCCAAGTTCTGGAACAGATCCCGAGCTT CAAAGACGATTTTGA

GCCGCTGTTCGATGTGCGTAAGGAACTGGCGGCGGGTAACATCTACATTTGCGACTA TACCGGCACCGATGA

GTACTATCGTGGCCCGAGCATGGTTCAGGGTGGCACCTACGAAAAGGGCCGTAAATA TCTGCCGAAACCGCT

GGCGTTCTTTTGGTGGCAACGTACCGGTATTAGCGACCGTGGCAAGCTGGTGCCGAT CGCGATTCAGCTGGA

TGCGAGCAAGAACAGCAAAGTGTACACCCCGACCAACAGCAAAGTTTATACCCCGTT TGAGCAAAACCCGCTG

GACTGGCTGTTCGCGAAGCTGTGCGTGCAGATCGCGGATAGCAACCACCACGAAATG AGCAGCCACCTGTGC

CGTACCCACTTCGTTATGGAGCCGATCGCGATTTGCACCGCGCACCAGCTGGCGGAA AACCACCCGCTGAGCC

TGCTGCTGCGTCCGCACTTCCTGTTTATGCTGACCAACAACAGCCTGGGTCAACAGC GTCTGATCAACCCGGGT

GGCCCGGTGGATGAGCTGCTGGCGGGCACCCTGCCGGAGAGCATGGAACTGGTTAAG GATGCGTACGAGGG

CTGGAACATTAAAGAATTCGCGTTTCCGACCGAGATCAAGAACCGTGGTATGGACAA CACCGAACGTCTGCC

GCACTACCCGTATCGTGACGATGGCATGCTGGTTTGGAAAGCGATTCACACCTTTGT GAGCGATTACGTTAAC

CACTTCTATCCGACCCCGGAAGACATCACCGGTGATACCGAGCTGCAAGCGTGGGCG AAGGAATGCAGCGAC

CAAAGCGCGCAGACCAACGGTGGCAAGGTGAAAGGCATGCCGACCAGCTTTACCACC GTGCAGGAGCTGAT

CGAAATTGTTACCACCATCATTTTCATTTGCGGTCCGCAACACAGCGCGGTTAACTA CGCGCAGGATGGCTATA

TGACCTTTGCGGCGAACATGCCGCTGGCGGCGTACCGTGACATCCCGAAGCAGAGCC ACAAACCGCAGGATC

AACCGACCGCGACCCCGAGCGTGGCGGTTCAAACCACCGCGGAGCAGACCACCGCGG AACAAACCAAGGCG

GTGGAAATTACCGCGGACAAAGCGACCCTGGATCAGAACACCGTTCTGCAAAAACGT GCGGTGCAGACCACC

ACCGTTGAGATCCCGGAAGACCAAATTACCGAGGAACAGATCCTGAAGCTGCTGCCG CCGTACAAACGTACC

GCGGACCAACTGCAGAGCCTGTTTGTGCTGAGCGCGTACCAATATGATCGTCTGGGT TACTATGAGAAGGCG

TT CC AACAGCT GT AT GGCG ACAAGTTT G AAG AT GTTTT CAAAG ACG AT AAC AACCAAGCG AT C ATTGCG ATT G

TGCGTCAGTT CCAACAG AACCT G AACATGGTT G AGCAGGAAAT CG ACGCG AACAACAAG AAACGT GTGGTT C

CGT ACCT GT AT CT G AAGCCG AGCCT GAT CCT G AACAGCAT CAGCATTT AA

Amino acid sequence for WP_015204462.1mut2 - SEQ I D NO: 260

MPQPCLPQN EPNPEKRN NDLSDQQQAYEYDYKYLPPLVLLKKIPAFEN FSAQYIAERVVATSELVPNM LAAKARSF

LDPLDDIKDYEDLFTLLPLPEVAKVYQTNNSFAEQRLSGAN PFVI RLLDEDDPRSQVLEQIPSFKDDFEPLFDVRKELA

AGNIYICDYTGTDEYYRGPSMVQGGTYEKGRKYLPKPLAFFWWQRTGISDRGKLVPI AIQLDASKNSKVYTPTNSK VYTPFEQNPLDWLFAKLCVQIADSNH HEMSSH LCRTHFVM EPIAICTAHQLAEN H PLSLLLRPH FLFMLTNNSLGQ

QRLI N PGGPVDELLAGTLPESM ELVKDAYEGWNIKEFAFPTEIKN RGMDNTERLPHYPYRDDGMLVWKAIHTFVS

DYVNH FYPTPEDITGDTELQAWAKECSDQSAQTNGGKVKGM PTSFTTVQELI EIVTTI IFICGPQHSAVNYAQDGY

MTFAAN MPLAAYRDIPKQSH KPQDQPTATPSVAVQTTAEQTTAEQTKAVEITADKATLDQNTVLQKRAVQTTTV

EI PEDQITEEQILKLLPPYKRTADQLQSLFVLSAYQYDRLGYYEKAFQQLYGDKFEDVFKDD NNQAIIAIVRQFQQNL

N MVEQEIDANN KKRVVPYLYLKPSLI LNSISI

Codon-optimized coding sequence of WP_015204462.1mut3 - SEQ I D NO: 261

ATGCCGCAACCGTGCCTGCCGCAGAACGAGCCGAACCCGGAAAAACGTAACAACGAC CTGAGCGATCAGCAA

CAGGCGTACGAGTATGATTACAAGTATCTGCCGCCGCTGGTGCTGCTGAAGAAAATT CCGGCGTTCGAAAACT

TTAGCGCGCAGTACATCGCGGAACGTGTGGTTGCGACCAGCGAGCTGGTTCCGAACA TGCTGGCGGCGAAAG

CGCGTAGCTTTCTGGACCCGCTGGACGATATCAAGGACTACGAGGACCTGTTCACCC TGCTGCCGCTGCCGGA

AGTGGCGAAAGTTTATCAAACCAACAACAGCTTTGCGGAGCAGCGTCTGAGCGGTGC GAACCCGTTCGTGAT

TCGTCTGCTGGACGAGGACGATGCGCGTAGCCAAGTTCTGGAACAGATCCCGAGCTT CAAAGACGATTTTGA

ACCGCTGTTCGATGTGGAGAAGGAACTGGCGGCGGGTAACATCTACATTTGCGACTA TACCGGCACCGATGA

GTACTATCGTGGCCCGAGCATGGTTCAAGGTGGCACCTACGAAAAGGGCCGTAAATA TCTGCCGAAGCCGCT

GGCGTTCTTTTGGTGGCAGCGTACCGGTATTAGCGACCGTGGCCAACTGGTGCCGAT CGCGATTCAGCTGGA

CCCGAGCAAGAACAGCAAAGTGTACACCCCGACCAACAGCAAAGTTTATACCCCGTT TGAGCAAAACCCGCTG

GACTGGCTGTTCGCGAAGCTGTGCGTGCAGATCGCGGATGCGAACCACCACGAAATG AGCAGCCACCTGTGC

CGTACCCACTTCGTTATGGAGCCGATCGCGATTTGCACCGCGCACCAGCTGGCGGAA AACCACCCGCTGAGCC

TGCTGCTGCGTCCGCACTTCCTGTTTATGCTGACCAACAACAGCCTGGGCCAACAGC GTCTGATCAACCCGGG

TGGCCCGGTGGATGAGCTGCTGGCGGGCACCCTGCCGGAGAGCATGGAACTGGTTAA GGATGCGTACGAGG

GCTGGAACATTAAAGAGTTCGCGTTTCCGACCGAGATCAAGAACCGTGGTATGGACA ACACCGAACGTCTGC

CGCACTACCCGTATCGTGACGATGGCATGCTGGTTTGGAAAGCGATTCACACCTTTG TGAGCGATTACGTTAA

CCACTT CT ATCCG ACCCCGG AAG ACAT CACCGGT GAT ACCG AGCTGCAAGCGTGGGCG AAGG AAT GCAGCG A

CCAAAGCGCGCAGACCAACGGTGGCAAGGTGAAAGGCATGCCGACCAGCTTTACCAC CGTGCAGGAGCTGA

TCGAAATTGTTACCACCATCATTTTCATTTGCGGTCCGCAACACAGCGCGGTTAACT ACGCGCAGGATGGCTAT

ATGACCTTTGCGGCGAACATGCCGCTGGCGGCGTACCGTGACATCCCGAAGCAGAGC CACAAACCGCAGGAT

CAACCGACCGCGACCCCGAGCGTGGCGGTTCAAACCACCGCGGAGCAGACCACCGCG GAACAAACCAAGGC

GGTGGAGATTACCGCGGACAAAGCGACCCTGGATCAGAACACCGTTCTGCAAAAACG TGCGGTGCAGACCAC

CACCGTTGAGATCCCGGAAGACCAAATTACCGAGGAACAGATCCTGAAGCTGCTGCC GCCGTACAAACGTAC

CGCGGACCAACTGCAGAGCCTGTTTGTGCTGAGCGCGTACCAGTATGATCGTCTGGG TTACTATGAGAAGGC

GTT CC AACAGCT GT AC AACG AC AAGTT CG AAG AT GTTTT CAAGG ACG AT AAC AACC AAGCG AT C ATT GCG ATT

GTGCGTCAGTT CCAACAG AACCT G AACAT GGTT G AGCAGG AAAT CG ACGCGAACAACAAG AAACGT GTGGTT

CCGT ACCT GT AT CT G AAACCG AGCCT GAT CCT G AACAGCAT CAGCATTT AA

Amino acid sequence for WP_015204462.1mut3 - SEQ I D NO: 262

MPQPCLPQN EPNPEKRN NDLSDQQQAYEYDYKYLPPLVLLKKIPAFEN FSAQYIAERVVATSELVPNM LAAKARSF

LDPLDDIKDYEDLFTLLPLPEVAKVYQTNNSFAEQRLSGAN PFVI RLLDEDDARSQVLEQIPSFKDDFEPLFDVEKELA

AGNIYICDYTGTDEYYRGPSMVQGGTYEKGRKYLPKPLAFFWWQRTGISDRGQLVPI AIQLDPSKNSKVYTPTNSK

VYTPFEQNPLDWLFAKLCVQIADAN HH EMSSHLCRTH FVMEPIAICTAHQLAEN HPLSLLLRPHFLFMLTN NSLGQ

QRLI N PGGPVDELLAGTLPESM ELVKDAYEGWNIKEFAFPTEIKN RGMDNTERLPHYPYRDDGMLVWKAIHTFVS

DYVNH FYPTPEDITGDTELQAWAKECSDQSAQTNGGKVKGM PTSFTTVQELI EIVTTI IFICGPQHSAVNYAQDGY

MTFAAN MPLAAYRDIPKQSH KPQDQPTATPSVAVQTTAEQTTAEQTKAVEITADKATLDQNTVLQKRAVQTTTV El PEDQITEEQILKLLPPYKRTADQLQSLFVLSAYQYDRLGYYEKAFQQLYN DKFEDVFKDDN NQAIIAIVRQFQQNL N MVEQEIDANN KKRVVPYLYLKPSLI LNSISI

Codon-optimized coding sequence of WP_006635899.1mut - SEQ ID NO: 263

ATGGTGGATAACATGAAGCCGTGCCTGCCGCAAGACGATCCGAACCCGGAACAGCGT CACGACAGCCTGAAC

CGTCAGCAACAGGCGTACCAATTCGATTATGAAAGCCTGAGCCCGCTGGCGCTGCTG AAGGATGTGCCGGCG

GTTGAGAACTTTAGCAGCAAATACCTGGCGGAGCGTATCCTGGCGACCAGCGAACTG CCGGCGAACATGCTG

GCGGCGGACAGCCGTACCTTCCTGGACCCGCTGGATGAGCTGCAGGACTACGAAGAT TTCTTTACCTGGCTGC

CGCTGCCGGGTGTGGCGAAAATCTATCAAACCGATCGTAGCTTTGCGGAACAGCGTC TGAGCGGTGCGAACC

CGATGGTTCTGCGTCTGCTGCACCAAGAGGACGCGCGTGCGGAAACCCTGGCGCAAC TGTGCTGCCTGCAGC

CGCTGTTCGACCTGCGTAAGGAGCTGCAGGATAAAAACATCTACATTTGCGACTATA CCGGCACCGATGAACA

CTATCGTGGTCCGGCGAAGGTTGCGGGTGGCACCTACGAGAAGGGTCGTAAATATCT GCCGAAACCGCGTGC

GTTCTTTGCGTGGCGTTGGACCGGTATCCGTGATCGTGGCGAGATGACCCCGATCGC GATTCAACTGGACCCG

AAGCCGGGTAGCCACCTGTACACCCCGTTTGACCCGCCGATTGATTGGCTGTATGCG AAACTGTGCGTGCAGG

TT GCGG ACGCG AACCACCACG AAAT G AGCAGCCACCTGGGCCGTACCCACCT GGT G ATGG AGCCG ATCGCG A

TTTGCACCGCGCGTCAGCTGGCGAAGAACCACCCGCTGAGCCTGCTGCTGAAACCGC ACTTCCGTTTTATGCT

GACCAACAACAGCCTGGCGCGTAGCCACCTGATTGCGCCGGGTGGCCCGGTTGATGA ACTGCTGGGTGGCAC

CCTGGCGGAGACCATGGAACTGACCCGTGAGGCGTGCAGCACCTGGAGCCTGGATGA GTTTGCGCTGCCGGC

GGAACTGAAGAACCGTGGTATGGACGATCCGAACCAGCTGCCGCACTACCCGTATCG TGACGATGGCCTGCT

GCTGTGGGATGCGATCGAAACCTTTGTGAGCGGTTACCTGAAGTTCTTTTATCCGAC CAACGAGGGCATTGTG

CAAGACGTTGAACTGCAGACCTGGGCGAAAGAGTGCGCGAGCGACGATGGTGGCAAG GTGAAGGGTATGCC

GCACCACATCGACACCGTTGAGCAGCTGATCGCGATTGTGACCACCGTTATTTTCAC CTGCGGCCCGCAACAC

AGCGCGGTGAACTTCCCGCAGTACGATTATATGAGCTTTGCGGCGAACATGCCGCTG GCGGCGTACCGTGAC

ATCCCGGGTATTACCGCGAGCGGCCACCTGGAAGTGATCACCGAAAACGATATTCTG CGTCTGCTGCCGCCGT

ATAAGCGTGCGGCGGACCAACTGCAGATCCTGTTCATTCTGAGCGCGTACCGTTATG ACCGTCTGGGTTACTA

TGATAAAAGCTTTCGTGAACTGTACCGTATGAGCTTCGATGAGGTTTTTGCGGGCAC CCCGATCCAACTGCTG

GCGCGTCAGTTCCAACAGAACCTGAACATGGCGGAACAAAAGATCGACGCGAACAAC CAGAAACGTGTGATT

CCGTATTTTGCGCTGAAACCGAGCCTGGTTCTGAACAGCATTAGCATGTAA

Amino acid sequence for WP_006635899.1mut - SEQ ID NO: 264

MVDNMKPCLPQDDPN PEQRHDSLNRQQQAYQFDYESLSPLALLKDVPAVEN FSSKYLAERILATSELPANMLAAD

SRTFLDPLDELQDYEDFFTWLPLPGVAKIYQTDRSFAEQRLSGAN PMVLRLLHQEDARAETLAQLCCLQPLFDLRKE

LQDKN IYICDYTGTDEHYRGPAKVAGGTYEKGRKYLPKPRAFFAWRWTGIRDRGEMTPIAIQLDP KPGSHLYTPFD

PPIDWLYAKLCVQVADANH HEMSSH LGRTH LVMEPIAICTARQLAKNHPLSLLLKPH FRFM LTNNSLARSH LIAPG

GPVDELLGGTLAETM ELTREACSTWSLDEFALPAELKNRGMDDPNQLPHYPYRDDGLLLWDAI ETFVSGYLKFFYP

TN EGIVQDVELQTWAKECASDDGGKVKGMPHH IDTVEQLIAIVTTVIFTCGPQHSAVNFPQYDYMSFAAN MPLA

AYRDIPGITASGHLEVITENDILRLLPPYKRAADQLQILFILSAYRYDRLGYYDKSF RELYRMSFDEVFAGTPIQLLARQF

QQN LN MAEQKIDANNQKRVI PYFALKPSLVLNSISM

Codon-optimized coding sequence of WP_015178512.1mut - SEQ ID NO: 265

ATGGTGGACAACATGAAGCCGTGCCTGCCGCAAGACGATCCGAACCAAGAGCAGCGT AAAGACAGCCTGAA

CCGTCAGCAACAGGCGTACCAGTTCGATTATGAGAGCCTGAGCCCGCTGGCGCTGCT GAAGAACGTGCCGGC

GGTTGAAAACTTTAGCAGCAAATACATCGGCGAGCGTATTCTGGCGACCAGCGAACT GCCGGCGAACATGCT

GGCGGCGGACAGCCGTACCTTCCTGGACCCGCTGGATGAGCTGCAAGACTACGAAGA TTTCTTTACCCTGCTG CCGCTGCCGGCGGTGGCGAAGATTTATCAAACCGATCGTAGCTTTGCGGAGCAGCGTCTG AGCGGTGCGAAC

CCGATGGTTCTGCGTCTGCTGGATGCGGGTGATGCGCGTGCGCAAACCCTGGCGCAG ATCAGCAGCTTCCAC

CCGCTGTTTGACCTGGGCCAGGAACTGCAACAGAAAAACATTTACGTTTGCGACTAT ACCGGCACCGATGAGC

ACTACCGTGCGCCGAGCAAGATCGGTGGCGGTAGCTATGAAAAGGGCCGTAAATTCC TGCCGAAACCGCGTG

CGTTCTTTGCGTGGCGTTGGACCGGCATCCGTGACCGTGGTGAAATGACCCCGATCG CGATTCAACTGGACCC

G ACCCCGG ATAGCCAT GT GTACACCCCGTTT G ACCCGCCGGTT GATT GGCT GTTTGCGAAGCT GT GCGTGCAG

GTTGCGGATGCGAACCACCACGAGATGAGCAGCCACCTGGGTCGTACCCACCTGGTG ATGGAACCGATCGCG

ATTTGCACCGCGCGTCAACTGGCGCAGAACCACCCGCTGAGCCTGCTGCTGAAACCG CACTTCCGTTTTATGCT

G ACCAACAACAGCCT GGCGCGTAGCTACCT G ATTGCGCCGGGCGGT CCGGTT GAT G AGCT GCT GGGT GGCAC

CCTGCCGGAGACCATGGAAATCGCGCGTGAAGCGTGCAGCACCTGGAGCCTGGATGA GTTTGCGCTGCCGGC

GGAACTGAAGAACCGTGGCATGGACGATACCAACCAGCTGCCGCACTACCCGTATCG TGACGATGGCCTGCT

GCTGTGGGACGCGATTGAGACCTTTGTGAGCGGTTACCTGAAATTCTTTTATCCGAC CGAAATCGCGATTGTG

CAAGACGTTGAGCTGCAAACCTGGGCGCAGGAATGCGCGAGCGATCGTGGCGGTAAA GTGAAAGGCATGCC

GCCGCGTATCAACACCGTGGAGCAGCTGATCAAGATTGTTACCACCATCATTTTCAC CTGCGGTCCGCAACAC

AGCGCGGTTAACTTCCCGCAGTACGAATATATGAGCTTTGCGGCGAACATGCCGCTG GCGGCGTACCGTGAT

ATCCCGAAGATTACCGCGAGCGGTAACCTGGAAGTGATCACCGAAAAAGACATTCTG CGTCTGCTGCCGCCGT

ATAAGCGTGCGGCGGATCAGCTGAAAATCCTGTTCACCCTGAGCGCGTACCGTTATG ACCGTCTGGGCTACTA

TGATAAGAGCTTTCGTGAGCTGTACCGTATGAGCTTCGACGAAGTTTTTGCGGGCAC CCCGATTCAACTGCTG

GCGCGTCAGTTTCAACAGAACCTGAACATGGCGGAGCAAAAGATCGATGCGAACAAC CAGAAACGTGTGATC

CCGTAT ATT G CG CT G A AACCG AG CCTG GTT AT C A AC AG C ATT AG CAT GTA A

Amino acid sequence for WP_015178512.1mut - SEQ ID NO: 266

MVDNMKPCLPQDDPNQEQRKDSLNRQQQAYQFDYESLSPLALLKNVPAVEN FSSKYIGERILATSELPANMLAAD

SRTFLDPLDELQDYEDFFTLLPLPAVAKIYQTDRSFAEQRLSGANPMVLRLLDAGDA RAQTLAQISSFH PLFDLGQEL

QQKN IYVCDYTGTDEHYRAPSKIGGGSYEKGRKFLPKPRAFFAWRWTGIRDRGEMTPIAIQLDP TPDSHVYTPFDP

PVDWLFAKLCVQVADAN HH EMSSHLGRTHLVM EPIAICTARQLAQN HPLSLLLKPHFRFMLTNNSLARSYLIAPG

GPVDELLGGTLPETMEIAREACSTWSLDEFALPAELKN RGMDDTNQLPHYPYRDDGLLLWDAI ETFVSGYLKFFYP

TEIAIVQDVELQTWAQECASDRGGKVKGM PPRI NTVEQLIKIVTTII FTCGPQHSAVN FPQYEYMSFAANMPLAAY

RDI PKITASGN LEVITEKDI LRLLPPYKRAADQLKILFTLSAYRYDRLGYYDKSFRELYRMSFDEVFAGTPIQLLARQFQ

QN LN MAEQKIDAN NQKRVIPYIALKPSLVI NSISM

Codon-optimized coding sequence of WP_028091425.1mut - SEQ ID NO: 267

ATGCAGCCGTGCCTGCCGCAAAACGACCCGAACCCGAGCCAGCGTCAAAGCAGCCTG GAGAAGGGTCGTAA

GGAATACCAGTTCATGTACGATTTTCTGCCGCCGATGGCGATGATCAAGAGCGTGCC GCCGGCGGAGAACTT

CAGCACCAAATACATTGCGGAACGTACCCTGGAGGCGGCGGAACTGCCGCTGAACAT GATGGCGGTTAAGAC

CCACGCGATGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTTCTTTCCGGT GCTGCAAAAGCCGAA

CGTTATGAAAACCTATGAGACCGACGATAGCTTTGCGGAACAGCGTCTGTGCGGCGT GAACCCGATGGTTCT

GCGTCAGATCAAGCAAATGGACGCGCGTTTCGCGTTTACCATTGAGGAACTGCAAGA TAAATTCGGTAGCAG

CATCAACCTGATTGAGCGTCTGGCGACCGGCAACCTGTACGTGTGCGACTATCGTAG CCTGGCGTTTATCCAG

GGTGGCACCTACGCGAAGGGTAAGAAATATCTGCCGGCGCCGCTGGCGTTCTTTTGC TGGCGTACCAGCGGT

TTCCAGGATCGTGGCCAACTGGTGCCGGTTGCGATCCAGATTAACCCGAAAGCGGGT AAAGCGAGCCCGCTG

CTGACCCCGTTTGATGATCCGCTGACCTGGTTTTACGCGAAAAGCTGCGTTCAAATC GCGGACGCGAACCACC

ACG AG AT G AGCAGCCACCT GTGCCGT ACCCACCTGGT GAT GG AGCCGTTTGCGGTTTGCACCCCGCGTCAGCT

GGCGGAAAACCACCCGCTGCGTATTCTGCTGAAGCCGCACTTCCGTTTTATGCTGGC GAACAACAGCCTGGCG CGTAAACGTCTGGTTAGCCGTGGTGGCTTCGTTGACGAGCTGCTGGCGGGCACCCTGCAG GAAAGCCTGCAA

ATCGTGGTTGACGCGTACAAAAGCTGGAGCCTGGATCAGTTTGCGCTGCCGCGTGAA CTGAAGAACCGTGGT

GTGAACGATGTTAAAAACCTGCCGCACTACCCGTATCGTGACGATGGCATCCTGCTG TGGAACGCGATTAACA

AGTTCGTGTTTAACTATCTGCAGCTGTACTATCAAAGCAGCGCGGACCTGAAGGCGG ATGCGGAACTGCAGG

CGTGGGCGCGTGAATGCGTGGCGCAAGACGGTGGCCGTGTTAAGGGTATGAGCGACC GTATCGATACCCTG

GAGCAGCTGGTTGAGATCGTTACCACCATCATTTACATTTGCGGCCCGCAGCACAGC GCGGTGAACTTCAGCC

AATACGAATATATGGGCTTTATTCCGAACATGCCGCTGGCGGCGTATCAGCCGATCC AGCAAAAGGGTGACAT

TAAAGATCGTCAAGCGCTGATCGATTTCCTGCCGCCGGCGAAACCGACCAGCACCCA GCTGAGCACCGTTTAC

ATTCTGAGCGACTACCGTTATGATCGTCTGGGCTACTATGAGGAAGAGGAATTCACC GACCCGAACGCGGATC

AGGTGGTTAACAAGTTTCAGCAAG AGCT G AACATGGTGCAGCGT AAG AT CGAACT G AACAACAAACGT CGT C

TG GTT A ACT AC AA AT AT CT G C A ACCG CGT CT GATT CT G AAC AG CATC AG C ATTT AA

Amino acid sequence for WP_028091425.1mut - SEQ ID NO: 268

MQPCLPQNDPNPSQRQSSLEKGRKEYQFMYDFLPPMAMI KSVPPAEN FSTKYIAERTLEAAELPLN MMAVKTHA

MWDPLDELQDYEDFFPVLQKPNVMKTYETDDSFAEQRLCGVN PMVLRQI KQMDARFAFTIEELQDKFGSSIN LIE

RLATGN LYVCDYRSLAFIQGGTYAKGKKYLPAPLAFFCWRTSGFQDRGQLVPVAIQIN PKAGKASPLLTPFDDPLTW

FYAKSCVQIADAN HH EMSSH LCRTH LVMEPFAVCTPRQLAENH PLRILLKPHFRFMLAN NSLARKRLVSRGGFVDE

LLAGTLQESLQIVVDAYKSWSLDQFALPRELKNRGVN DVKNLPHYPYRDDGILLWNAI NKFVFNYLQLYYQSSADLK

ADAELQAWARECVAQDGGRVKG MSDRIDTLEQLVEIVTTIIYICGPQHSAVN FSQYEYMGFI PNMPLAAYQPIQQ

KGDIKDRQAUDFLPPAKPTSTQLSTVYI LSDYRYDRLGYYEEEEFTDPNADQWNKFQQELN MVQRKI ELN NKRRL

VNYKYLQPRLI LNSISI

Codon-optimized coding sequence of OBQ.01436.lmut - SEQ I D NO: 269

ATGCAGCCGTGCCTGCCGCAAAACGACCCGAACCCGGCGCAGCGTCAAAGCTGCCTG GAGAAGGGTCGTAA

GGAATACCAGTTCATGTACGATTTTCTGCCGCCGATGGCGATGCTGAAGAGCGTTCC GCCGGCGGAGAACTTC

AGCACCAAATACATCGCGGAACGTACCCTGGAGGCGGCGGAACTGCCGCTGAACATG ATGGCGGTGAAGAC

CCACGCGATGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTTCTTTCCGAT TCTGCAAAAGCCGAAC

GTTATGAAAACCTATGAGACCGACGATAGCTTTGCGGAACAGCGTCTGTGCGGCGTG AACCCGATGGTTCTG

CGTCAGATCAAGCAAATGGACGCGCGTTTCGCGTTTACCATTGAGGAACTGCAAGCG AAATTCGGTAACAGC

ATCAACCTGATTGAGCGTCTGGCGACCGGCAACCTGTACGTTTGCGATTATCGTAGC CTGGCGTTTATCCAGG

GTGGCACCTACGCGAAGGGTAAGAAATATCTGCCGGCGCCGCTGGCGTTCTTTTGCT GGCGTAGCAGCGGTT

TCCAGGACCGTGGCCAACTGGTGCCGGTTGCGATCCAGATTAACCCGAAAGCGGGTA AAGCGAGCCCGCTGC

T G ACCCCGTTT GAT G ATCCGCT G ACCT GGTTTT ACGCG AAAAGCT GCGT GCAAATCGCGG AT GCG AACCACCA

CGAGATGAGCAGCCACCTGTGCCGTACCCACCTGGTGATGGAGCCGTTTGCGGTTTG CACCCCGCGTCAGCTG

GCGGAAAACCACCCGCTGCGTATTCTGCTGCGTCCGCACTTCCGTTTTATGCTGGCG AACAACAGCCTGGCGC

GTAAGCGTCTGGTTAGCCGTGGTGGCTTCGTTGACGAGCTGCTGGCGGGCACCCTGC AGGAAAGCCTGCAAA

TCGTGGTTGACGCGTACAAAAGCTGGAGCCTGGATCAGTTTGCGCTGCCGCGTGAAC TGAAGAACCGTGGTG

TGGACGATGTTAAAAACCTGCCGCACTACCCGTATCGTGACGATGGCATCCTGCTGT GGAACGCGATTAACAA

GTTCGTGTTTAACTATCTGCAGCTGTACTATAAGAGCCCGGCGGACCTGAAGGCGGA TGGTGAACTGCAGGC

GTGGGCGCGTGAATGCGTGGCGCAAGACGGTGGCCGTGTTAAAGGCATGAGCGACCG TATCGATACCCTGG

AGCAACTGGTGGAAATCGTTACCACCATCATTTACATTTGCGGCCCGCAGCACAGCG CGGTGAACTTCAGCCA

ATACGAGTATATGGGCTTTATTCCGAACATGCCGCTGGCGGCGTATCAGGAGATCCA GCAAAACGGTGACATT

GAAGATCGTCAAGCGCTGATCGATTTCCTGCCGCCGGCGAAGCCGACCAACACCCAG CTGAGCACCGTTTACA

TTCTGAGCGACTACCGTTATGATCGTCTGGGCTACTATGAGGAAGAGGAATTCACCG ACCCGAACGCGGATCA GGTGGTTAACAAATTTCAGCAAGAGCTGAGCGTGGTTCAGCGTAAGATCGAACTGAACAA CAAAGGTCGTCT

GGTGAACTACGAATATCTGCAACCGGGCCTGATTCTGAACAGCATCAGCATTTAA

Amino acid sequence for OBQ.01436.lmut - SEQ ID NO: 270

MQPCLPQNDPNPAQRQSCLEKGRKEYQFMYDFLPPMAMLKSVPPAEN FSTKYIAERTLEAAELPLN MMAVKTH

AMWDPLDELQDYEDFFPILQKPNVMKTYETDDSFAEQRLCGVN PMVLRQI KQMDARFAFTIEELQAKFGNSI NLI

ERLATGNLYVCDYRSLAFIQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQ I N PKAGKASPLLTPFDDPLT

WFYAKSCVQIADANH HEMSSHLCRTHLVM EPFAVCTPRQLAENH PLRILLRPH FRFMLANNSLARKRLVSRGGFV

DELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVDDVKNLPHYPYRDDGILLWNAIN KFVFNYLQLYYKSPA

DLKADGELQAWARECVAQDGGRVKGMSDRIDTLEQLVEIVTTIIYICGPQHSAVNFS QYEYMGFIPNM PLAAYQEI

QQNGDI EDRQALI DFLPPAKPTNTQLSTVYILSDYRYDRLGYYEEEEFTDPNADQVVN KFQQELSVVQRKIELN NKG

RLVNYEYLQPGLILNSISI

Codon-optimized coding sequence of OBQ25779.1mut - SEQ I D NO: 271

ATGATCAACATTATGCAGCCGTGCCTGCCGCAAAACGACCCGAACCCGGGTCAGCGT CAAAGCAGCCTGGAG

AAGGGCCGTAAGGAATACCAGTTCATGTACGATTTTCTGCCGCCGATGGCGATGCTG AAGAGCGTGCCGCCG

GCGGAGAACTTCAGCACCAAATACATCGCGGAACGTACCCTGGAGGCGGCGGAACTG CCGCTGAACATGATG

GCGGTTAAGACCCACGCGATGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGAT TTCTTTCCGGTGCTG

CAAAAGCCGAACGTTATGAAAACCTATGAGACCGACGATAGCTTTGCGGAACAGCGT CTGTGCGGTGTGAAC

CCGATGGTTCTGCGTCAGATCAAGCAAATGGACGCGCGTTTCGCGTTTACCATTGAG GAACTGCAAGCGAAAT

TCGGTAACAGCATCAACCTGATTGAGCGTCTGGCGACCGGCAACCTGTACGTTTGCG ATTATCGTAGCCTGGC

GTTTATCCAGGGTGGCACCTACGCGAAGGGTAAGAAATATCTGCCGGCGCCGCTGGC GTTCTTTTGCTGGCGT

AGCAGCGGTTTCCAGGACCGTGGCCAACTGGTGCCGGTTGCGATCCAGATTAACCCG AAAGCGGGTCAAGCG

AGCCCGCTGCTGACCCCGTTTGACAAGCCGCTGACCTGGTTTTACGCGAAAAGCTGC GTGCAGATCGCGGATG

CG AACCACCACG AG AT G AGCAGCCACCT GTGCCGTACCCACCTGGT G ATGG AGCCGTTTGCGGTTT GCACCCC

GCGTCAACTGGCGGAAAACCACCCGCTGCGTATTCTGCTGAAGCCGCACTTCCGTTT TATGCTGGCGAACAAC

AGCCTGGCGCGTAAACGTCTGGTTAGCCGTGGTGGCTTCGTTGACGAGCTGCTGGCG GGCACCCTGCAGGAA

AGCCTGCAAATCGTGGTTGACGCGTACAAAAGCTGGAGCCTGGATCAGTTTGCGCTG CCGCGTGAACTGAAG

AACCGTGGTGTGGACGATGTTAAAAACCTGCCGCACTACCCGTATCGTGACGATGGC ATCCTGCTGTGGAACG

CGATTAACAAGTTCGTTTTTAACTATCTGCAGCTGTACTATAAGAGCCCGGCGGACC TGAAGGCGGATGGTGA

ACTGCAGGCGTGGGCGCGTGAATGCGTGGCGCAAGACGGTGGCCGTGTTAAAGGCAT GAGCGACCGTATCG

ATACCCTGGAGCAACTGGTGGAAATCGTTACCACCATCATTTACATTTGCGGCCCGC AGCACAGCGCGGTGAA

CTTCAGCCAATACGAGTATATGGGCTTTATTCCGAACATGCCGCTGGCGGCGTATCA GGCGATCCAGCAAAAG

GGCGACATTAAAGATCGTCAAGCGCTGATCGATTTCCTGCCGCCGGCGAAGCCGACC AACACCCAGCTGAGC

ACCGTTT ACATT CT G AGCG ACTACCGTT AT G ATCGT CT GGGTT ACT AT G AGG AAG AGG AATT CACCG ACCCG A

ACGCGGATCAGGTGGTTAACAAATTTCAGCAAGAGCTGAACGTGGTTCAGCGTAAGA TCGAACTGAACAACA

AAGGCCGTCTGGTGAACTACGAATATCTGCAGCCGCGTCTGATTCTGAACAGCATCA GCATTTAA

Amino acid sequence for OBQ25779.1mut - SEQ ID NO: 272

MI N IMQ.PCLPQNDPNPGQRQSSLEKGRKEYQ.FMYDFLPPMAMLKSVPPAENFSTKYIAERT LEAAELPLNMMAV

KTHAMWDPLDELQDYEDFFPVLQKPNVM KTYETDDSFAEQRLCGVNPMVLRQIKQM DARFAFTIEELQAKFGNS

IN LI ERLATGNLYVCDYRSLAFIQGGTYAKGKKYLPAPLAFFCWRSSGFQDRGQLVPVAIQINP KAGQASPLLTPFDK

PLTWFYAKSCVQIADAN HH EMSSH LCRTHLVMEPFAVCTPRQLAEN HPLRILLKPHFRFMLANNSLARKRLVSRG

GFVDELLAGTLQESLQIVVDAYKSWSLDQFALPRELKN RGVDDVKNLPHYPYRDDGILLWNAI NKFVFNYLQLYYKS PADLKADGELQAWARECVAQDGGRVKGMSDRIDTLEQLVEIVTTNYICGPQHSAVNFSQY EYMGFI PNMPLAAY

QAIQQKGDI KDRQALI DFLPPAKPTNTQLSTVYI LSDYRYDRLGYYEEEEFTDPNADQVVNKFQQELNVVQRKIELN

N KGRLVNYEYLQPRLI LNSISI

Codon-optimized coding sequence of WP_039200563.1mut - SEQ ID NO: 273

ATGAAGCCGTGCCTGCCGCAGAACGATCCGAACCCGACCCAGCGTCAAAGCAGCCTG GAGAAGGGCCGTAA

AGAGTACGAATTCCGTTATGACTTTCTGCCGCCGATGGCGATGCTGAAGAACGTGCC GCCGAGCGAGAACTTC

AGCACCAAATACATTGCGGAACGTACCATCGAGACCGCGGAACTGCCGAGCAACATG ATGGCGGTTAAAGCG

CACGCG AT GTGGG ACCCGCT GG AT G AGCT GCAGG ACT ACG AAG ATTTCTTTCCGGT GCT GCAAAAGCCG AAC

GTTATGAAAAACTATGAGACCGACGATAGCTTCGCGGAACAGCGTCTGTGCGGTGTG AACCCGGTGGTTCTG

TGCCAGATTAAGCAAATGGATGCGCGTTTCGCGTTTACCATCGAGGAACTGCAAGCG AAATTTGGTAACAGCA

TTGATCTGCGTGAGCGTCTGGCGACCGGCAACCTGTACGTTTGCGACTATCGTCCGC TGGCGTTCATCCGTGG

TGGCACCTTTGCGAAGGGTAAGAAATACCTGCCGGCGCCGCTGGCGTTCTTTTGCTG GCGTAGCAGCGGTTTC

CAGGATCGTGGCCAACTGGTGCCGATCGCGATTCAGATCAACCCGAAGGAAGGCAAA GCGAGCCCGCTGCTG

ACCCCGTTCGACGATAGCAGCACCTGGTTTTACGCGAAGAGCTGCGTTCAAATCGCG GACGCGAACCACCAC

GAG AT G AGCAGCCACCT GTGCCGT ACCCACTTCGT GAT GG AACCGTTTGCGGTTT GCACCCCGCGTCAGCT GG

CGCAAAACCACCCGCTGCGTATTCTGCTGAAACCGCACTTCCGTTTTATGCTGGCGA ACAACAGCCTGGGTCG

TCAGCGTCTGGTGAACCGTGGTGGCCCGGTTGATGAGCTGCTGGCGGGCACCCTGCA GGAAAGCCTGCAAAT

TGTGGTTGACGCGTACACCGATTGGCGTCTGGACCAATTCGCGCTGCCGACCGAGCT GAAGAACCGTGGTGT

GGACGATGTTAAAAACCTGCCGCACTACCCGTATCGTGACGATGGCATTCTGCTGTG GAACGCGATCAACAAG

TTCGTGTTCAACTACCTGGAACTGTACTACAAGAGCCCGGCGGATCTGACCGCGGAT GTTGAACTGCAGGCGT

GGGCGCGTGAATGCGTGGCGCAAGATGGTGGCCGTGTTAAGGGTATGAGCGACCGTA TTGATACCCTGAAA

CAGCT GGTT GAG AT CGTT ACCACCATCATTTACACCTGCGGTCCGCTGCACAGCGCGGTG AACTT CCCGCAGT

ACGAATATATGGGCTTTATCCCGAACATGCCGCTGGCGGCGTATCAACCGATTAAGA AAGAGGGTGTTTGCAC

CCGTAAGGAACTGATCGACTTCCTGCCGGCGGCGAAACCGACCAGCAGCCAGCTGAC CACCCTGTTTACCCTG

AGCGCGTACCGTT AT G ATCGT CT GGGCTACT AT G AGG AAG AGG AATT CG AGG ACCCG AACGCGG ACG AT GT G

GTTAACAAATTTCAGCAAGAGCTGAACGTGGTTCAGCGTAAGATCGAACTGAGCAAC AAAGGTCGTCTGGTG

AACTACG AAT AT CTGCAACCGCGT CT GATT CT G AACAGCATT AGCAT CTAA

Amino acid sequence for WP_039200563.1mut - SEQ ID NO: 274

MKPCLPQNDPNPTQRQSSLEKGRKEYEFRYDFLPPMAMLKNVPPSENFSTKYIAERT IETAELPSNM MAVKAHA

MWDPLDELQDYEDFFPVLQKPNVMKNYETDDSFAEQRLCGVN PVVLCQIKQM DARFAFTIEELQAKFGNSI DLRE

RLATGN LYVCDYRPLAFIRGGTFAKGKKYLPAPLAFFCWRSSGFQDRGQLVPIAIQIN PKEGKASPLLTPFDDSSTWF

YAKSCVQIADAN H HEMSSH LCRTHFVM EPFAVCTPRQLAQNH PLRI LLKPHFRFMLAN NSLGRQRLVNRGGPVD

ELLAGTLQESLQIVVDAYTDWRLDQFALPTELKN RGVDDVKN LPHYPYRDDGI LLWNAINKFVFNYLELYYKSPADL

TADVELQAWARECVAQDGGRVKGMSDRIDTLKQLVEIVTTI IYTCGPLHSAVNFPQYEYMGFIPN MPLAAYQPI KK

EGVCTRKELIDFLPAAKPTSSQLTTLFTLSAYRYDRLGYYEEEEFEDPNADDVVN KFQQELNVVQRKI ELSNKGRLVN

YEYLQPRLI LNSISI

Codon-optimized coding sequence of WP_012407347.1mut - SEQ ID NO: 275

ATGAAACCGTGCCTGCCGCAGAACGACCCGGATCCGACCAAACGTCAGATCCTGCTG GAGCGTAACCAAGGC

GAGTACGAATTCGACTATGATTTTCTGGTGCCGATGGCGATGCTGAAGAACGTTCCG AGCATTGAGAACTTCA

GCACCAAATACATCGCGGAACGTACCCTGGAGACCGCGGAACTGCCGATTAACATGC TGGCGGTGAAGACCC

GTAGCCTGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTATTTTCCGGTTC TGCCGAAGCCGAACAT CATTAAAACCTACCAGAGCGACGATAGCTTCTGCGAGCAACGTCTGTGCGGCGCGAACCC GTTTGTGCTGCGT

CGTATTGAACAGATGGACGCGCGTTTCGCGTTTACCATCCTGGAGCTGCAAGAAAAG TTCGGTGATAGCATTA

ACCTGGTTGAGAAACTGGCGAACGGCAACCTGTACGTGTGCGACTATCGTGCGCTGG CGTTCGTTAAAGGTG

GCAGCTACGAACGTGGTAAGAAATTTCTGCCGACCCCGATCGCGTTCTTTTGCTGGC GTAGCAGCGGTTTCAG

CGACCGTGGCCAGCTGGTGCCGATCGTTATTCAAATCAACCCGGCGGATGGCAAGCA GAGCCAACTGATCAC

CCCGTTCGACGATCCGCTGACCTGGTTTCACGCGAAACTGTGCGTGCAGATTGCGGA CGCGAACCACCACGAA

ATGAGCAGCCACCTGTGCCGTACCCACTTCGTTATGGAGCCGTTTGCGATTTGCACC GCGCGTCAACTGGCGG

AAAACCACCCGCTGAGCCTGCTGCTGAAGCCGCACTTCCGTTTTATGCTGGCGAACA ACAGCCTGGCGCGTAA

ACGTCTGATCAGCCGTGGTGGCCCGGTGGATGAGCTGCTGGCGGGCACCCTGCAGGA AAGCCTGCAAATTGT

GGTTAACGCGTACACCGAGTGGAGCCTGGACCAGTTCAGCCTGCCGACCGAACTGAA GAACCGTGGTATGGA

CGATCCGGATAACCTGCCGCACTACCCGTATCGTGACGATGGCCTGCTGCTGTGGAA CGCGATTAAGAAATTT

GTTAGCGAGTATCTGCAGATCTACTATAAGACCCCGCAAGACCTGGCGGAGGATCTG GAACTGCAGAGCTGG

GTGCAAGAATGCGTTAGCCAGAGCGGTGGCCGTGTGAAAGGTATTAGCGACCGTATC AACACCCTGGACCAA

CTGGTGGATATTGCGACCGCGGTTATTTTTACCTGCGGTCCGCAGCATGCGGCGGTT AACTACAGCCAATACG

AGTATATGACCTTTATGCCGAACATGCCGCTGGCGGCGTATAAACAGATGACCAGCG AAGGCACCATCCCGG

ATCGTAAGAGCCTGCTGAGCTTCCTGCCGCCGAGCAAACAGACCGCGGACCAACTGA GCATTCTGTTTATCCT

G AGCGCGTACCGTT AT G ATCGT CT GGGCTACT AT G ACG AT AAGTT CCT GG ACCCGG AGGCGCAAG AT GTGCT

GGCGAAATTTCAGCAAGAACTGAACGAGGCGGAACGTGAGATTGAACTGAACAACAA GAGCCGTCTGATCA

ACT AC A ACT AT CT G AA ACCG CGTCTG GT G ACC A AC AG CAT C AG CGTTT AA

Amino acid sequence for WP_012407347.1mut - SEQ ID NO: 276

MKPCLPQNDPDPTKRQILLERNQGEYEFDYDFLVPMAM LKNVPSI EN FSTKYIAERTLETAELPI NMLAVKTRSLW

DPLDELQDYEDYFPVLPKPNI IKTYQSDDSFCEQRLCGANPFVLRRI EQMDARFAFTILELQEKFGDSIN LVEKLANG

N LYVCDYRALAFVKGGSYERGKKFLPTPIAFFCWRSSGFSDRGQLVPIVIQIN PADGKQSQLITPFDDPLTWFHAKLC

VQIADAN HH EMSSH LCRTHFVMEPFAICTARQLAENH PLSLLLKPH FRFMLAN NSLARKRLISRGGPVDELLAGTL

QESLQIVVNAYTEWSLDQFSLPTELKNRGMDDPDN LPHYPYRDDGLLLWNAI KKFVSEYLQIYYKTPQDLAEDLEL

QSWVQECVSQSGGRVKGISDRINTLDQLVDIATAVI FTCGPQHAAVNYSQYEYMTFMPNMPLAAYKQMTSEGTI

PDRKSLLSFLPPSKQTADQLSILFILSAYRYDRLGYYDDKFLDPEAQDVLAKFQQEL N EAEREI ELN NKSRLINYNYLKP

RLVTNSISV

Codon-optimized coding sequence of WP_027843955.1mut - SEQ ID NO: 277

ATGAAACCGTGCCTGCCGCAGAACGACCCGAACCCGGAGAAGCGTAAAGATTGGCTG AACAAAAACCGTGA GGAATACCAATTCAACTTTAACTATCTGAGCCCGCTGCCGCTGATCGACGATGTTCCGAA CAACGAGGCGTTT AGCCCGAAGTACCTGGCGGAACGTCTGCCGCTGACCTTCGGTAAACTGAGCGCGAACACC CTGGGCATTCGT CTGCGTAGCTTTTGGGACCCGTTCGATGAGTTTCAGGACTATGAAGATTTCTTTCCGGTG CTGCCGACCCCGG AACTGCT G AAG ACCT ACC AG AACG ACG AGT ATTT CG CGG AAC AACGT CTG AGCGGT GT G AACCCG AT GGTT A TCCGTAGCATTAAAGAGCTGGACGCGCGTTTCGCGTTTAGCATCCGTGATCTGCAGGCGG AATTCGGCACCAG CCT G AACCT GG AGCAAG AACT G AAC AACGGC AACCT GT AC ATTT G CG ACT AT ACCAGCCT G AGCTTT GTT CGT GGTGGCAGCTACCTGCGTGGTCGTAAGAGCCTGCCGGCGCCGATTGCGCTGTTCTGCTGG CGTAACAGCGGT TATTGCGATCGTGGCGAGCTGACCCCGATCGCGATTCAACTGGTGCCGGAACTGGGCACC GGTAGCCGTATTC TGACCCCGTTTGACAGCCACCTGAACTGGCTGTACGCGAAAATCTGCATGCAAATTGCGG ATGCGAACCACCA CG AG AT G AGCAGCCACCT GT GCCACACCCACCTGGT G ATGG AGCCGTTTGCGGTTTGCACCGCGCGTCAGCT G G CG G AA AACC ACCCG CT GGGTCTGCTGCTGCGTCCG C ACTTCCGTTTT ATG CTG C AC AAC A AC AG CCTG G CG CGT AAG AACCT G ATCAACCAGGGTGGCTACGTT G ACAACCT GCT GGGT GGCACCCTGCGT G AG AGCCT GCAA ATTGTGCGTGACGCGTATTTCAAGAACGCGGAGGAATTTTGGAGCCTGGATGAGTTCGCG CTGCCGAAAGAA

ATCGCGAACCGTGGTCTGGACGATACCGATCGTCTGCCGCACTACCCGTATCGTGAC GATGGCATGCTGCTGT

GGAACGCGATTGAAAAGTTTGTTAGCAACTACCTGAGCATCTACTATCCGAACCCGG GTGACATTAAAGATGA

TCGTGAGCTGCAAGCGTGGGCGGCGGAATGCGTGGCGGCGGATGGTGGCCGTGTGAA GGGCGTTCCGAGC

CAATTTGAGAACCTGCAGCAACTGATCGACGTGGTTACCGGTATCATTTTTACCTGC GGTCCGCAGCACAGCG

CGGTGAACTACCCGCAATACGAATATATGGCGTTTGTTCCGAACATGCCGCTGGCGG GTTATCAGGCGGTGG

ACAGCAACCCGAACATGGATCTGAAAAGCCTGATGGCGTTCCTGCCGCCGCCGAACC AAACCGCGGACCAGC

TGCAAATCATTTACGGTCTGAGCGCGTACCGTTATGATCGTCTGGGCTACTATGACC GTGAGTTTAGCGATCC

GCACGCGGAGGAAGTGGTTCGTCTGTTCCAGCAAGATCTGAACCAGGTTGAGCGTAA GATCGAACTGCGTAA

C A A AA ACCGT CTG GTG G AAT AT AACTT CCT G AA ACCG AG CCTG GTTCT G A AC AG CAT C AG C ATTT A A

Amino acid sequence for WP_027843955.1mut - SEQ ID NO: 278

MKPCLPQNDPNPEKRKDWLN KNREEYQFNFNYLSPLPLI DDVPN NEAFSPKYLAERLPLTFGKLSANTLGIRLRSFW

DPFDEFQDYEDFFPVLPTPELLKTYQNDEYFAEQRLSGVN PMVIRSI KELDARFAFSIRDLQAEFGTSLN LEQELN NG

N LYICDYTSLSFVRGGSYLRGRKSLPAPIALFCWRNSGYCDRGELTPIAIQLVPELGTGSR ILTPFDSH LNWLYAKICM

QIADAN HH EMSSH LCHTH LVMEPFAVCTARQLAEN HPLGLLLRPHFRFMLHN NSLARKNLINQGGYVDNLLGGT

LRESLQIVRDAYFKNAEEFWSLDEFALPKEIANRGLDDTDRLPHYPYRDDGM LLWNAIEKFVSNYLSIYYPNPGDI K

DDRELQAWAAECVAADGGRVKGVPSQFENLQQLIDVVTGIIFTCGPQHSAVNYPQYE YMAFVPNMPLAGYQAV

DSN PN MDLKSLMAFLPPPNQTADQLQIIYGLSAYRYDRLGYYDREFSDPHAEEVVRLFQQDLNQV ERKIELRNKN R

LVEYN FLKPSLVLNSISI

Codon-optimized coding sequence of WP_073641301.1mut - SEQ ID NO: 279

ATGAAACCGTGCCTGCCGCAGAACGACCCGGATCCGATTAAGCGTAAATACAGCCTG GAGCACAAGAAAGAG

GAATATGAATTCGACCACGATTTTCTGAGCCCGATGGCGATGCTGAAAGACGTGCCG GCGGTTGAGAACTTC

AGCACCCGTTACATTGCGGAACGTACCGTGGAGACCGCGGAACTGCCGATCAACATG CTGGCGGTTAAGACC

CGTGCGCTGTGGGACCCGCTGGATGAGCTGCAGGACTACGAAGATTATTTCCCGGTG CTGCCGAAGCCGAAC

GTTATCAAAACCTACCAGACCGACGATAGCTTTTGCGAGCAACGTCTGTGCGGTGCG AACCCGATGGCGCTGC

AGCAAATCAAAGAGATGGACGCGCGTTTCGAATTTACCATTGAGGAACTGCAGGAGA AATTCGGTGAAAGCA

TCAACCTGGTGGAGAAGCTGGCGGACGGCAACCTGTACGTGTGCGATTATCGTCCGC TGAGCTTTGTTAAGG

GTGGCACCTACGAACGTGGTAAGAAATATCTGCCGACCCCGCTGGCGTTCTTTTGCT GGCGTAGCAGCGGTTT

CAGCGATCGTGGCCAGCTGGTGCCGATCGCGATTCAACTGAACCCGGCGGTTGGCCG TCAGAGCCAACTGAT

TACCCCGTTCGACGATCCGCTGACCTGGTTTCACGCGAAACTGTGCGTGCAGATCGC GGACGCGAACCACCAC

GAGATGAGCAGCCACCTGTGCCGTACCCACTTCGTTATGGAACCGTTTGCGATTTGC ACCGCGCGTCAACTGG

CGGATAACCACCCGCTGAACCTGCTGCTGAAACCGCACTTCCGTTTTATGCTGGCGA ACAACAGCCTGGGTCG

TAAGCGTCTGGTGAACCGTGGTGGCCCGGTTGATGAGCTGCTGGCGGGCACCCTGCA GGAAAGCCTGCAAAT

TGTGGTTAACGCGTACAAAGAGTGGAGCCTGGATGAATTCGCGCTGCCGACCGAAAT CAAGAACCGTGGTAT

GGACGATAAGCTGAAACTGCCGCACTACCCGTATCGTGACGATGGCATGCTGCTGTG GAACGCGATTAAGAA

ATTTGTGAGCGAGTATCTGAAGCTGTACTATAAAACCCCGCAGGACCTGACCGCGGA TCTGGAACTGCAGGC

GTGGGCGCAAGAGTGCGTTAGCGAAAGCGGTGGCCGTGTGAAAGGTGTTCCGAGCCG TATCGAGAAGCTGG

AACAACTGGTGGACATCGCGACCGCGGTTATTTTTACCTGCGGTCCGCAGCATGCGG CGGTTAACTACAGCCA

ATACGAGTATATGACCTTTATGCCGAACATGCCGCTGGCGGCGTATAAGCAGATGAC CGCGGAAGGCACCAT

CGCGGATCGTAAAAGCCTGCTGAGCTTCCTGCCGCCGAGCAAGCAGACCGCGGACCA ACTGAGCATCCTGTTT

ATTCTGAGCGCGTACCGTTATGATCGTCTGGGCTACTATGACGATAAATTCGCGGAC CCGGAGGCGCAAGATA TTCTGGTGACCTTTCAGCAAGACCTGAACGAGGTTGAGCGTAAGATCGAACTGAACAACA AGAGCCGTCTGA TT AAATACAACTAT CT G AAGCCGCGT CTGGT G ACCAACAGCATCAGCGTTT AA

Amino acid sequence for WP_073641301.1mut - SEQ ID NO: 280

MKPCLPQNDPDPIKRKYSLEHKKEEYEFDHDFLSPMAMLKDVPAVEN FSTRYIAERTVETAELPINM LAVKTRALW

DPLDELQDYEDYFPVLPKPNVIKTYQTDDSFCEQRLCGANPMALQQIKEMDARFEFT IEELQEKFGESI NLVEKLAD

GN LYVCDYRPLSFVKGGTYERGKKYLPTPLAFFCWRSSGFSDRGQLVPIAIQLNPAVGRQSQ LITPFDDPLTWFHAK

LCVQIADANH HEMSSHLCRTHFVM EPFAICTARQLADN HPLN LLLKPH FRFMLAN NSLGRKRLVN RGGPVDELLA

GTLQESLQIVVNAYKEWSLDEFALPTEI KNRGMDDKLKLPHYPYRDDGM LLWNAIKKFVSEYLKLYYKTPQDLTADL

ELQAWAQECVSESGGRVKGVPSRIEKLEQLVDIATAVIFTCGPQHAAVNYSQYEYMT FMPNM PLAAYKQMTAEG

TIADRKSLLSFLPPSKQTADQLSI LFI LSAYRYDRLGYYDDKFADPEAQDILVTFQQDLN EVERKI ELNN KSRLIKYNYLK

PRLVTNSISV

Codon-optimized coding sequence of WP_096647440.1mut - SEQ ID NO: 281

ATGAAACCGTGCCTGCCGCAGAACGACCCGGAGCCGACCCAGCGTAAGAACTTCCTG GAACGTAAACAGGGC

GAGTACGAATTCGATCACAAGTTTCTGAAACCGATGGCGATGCTGAAGAACGTGCCG AGCATTGAGAACTTT

AGCACCAAATATATCGCGGAACGTACCGTGGAGACCGCGGAACTGCCGCTGAACATG CTGGCGGTTAAAACC

CGT AGCCT GT GGG ACCCGCT GG AT G AGCT GCAGG ACT ACG AAG ATT ATTTCCCGGT GCTGCCG AAGCCG AAC

GTTATCAAAACCTACCAGACCGACAACAGCTTTTGCGAGCAACGTCTGTGCGGTGCG AACCCGCTGGTTCTGC

GTCAGATTCAGCAAATGGATGCGCGTTTCGCGTTTACCATCAGCGAGCTGCAAGAAA AGTTCGGTGACAGCAT

TGATCTGGAGGAACGTCTGAAAACCGGCAACCTGTACGTGTGCGACTATCGTGCGCT GGCGTTTGTTAAGGG

TGGCACCTACGAGCGTGGTAAGAAATATCTGCCGACCCCGATCGCGTTCTTTTGCTG GCGTAGCAGCGGTTTC

AGCGATCGTGGCCAGCTGGTGCCGATCGCGATTCAAATCAACCCGACCGACGGCAAG CAGAGCCAACTGATC

ACCCCGTTCGATGAACCGCTGGTGTGGTTTCACGCGAAACTGTGCGTTCAGATTGCG GACGCGAACCACCACG

AGATGAGCAGCCACCTGTGCCGTACCCACTTCGTTATGGAACCGTTTGCGATTTGCA CCGCGCGTCAGCTGGC

GGATAACCACCCGCTGAACCTGCTGCTGAAGCCGCACTTCCGTTTTATGCTGGCGAA CAACAGCCTGGGTCGT

CAACGTCTGGTGAACCGTGGTGGCCCGGTTGATGAGCTGCTGGCGGGCACCCTGCAG GAAAGCCTGCAAATC

GTGGTTAACGCGTACAAAGAGTGGAGCCTGGATCAGTTCAGCCTGCCGACCGAACTG AAGAACCGTGGTATG

GACAACAGCGATAAACTGCCGCACTACCCGTATCGTGACGATGGCCTGCTGCTGTGG AACGCGATTAAGAAA

TTCGTGAGCGAATACCTGAAGCTGTACTATAAAACCCCGCAAGACCTGACCGCGGAT TTTGAGCTGCAGAGCT

GGGCGCAAG AAT GCGTT AGCCAG AGCGGT GGCCGT GT G AAAGGT GTTAGCGACCGTAT CACCACCCTGG ACC

AACTGATTGATATCGCGACCGCGGTGATTTTTACCTGCGGTCCGCAGCATGCGGCGG TTAACTACAGCCAATA

CGAGTATATGACCTTTATCCCGAACATGCCGCTGGCGGCGTATAAGCAGATTACCAG CGAGGGTAACATCCCG

GACCGTAAGAGCCTGCTGAGCTTCCTGCCGCCGAGCAAACAGACCGCGGATCAACTG AGCATTCTGTTTATCC

TGAGCGCGTACCGTTATGACCGTCTGGGCTACTATGACGATAAATTCCTGGATCCGG AGGCGCAGGAAATCCT

GGTGACCTTTCAGCAAGAGCTGAACGAGGCGGAACGTCAAATTGAACTGAACAACAA GAGCCGTCTGATCAA

CTACGACTATCTGAAACCGCGTCTGGTGACCAACAGCATTAGCGTTTAA

Amino acid sequence for WP_096647440.1mut - SEQ ID NO: 282

MKPCLPQNDPEPTQRKN FLERKQGEYEFDHKFLKPMAMLKNVPSIENFSTKYIAERTVETAELPLN MLAVKTRSLW

DPLDELQDYEDYFPVLPKPNVIKTYQTDNSFCEQRLCGANPLVLRQIQQMDARFAFT ISELQEKFGDSIDLEERLKTG

N LYVCDYRALAFVKGGTYERGKKYLPTPIAFFCWRSSGFSDRGQLVPIAIQIN PTDGKQSQLITPFDEPLVWFHAKLC

VQIADAN HH EMSSH LCRTHFVMEPFAICTARQLADNHPLNLLLKPHFRFMLANNSLGRQRLVN RGGPVDELLAG

TLQESLQIVVNAYKEWSLDQFSLPTELKNRGMDNSDKLPHYPYRDDGLLLWNAI KKFVSEYLKLYYKTPQDLTADFE LQSWAQECVSQSGGRVKGVSDRITTLDQLIDIATAVI FTCGPQHAAVNYSQYEYMTFI PNMPLAAYKQITSEGNIP

DRKSLLSFLPPSKQTADQLSI LFILSAYRYDRLGYYDDKFLDPEAQEILVTFQQELNEAERQI ELNN KSRLI NYDYLKPRL

VTNSISV

Codon-optimized coding sequence of WP_099099431.1mut - SEQ ID NO: 283

ATGAAACCGTGCCTGCCGCAGAAAGACCCGGATGTTAAAGTGCGTATCAACTGGCTG GACAAAAACCGTGAG

GAATACAAGTTCAACTACGACTATCTGGCGCCGCTGCCGGTTATCGATAAAGTGCCG CACAAGGAGATTTTTA

GCGCGGAATATACCACCAAACGTCTGGCGAGCATGGCGAGCCTGGCGCCGAACATGC TGGCGGCGAAGGCG

CGTAACTTCCTGGACCCGCTGGATGAGCTGGAGGAATACGAGGAACTGCTGAGCCTG CTGCCGAAGCCGGAC

GTTATCAAGAACTATAAAACCGATAGCTGCTTTGCGGAACAACGTCTGAGCGGTGCG AACCCGCTGGCGATCC

AAAAAATTGACGTTCTGGATGCGCGTTTCGCGGTGACCGACGCGCACTTTCAGAAGG TGGCGGGCACCGAGT

TCACCCTGGAAAAGGCGCTGAAAGAGGGCAAGCTGTACTTTTGCGACTATCCGCTGC TGAGCGATATCAAAG

GTGGCGTTTACAACAACGTGAAGAAATATCTGCCGAAGCCGCAGGCGCTGTTCTACT GGCAAAGCAACGACA

GCCCGAACGGTGGCAGCCTGGTTCCGGTGGCGATCCAGATTAACCACGATAGCGGTG GCAAAAGCGTTATCT

ATACCCCGGACGATCCGCACCTGGACTGGTTTCTGGCGAAGACCTGCGTGCAGATTG CGGATGGTAACCACC

AAGAGCTGGGCAGCCACTTCGCGTACACCCACGCGGTTATGGCGCCGTTTGCGATCT GCACCGCGCGTCAACT

GGCGGAAAACCACCCGATTGCGCTGCTGCTGAAACCGCACTTCCGTTTTATGCTGTT CGACAACAGCCTGGGT

CGTACCCAGTTTCTGCAACCGGGTGGCCCGGTTGATGAGTTCATGGCGGGTAGCCTG GCGGAAAGCCTGGGC

TTTGTTGCGAAGGTGTACGAGGAATGGAGCGTGGAGAAATTCACCTTTCCGCGTCTG ATCAAGAGCCGTCGT

ACCGACGATCCGGAAATTCTGCCGCACTTCCCGTTTCGTGACGATGGTATGCTGATC TGGAACGCGGTTGAGA

AATT CGT GT ACG AAT AT CT GC AGCT GT ACT AT AAG ACCAGCCAAG ACCT GATT G ACG ATT AT G AGCTGC AG AA

CT GGGCGCGT G AATGCGTTGCGCAAG AT GGT GGCCGT GT G AAAGGCAT GCCGGCG AAG AT CG AG ACCCTGG

AACAGCTGATTGAGATCATTAGCGTGGTTGTTTTTACCTGCGCGCCGCTGCACAGCG CGCTGAACTTCAGCCA

ATACG AAT AT AT GGCGTTT GTT CCG AACATGCCGTACGCGGCGT AT CACCCG ATCCCGGAG ACCAAAGGT GT G

GACCTGGAAACCATCATGAAAATTCTGCCGCCGTTCAAGCAGGCGGCGGACCAAGTG ATGTGGACCGAGATT

CTGACCAGCTACCACTATGATAAGCTGGGCTTCTACGACGAGGAATTTGCGGATCCG CTGGCGCAGGAAATC

GTT GTGCAATTCCAGC AAAACCTGC ACG AG ATT G AACGT C AG ATCG AT ATT CGT AACC AAACCCGT CCG ATCC

CGT AC AACT ATTTT AAACCG AGCC AG AT C ATT AACAGCATT AAC ACCT AA

Amino acid sequence for WP_099099431.1mut - SEQ ID NO: 284

MKPCLPQKDPDVKVRI NWLDKN REEYKFNYDYLAPLPVIDKVPH KEI FSAEYTTKRLASMASLAPN MLAAKARN FL

DPLDELEEYEELLSLLPKPDVIKNYKTDSCFAEQRLSGANPLAIQKI DVLDARFAVTDAHFQKVAGTEFTLEKALKEGK

LYFCDYPLLSDI KGGVYNNVKKYLPKPQALFYWQSNDSPNGGSLVPVAIQI N HDSGGKSVIYTPDDPHLDWFLAKT

CVQIADGNHQELGSH FAYTHAVMAPFAICTARQLAEN HPIALLLKPH FRFM LFDNSLGRTQFLQPGGPVDEFMAG

SLAESLGFVAKVYEEWSVEKFTFPRLI KSRRTDDPEI LPHFPFRDDGMLIWNAVEKFVYEYLQLYYKTSQDLIDDYEL

QNWARECVAQDGGRVKGMPAKI ETLEQLI EIISVVVFTCAPLHSALN FSQYEYMAFVPN MPYAAYHPIPETKGVDL

ETIM KILPPFKQAADQVMWTEILTSYHYDKLGFYDEEFADPLAQEIVVQFQQN LH EI ERQIDI RNQTRPIPYNYFKPS

QI INSINT

Codon-optimized coding sequence of WP_052672367.1mut - SEQ ID NO: 285

ATGAAACCGTGCCTGCCGCAACATGAGCCGGATGCGATTGCGCGTCAGAACCGTCTG ATTAAAAACCGTGCG

GACTACGTGCTGGATTACAACTATCTGCCGCCGATCCCGCTGCAGACCCCGGTTCCG CAGCAAGAGCGTTTCA

GCGCGGAATATACCGCGCGTCGTCTGGCGAGCTTTGCGAACCTGGTGCCGAACATGC TGATGGCGCGTGCGC

GTAACGCGTTTGACCCGCTGGATACCCTGGAGGAATATGCGGACCTGCTGCCGGTGC TGCCGAAGCCGAACG TTATTAAAAACTATCAAGCGGATTGGTGCTTCGCGGAGCAGCGTCTGAGCGGTATCAACC CGCCGGCGATCCG

TCGTATTGACGCGCTGGATGCGCGTCTGCCGATTAGCAACAGCAGCTTTCAACACAG CGTTGGCGCGGAGCA

CAACCTGGAACAGGCGCTGAAGGAAGGTAAACTGTACTGCTGCGACTATCCGCTGCT GAGCGGCATCGGTGG

CGGTAACTACCAAAACCTGCCGAAGTATCTGCCGAAACCGCAGGCGCTGTTTTACTG GCGTAGCGATAACAGC

AAGATTGGCGGTAGCCTGGTGCCGGTTGCGATCAAGATTCTGAACGAGCTGGGCGGT AAAAACCTGGTGTAC

ACCCCGAACGACGCGCCGCTGGATTGGTTCCTGGCGAAGACCTGCGTTCAGATGGCG GACGCGAACCACCAA

GAACTGGGCACCCACTTTGCGAAAACCCACGCGGTTATGGCGCCGATTGCGGCGTGC ACCGCGCGTGAGCTG

GGTGAAAACCACCCGCTGACCCTGCTGCTGAAACCGCACTTCCGTTTTATGCTGTTC GATAACAGCCTGGGTC

GTACCCAGTTTCTGCAACCGACCGGTCCGACCGAGGAACTGCTGGCGGGCACCCTGG AGGAAAGCGTTCAGC

TGGTTGTGCAAGCGTACGAGGAATGGAGCATCGACACCACCTTCCCGCTGGAGCTGC AGCAACGTCAAATGC

ACGATCCGGAAATTCTGCCGCACTATCCGTTCCGTGACGATGGCATCCTGGTGTGGA ACGCGATTCACCAGTT

TGTTACCGAATACCTGCAAATTTACTATCACACCCCGCAGGACATCAGCGCGGATTA TGAGGTGCAGAACTGG

GCGCGTGAATGCGTGGACAGCGGTCGTGTTAAGGGTATGCCGGAGAGCATCGACACC CTGGCGCAACTGATT

GATATCATTGCGGTGGTTATCTTCACCTGCGCGCCGCTGCACAGCTGCCTGAACCTG GCGCAGTACGAATATA

TGACCTTTGTTCCGAACATGCCGTACGCGGCGTATCACCCGATCCCGACCACCAAGG GTGTGGATATGGCGAC

CATCGTTAAAATTATGCCGCCATTCCAGCGTGCGATCGACCAAATTCTGTGGACCGA TATTCTGAGCGCGTTTC

AATACGACAAGCTGGGCTTCTATGAGGAAGACTTTGCGGATCCGAAAGCGCAGGAAG TGCTGCAGCGTTTCC

AAGATAACCTGCAGCAAGTTGAGGAAAAGATCGAAATGCACAACCAGATCCGTCCGA TTCCGTACAACTATCT

GAAACCGAGCCGTATCATGAACAGCATTAACACCTAA

Amino acid sequence for WP_052672367.1mut - SEQ ID NO: 286

MKPCLPQHEPDAIARQNRLIKNRADYVLDYNYLPPIPLQTPVPQQERFSAEYTARRL ASFAN LVPN MLMARARNA

FDPLDTLEEYADLLPVLPKPNVIKNYQADWCFAEQRLSGI NPPAI RRI DALDARLPISNSSFQHSVGAEHN LEQALKE

GKLYCCDYPLLSGIGGGNYQNLPKYLPKPQALFYWRSDNSKIGGSLVPVAI KI LN ELGGKNLVYTPNDAPLDWFLAK

TCVQMADAN HQELGTH FAKTHAVMAPIAACTARELGENHPLTLLLKPH FRFMLFDNSLGRTQFLQPTGPTEELLA

GTLEESVQLVVQAYEEWSIDTTFPLELQQRQM HDPEILPHYPFRDDGILVWNAI HQFVTEYLQIYYHTPQDISADYE

VQNWARECVDSGRVKGMPESI DTLAQLIDI IAVVIFTCAPLHSCLNLAQYEYMTFVPNMPYAAYHPIPTTKGVDM

ATIVKI MPPFQRAI DQILWTDI LSAFQYDKLGFYEEDFADPKAQEVLQRFQDNLQQVEEKI EMH NQI RPIPYNYLKP

SRIM NSI NT

Codon-optimized coding sequence of WP_073631249.1mut - SEQ ID NO: 287

ATGAAACCGTGCCTGCCGCAGCATGACCCGAACCCGGAAGCGCGTCGTAACTGGCTG GAACAAAACCGTGAG

GACTACAAGTTTGATCACAACTATCTGGCGCCGATCCCGATTCTGGACAAGGTTCCG CACAAAGAGCTGTTCA

GCCCGCAGTACACCGCGAAACGTCTGGCGAGCATGGCGGATCTGGTGCCGAACATGC TGGCGGCGAAGGCG

CGTAACTTCTTTGACCCGCTGGATGAACTGGAGGAATACGAGGCGCTGCTGAGCATT CTGCCGAAACCGAGC

GTTATCAAGAACTATAAAACCGACAGCTGCTTTGCGGAACAGCGTCTGAGCGGTGCG AACCCGATGGCGATG

CACCGTATTGACGAGCTGGATGCGCGTTTCCCGGTTACCAACGATCACTTTCAAAAG GCGGTGGGTGCGGAA

CACAACCTGGAGGCGGCGCTGAAGGAAGGCAAACTGTACCTGTGCGACTATCCGCTG CTGTTTGATATTAAG

GGTGGCACCT ACC AG A AC AT C AAG A AAT ATCTG CCG AAACCG C AG G CG CT GTT CT ACT G G C A AAG C A ACG GT

AACAAGAACAGCGGCAGCCTGGTGCCGATCGCGATTCAAATCCACAACGACACCGGT GGCGATAGCCTGATT

TATACCCCGGACGATCCGCACCTGGACTGGTTCCTGGCGAAAACCTGCGTTCAGATC GCGGATGCGAACCACC

AAGAACTGGGTAGCCATTTTGCGCGTACCCATGCGGTGATGGCGCCGTTTGCGATCT GCACCGCGCGTCAACT

GGGTGAAAACCACCCGCTGGCGCTGCTGCTGAAACCGCACTTCCGTTTTATGCTGTA CGACAACAGCCTGGGT

CGTACCCACTTCCTGCAGGCGGGTGGCCCGGTTGATGAATTTATGGCGGGCACCCTG CAAGAGAGCCTGGGC TTTGTGGCGAAGGCGTACGAGGAATGGAGCCTGGACAACGCGGTTTTCCCGACCGAGGTG AAGAACCGTAA

AATGGACGATCCGGACATTCTGCCGCACTATCCGTTTCGTGACGATGGTATGCTGCT GTGGGATGCGGTTAAG

AAATTCGTGACCGAATACCTGCAGCTGTACTATAAGACCCCGCAAGACCTGAGCGAG GATTATGAACTGCAAA

ACTGGGCGCGTGAGTGCGCGGCGCAAGACGGTGGCTGCGTTAAGGGCATGCCGGAGA AAATTGAAACCATC

GAGCAGCTGATCCACGTGGTTACCGTGGTTGTGTTTACCTGCGCGCCGCTGCACAGC GCGCTGAACTTCAGCC

AATACGAATATATGGCGTTTGTTCCGAACATGCCGTACGCGGCGTACTATCCGGTTC CGGAGACCAAAGGTGT

GGATATGCAGACCATTATGAAGATGCTGCCGCCGTTCAAACAGGCGGCGGACCAAGT GATGTGGAGCGATAT

CCTGACCAGCTTCCACTACGACAAGCTGGGCCACTATGATGAGGAATTTGCGAACCC GATGGCGCAGGCGAT

CCTGCTGCAATTCCAGCAAAACCTGCACGAGGTGGAACGTCAGATTGAAATCAAGAA CCAAAGCCGTCCGATT

CCGT ACAACT AT CT G AAACCG AGCG AG AT CATT AACAGCAT CAACACCT AA

Amino acid sequence for WP_073631249.1mut - SEQ ID NO: 288

MKPCLPQHDPNPEARRNWLEQNREDYKFDH NYLAPI PILDKVPH KELFSPQYTAKRLASMADLVPNMLAAKARN

FFDPLDELEEYEALLSILPKPSVIKNYKTDSCFAEQRLSGANPMAMH RIDELDARFPVTNDH FQKAVGAEH NLEAAL

KEGKLYLCDYPLLFDIKGGTYQNI KKYLPKPQALFYWQSNGN KNSGSLVPIAIQIH NDTGGDSLIYTPDDPHLDWFL

AKTCVQIADAN HQELGSHFARTHAVMAPFAICTARQLGENH PLALLLKPH FRFM LYDNSLGRTH FLQAGGPVDEF

MAGTLQESLGFVAKAYEEWSLDNAVFPTEVKNRKMDDPDILPHYPFRDDGM LLWDAVKKFVTEYLQLYYKTPQD

LSEDYELQNWARECAAQDGGCVKGMPEKIETI EQLI HVVTVVVFTCAPLHSALN FSQYEYMAFVPNMPYAAYYPV

PETKGVDMQTIM KMLPPFKQAADQVMWSDI LTSFHYDKLGHYDEEFAN PMAQAI LLQFQQN LH EVERQI EI KN

QSRPIPYNYLKPSEI INSINT

Codon-optimized coding sequence of WP_013220336.1mut - SEQ ID NO: 289

ATGAACACCTGCCTGCCGCAGAACGACAGCGATCCGCAAGGTCGTAAGGATCGTCTG GAACGTCGTCGTGCG

CTGTACGTGTTCAACTACGATTATGTTCCGCCGATCCCGATGATTGACAAGGTTCCG CACGAGGAATACTTTAG

CCCGAAATATACCGCGGAGCGTCTGGCGAGCATGGCGAAACTGGCGCCGAACATGCT GGCGGCGAAGACCA

AACGTCTGTTCGATCCGCTGGACGAGCTGAACGAATACGATGAGATGTTCATCTTTC TGGACAAGCCGGGTAT

TGTTCGTGGCTATCGTACCGACGAAAGCTTCGGCGAGCAGCGTCTGAGCGGCGTGAA CCCGATGAGCATCCG

TCGTCTGGATAAACTGGACGCGCGTTTTCCGATTATGGATGAATACCTGGAGCAGAG CCTGGGTAGCCCGCAC

ACCCTGGCGCAGGCGCTGCAAGAAGGCCGTCTGTACTTCTGCGACTATCCGCAACTG GCGCACGTTAAAGAG

GGTGGTCTGTACCGTGGTCGTAAGAAATATCTGCCGAAACCGCGTGCGCTGTTTTGC TGGGATGGTAACCACC

TGCAGCCGGTGGCGATCCAGATTAGCGGCCAACCGGGTGGCCGTCTGTTCATTCCGC GTGACAGCGATCTGG

ACTGGTTTGTGGCGAAGCTGTGCGTTCAGATCGCGGACGCGAACCACCAAGAACTGG GCACCCACTTCGCGC

GTACCCACGTGGTTATGGCGCCGTTTGCGGTTTGCACCCATCGTCAGCTGGCGGAGA ACCACCCGCTGCACAT

TCTGCTGCGTCCGCACTTCCGTTTTATGCTGTACGATAACAGCCTGGGTCGTACCCG TTTCATCCAGCCGGATG

GTCCGGTGGAACACATGATGGCGGGCACCCTGGAGGAAAGCATCGGCATTAGCGCGG CGTTCTACAAGGAA

TGGCGTCTGGATGAGGCGGCGTTTCCGATCGAGATTGCGCGTCGTAAAATGGACGAT CCGGAAGTTCTGCCG

CACTACCCGTTCCGTGACGATGGTATGCTGCTGTGGGACGGCATTCAGAAGTTTGTT AAAGAGTATCTGGCGC

TGTACTATCAAAGCCCGGAAGATCTGGTGCAGGACCAAGAGCTGCGTAACTGGGCGC GTGAATGCACCGCGA

ACGATGGTGGCCGTGTGGCGGGTATGCCGGGTCGTATCGAAACCGTTGACCAGCTGA CCAGCATCCTGAGCA

CCGTGATTTATACCTGCGCGCCGCTGCACAGCGCGCTGAACTTTGCGCAATACGAGT ATATCGGTTATGTTCCG

AACATGCCGTACGCGGCGTATCACCCGATTCCGGAGGAAGGTGGCGTGGATATGGAG ACCCTGATGAAGATT

CTGCCGCCGTACGAACAGGCGGCGCTGCAACTGAAATGGACCGAGATCCTGACCAGC TACCACTATGACCGT

CTGGGCCACTATGATGAAAAGTTCGAGGACCCGCAGGCGCAAGCGGTGGTTGAACAG TTTCAGCAAGAGCTG GCGGCGGTGGAGCAAGAAATTGATCAGCGTAACCAAGACCGTCCGCTGGCGTACACCTAT CTGAAACCGAGC G AAAT C ATT AAC AGC AT C AAC ACCT AA

Amino acid sequence for WP_013220336.1mut - SEQ ID NO: 290

MNTCLPQNDSDPQGRKDRLERRRALYVFNYDYVPPI PM IDKVPHEEYFSPKYTAERLASMAKLAPNM LAAKTKRL FDPLDELN EYDEMFI FLDKPGIVRGYRTDESFGEQRLSGVNPMSIRRLDKLDARFPIM DEYLEQSLGSPHTLAQALQ EGRLYFCDYPQLAHVKEGGLYRGRKKYLPKPRALFCWDGNHLQPVAIQISGQPGGRLFIP RDSDLDWFVAKLCVQI ADANHQELGTH FARTHVVMAPFAVCTH RQLAENH PLHI LLRPH FRFMLYDNSLGRTRFIQPDGPVEHMMAGTLE ESIGISAAFYKEWRLDEAAFPIEIARRKMDDPEVLPHYPFRDDGMLLWDGIQKFVKEYLA LYYQSPEDLVQDQELRN WARECTANDGGRVAGMPGRIETVDQLTSILSTVIYTCAPLHSALN FAQYEYIGYVPNM PYAAYH PIPEEGGVDME TLMKILPPYEQAALQLKWTEILTSYHYDRLGHYDEKFEDPQAQAVVEQFQQELAAVEQEI DQRNQDRPLAYTYLKP SEI INSINT