Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IMPROVING EXPRESSION IN FERMENTATION PROCESSES
Document Type and Number:
WIPO Patent Application WO/2021/224152
Kind Code:
A1
Abstract:
The present invention is concerned with materials and methods for industrial fermentation processes. In particular the invention is concerned with expression cassettes to facilitate the expression of a target gene under the control of a heterologous promoter. The invention further pertains the construction of such promoters, vectors and host cells comprising such expression cassettes and fermentation methods using such host cells. Furthermore, the invention provides materials obtained by such fermentation.

Inventors:
FELLE MAX FABIAN (DE)
SAUER CHRISTOPHER (DE)
APPELBAUM MATHIS (DE)
HILKMANN MAXIMILIAN (DE)
SCHWEDER THOMAS (DE)
Application Number:
PCT/EP2021/061506
Publication Date:
November 11, 2021
Filing Date:
May 03, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BASF SE (DE)
International Classes:
C12N15/67; C12N15/75
Domestic Patent References:
WO2016046234A22016-03-31
WO2001051643A12001-07-19
WO2020169563A12020-08-27
WO2020169564A12020-08-27
WO2001051643A12001-07-19
WO1991002792A11991-03-07
WO2008148575A22008-12-11
WO2008140615A22008-11-20
WO2015181296A12015-12-03
WO2019016051A12019-01-24
WO1997003185A11997-01-30
Foreign References:
US5958728A1999-09-28
US5352604A1994-10-04
Other References:
JUNE-HYUNG KIM: "Comparison of PaprE, PamyE, and PP43 Promoter Strength for [beta]-Galactosidase and Staphylokinase Expression in Bacillus subtilis", 1 January 2008 (2008-01-01), XP055818301, Retrieved from the Internet [retrieved on 20210625]
JAN JANET ET AL: "Characterization of the 5P subtilisin (aprE) regulatory region from Bacillus subtilis", 1 January 1999 (1999-01-01), XP055818303, Retrieved from the Internet [retrieved on 20210625]
FERRARI,E.D.J.HENNERM.PEREGOJ.A.HOCH: "Transcription of Bacillus subtilis subtilisin and expression of subtilisin in sporulation mutants", J BACTERIOL, vol. 170, 1988, pages 289 - 295, XP000991317
HEN-NER,D.J.E.FERRARIM.PEREGOJ.A.HOCH: "Location of the targets of the hpr-97, sacU32(Hy), and sacQ36(Hy) mutations in upstream regions of the subtilisin promoter", J. BACTERIOL, vol. 170, 1988, pages 296 - 300
PARK,S.S.S.L.WONGL.F.WANGR.H.DOI.: "Bacillus subtilis subtilisin gene (aprE) is expressed from a sigma A (sigma 43) promoter in vitro and in vivo", J BACTERIOL, vol. 171, 1989, pages 2657 - 2665
GAUR,N.K.J.OPPENHEIMI.SMITH.: "The Bacillus subtilis sin gene, a regulator of alternate developmental processes, codes for a DNA-binding protein", J BACTERIOL, vol. 173, 1991, pages 678 - 686
KALLIO,P.T.J.E.FAGELSON, J.A.HOCHM.A.STRAUCH: "The transition state regulator Hpr of Bacillus subtilis is a DNA-binding protein", JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 266, 1991, pages 13411 - 13417
HELMANN,J.D.: "Compilation and analysis of Bacillus subtilis sigma A-dependent promoter sequences: evidence for extended contact between RNA polymerase and upstream promoter DNA", NUCLEIC ACIDS RES., vol. 23, 1995, pages 2351 - 2360, XP000990076
HAMBRAEUS ET AL., MICROBIOLOGY, vol. 146, 2000, pages 3051 - 3059
HAMBRAEUS ET AL., MICROBIOLOGY, vol. 148, 2002, pages 1795 - 1803
J. BACTERIOL, vol. 170, pages 296 - 300
STRAUCH,M.A.G.B.SPIEGELMANM.PEREGOW.C.JOHNSOND.BURBULYSJ.A.HOCH.: "The transition state transcription regulator abrB of Bacillus subtilis is a DNA binding protein", EMBO J, vol. 8, 1989, pages 1615 - 1621
JACOBS MELIASSON MUHLEN MFLOCK JI: "Cloning, sequencing and expression of subtilisin Carlsberg from Bacillus licheniformis", NUCLEIC ACIDS RES, vol. 13, 1985, pages 8913 - 8926, XP000651926
JACOBS,M.F: "Expression of the subtilisin Carlsberg-encoding gene in Bacillus licheniformis and Bacillus subtilis", GENE, vol. 152, 1995, pages 69 - 74, XP004042590, DOI: 10.1016/0378-1119(94)00655-C
HELMANN ET AL., NUCLEIC ACIDS RES., vol. 23, 1995, pages 2351 - 2360
NEEDLEMANWUNSCH ALGORITHM, J. MOL. BIOL., vol. 48, 1979, pages 443 - 453
MEINKOTHWAHL: "DNA-DNA hybrids", ANAL. BIOCHEM., vol. 138, 1984, pages 267 - 284
SAMBROOK,J.RUSSELL,D.W.: "Molecular cloning. A laboratory manual, 3rd", 2001, COLD SPRING HARBOR LABORATORY PRESS
VEHMAANPERA J., FEMS MICROBIO. LETT., vol. 61, 1989, pages 165 - 170
MALZAHN ET AL., CELL BIOSCI, vol. 7, 2017, pages 21
BORTESIFISCHER, BIOTECHNOLOGY ADVANCES, vol. 33, 2015, pages 41 - 52
CHENGAO, PLANT CELL REP, vol. 33, 2014, pages 575 - 583
HUE ET AL., JOURNAL OF BACTERIOLOGY, vol. 177, 1995, pages 3465 - 3471
BRIGIDI,P.MATEUZZI,D., BIOTECHNOL. TECHNIQUES, vol. 5, 1991, pages 5
BIRNBOIM, H. C.DOLY, J., NUCLEIC ACIDS RES, vol. 7, no. 6, 1979, pages 1513 - 1523
COBB, R. E.WANG, Y.ZHAO, H: "High-Efficiency Multiplex Genome Editing of Streptomyces Species Using an Engineered CRISPR/Cas System", ACS SYNTHETIC BIOLOGY, vol. 4, no. 6, 2015, pages 723 - 728, XP055204410, DOI: 10.1021/sb500351f
PROC. NATL. ACAD. SCI. U. S. A, vol. 86, pages 2172 - 2175
RADECK, J.MEYER, D.LAUTENSCHLAGER, N.MASCHER, T: "Bacillus SEVA siblings: A Golden Gate-based toolbox to create personalized integrative vectors for Ba cillus subtilis", SCI. REP, vol. 7, 2017, pages 14134
RADECK ET AL., SCI. REP, vol. 7, 2017, pages 14134
RADECK ET AL., SCI. REP., vol. 7, 2017, pages 14134
ALTENBUCHNER J.: "Editing of the Bacillus subtilis genome by the CRISPR-Cas9 system", APPL ENVIRON MICROBIOL, vol. 82, 2016, pages 5421 - 5
GUIZIOU,S.V.SAUVEPLANEH.J.CHANGC.CLERTEN.DECLERCKM.JULESJ.BONNET: "A part toolbox to tune genetic expression in Bacillus subtilis", NUCLEIC ACIDS RES., vol. 44, 2016, pages 7495 - 7508
HA-BICHER ET AL., BIOTECHNOL J, vol. 15, no. 2, 2019
MEISSNER ET AL., JOURNAL OF INDUSTRIAL MICROBIOLOGY & BIOTECHNOLOGY, vol. 42, no. 9, 2015, pages 1203 - 1215
SYVERTSSON ET AL., PLOS ONE, vol. 11, no. 3, 2016, pages e0151267
CAMACHO C.COULOURIS G.AVAGYAN V.MA N.PAPADOPOULOS J.BEALER K.MADDEN T.L.: "BLAST+: architecture and applications", BMC BIOINFORMATICS, vol. 10, 2008, pages 421
KATOHSTANDLEY: "MAFFT multiple sequence alignment software version 7: improvements in performance and usability", MOLECULAR BIOLOGY AND EVOLUTION, vol. 30, 2013, pages 772 - 780
WHEELER, TRAVIS JSEAN R EDDY: "nhmmer: DNA homology search with profile HMMs", BIOINFORMATICS (OXFORD, ENGLAND, vol. 29, no. 19, 2013, pages 2487 - 9
Attorney, Agent or Firm:
BASF IP ASSOCIATION (DE)
Download PDF:
Claims:
CLAIMS

1. Expression cassette comprising a target gene under the control of a heterologous pro moter of aprE-type, the promoter comprising a sigma factor A site comprising a -10 and a -35 motif, and a first enhancer element having a motif score of at least 100 and, preferably, a sec ond enhancer element having a motif score of at least 100.

2. Expression cassette according to claim 1 , wherein the first enhancer element is located in the region of -202 to -102, more preferably in the region of -191 to -169, even more pref erably in the region of -186 to -174 and most preferably in the region of -183 to -177.

3. Expression cassette according to any of the previous claims, wherein the second enhanc er element is located in the region of -189 to -100, more preferably in the region of -179 to -154, even more preferably in the region of -174 to -159 and most preferably in the region of -171 to -162., and preferably further comprising, immediately downstream of the second enhancer ele ment, a region of 10 nucleotides length enriched in strong or weak type nucleotides such that the ratio of strong:weak nucleotides is either at least 6:4 or at least 4:6, more prefera bly at least 7:3 or at least 3:7, respectively.

4. Expression cassette according to any of the previous claims, wherein the promoter further comprises a third enhancer element having a motif score of at least 100, preferably locat ed in the region of -152 to -100, more preferably in the region of -142 to -115, even more preferably in the region of -137 to -120 and most preferably in the region of -134 to -123.

5. Expression cassette according to any of the previous claims, wherein the promoter does not comprise a repressor element spanning position -131, wherein the repressor element is of SEQ ID NO. 3 or differs from SEQ ID NO. 3 by up to 2 nucleotides.

6. Expression cassette according to any of the previous claims, wherein the promoter further comprises a degU binding motif and optionally further comprises one more more binding motifs of regulatory factors selected from ScoC, hpr, SinR and AbrB.

7. Expression cassette, obtainable or obtained by a method comprising or consisting of the steps: i) obtaining a promoter sequence sequence having an HMM score above 50, ii) if required altering the promoter sequence to conform to the promoter as defined in any of the preceding claims, iii) bringing a heterologous target gene under the control of the promoter.

8. Expression cassette according to any of the previous claims, wherein the targe gene is codes for an enzyme, preferably wherein the enzyme is selected from the group consist ing of amylase, catalase, cellulase, chitinase, cutinase, galactosidase, beta-galactosidase, glucoamylase, glucosidase, hemicellulase, invertase, laccase, lipase, mannanase, man- nosidase, nuclease, oxidase, pectinase, phosphatase, phytase, protease, ribonuclease, transferase and xylanase, more preferably a protease, amylase or lipase, most preferably a protease.

9. Vector comprising an expression cassette according to any of the previous claims.

10. Host cell comprising a vector according to claim 9 or, integrated into its genome, an ex pression cassette according to any of claims 1-8.

11. Host cell according to claim 10, wherein the host cell belongs to taxonomic genus Bacil lus, preferably Bacillus amyloliquefaciens, Bacillus clausii, Bacillus halodurans, Bacillus lentus, Bacillus licheniformis, Bacillus paralicheniformis, Bacillus pumilus, Bacillus subtilis or Bacillus velezensis, more preferably Bacillus licheniformis.

12. Fermentation method for the production of a target protein, comprising cultivating a host cell according to any of claims 10 or 11 in a suitable medium to produce the protein, pref erably further comprising the step of isolating or purifying the target protein.

13. Fermentation broth obtained by the method according to claim 12.

14. Use of an expression cassette according to any of claims 1-8, a vector according to claim 9 or a host cell according to claim 10 or 11 for the production of a target protein coded by the target gene of the expression cassette.

Description:
IMPROVING EXPRESSION IN FERMENTATION PROCESSES

The present invention is concerned with materials and methods for industrial fermentation pro cesses. In particular the invention is concerned with expression cassettes to facilitate the ex pression of a target gene under the control of a heterologous promoter. The invention further pertains the construction of such promoters, vectors and host cells comprising such expression cassettes and fermentation methods using such host cells. Furthermore, the invention provides materials obtained by such fermentation.

BACKGROUND

It is a general aim in the field of industrial fermentation processes to improve yield of the ex pressed target gene. There are a number of techniques available to the skilled person to pursue this goal. One of those techniques is the selection of a proper promoter to control expression of the target gene.

Promoters for expression of target genes in industrial fermentation processes have been exten sively studied. Of particular interest are promoters which are inducer independent. Such pro moters are allowed to express the target gene in a fermentation process without having to con stantly supply and inducer to the fermentation medium to achieve expression of the target gene. Well-known examples of such promoters are the aprE promoter, amyL promoter, veg promoter, bacteriophage SP01 promoter, and the cry3A promoter

The native promoter from the gene encoding the Bacillus subtilisin Carlsberg protease, also referred to as aprE promoter, is well described in the art. The aprE gene is transcribed by sigma factor A (sigA) and its expression is highly controlled by several regulators - DegU acting as activator of aprE expression, whereas AbrB, ScoC (hpr) and SinR are repressors of aprE ex pression (Ferrari, E., D.J.Henner, M.Perego, and J.A.Hoch. 1988. Transcription of Bacillus sub- tilis subtilisin and expression of subtilisin in sporulation mutants. J Bacteriol 170: 289-295; Hen- ner.D.J., E. Ferrari, M.Perego, and J.A.Hoch. 1988. Location of the targets of the hpr-97, sacU32(Hy), and sacQ36(Hy) mutations in upstream regions of the subtilisin promoter. J. Bacte riol. 170: 296-300; Park.S.S., S.L.Wong, L.F.Wang, and R.H.Doi. 1989. Bacillus subtilis subtil isin gene (aprE) is expressed from a sigma A (sigma 43) promoter in vitro and in vivo. J Bacteri ol 171 : 2657-2665; Gaur.N.K., J.Oppenheim, and I. Smith. 1991. The Bacillus subtilis sin gene, a regulator of alternate developmental processes, codes for a DNA-binding protein. J Bacteriol 173: 678-686; Kallio, P.T. , J.E.Fagelson, J.A.Hoch, and M.A.Strauch. 1991. The transition state regulator Hpr of Bacillus subtilis is a DNA-binding protein. Journal of Biological Chemistry 266: 13411-13417).

The core promoter region comprising the sigma factor A binding sites -35 and -10 have been mapped to the region nt -1 - nt -45 relative to the transcriptional start site (Park.S.S., S.L.Wong, L.F.Wang, and R.H.Doi. 1989. Bacillus subtilis subtilisin gene (aprE) is expressed from a sigma A (sigma 43) promoter in vitro and in vivo. J Bacteriol 171: 2657-2665). WO0151643 describes the increase of expression by mutating the -35 site of the wild type aprE promoter from TACTAA to the canonical TTGACA -35 site motif (Helmann.J.D. 1995. Compila tion and analysis of Bacillus subtilis sigma A-dependent promoter sequences: evidence for ex tended contact between RNA polymerase and upstream promoter DNA. Nucleic Acids Res. 23: 2351-2360).

The transcriptional start site (TSS) is located at nt -58 relative to the start GTG of the aprE gene. The 5’UTR comprises the ribosome binding site (Shine Dalgarno) and a sequence within nt -58 - nt -33 relative to the start GTG forming a very stable stem-loop structure of the 5’-end of the mRNA being responsible for high mRNA transcript stability of up to 25 min (Hambraeus, et al. , 2000, Microbiology. 146 Pt 12:3051-3059; Hambraeus et al. , 2002, Microbi ology.148(Pt 6): 1795- 1803).

The region of nt -141 - nt -161 relative to the transcriptional start site has be shown to be re sponsible for full induction in a DegU (SacU) and DegQ (SacQ) dependent manner, whereas regions 5’ of nt -200 up to nt -600 are negatively regulated by ScoC (Hpr) (Henner.D.J.,

E. Ferrari, M.Perego, and J.A.Hoch. 1988. Location of the targets of the hpr- 97, sacU32(Hy), and sacQ36(Hy) mutations in upstream regions of the subtilisin promoter. J. Bacteriol. 170: 296- 300).

The ScoC (hpr) binding sites within the Bacillus subtilis aprE promoter region have been more precisely mapped revealing additional binding sites within the above 48 mentioned core pro moter region (Kallio, P.T., J.E.Fagelson, J.A.Hoch, and M.A.Strauch. 1991. The transition state regulator Hpr of Bacillus subtilis is a DNA-binding protein. Journal of Biological Chemistry 266: 13411-13417).

The binding site of the repressing transition state regulator ArbB has been mapped to nt -58 to + nt 15 relative to the transcriptional start site (Strauch.M.A., G.B.Spiegelman, M.Perego,

W.C. Johnson, D.Burbulys, and J.A.Hoch. 1989. The transition state transcription regulator abrB of Bacillus subtilis is a DNA binding protein. EMBO J 8: 1615-1621).

The binding sites of the repressor SinR have been mapped to nt -233 to nt -268 relative to the transcriptional start site (Gaur.N.K., J.Oppenheim, and I. Smith. 1991. The Bacillus subtilis sin gene, a regulator of alternate developmental processes, codes for a DNA-binding protein. J Bacteriol 173: 678-686).

Jacobs et al (Jacobs M, Eliasson M, Uhlen M, Flock Jl. 1985. Cloning, sequencing and expres sion of subtilisin Carlsberg from Bacillus licheniformis. Nucleic Acids Res 13: 8913-8926; Ja cobs, M.F. 1995. Expression of the subtilisin Carlsberg-encoding gene in Bacillus licheniformis and Bacillus subtilis. Gene 152: 69-74) discloses the sequence of the aprE (subC) gene and its 5’ region of the Bacillus licheniformis NCIB6816 strain (GenBank accession No. X03341). The regulation of the expression of the subtilisin Carlsberg aprE (subC) gene and the DNA se quences involved are described. The transcriptional start site (TSS) is located at nt -73 and ac cordingly the 5’ UTR comprising nt -73 to nt -1 relative to the start ATG. The ribosome binding site (Shine Dalgarno) is located at position nt -16 to nt -9. The recognition sequence -10-site (TATAAT-box) of the sigma factor A is highly conserved and located at nt -84 to nt -79 whereas the -35 site (TACCAT) located 17 nt upstream of the -10 site is less conserved compared to standard sigma factor A dependent promoters in Bacillus (Helmann et al., 1995, Nucleic Acids Res. 23: 2351-2360). Promoter truncations from the 5’ end comprising nt -122 to nt -1 and nt - 181 to nt -1 (mutant 771 and mutant 770, respectively, as described in Jacobs et al., 1995) show 20-40 fold reduced subtilisin Carlsberg protease expression activities compared to ex pression with promoter fragment nt -225 to nt -1 (mutant 769, as described in Jacobs et al., 1995) in Bacillus subtilis strains with elevated regulators DegU (degU32H) or DegQ (degQ36H). Therefore, the binding sites of the regulator degU stimulating subtilisin Carlsberg expression lie within the region comprising nt -225 to nt -182.

W09102792 discloses the functionality of the promoter of the alkaline protease gene for the large-scale production of subtilisin Carlsberg-type protease in Bacillus licheniformis. The subtil- isin Carlsberg is produced in a fermentation process using complex media components as ni trogen and carbon sources.

In particular, WO9102792 describes the 5’ region of the subtilisin Carlsberg protease encoding aprE gene of Bacillus licheniformis (Figure 27) comprising the functional aprE gene promoter and the 5’UTR comprising the ribosome binding site (Shine Dalgarno sequence). Moreover, the truncated fragment thereof starting with the Aval restriction endonuclease site comprises the functional aprE gene promoter and the 5’UTR comprising the ribosome binding site (Shine Dal garno sequence) as exemplified by expression of subtilisin Carlsberg fusion protein consisting of the signal peptide of the aprE gene from Bacillus licheniformis and the propeptide sequence and mature sequence of the Bacillus lentus alkaline protease gene.

The invention thus aspires to provide materials and methods to improve industrial fermentation processes, in particular by providing an expression cassette to achieve increased expression of a target gene under the control of a promoter.

SUMMARY OF THE INVENTION

The invention provides an expression cassette comprising a target gene under the control of a heterologous promoter of aprE-type, the promoter comprising a sigma factor A site comprising a -10 and a -35 motif, and a first enhancer element having a motif score of at least 100 and, preferably, a second enhancer element having a motif score of at least 100.

The invention also provides an expression cassette, obtainable or obtained by a method com prising or consisting of the steps: i) obtaining a promoter sequence having an HMM score above 50, ii) if required altering the promoter sequence to conform to the promoter as defined in any of the preceding claims, iii) bringing a heterologous target gene under the control of the promoter.

Furthermore, the invention provides a vector comprising an expression cassette according to the present invention.

The invention further provides a host cell comprising a vector of the present invention or, inte grated into its genome, an expression cassette according to the present invention.

The invention also provides a fermentation method for the production of a target protein, com prising cultivating a host cell according to the present invention in a suitable medium to produce the protein, preferably further comprising the step of isolating or purifying the target protein.

Correspondingly the invention provides a fermentation broth obtained by the method according to the present invention.

And the invention provides the use of an expression cassette according to the present inven tion, a vector according to the present invention or a host cell according to the present invention for the production of a target protein coded by the target gene of the expression cassette.

BRIEF DESCRIPTION OF THE FIGURES

Figure 1 shows an annotated sequence (SEQ ID NO. 6) of the 5' region to the aprE gene com prising a minimal promoter aprE (SEQ ID NO. 45) for use in an expression cassette according to the present invention.

Figure 2 shows an alignment of the aprE type promoters of SEQ ID NO. 2, 3, 4, 5 and 6.

Figure 3 shows the relative promoter activity in percent plotted against the indicated B. licheni- formis strains for the 24h and 48h timepoints of cultivation. B. licheniformis strains M609.1A and M609.1 B carry a GFP reporter gene under control of the truncated and full-length aprE promot er of B. licheniformis DSM641, respectively. B. licheniformis strains M609.2A and M609.2B car ry the GFP reporter gene under control of the truncated and full-length aprE promoter of B. li cheniformis DSM13, respectively. The figure shows that the promoter activity of the aprE type promoter of the present invention is significantly increased over similar promoters both within 24h and 48h of fermentation.

DETAILED DESCRIPTION

The technical teaching of the invention is expressed herein using the means of language, in particular by use of scientific and technical terms. However, the skilled person understands that the means of language, detailed and precise as they may be, can only approximate the full con tent of the technical teaching, if only because there are multiple ways of expressing a teaching, each necessarily failing to completely express all conceptual connections, as each expression necessarily must come to an end. With this in mind the skilled person understands that the sub ject matter of the invention is the sum of the individual technical concepts signified herein or expressed, necessarily in a pars-pro-toto way, by the innate constrains of a written description. In particular, the skilled person will understand that the signification of individual technical con cepts is done herein as an abbreviation of spelling out each possible combination of concepts as far as technically sensible, such that for example the disclosure of three concepts or embod iments A, B and C are a shorthand notation of the concepts A+B, A+C, B+C, A+B+C. In particu lar, fallback positions for features are described herein in terms of lists of converging alterna tives or instantiations. Unless stated otherwise, the invention described herein comprises any combination of such alternatives. The choice of more or less preferred elements from such lists is part of the invention and is due to the skilled person’s preference for a minimum degree of realization of the advantage or advantages conveyed by the respective features. Such multiple combined instantiations represent the adequately preferred form(s) of the invention.

As used herein, terms in the singular and the singular forms like "a", "an" and "the" include plu ral referents unless the content clearly dictates otherwise. Thus, for example, use of the term "a nucleic acid" optionally includes, as a practical matter, many copies of that nucleic acid mole cule; similarly, the term "probe" optionally (and typically) encompasses many similar or identical probe molecules. Also as used herein, the word "comprising" or variations such as "comprises" or "comprising" will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.

As used herein, the term "and/or" refers to and encompasses any and all possible combinations of one or more of the associated listed items, as well as the lack of combinations when inter preted in the alternative ("or"). The term “comprising” also encompasses the term “consisting of”.

The term "about", when used in reference to a measurable value, for example an amount of mass, dose, time, temperature, sequence identity and the like, refers to a variation of ± 0.1%, 0.25%, 0.5%, 0.75%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15% or even 20% of the specified value as well as the specified value. Thus, if a given composition is described as com prising "about 50% X," it is to be understood that, in some embodiments, the composition com prises 50% X whilst in other embodiments it may comprise anywhere from 40% to 60% X (i.e., 50% ± 10%).

As used herein, the term "gene" refers to a biochemical information which, when materialised in a nucleic acid, can be transcribed into a gene product, i.e. a further nucleic acid, preferably an RNA, and preferably also can be translated into a peptide or polypeptide. The term is thus also used to indicate the section of a nucleic acid resembling said information and to the sequence of such nucleic acid (herein also termed "gene sequence").

Also as used herein, the term "allele" refers to a variation of a gene characterized by one or more specific differences in the gene sequence compared to the wild type gene sequence, re- gardless of the presence of other sequence differences. Alleles or nucleotide sequence variants of the invention have at least, in increasing order of preference, 30%, 40%, 50%, 60%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%-84%, 85%, 86%, 87%, 88%,

89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% nucleotide "sequence identity" to the nucleotide sequence of the wild type gene. Correspondingly, where an "allele" refers to the biochemical information for expressing a peptide or polypeptide, the respective nucleic acid sequence of the allele has at least, in increasing order of preference, 30%, 40%, 50%, 60%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%-84%, 85%, 86%, 87%,

88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% amino acid "sequence identity" to the respective wild type peptide or polypeptide.

Mutations or alterations of amino or nucleic acid sequences can be any of substitutions, dele tions or insertions; the terms "mutations" or "alterations" also encompass any combination of these.

Protein or nucleic acid variants may be defined by their sequence identity when compared to a parent protein or nucleic acid. Sequence identity usually is provided as “% sequence identity” or “% identity”. To determine the percent-identity between two amino acid sequences in a first step a pairwise sequence alignment is generated between those two sequences, wherein the two sequences are aligned over their complete length (i.e. , a pairwise global alignment). The align ment is generated with a program implementing the Needleman and Wunsch algorithm (J. Mol. Biol. (1979) 48, p. 443-453), preferably by using the program “NEEDLE” (The European Mo lecular Biology Open Software Suite (EMBOSS)) with the programs default parameters (gapo- pen=10.0, gapextend=0.5 and matrix=EBLOSUM62). The preferred alignment for the purpose of this invention is that alignment, from which the highest sequence identity can be determined.

The following example is meant to illustrate two nucleotide sequences, but the same calcula tions apply to protein sequences:

Seq A: AAGATACTG length: 9 bases Seq B: GATCTGA length: 7 bases

Hence, the shorter sequence is sequence B.

Producing a pairwise global alignment which is showing both sequences over their complete lengths results in

Seq A: AAGATACTG-

Seq B: The Ί” symbol in the alignment indicates identical residues (which means bases for DNA or amino acids for proteins). The number of identical residues is 6.

The symbol in the alignment indicates gaps. The number of gaps introduced by alignment within the sequence B is 1. The number of gaps introduced by alignment at borders of se quence B is 2, and at borders of sequence A is 1.

The alignment length showing the aligned sequences over their complete length is 10.

Producing a pairwise alignment which is showing the shorter sequence over its complete length according to the invention consequently results in:

Seq A:

Seq B:

Producing a pairwise alignment which is showing sequence A over its complete length accord ing to the invention consequently results in:

Seq A:

Seq B:

Producing a pairwise alignment which is showing sequence B over its complete length accord ing to the invention consequently results in:

Seq A:

Seq B:

The alignment length showing the shorter sequence over its complete length is 8 (one gap is present which is factored in the alignment length of the shorter sequence).

Accordingly, the alignment length showing sequence A over its complete length would be 9 (meaning sequence A is the sequence of the invention), the alignment length showing se quence B over its complete length would be 8 (meaning sequence B is the sequence of the in vention).

After aligning the two sequences, in a second step, an identity value shall be determined from the alignment. Therefore, according to the present description the following calculation of per- cent-identity applies: %-identity = (identical residues / length of the alignment region which is showing the respective sequence of this invention over its complete length) *100. Thus, sequence identity in relation to comparison of two amino acid sequences according to the invention is calculated by dividing the number of identical residues by the length of the alignment region which is showing the respec tive sequence of this invention over its complete length. This value is multiplied with 100 to give “%-identity”. According to the example provided above, %-identity is: for sequence A being the sequence of the invention (6 / 9) * 100 = 66.7 %; for sequence B being the sequence of the in vention (6 / 8) * 100 = 75%.

The term "hybridisation" as defined herein is a process wherein substantially complementary nucleotide sequences anneal to each other. The hybridisation process can occur entirely in so lution, i.e. both complementary nucleic acids are in solution. The hybridisation process can also occur with one of the complementary nucleic acids immobilised to a matrix such as magnetic beads, Sepharose beads or any other resin. The hybridisation process can furthermore occur with one of the complementary nucleic acids immobilised to a solid support such as a nitro cellulose or nylon membrane or immobilised by e.g. photolithography to, for example, a sili ceous glass support (the latter known as nucleic acid arrays or microarrays or as nucleic acid chips). In order to allow hybridisation to occur, the nucleic acid molecules are generally thermal ly or chemically denatured to melt a double strand into two single strands and/or to remove hairpins or other secondary structures from single stranded nucleic acids.

The term “stringency” refers to the conditions under which a hybridisation takes place. The stringency of hybridisation is influenced by conditions such as temperature, salt concentration, ionic strength and hybridisation buffer composition. Generally, low stringency conditions are selected to be about 30°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. Medium stringency conditions are when the temperature is 20°C below Tm, and high stringency conditions are when the temperature is 10°C below Tm. High stringency hybridisation conditions are typically used for isolating hybridising sequences that have high sequence similarity to the target nucleic acid sequence. However, nucleic acids may deviate in sequence and still encode a substantially identical polypeptide, due to the de generacy of the genetic code. Therefore, medium stringency hybridisation conditions may sometimes be needed to identify such nucleic acid molecules.

The “Tm” is the temperature under defined ionic strength and pH, at which 50% of the target sequence hybridises to a perfectly matched probe. The Tm is dependent upon the solution con ditions and the base composition and length of the probe. For example, longer sequences hy bridise specifically at higher temperatures. The maximum rate of hybridisation is obtained from about 16°C up to 32°C below Tm. The presence of monovalent cations in the hybridisation solu tion reduce the electrostatic repulsion between the two nucleic acid strands thereby promoting hybrid formation; this effect is visible for sodium concentrations of up to 0.4M (for higher con centrations, this effect may be ignored). Formamide reduces the melting temperature of DNA- DNA and DNA-RNA duplexes with 0.6 to 0.7°C for each percent formamide, and addition of 50% formamide allows hybridisation to be performed at 30 to 45°C, though the rate of hybridisa tion will be lowered. Base pair mismatches reduce the hybridisation rate and the thermal stabil ity of the duplexes. On average and for large probes, the Tm decreases about 1°C per % base mismatch. The Tm may be calculated using the following equations, depending on the types of hybrids: DNA-DNA hybrids (Meinkoth and Wahl, Anal. Biochem., 138: 267-284, 1984): Tm= 81 5°C + 16.6xlog([Na+]{a}) + 0.41x%[G/C{b}] - 500x[L{c}]-1 - 0.61x% formamide

DNA-RNA or RNA-RNA hybrids:

Tm= 79.8 + 18.5 (log10[Na+]{a}) + 0.58 (%G/C{b}) + 11.8 (%G/C{b})2 - 820/L{c}

• oligo-DNA or oligo-RNAd hybrids: for <20 nucleotides: Tm= 2 ({In}) for 20-35 nucleotides: Tm= 22 + 1.46 ({In} ) wherein:

{a} or for other monovalent cation, but only accurate in the 0.01-0.4 M range {b} only accurate for %GC in the 30% to 75% range {c} L = length of duplex in base pairs {d} Oligo, oligonucleotide

{In} effective length of primer = 2* (no. of G/C)+(no. of A/T)

Non-specific binding may be controlled using any one of a number of known techniques such as, for example, blocking the membrane with protein containing solutions, additions of heterolo gous RNA, DNA, and SDS to the hybridisation buffer, and treatment with Rnase. For non- related probes, a series of hybridizations may be performed by varying one of (i) progressively lowering the annealing temperature (for example from 68°C to 42°C) or (ii) progressively lower ing the formamide concentration (for example from 50% to 0%). The skilled artisan is aware of various parameters which may be altered during hybridisation and which will either maintain or change the stringency conditions.

Besides the hybridisation conditions, specificity of hybridisation typically also depends on the function of post-hybridisation washes. To remove background resulting from non-specific hy bridisation, samples are washed with dilute salt solutions. Critical factors of such washes in clude the ionic strength and temperature of the final wash solution: the lower the salt concentra tion and the higher the wash temperature, the higher the stringency of the wash. Wash condi tions are typically performed at or below hybridisation stringency. A positive hybridisation gives a signal that is at least twice of that of the background. Generally, suitable stringent conditions for nucleic acid hybridisation assays or gene amplification detection procedures are as set forth above. More or less stringent conditions may also be selected. The skilled artisan is aware of various parameters which may be altered during washing and which will either maintain or change the stringency conditions. For example, typical high stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 65°C in 1x SSC or at 42°C in 1x SSC and 50% forma- mide, followed by washing at 65°C in 0.3x SSC. Examples of medium stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 50°C in 4x SSC or at 40°C in 6x SSC and 50% formamide, followed by washing at 50°C in 2x SSC. The length of the hybrid is the anticipated length for the hybridising nucleic acid. When nucleic acids of known sequence are hybridised, the hybrid length may be determined by aligning the se quences and identifying the conserved regions described herein. 1 xSSC is 0.15M NaCI and 15mM sodium citrate; the hybridisation solution and wash solutions may additionally include 5x Denhardt's reagent, 0.5-1.0% SDS, 100 pg/ml denatured, fragmented salmon sperm DNA, 0.5% sodium pyrophosphate. Another example of high stringency conditions is hybridisation at 65°C in 0.1x SSC comprising 0.1 SDS and optionally 5x Denhardt's reagent, 100 pg/ml denatured, fragmented salmon sperm DNA, 0.5% sodium pyrophosphate, followed by the washing at 65°C in 0.3x SSC.

For the purposes of defining the level of stringency, reference can be made to Sambrook et al. (2001) Molecular Cloning: a laboratory manual, 3rd Edition, Cold Spring Harbor Laboratory Press, CSH, New York or to Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989 and yearly updates).

The term "nucleic acid construct" as used herein refers to a nucleic acid molecule, either single- or double-stranded, which is isolated from a naturally occurring gene or is modified to contain segments of nucleic acids in a manner that would not otherwise exist in nature or is synthetic.

The term "nucleic acid construct" is synonymous with the term "expression cassette" when the nucleic acid construct contains the control sequences required for expression of a polynucleo tide.

The term "control sequence" is defined herein to include all sequences affecting the expression of a polynucleotide, including but not limited thereto, the expression of a polynucleotide encod ing a polypeptide. Each control sequence may be native or foreign to the polynucleotide or na tive or foreign to each other. Such control sequences include, but are not limited to, promoter sequence, 5’-UTR (also called leader sequence), ribosomal binding site (RBS, Shine Dalgarno sequence), 3’-UTR, and transcription start and stop sites.

The term "functional linkage" or "operably linked" with respect to regulatory elements, is to be understood as meaning the sequential arrangement of a regulatory element (including but not limited thereto a promoter) with a nucleic acid sequence to be expressed and, if appropriate, further regulatory elements (including but not limited thereto a terminator) in such a way that each of the regulatory elements can fulfil its intended function to allow, modify, facilitate or oth erwise influence expression of said nucleic acid sequence. For example, a control sequence is placed at an appropriate position relative to the coding sequence of the polynucleotide se quence such that the control sequence directs the expression of the coding sequence of a poly peptide. A "promoter" or "promoter sequence" is a nucleotide sequence located upstream of a gene on the same strand as the gene that enables that gene's transcription. Promoter is followed by the transcription start site of the gene. A promoter is recognized by RNA polymerase (together with any required transcription factors), which initiates transcription. A functional fragment or func- tional variant of a promoter is a nucleotide sequence which is recognizable by RNA polymerase, and capable of initiating transcription.

An "active promoter fragment", "active promoter variant", "functional promoter fragment" or "functional promoter variant" describes a fragment or variant of the nucleotide sequences of a promoter, which still has promoter activity.

An "inducer dependent promoter" is understood herein as a promoter that is increased in its activity to enable transcription of the gene to which the promoter is operably linked upon addi tion of an "inducer molecule" to the fermentation medium. Thus, for an inducer-dependent pro- moter the presence of the inducer molecule triggers via signal transduction an increase in ex pression of the gene operably linked to the promoter. The gene expression prior activation by the presence of the inducer molecule does not need to be absent, but can also be present at a low level of basal gene expression that is increased after addition of the inducer molecule. The "inducer molecule" is a molecule which presence in the fermentation medium is capable of af- fecting an increase in expression of a gene by increasing the activity of an inducer-dependent promoter operably linked to the gene. Preferably the inducer molecule is a carbohydrate or an analogue thereof. In one embodiment, the inducer molecule is a secondary carbon source of the Bacillus cell. In the presence of a mixture of carbohydrates cells selectively take up the car bon source that provide them with the most energy and growth advantage (primary carbon source). Simultaneously, they repress the various functions involved in the catabolism and up take of the less preferred carbon sources (secondary carbon source). Typically, a primary car bon source for Bacillus is glucose and various other sugars and sugar derivates being used by Bacillus as secondary carbon sources. Secondary carbon sources include e.g. mannose or lac tose without being restricted to these. In contrast thereto, the activity of promoters that do not depend on the presence of an inducer molecule (herein called "inducer-independent promot ers") are either constitutively active or can be increased regardless of the presence of an induc er molecule that is added to the fermentation medium. In a preferred embodiment the inducer- independent promoter is an aprE promoter. An "aprE promoter", "aprE-type promoter" or "aprE promoter sequence" is the nucleotide se quence (or parts or variants thereof) located upstream of an aprE gene, i.e. , a gene coding for a Bacillus subtilisin Carlsberg protease, on the same strand as the aprE gene that enables that aprE gene’s transcription. The term "transcription start site" or "transcriptional start site" shall be understood as the location where the transcription starts at the 5’ end of a gene sequence. In prokaryotes the first nucleotide, referred to as +1 is in general an adenosine (A) or guanosine (G) nucleotide. In this context, the terms "sites" and "signal" can be used interchangeably here in.

The term "expression" or "gene expression" means the transcription of a specific gene or specif- ic genes or specific nucleic acid construct. The term "expression" or "gene expression" in par ticular means the transcription of a gene or genes or genetic construct into structural RNA (e.g., rRNA, tRNA) or mRNA with or without subsequent translation of the latter into a protein. The process includes transcription of DNA and processing of the resulting mRNA product.

The term "vector" is defined herein as a linear or circular DNA molecule that comprises a poly nucleotide that is operably linked to one or more control sequences that provides for the ex pression of the polynucleotide.

As used herein, the term "isolated DNA molecule" refers to a DNA molecule at least partially separated from other molecules normally associated with it in its native or natural state. The term "isolated" preferably refers to a DNA molecule that is at least partially separated from some of the nucleic acids which normally flank the DNA molecule in its native or natural state. Thus, DNA molecules fused to regulatory or coding sequences with which they are not normally asso ciated, for example as the result of recombinant techniques, are considered isolated herein.

Such molecules are considered isolated when integrated into the chromosome of a host cell or present in a nucleic acid solution with other DNA molecules, in that they are not in their native state.

Any number of methods well known to those skilled in the art can be used to isolate and manip ulate a polynucleotide, or fragment thereof, as disclosed herein. For example, polymerase chain reaction (PCR) technology can be used to amplify a particular starting polynucleotide molecule and/or to produce variants of the original molecule. Polynucleotide molecules, or fragment thereof, can also be obtained by other techniques, such as by directly synthesizing the fragment by chemical means, as is commonly practiced by using an automated oligonucleotide synthe sizer. A polynucleotide can be single-stranded (ss) or double- stranded (ds). "Double-stranded" refers to the base-pairing that occurs between sufficiently complementary, anti-parallel nucleic acid strands to form a double-stranded nucleic acid structure, generally under physiologically relevant conditions. Embodiments of the method include those wherein the polynucleotide is at least one selected from the group consisting of sense single- stranded DNA (ssDNA), sense single-stranded RNA (ssRNA), double-stranded RNA (dsRNA), double-stranded DNA (dsDNA), a double-stranded DNA/RNA hybrid, anti-sense ssDNA, or anti-sense ssRNA; a mixture of pol ynucleotides of any of these types can be used.

The term "heterologous" (or exogenous or foreign or recombinant or non-native) polypeptide is defined herein as a polypeptide that is not native to the host cell, a polypeptide native to the host cell in which structural modifications, e.g., deletions, substitutions, and/or insertions, have been made by recombinant DNA techniques to alter the native polypeptide, or a polypeptide native to the host cell whose expression is quantitatively altered or whose expression is directed from a genomic location different from the native host cell as a result of manipulation of the DNA of the host cell by recombinant DNA techniques, e.g., a stronger promoter. Similarly, the term "heterologous" (or exogenous or foreign or recombinant or non-native) polynucleotide refers to a polynucleotide that is not native to the host cell, a polynucleotide native to the host cell in which structural modifications, e.g., deletions, substitutions, and/or insertions, have been made by recombinant DNA techniques to alter the native polynucleotide, or a polynucleotide native to the host cell whose expression is quantitatively altered as a result of manipulation of the regulatory elements of the polynucleotide by recombinant DNA techniques, e.g., a stronger promoter, or a polynucleotide native to the host cell, but integrated not within its natural genetic environment as a result of genetic manipulation by recombinant DNA techniques. With respect to two or more polynucleotide sequences or two or more amino acid sequences, the term "heterologous" is used to characterized that the two or more polynucleotide sequences or two or more amino acid sequences are naturally not occurring in the specific combination with each other. In particular, the term "heterologous" when referring to a promoter-gene combination means that the specific combination of promoter and gene is not found in nature. A promotor is heterologous to a gene and vice versa in particular when (a) a promoter, which in a wild type cell is operably linked to a gene A, is now operably linked instead to another gene B, or (b) where a promotor not found in nature is operably linked to a gene, or (c) where a promotor is operably linked to a gene of a sequence not found in nature.

The term "host cell", as used herein, includes any cell type that is susceptible to transformation, transfection, transduction, conjugation, and the like with a nucleic acid construct or expression vector. Thus, the term "host cell" includes cells that have the capacity to act as a host or ex pression vehicle for a newly introduced DNA sequence, in particular for expression of a target gene comprised in said newly introduced DNA sequence.

As used herein, "recombinant" when referring to nucleic acid or polypeptide, indicates that such material has been altered as a result of human application of a recombinant technique, such as by polynucleotide restriction and ligation, by polynucleotide overlap-extension, or by genomic insertion or transformation. A gene sequence open reading frame is recombinant if (a) that nu cleotide sequence is present in a context other than its natural one, for example by virtue of being (i) cloned into any type of artificial nucleic acid vector or (ii) moved or copied to another location of the original genome, or if (b) the nucleotide sequence is mutagenized such that it differs from the wild type sequence. The term recombinant also can refer to an organism having a recombinant material, e.g., a plant that comprises a recombinant nucleic acid is a recombinant plant.

The term "transgenic" refers to an organism, preferably a plant or part thereof, or a nucleic acid that comprises a heterologous polynucleotide. Preferably, the heterologous polynucleotide is stably integrated within the genome such that the polynucleotide is passed on to successive generations. The heterologous polynucleotide may be integrated into the genome alone or as part of a recombinant expression cassette. "Transgenic" is used herein to refer to any cell or cell line the genotype of which has been so altered by the presence of heterologous nucleic acid including those transgenic organisms or cells initially so altered, as well as those created by crosses or asexual propagation from the initial transgenic organism or cell. A "recombinant" organism preferably is a "transgenic" organism.

As used herein, "mutagenized" refers to an organism or nucleic acid thereof having alteration(s) in the biomolecular sequence of its native genetic material as compared to the sequence of the genetic material of a corresponding wildtype organism or nucleic acid, wherein the alteration(s) in genetic material were induced and/or selected by human action. Methods of inducing muta tions can induce mutations in random positions in the genetic material or can induce mutations in specific locations in the genetic material (i.e. , can be directed mutagenesis techniques), such as by use of a genoplasty technique. In addition to unspecific mutations, according to the inven tion a nucleic acid can also be mutagenized by using mutagenesis means with a preference or even specificity for a particular site, thereby creating an artificially induced heritable allele ac cording to the present invention. Such means, for example site specific nucleases, including for example zinc finger nucleases (ZFNs), meganucleases, transcription activator-like effector nu- celases (TALENS) (Malzahn et al., Cell Biosci, 2017, 7:21) and clustered regularly interspaced short palindromic repeats/CRISPR-associated nuclease (CRISPR/Cas) with an engineered crRNA/tracr RNA (for example as a single-guide RNA, or as modified crRNA and tracrRNA mol ecules which form a dual molecule guide), and methods of using this nucleases to target known genomic locations, are well-known in the art (see reviews by Bortesi and Fischer, 2015, Bio technology Advances 33: 41-52; and by Chen and Gao, 2014, Plant Cell Rep 33: 575-583, and references within).

As used herein, a "genetically modified organism" (GMO) is an organism whose genetic charac teristics contain alteration(s) that were produced by human effort causing transfection that re sults in transformation of a target organism with genetic material from another or "source" or ganism, or with synthetic or modified-native genetic material, or an organism that is a descend ant thereof that retains the inserted genetic material. The source organism can be of a different type of organism (e.g., a GMO plant can contain bacterial genetic material) or from the same type of organism (e.g., a GMO plant can contain genetic material from another plant).

The term "native" (or wildtype or endogenous) cell or organism and "native" (or wildtype or en dogenous) polynucleotide or polypeptide refers to the cell or organism as found in nature and to the polynucleotide or polypeptide in question as found in a cell in its natural form and genetic environment, respectively (i.e. , without there being any human intervention).

As used herein, "wildtype" or "corresponding wildtype plant" means the typical form of an organ ism or its genetic material, as it normally occurs, as distinguished from e.g. mutagenized and/or recombinant forms. Similarly, by "control cell" or "wildtype host cell" is intended a cell that lacks the particular polynucleotide of the invention that are disclosed herein. The use of the term "wildtype" is not, therefore, intended to imply that a host cell lacks recombinant DNA in its ge nome, and/or does not possess fungal resistance characteristics that are different from those disclosed herein.

The invention provides an expression cassette comprising a target gene under the control of a heterologous promoter. The promoter is of aprE-type. Preferred promoters are described herein and are available to the skilled person. The promoter comprises a sigma factor A site compris ing a -10 and a -35 motif, and a first enhancer element having a motif score of at least 100 and preferably a second enhancer element having a motif score of at least 100. As described in the examples below, it has been surprisingly found that a promoter with these features leads to an increased expression of a target gene operably linked thereto. This was particularly unexpected as promoters of aprE-type had been well studied before.

The motif score is calculated as follows: The putative promoter sequence is aligned to the se quence according to SEQ ID NO. 44. Let n be the length of the motif to be assessed. Then, starting at the first nucleotide of the putative promoter, the nucleotides up to position n are com pared to their respective scores in the corresponding motif score table and the respective scores are totalled. For example, to analyse if the nucleotides of position 1 to n in a putative promoter sequence conform to the minimal motif score for a first enhancer element, the score value for the first nucleotide according to table 1 is added to the score of the second nucleotide according to table 1 and so on until the score of the seventh nucleotide according to table 1 is added. Then, if the sum is at least 100, the presence of a first enhancer element at sequence position 1-7 is confirmed. If the respective element cannot be confirmed at position 1 to n, then the window of analysis is shifted to the next consecutive nucleotides 2 to n+1 of the putative promoter sequence and the process is repeated. The process is repeated until the presence of the respective enhancer has been confirmed. Of course, when the number of remaining nucleo tides is smaller than n then the presence of the enhancer element cannot be confirmed for the remaining nucleotides; the respective enhancer element is thus not found at those positions.

Table 1: Nucleotide scores for the first enhancer element

For example, the first enhancer element of SEQ ID NO. 6 at the 120th to 126th nucleotide has a motif score of 20+16+10+15+8+17+18=104, whereas the same nucleotide window of SEQ ID NO. 3 only has a motif score of 20+16+7+15+5+17+18=98. Clearly, the sequence SEQ ID NO.

3 lacks a first enhancer element at this position.

Table 2: Nucleotide scores for the second enhancer element

Table 3: Nucleotide scores for the third enhancer element

The invention is not limited to any particular biological enhancer or transcription factor recogniz ing any of the first, second or third enhancer elements or of the repressor element. The term "enhancer" or "repressor" is attached to each element only to denote that, for an enhancer ele ment, the presence of this element is preferred according to the invention, or discouraged for a repressor element.

The first enhancer element is preferably located in the region of -202 to -102, more preferably in the region of -191 to -169, even more preferably in the region of -186 to -174 and most prefera bly in the region of -183 to -177. In line with common numbering in the art, the position of the 5' end of the -10 motif of the sigma A binding site is defined to be position number -12, all nucleo- tides more upstream of this are then numbered by decreasing integers (-13, -14 and so on). There is no position 0, instead, the nucleotide in 3' direction next to position -1 is numbered +1 and all further nucleotides in 3' direction are successively numbered by increasing integers (2, 3 and so on). This numbering is depicted in figure 1. As shown in the examples absence of such first enhancer element leads to a significant decrease in expression of a target gene.

The second enhancer element is preferably located in the region of -189 to -100, more prefera bly in the region of -179 to -154, even more preferably in the region of -174 to -159 and most preferably in the region of -171 to -162. When comparing the promoters of the present inven tion, in particular according to SEQ ID NO. 6, 44 or 45, with other aprE-type promoters it has been found that the second enhancer element tends to be conserved even for promoters of tax- onomically distant sources of AprE proteins, i.e. promoters having a HMM score of at least 50. Thus, even though the reason for such conservation is not known it is nevertheless preferred to maintain or introduce this element in a promoter according to the present invention.

The promoter further preferably comprises, immediately downstream of the second enhancer element, a region of 10 nucleotides length enriched in strong or weak type nucleotides such that the ratio of strong:weak nucleotides is either at least 6:4 or at least 4:6, more preferably at least 7:3 or at least 3:7, respectively. Conforming to nomenclature of the art, the nucleotides adenine (A) and thymidine (T) are collectively called "weak" type nucleotides and the remaining cytidine and guanosine nucleotides are called "strong" type nucleotides. Interestingly the exact se quence within the region of 10 nucleotides length downstream of the second enhancer element is not conserved for similar promoters, but the ratio of strong to weak type nucleotides is imbal anced as described herein. Thus, it is preferred to maintain such imbalance.

Further preferably the promoter comprises a third enhancer element having a motif score of at least 100. The third enhancer element, if present, is preferably located in the region of -152 to - 100, more preferably in the region of -142 to -115, even more preferably in the region of -137 to -120 and most preferably in the region of -134 to -123. As shown in the examples the presence of this enhancer element is not sufficient to achieve the significant increase in expression strength of the promoter of the present invention. However, the third enhancer element is found in preferred promoters of the present invention, for example according to SEQ ID NO. 6, 44 or 45, and it is thus preferred that the element is present.

The promoter according to the present invention preferably does not comprise a repressor ele ment which has the sequence of SEQ ID NO. 47 or differs from SEQ ID NO. 47 by up to 2 nu cleotides and wherein the repressor element covers position -131. Such repressor element is found in some aprE-type promoters. However, such repressor element would collide with the presence of the preferred third enhancer element. Thus, the promoter according to the present invention preferably does not comprise such repressor element, or if it does, then preferably position -131 of the promoter does not fall within the repressor element.

According to the present invention the promoter preferably further comprises a degU binding motif. DegU is a transcription factor known to increase expression of aprE-type promoters. Thus, the presence of a corresponding transcription factor binding site is preferred. Optionally the promoter further comprises one more more binding motifs of regulatory factors selected from ScoC, hpr, SinR and AbrB. These regulatory factors are primarily negative regula tors. However, presence of such binding sites is preferred to improve correct timing of expres sion of the target gene during an industrial fermentation process.

Further optionally the promoter comprises a 5'UTR. This is a transcribed but not translated re gion downstream of the -1 promoter position. Such untranslated region for example should con tain a ribosome binding site to facilitate translation in those cases where the target gene codes for a peptide or polypeptide.

With respect to the 5'UTR the invention in particular teaches to combine the promoter of the present invention with a 5'UTR comprising one or more stabilising elements. This way the mRNAs synthesized from the promoter region may be processed to generate mRNA transcript with a stabilizer sequence at the 5' end of the transcript. Preferably such a stabilizer sequence at the 5'end of the mRNA transcripts increases their half-life as described by Hue et al, 1995, Journal of Bacteriology 177: 3465-3471. Suitable mRNA stabilizing elements are those de scribed in

WO08148575, preferably SEQ ID NO. 1 to 5 of W008140615, or fragments of these se quences which maintain the mRNA stabilizing function, and in

W008140615, preferably Bacillus thuringiensis CrylllA mRNA stabilising sequence or bac teriophage SP82 mRNA stabilising sequence, more preferably a mRNA stabilising sequence according to SEQ ID NO. 4 or 5 of W008140615, more preferably a modified mRNA stabilising sequence according to SEQ ID NO. 6 of W008140615, or fragments of these sequences which maintain the mRNA stabilizing function.

Preferred mRNA stabilizing elements are selected from the group consisting of aprE, grpE, cotG, SP82, RSBgsiB, CrylllA mRNA stabilizing elements, preferably mRNA stabilising ele ments according to SEQ ID NO. 48 to 52 respectively (corresponding to SEQ ID NO. 1 to 5, respectively, of WO08148575), and according to SEQ ID NO. 53 or 54 (corresponding to SEQ ID NO. 4 and 6, respectively of W008140615), or according to fragments of these sequences which maintain the mRNA stabilizing function. A preferred mRNA stabilizing element is the grpE mRNA stabilizing element, preferably according to SEQ ID NO. 49 (corresponding to SEQ ID NO. 2 of WO08148575).

The 5'UTR also preferably comprises a modified rib leader sequence located downstream of the promoter and upstream of an ribosome binding site (RBS). In the context of the present inven tion a rib leader is herewith defined as the leader sequence upstream of the riboflavin biosyn thetic genes (rib operon) in a Bacillus cell, more preferably in a Bacillus subtilis cell. In Bacillus subtilis, the rib operon, comprising the genes involved in riboflavin biosynthesis, include ribG (ribD), ribB (ribE), ribA, and ribH genes. Transcription of the riboflavin operon from the rib pro moter (Prib) in B. subtilis is controlled by a riboswitch involving an untranslated regulatory lead er region (the rib leader) of almost 300 nucleotides located in the 5'-region of the rib operon be tween the transcription start and the translation start codon of the first gene in the operon, ribG. Suitable rib leader sequences are described in WO2015/1181296, in particular pages 23-25, incorporated herein by reference. The invention also provides an expression cassette obtainable or obtained by a method com prising or consisting of the steps: i) obtaining a promoter sequence sequence having an HMM score above 50, ii) if required altering the promoter sequence to conform to the promoter as defined in any of the preceding claims, iii) bringing a heterologous target gene under the control of the promoter.

The "HMM score" is the score value obtained by the method used in Example 2. By following the steps of the above method suitable further promoters are found - for example the aprE-type promoters of sequences SEQ ID NO. 49 to 196 as described in example 2 - , converted into promoters according to the present invention and made use of by coupling them with a target gene to be expressed. It is a particular advantage of the present invention that suitable promot ers can be reliably identified in highly varying sources, and still can be converted into strong promoters for the expression of a target gene, preferably for a microbial host as described here in. Thus, the present invention advantageously not only provides, for use in an expression cas sette of the present invention, the promoters according to SEQ ID NO. 6, 44 or 45 and promot ers having at least 80%, more preferably at least 85%, more preferably at least 89%, more pref erably at least 90%, more preferably at least 91%, more preferably at least 92%, more prefera bly at least 93%, more preferably at least 94%, more preferably at least 95%, more preferably at least 96%, more preferably at least 97%, more preferably at least 98%, more preferably at least 99% sequence identity to any of sequences SEQ ID NO. 6, 44 or 45, provided that the promot ers that differ from SEQ ID NO. 44 comprise the first and preferably also one or more of the further elements (other than the putative repressor element at position -131) as described here in, further preferably at the respective locations as described with respect to the elements.

According to the present invention the target gene preferably codes for an enzyme, and more preferably the enzyme is selected from the group consisting of amylase, catalase, cellulase, chitinase, cutinase, galactosidase, beta-galactosidase, glucoamylase, glucosidase, hemicellu- lase, invertase, laccase, lipase, mannanase, mannosidase, nuclease, oxidase, pectinase, phosphatase, phytase, protease, ribonuclease, transferase and xylanase, more preferably a protease, amylase or lipase, most preferably a protease. It is a particular advantage of the pre sent invention that the promoter of the expression cassette is capable of driving expression of a variety of target genes common in industrial fermentation processes.

The invention also provides a vector comprising an expression cassette as described herein. The vector facilitates the transformation of host cells to have the target gene expressed, prefer ably in an industrial fermentation process.

Correspondingly the invention also provides a host cell comprising a vector according to the present invention, and the invention provides a host cell having integrated into its genome an expression cassette according to the present invention.

The host cell according to the invention preferably belongs to the taxonomic genus Bacillus, more preferably to any of the species Bacillus amyloliquefaciens, Bacillus clausii, Bacillus halodurans, Bacillus lentus, Bacillus licheniformis, Bacillus paralicheniformis, Bacillus pumilus, Bacillus subtilis or Bacillus velezensis, and most preferably belongs to the species Bacillus li cheniformis. Such Bacillus microorganisms are commonly used in fermentation processes. It is an advantage of the present invention that the expression cassette provided herein is adapted to facilitate or cause a strong expression of the target gene in such Bacillus microorganisms, thereby helping to increase the yield of a target protein in an industrial fermentation process. Preferably the Bacillus licheniformis is selected from the group consisting of Bacillus licheni formis ATCC 14580, ATCC 31972, ATCC 53926, ATCC 53757, ATCC 55768, DSM 13, DSM 394, DSM 641, DSM 1913, DSM 11259, and DSM 26543. Most preferably the host cell accord ing to the invention belongs to a Bacillus licheniformis species encoding a restriction modifica tion system having a recognition sequence GCNGC.

Further preferably the host cell may additionally contain modifications, e.g., deletions or disrup tions, of other genes that may be detrimental to the production, recovery or application of a pol ypeptide of interest. In one embodiment, a bacterial host cell is a protease-deficient cell. The bacterial host cell, e.g., Bacillus cell, preferably comprises a disruption or deletion of extracellu lar protease genes including but not limited to aprE, mpr, vpr, bpr, and / or epr. Further prefera bly the bacterial host cell does not produce spores. Further preferably the bacterial host cell, e.g., Bacillus cell, comprises a disruption or deletion of spollAC, sigE, and / or sigG. Further preferably the bacterial host cell, e.g., Bacillus cell, comprises a disruption or deletion of one of the genes involved in the biosynthesis of surfactin, e.g., srfA, srfB, srfC, and / or srfD, see, for example, U.S. Patent No. 5,958,728. It is also preferred that the bacterial host cell comprises a disruption or deletion of one of the genes involved in the biosynthesis of polyglutamic acid. Oth er genes, including but not limited to the amyE gene, which are detrimental to the production, recovery or application of a polypeptide of interest may also be disrupted or deleted.

Having established the materials and methods to create key ingredients for industrial fermenta tion processes, the invention also provides a fermentation method for the production of a target protein, comprising cultivating a host cell according to the invention in a suitable medium to pro duce the protein, preferably further comprising the step of isolating or purifying the target pro tein. As described above, the host comprises an expression cassette according to the invention which, in turn, comprises a target gene coding for the target protein. Thus the fermentation pro cess benefits from the increase in expression achieved by the use of the promoter as described above.

The invention also provides a fermentation broth obtained by the fermentation method accord ing to the present invention. Such fermentation broth comprises high concentrations of the tar get protein when the protein is secreted or the host cells are ruptured during downstream pro cessing; if the protein is not secreted, then the fermentation broth comprises host cells which, in turn, feature a high concentration of the target protein. Correspondingly the fermentation broth advantageously lends itself to the production of a product comprising or made using the target protein.

Furthermore, as described above the invention teaches the use of an expression cassette ac cording to the present invention, a vector according to the present invention or a host cell ac cording to the present invention for the production of a target protein coded by the target gene of the expression cassette. Such use realises the advantages conferred by the promoter-gene combination provided by the present invention. The invention is further described by way of examples. These examples are for illustrative pruposes and are not intended to limit the scope of the invention or of the claims.

EXAMPLES

Example 1 : Comparison of a promoter of the present invention to similar promoters

1.0 Common materials and methods

Unless otherwise stated the following experiments have been performed by applying standard equipment, methods, chemicals, and biochemicals as used in genetic engineering and ferment ative production of chemical compounds by cultivation of microorganisms. See also Sambrook et al. (Sambrook, J. and Russell, D.W. Molecular cloning. A laboratory manual, 3rd ed, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 2001) and Chmiel et al. (Bioprocess- technik 1. Einfuhrung in die Bioverfahrenstechnik, Gustav Fischer Verlag, Stuttgart, 1991).

1.0.1 Electrocompetent Bacillus licheniformis cells and electroporation

Transformation of DNA into Bacillus licheniformis strain DSM641 is performed via electro poration. Preparation of electrocompetent Bacillus licheniformis cells and transformation of DNA is performed as essentially described by Brigidi et al (Brigidi.P., Mateuzzi.D. (1991). Biotechnol. Techniques 5, 5) with the following modification: Upon transformation of DNA, cells are recov ered in 1 ml LBSPG buffer and incubated for 60 min at 37°C (Vehmaanpera J., 1989, FEMS Microbio. Lett., 61 : 165-170) following plating on selective LB-agar plates. In order to overcome the Bacillus licheniformis specific restriction modification system of Bacillus licheniformis strains DSM641 , plasmid DNA is isolated from Ec#098 cells as described below. For transfer into Bacil lus lichenformis restrictase knockout strains, plasmid DNA is isolated from E. coli INV110 cells (Life technologies).

1.0.2 Electrocompetent Bacillus subtilis cells and electroporation

Transformation of DNA into Bacillus subtilis ATCC6051a is performed via electroporation as described for Bacillus licheniformis. Plasmid DNA isolated from E.coli DH10B cells can be readi ly used for transfer into Bacillus subtilis.

1.0.3 Plasmid Isolation

Plasmid DNA was isolated from Bacillus and E. coli cells by standard molecular biology meth ods described in (Sambrook, J. and Russell, D.W. Molecular cloning. A laboratory manual, 3rd ed, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 2001) or the alkaline lysis method (Birnboim, H. C., Doly, J. (1979). Nucleic Acids Res 7(6): 1513-1523). Bacillus cells were in comparison to E. coli treated with 10 mg/ml lysozyme for 30 min at 37°C prior to cell lysis.

1.0.4 Annealing of oligonucleotides to form oligonucleotide-duplexes. Oligonucleotides were adjusted to a concentration of 100pM in water. 5pl of the forward and 5pl of the corresponding reverse oligonucleotide were added to 90pl 30mM Hepes-buffer (pH 7.8). The reaction mixture was heated to 95°C for 5min following annealing by ramping from 95°C to 4°C with decreasing the temperature by 0.1°C/sec (Cobb, R. E., Wang, Y., & Zhao, H. (2015). High-Efficiency Multiplex Genome Editing of Streptomyces Species Using an Engineered CRISPR/Cas System. ACS Synthetic Biology, 4(6), 723-728).

1.0.5 Molecular biology methods and techniques

Standard methods in molecular biology not limited to cultivation of Bacillus and E. coli microor ganisms, electroporation of DNA, isolation of genomic and plasmid DNA, PCR reactions, clon ing technologies were performed as essentially described by Sambrook and Rusell. (Sam- brook.J. and Russell, D.W. Molecular cloning. A laboratory manual, 3rd ed, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 2001.)

1.0.6 Strains E. coli strain Ec#098

E. coli strain Ec#098 is an E. coli INV110 strain (Life technologies) carrying the DNA- methyltransferase encoding expression plasmid pMDS003 WO2019016051.

1.0.7 Generation of Bacillus licheniformis gene k.o strains

For gene deletion in Bacillus licheniformis strains DSM641 (US5352604) and derivatives thereof deletion plasmids were transformed into E. coli strain Ec#098 made competent according to the method of Chung (Chung, C.T., Niemela.S.L, and Miller, R.H. (1989). One-step preparation of competent Escherichia coli: transformation and storage of bacterial cells in the same solution. Proc. Natl. Acad. Sci. U. S. A 86, 2172-2175), following selection on LB-agar plates containing 100pg/ml ampicillin and 30pg/ml chloramphenicol at 37°C. Plasmid DNA was isolated from indi vidual clones and used for subsequent transfer into Bacillus licheniformis strains. The isolated plasmid DNA carries the DNA methylation pattern of Bacillus licheniformis strains DSM641 re spectively and is protected from degradation upon transfer into B. licheniformis.

1.0.8 Bacillus licheniformis P304: deleted restriction endonuclease

Electrocompetent Bacillus licheniformis DSM641 cells (US5352604) were prepared as de scribed above and transformed with 1 pg of pDel006 restrictase gene deletion plasmid isolated from E. coli Ec#098 following plating on LB-agar plates containing 5 pg/ml erythromycin at 30°C.

The gene deletion procedure was performed as follows:

Plasmid carrying Bacillus licheniformis cells were grown on LB-agar plates with 5 pg/ml eryth romycin at 45°C driving integration of the deletion plasmid via Campbell recombination into the chromosome with one of the homology regions of pDel006 homologous to the sequences 5’ or 3’ of the restrictase gene. Clones were picked and cultivated in LB-media without selection pressure at 45°C for 6 hours, following plating on LB-agar plates with 5 pg/ml erythromycin and incubation overnight at 30°C. Individual clones were picked and screened by colony-PCR anal- ysis with oligonucleotides SEQ ID NO. 16 and SEQ ID NO. 17 for successful genomic deletion of the restrictase gene. Putative deletion positive individual clones were picked and taken through two consecutive overnight incubation in LB media without antibiotics at 45°C to cure the plasmid and plated on LB-agar plates for overnight incubation at 37°C. Single clones were ana lyzed by colony PCR for successful genomic deletion of the restrictase gene. A single erythro mycin-sensitive clone with the correct deleted restrictase gene was isolated and designated Bacillus licheniformis P304.

1.0.9 Bacillus licheniformis P305: deleted sigF gene

Electrocompetent B. licheniformis P304 cells were prepared as described above and trans formed with 1 pg of pDel005 sigF gene deletion plasmid isolated from E. coli INV110 cells (Life technologies) following plating on LB-agar plates containing 5 pg/ml erythromycin and incuba tion overnight at 30°C.

The gene deletion procedure was performed as described for the restrictase gene.

The deletion of the sigF gene was analyzed by PCR with oligonucleotides SEQ ID NO. 25 and SEQ ID NO. 26. The resulting B. licheniformis strain with a deleted sigF gene is designated B. licheniformis P305 and is no longer able to sporulate as described (WO9703185).

1.0.10 Bacillus licheniformis P307: deleted aprE gene

Electrocompetent Bacillus licheniformis P305 cells were prepared as described above and transformed with 1 pg of pDel003 aprE gene deletion plasmid isolated from E. coli INV110 cells following plating on LB-agar plates containing 5 pg/ml erythromycin and incubation overnight at 30°C.

The gene deletion procedure was performed as described for the deletion of the restrictase gene. The deletion of the aprE gene was analyzed by PCR with oligonucleotides SEQ ID NO. 22 and SEQ ID NO. 23 The resulting Bacillus licheniformis strain with deleted aprE gene was named B. licheniformis P307.

1.0.11 Bacillus licheniformis M309: deleted poly-gamma glutamate synthesis genes

Electrocompetent Bacillus licheniformis P307 cells were prepared as described above and transformed with 1 pg of pDel007 pga gene deletion plasmid isolated from E. coli INV110 cells (Life technologies) following plating on LB-agar plates containing 5 pg/ml erythromycin and in cubation overnight at 30°C.

The gene deletion procedure was performed as described for the deletion of the restrictase gene.

The deletion of the pga genes was analyzed by PCR with oligonucleotides SEQ ID NO. 19 and SEQ ID NO. 20. The resulting Bacillus licheniformis strain with deleted pga synthesis genes was named Bacillus licheniformis M309.

1.0.12 Bacillus licheniformis M609.1A The gene integration of the expression construct of plasmid pCC043 comprising the PaprE trunc. DSM641-GFP2 fragment into the amylase amyB locus was performed as follows.

Electrocompetent Bacillus licheniformis M309 cells were prepared as described above and transformed with 1 pg of pCC043 plasmid isolated from E. coli INV110 cells following plating on LB-agar plates containing 20 pg/ml kanamycin and incubation overnight at 37°C.

The next day clones of the transformation reaction were subjected to colony-PCR with oligonu cleotides SEQ ID NO. 42 and SEQ ID NO. 43 to analyze for successful CRISPR/Cas9-based integration of the PaprE-GFPmut2 expression cassette to replace the amyB gene of Bacillus licheniformis. Positive clones were transferred onto fresh LB-agar plates without antibiotics fol lowing incubation at 48°C overnight for plasmid curing. Kanamycin sensitive clones were again analyzed by PCR with oligonucleotides SEQ ID NO. 42 and SEQ ID NO. 43 to confirm gene integration replacing the amylase amyB gene was successful. The resulting Bacillus licheni formis strain was named B. licheniformis M609.1A.

1.0.13 Bacillus licheniformis M609.1B

B. licheniformis strain M609.1B was constructed as described for B. licheniformis strain M609.1A, however plasmid pCC047 with expression construct PaprE fl. DSM641-GFPmut2 was used.

1.0.14 Bacillus licheniformis M609.2A

B. licheniformis strain M609.2A was constructed as described for B. licheniformis strain M609.1A, however plasmid pCC048 with expression construct PaprE trunc. DSM13-GFPmut2 was used.

1.0.15 Bacillus licheniformis M609.2B

B. licheniformis strain M609.2B was constructed as described for B. licheniformis strain M609.1A, however plasmid pCC049 with expression construct PaprE fl. DSM13-GFPmut2 was used.

1.0.16 Plasmids

1.0.16.1 pEC194RS - Bacillus temperature sensitive deletion plasmid.

The plasmid pE194 is PCR-amplified with oligonucleotides SEQ ID NO. 9 and SEQ ID NO. 10 with flanking Pvull sites, digested with restriction endonuclease Pvull and ligated into vector pCE1 digested with restriction enzyme Smal. pCE1 is a pUC18 derivative, where the Bsal site within the ampicillin resistance gene has been removed by a silent mutation. The ligation mix ture was transformed into E. coli DH10B cells (Life technologies). Transformants were spread and incubated overnight at 37°C on LB-agar plates containing 100pg/ml ampicillin. Plasmid DNA was isolated from individual clones and analyzed for correctness by restriction digest. The resulting plasmid is named pEC194S.

The type-ll-assembly mRFP cassette is PCR-amplified from plasmid pBSd141R (accession number: KY995200) (Radeck, J., Meyer, D., Lautenschlager, N., and Mascher, T. 2017. Bacillus SEVA siblings: A Golden Gate-based toolbox to create personalized integrative vectors for Ba- cillus subtilis. Sci. Rep. 7: 14134) with oligonucleotides SEQ ID NO. 11 and SEQ ID NO. 12, comprising additional nucleotides for the restriction site BamHI. The PCR fragment and pEC194S were restricted with restriction enzyme BamHI following ligation and transformation into E. coli DH10B cells (Life technologies). Transformants were spread and incubated over night at 37°C on LB-agar plates containing 100pg/ml ampicillin. Plasmid DNA was isolated from individual clones and analyzed for correctness by restriction digest. The resulting plasmid pEC194RS carries the mRFP cassette with the open reading frame opposite to the reading frame of the erythromycin resistance gene.

1.0.16.2 pDel003 - aprE gene deletion plasmid

The gene deletion plasmid for the aprE gene of Bacillus licheniformis was constructed with plasmid pEC194RS and the gene synthesis construct SEQ ID NO. 21 comprising the genomic regions 5’ and 3’ of the aprE gene flanked by Bsal sites compatible to pEC194RS. The type-ll- assembly with restriction endonuclease Bsal was performed as described (Radeck et al. , 2017; Sci. Rep. 7: 14134) and the reaction mixture subsequently transformed into E. coli DH10B cells (Life technologies). Transformants were spread and incubated overnight at 37°C on LB-agar plates containing 100pg/ml ampicillin. Plasmid DNA was isolated from individual clones and analyzed for correctness by restriction digest. The resulting aprE deletion plasmid is named pDel003.

1.0.16.3 pDel005 - sigF gene deletion plasmid

The gene deletion plasmid for the sigF gene (spollAC gene) of Bacillus licheniformis was con structed as described for pDel003, however the gene synthesis construct SEQ ID NO. 24 com prising the genomic regions 5’ and 3’ of the sigF gene flanked by Bsal sites compatible to pEC194RS was used. The resulting sigF deletion plasmid is named pDel005.

1.0.16.4 pDel006 - Restrictase gene deletion plasmid

The gene deletion plasmid for the restrictase gene (SEQ ID NO. 14) of the restriction modifica tion system of Bacillus licheniformis DSM641(SEQ ID NO. 13) was constructed with plasmid pEC194RS and the gene synthesis construct SEQ ID NO. 15 comprising the genomic regions 5’ and 3’ of the restrictase gene flanked by Bsal sites compatible to pEC194RS. The type-ll- assembly with restriction endonuclease Bsal was performed as described above and the reac tion mixture subsequently transformed into E. coli DH10B cells (Life technologies). Trans formants were spread and incubated overnight at 37°C on LB-agar plates containing 100pg/ml ampicillin. Plasmid DNA was isolated from individual clones and analyzed for correctness by restriction digest. The resulting restrictase deletion plasmid is named pDel006.

1.0.16.5 pDel007 - Poly-gamma-glutamate synthesis genes deletion plasmid

The deletion plasmid for deletion of the genes involved in poly-gamma-glutamate (pga) produc tion, namely ywsC (pgsB), ywtA (pgsC), ywtB (pgsA), ywtC (pgsE) of Bacillus licheniformis was constructed as described for pDel006, however the gene synthesis construct SEQ ID NO. 18 comprising the genomic regions 5’ and 3’ flanking the ywsC(pgsB), ywtA (pgsC), ywtB (pgsA), ywtC (pgsE) genes flanked by Bsal sites compatible to pEC194RS was used. The resulting pga deletion plasmid is named pDel007. 1.0.16.6 Plasmid p689-T2A-lac

The E. coli plasmid p689-T2A-lac comprises the lacZ-alpha gene flanked by Bpil restriction sites, again flanked 5’ by the T 1 terminator of the E. coli rrnB gene and 3’ by the TO lambda terminator and was ordered as gene synthesis construct (SEQ ID NO. 27).

1.0.16.7 Plasmid p890 PaprE trunc. DSM641-GFPmut2

The truncated promoter of the aprE gene from Bacillus licheniformis DSM641 (SEQ ID NO. 2) of plasmid pCB56C (US5352604) was PCR-amplified with oligonucleotides SEQ ID NO. 28 and SEQ ID NO. 29. The GFPmut2 gene variant (accession number AF302837) with flanking Bpil restriction sites (SEQ ID NO. 30) was ordered as gene synthesis fragment (Geneart Regens burg). The gene expression construct comprising the truncated (trunc.) PaprE promoter from Bacillus licheniformis DSM641 fused to the GFPmut2 variant was cloned into plasmid p689- T2A-lac by type-ll-assembly with restriction endonuclease Bpil as described (Radeck et al. , 2017; Sci. Rep. 7: 14134) and the reaction mixture subsequently transformed into electrocom- petent E. coli DH10B cells. Transformants were spread and incubated overnight at 37°C on LB- agar plates containing 100pg/ml ampicillin. Plasmid DNA was isolated from individual clones and analyzed for correctness by restriction digest and sequencing. The resulting plasmid is named p890 PaprE trunc. DSM641-GFPmut2.

1.0.16.8 Plasmid p891 PaprE fl. DSM641-GFPmut2

The plasmid p891 PaprE fl. DSM641-GFPmut2 was constructed as described for p890 PaprE trunc. DSM641-GFPmut2, however the full-length (fl.) promoter of the aprE gene from Bacillus licheniformis DSM641 (SEQ ID NO. 3) was PCR-amplified with oligonucleotides SEQ ID NO. 31 and SEQ ID NO. 32 using genomic DNA as a template.

1.0.16.9 Plasmid p892 PaprE trunc. DSM13-GFPmut2

The plasmid p892 PaprE trunc. DSM13-GFPmut2 was constructed as described for p891 PaprE fl. DSM641-GFPmut2, however the truncated (trunc.) promoter of the aprE gene from Bacillus licheniformis DSM13 (SEQ ID NO. 5) was PCR-amplified with oligonucleotides SEQ ID NO. 33 and SEQ ID NO. 32 using genomic DNA as a template.

1.0.16.10 Plasmid p893 PaprE fl. DSM13-GFPmut2

The plasmid p893 PaprE fl. DSM13-GFPmut2 was constructed as described for p892 PaprE trunc. DSM13-GFPmut2, however the full-length (fl.) promoter of the aprE gene from Bacillus licheniformis DSM13 (SEQ ID NO. 6) was PCR-amplified with oligonucleotides SEQ ID NO. 31 and SEQ ID NO. 32 using genomic DNA as a template

1.0.16.11 Plasmid pJOE8999.1:

Plasmid pJOE8999.1 was produced as described in Altenbuchner J., 2016, Editing of the Bacil lus subtilis genome by the CRISPR-Cas9 system, Appl Environ Microbiol 82:5421-5. 1.0.16.12 Plasmid pJOE-T2A

To allow for type-ll-assembly (T2A) based one-step-cloning of the sgRNA and the homology regions for DSB repair the CRISPR/Cas9 plasmid pJOE8889.1 was modified as follows. The type-ll-assembly mRFP cassette from plasmid pBSd141R (accession number: KY995200) (Radeck et al., J 2017; Sci. Rep. 7: 14134) was modified such to remove multiple restriction sites and the Bpil restriction sites and ordered as gene synthesis fragment with flanking Sfil re striction sites (SEC ID NO. 34). The plasmid is named p#732. Plasmid p#732 and plasmid pJOE8999.1 were digested with Sfil (New England Biolabs, NEB) and the mRFP cassette of p#732 ligated into Sfil-digested pJOE8999.1 following transformation into competent E. coli DH10B cells. Positive clones were screened on IPTG/X-Gal and kanamycin (20 pg/ml) contain ing LB agar plates for purple colonies (blue-white screening and mRFP1 expression). The re sulting sequence-verified plasmid was named pJOE-T2A.

1.0.16.13 Plasmid pCC027 - T2A CRISPR destination vector

Plasmid pCC027 is derivative of the plasmid pJOE-T2A, where the promoter PmanP was re placed by a promoter fragment (SEQ ID NO. 35) comprising in the 5’ to 3’ orientation the termi nator region of pMutin2 (accession number AF072806) followed by a Pveg promoter variant derived from Guiziou et al (Guiziou.S., V.Sauveplane, H.J. Chang, C.CIerte, N.Declerck,

M. Jules, and J. Bonnet. 2016. A part toolbox to tune genetic expression in Bacillus subtilis. Nu cleic Acids Res. 44: 7495-7508) by the Gibson assembly method (NEBuilder® HiFi DNA As sembly Cloning Kit, New England Biolabs).

1.0.16.14 pCC043 - GFP gene integration plasmid based on PaprE trunc. DSM641

The 20 bp target sequence of the amyB gene for the sgRNA were ordered as oligonucleotides SEC ID NO. 36 and Seq ID NO. 37 with 5' phosphorylation following annealing to form an oligo nucleotide duplex. The 5’ and 3’ regions of the amyB gene of Bacillus licheniformis were PCR- amplified with oligonucleotides SEC ID NO. 38 and SEQ ID NO. 39 and SEQ ID NO. 40 and SEQ ID NO.41, respectively.

The CRISPR/Cas9 based gene integration plasmid replacing the amyB gene of Bacillus licheni formis was constructed by type-ll-assembly with restriction endonuclease Bsal with the follow ing components: pCC027, the oligonucleotide duplex (SEQ ID NO. 36, SEQ ID NO. 37), the PCR-fragment of the 5’ homology region of the amyB gene, p890-PaprE trunc. DSM641- GFPmut2 and the PCR-fragment of the 3’ homology regions of the amyB gene. The reaction mixture was transformed into E. coli INV110 cells (Life technologies). Transformants were spread and incubated overnight at 37°C on LB-agar plates containing 20 pg/ml kanamycin and IPTG/X-Gal. Plasmid DNA was isolated from individual clones and analyzed for correctness by restriction digest and sequencing. The resulting CRISPR/Cas9 based gene integration plasmid is named pCC043.

1.0.16.15 pCC047 - GFP gene integration plasmid based on PaprE fl. DSM641

The plasmid pCC047 was constructed as for plasmid pCC043, however the plasmid p891 was used. 1.0.16.16 pCC048 - GFP gene integration plasmid based on PaprE trunc. DSM13

The plasmid pCC048 was constructed as for plasmid pCC043, however the plasmid p892 was used. 1.0.16.17 pCC049 - GFP gene integration plasmid based on PaprE fl. DSM13

The plasmid pCC049 was constructed as for plasmid pCC043, however the plasmid p893 was used.

1.1 Analysis of promoter strength by single cell measurement of promoter activity.

Bacillus licheniformis strains M609.1A, M609.1B, M609.2A and M609.2B (as listed in Table 4) were cultivated in a fed-batch based process and promoter strength recorded for individual cells by measurement of the fluorescence signal produced by GFP. Table 4: Description of strains note: Bacillus licheniformis strains DSM13 and ATCC1450 are isogenic.

Bacillus licheniformis strains were cultivated in a microtiter plate-based fed-batch process (Ha- bicher et al. , 2019 Biotechnol J.;15(2)). All cultivations were conducted in an orbital shaker with a diameter of 25 mm (Innova 42, New Brunswick Scientific, Eppendorf AG; Hamburg, Germany) at 30 °C and 400 rpm. Strains were cultivated in two subsequent precultures in FlowerPlates (MTP-48-OFF, m2p-labs GmbH) for synchronization of growth. The first preculture was carried out in 800 pi TB medium inoculated with a fresh single colony from the strain streaked onto LB agar plates. After 20 h, the second preculture containing 800 mI V3 minimal medium (Meissner et al., 2015, Journal of industrial microbiology & biotechnology 42 (9): 1203-1215) as inoculated with 8 mI of the first preculture and cultivated for 24 h. Microtiter plate-based fed-batch main cul tivations were conducted using 48-well round- and deep-well-microtiter plates with glucose- containing polymer on the bottom of each well (FeedPlate, article number: SMFP08004, Kuhner Shaker GmbH; Herzogenrath, Germany). 7 mI of the second preculture were used to inoculate 760 mI V3 minimal medium without glucose. Main cultures were incubated for 48 h. Precultures were covered with a sterile gas-permeable sealing foil (AeraSeal film, Sigma-Aldrich) to avoid contamination. FeedPlates were sealed with a sterile gas-permeable, evaporation reducing foil (F-GPR48-10, m2p-labs GmbH) to reduce evaporation and to avoid contamination. For fluorescence microscopy-based single-cell analysis samples were harvested and diluted in 0.9 % NaCI to an optical density OD600nm of 2.0. Microscope slides were prepared using 1.5 % agarose in phosphate-buffered saline, which was molded into a 125 pi GeneFrame (Thermo Fisher Scientific; Waltham, USA) to immobilize cells and ensure an even focus plane. To stain the nucleoid, required for data evaluation using NucT racer (see below), 2 pg/ml DAPI (4’,6- diamidino-2-phenylindole) were added to the agarose. 0.5 mI cell-suspension were applied on an agarose gel slide and excess moisture was allowed to evaporate before applying the co- verslip. Images were captured using a Zeiss Axio Imager. M2 epifluorescence microscope (Carl Zeiss Microscopy GmbH; Jena, Germany) equipped with an EC Plan-Neofluar objective (IOOx/1.3 Oil Ph3 M27) and a Zeiss AxioCam, running Zeiss ZEN software. Fluorescence filter sets used to visualize DAPI and GFP signals were obtained from Zeiss (Carl Zeiss Microscopy GmbH; Jena, Germany). Fluorescent signals of DAPI were visualized using Filter-Set 49 (wave length: excitation 335 - 383 nm, emission 420 - 470 nm), and fluorescent signals of GFP were visualized using Filter-Set 38 (wavelength: excitation 450 - 490 nm, emission 500 - 550 nm). The exposure time for GFP was set to 100 ms and kept constant throughout the analyses of all samples.

The measurement of the mean GFP intensity of single cells was carried out using the ImageJ software with the embedded ObjectJ plugin NucT racer (Syvertsson et al. , 2016, PloS one 11 (3): e0151267). NucT racer uses the DAPI stained nucleoid as identifier for individual cells and measures GFP fluorescence in the corresponding image obtained in the GFP channel. Micro scopic images of phase-contrast, DAPI and GFP channels were imported into ImageJ software and stored in a TIFF-stack format containing the images arranged by channels. The analysis was performed with 16-bit tiff files. Single cells were marked in the DAPI channel, and NucT rac er was used to transfer a circular region of interest (ROI) with a diameter of 0.5 pm to the GFP channel. Mean GFP intensity of each ROI was measured. For GFP background subtraction the measured values were blanked against the average background value of the image. The mean GFP intensity from 200 to 500 cells was used for calculation of mean aprE promoter activity. All mean values are relative to the fluorescence intensity obtained for B. licheniformis M609.1A (carrying the truncated DSM641 PaprE promoter reporter construct) after 24 h of cultivation which is set to 100.

Figure 3 summarizes the relative promoter strength at the timepoints of 24 h and 48 h of cultiva tion for the four different promoter constructs. At the 24 h timepoint the promoter activity of the full-length PaprE promoter from B. licheniformis DSM641 (M609.1 B) is lower compared to the truncated version of the PaprE promoter (Bacillus licheniformis strain M609.1A). The promoter activities of both the truncated and full-length PaprE promoters from Bacillus licheniformis DSM13 (Bacillus licheniformis strains M609.2A and M609.2B) are higher compared to the trun cated version of the PaprE promoter from Bacillus licheniformis DSM641 (Bacillus licheniformis strain M609.1A) with the full-length promoter PaprE fl. DSM13 showing a promoter activity over 150% compared to the reference PaprE trunc.-DSM641. At the 48h timepoint the PaprE pro moters of B. licheniformis DSM641 (both truncated and full-length) as well as the truncated Pap rE promoter of B. licheniformis DSM13 show comparable promoter activities, however lower compared to the 24h timepoint PaprE trunc. DSM641 reference (B. licheniformis M609.1A). Surprisingly, the full-length PaprE promoter of B. licheniformis DSM13 (B. licheniformis strain M609.2B) shows highest promoter activity of 150% relative to the reference. Example 2: Calculation of HMM score

2.1 Extraction and alignment of Bacillus species promoters

A translated blast search using tblastn 2.5.0+ (Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., & Madden T.L. (2008) "BLAST+: architecture and applications." BMC Bioinformatics 10:421) was performed using aprE protein sequence from Bacillus licheni- formis (SEQ ID NO. 48) as a query against Genbank and Genbank WGS (Whole Genome Shotgun) databases, with options: -evalue 1e-20, -db_gencode 11, -max_target_seqs 60000. Full GenBank records were retrieved for BLAST hits above minimal protein identity of 40%.

Using BLAST hit location information from the blast search results, upstream sequences of aprE-coding genes were extracted, subject to the following conditions: a) Upstream extraction size was 200 nucleotides. If there was an upstream gene/CDS anno tation closer than 200 nucleotides, then a shorter fragment was extracted. If fragment length was less than 50 nucleotides, such a fragment was not extracted. b) Extracted upstream sequences were grouped by BLAST hit bitscore, and sorted in de scending order by the same bitscore. To avoid bias, identical upstream sequences from the same bitscore group were deduplicated. c) For each of by-bitscore upstream sequence groups, a cumulative multiple alignment was performed (and saved separately) using mafft version 7.307 (Katoh, Standley. “MAFFT multiple sequence alignment software version 7: improvements in performance and usability”, Molecular Biology and Evolution 30:772-780, 2013), with the keeplength option. Generated multiple nucle otide alignments were visualized as sequence logos and examined to identify the bitscore threshold at which upstream regulatory sequences conservation is the most apparent: con served fragments still have high information content, while non-conserved fragments have low information content. d) Based on the identified threshold, all the upstream sequences with bitscore above the threshold (SEQ ID NO. 49 to 196) were multiple-aligned using mafft.

2.2 Hidden Markov model (HMM) creation

Using the above created multiple alignment file, an hmm was build using HMMER 3.1 b1 (Wheeler, Travis J, and Sean R Eddy. (2013) “nhmmer: DNA homology search with profile HMMs.” Bioinformatics (Oxford, England) vol. 29,19 (2013): 2487-9), by running the command: hmmbuild -n PaprE PaprE.hmm {aligned. mfa}. This hmm was then pressed using: hmmpress PaprE.hmm, resulting in a model that can be run over any sequence.

2.3. Sequence extraction

In order to extract the sequence matching the model, the HMMER software can be run using the command: nhmmscan PaprE.hmm {sequence}, where {sequence} represents a fasta for matted file containing any DNA sequence. This will output a list of sequences matching the model (given by start and end of the match), together with an e-value and a score. Calibration of the hmm indicated that any score above a cutoff of 50 is indicative of a match. Using this cutoff to extract matching sequences from a database of over 8000 non-Bacilli genomes, a false dis covery rate of zero was confirmed.

Example 3: Explanation of elements found in SEQ ID NO. 6

Figure 1 shows the 5' region (SEQ ID NO. 6) of the B. licheniformis aprE gene coding for the AprE protease (SEQ ID NO. 48). The sigma factor A -10 and -35 motifs are each indicated in bold face. Underlined is the promoter region, wherein the 5'UTR between the -1 promotor posi tion and the start codon (SEQ ID NO. 46) is indicated by a curly line. The first, second and third enhancer elements are each enclosed in a box. Optional promoter regions are indicated by oblique typeface.

Figure 2 shows an alignment of the promoters according to SEQ ID NO. 2, 3, 4, 5 and 6. Only the sequence of SEQ ID NO. 3 is spelled out. For all other sequences only the nucleotides that differ at a given position from the corresponding nucleotide of SEQ ID NO. 3 are indicated; "." indicates that the nucleotide is identical to that of SEQ ID NO. 3 at the respective position; "-" indicates that the respective nucleotide is missing.