Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IMPROVED SYNTHESIS OF BIOSYNTHETIC PRODUCT BY ORDERED ASSEMBLY OF BIOSYNTHETIC ENZYMES GUIDED BY THE NUCLEOTIDE SEQUENCE MOTIF TEMPLATE
Document Type and Number:
WIPO Patent Application WO/2012/053985
Kind Code:
A1
Abstract:
Invention refers to a method producing a compound by a biosynthetic pathway by culturing a genetically modified host cell, wherein the cell is modified with one or more nucleic acids comprising nucleotide sequence encoding chimeric proteins composed of biosynthetic pathway enzyme or other functional polypeptide and nucleic acid binding factor and program nucleic acid sequence that contains target nucleic acid elements that are recognized by said nucleic acid binding factor from the chimeric protein and where the said program nucleic acid sequence provides arrangement of biosynthetic pathway enzymes or other functional polypeptides along the program nucleic acid sequence and said culturing providing for synthesis of said chimeric proteins in the genetically modified host cells resulting in a production of said compound. Invention refers to host cells with program nucleic acid sequence, nucleic acid sequence expressing chimeric proteins. Invention provides methods improving product synthesis.

Inventors:
JERALA ROMAN (SI)
AVBELJ MONIKA (SI)
BENCINA MOJCA (SI)
MORI JERNEJA (SI)
GABER ROK (SI)
KOPRIVNJAK TOMAZ (SI)
ANDERLUH GREGOR (SI)
VOVK IRENA (SI)
LEBAR TINA (SI)
TURNSEK JERNEJ (SI)
ILC TINA (SI)
TOMSIC NEJC (SI)
STOSICKI TJASA (SI)
ZNIDARIC MATEJ (SI)
BORDON JURE (SI)
PETRONI MATTIA (IT)
GLAVNIK VESNA (SI)
Application Number:
PCT/SI2010/000058
Publication Date:
April 26, 2012
Filing Date:
October 22, 2010
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
KEMIJSKI INST (SI)
JERALA ROMAN (SI)
AVBELJ MONIKA (SI)
BENCINA MOJCA (SI)
MORI JERNEJA (SI)
GABER ROK (SI)
KOPRIVNJAK TOMAZ (SI)
ANDERLUH GREGOR (SI)
VOVK IRENA (SI)
LEBAR TINA (SI)
TURNSEK JERNEJ (SI)
ILC TINA (SI)
TOMSIC NEJC (SI)
STOSICKI TJASA (SI)
ZNIDARIC MATEJ (SI)
BORDON JURE (SI)
PETRONI MATTIA (IT)
GLAVNIK VESNA (SI)
EN FIST CT ODLICNOSTI (SI)
International Classes:
C12N15/10; C12N15/52; C12N15/62; C12P7/22; C12P17/16
Domestic Patent References:
WO1998056904A11998-12-17
WO2009108774A22009-09-03
WO2003068917A22003-08-21
WO2002018617A22002-03-07
WO2002050299A22002-06-27
WO2010003304A12010-01-14
WO2006125000A22006-11-23
WO2009108774A22009-09-03
Attorney, Agent or Firm:
ITEM d.o.o. (1000 Ljubljana, SI)
Download PDF:
Claims:
Claims:

1. A method of producing a compound by a biosynthetic pathway by culturing a genetically modified host cell, wherein the cell is modified with one or more nucleic acids comprising: a) nucleotide sequence encoding at least three to about 100 chimeric proteins wherein each of the said chimeric proteins is composed of at least one biosynthetic pathway enzyme or other functional polypeptide and at least one nucleic acid binding factor, where the said biosynthetic pathway enzyme or other functional polypeptide and nucleic acid binding factor are connected by a linker polypeptide;

b) program nucleic acid sequence, a nucleic acid sequence that contains at least three to about 100 target nucleic acid elements that are each recognized by separate said nucleic acid binding factor from the chimeric polypeptide and where the said program nucleic acid sequence provides arrangement of biosynthetic pathway enzymes or other functional polypeptides along the program nucleic acid sequence;

c) a substrate for the biosynthetic pathway is either present in the host cell or provided to the cell extracellularly and

said culturing providing for synthesis of said chimeric proteins in the genetically modified host cells resulting in a production of said compound.

2. A method of producing a compound by a biosynthetic pathway by mixing one or more compounds comprising:

a) three to about 100 chimeric proteins wherein each of the said chimeric proteins is composed of at least one biosynthetic pathway enzyme or other functional polypeptide and at least one nucleic acid binding factor, where the said biosynthetic pathway enzyme or other functional polypeptide and nucleic acid binding factor are connected by a linker polypeptide;

b) program nucleic acid sequence, a nucleic acid sequence that contains at least three to about 100 sequence target nucleic acid elements that are each recognized by the separate said nucleic acid binding factor from the chimeric protein and where the said program nucleic acid sequence provides arrangement of biosynthetic pathway enzymes or other functional polypeptides along the program nucleotide acid sequence; c) a substrate for biosynthetic pathway and cofactors for enzymes are provided to the mixture or are generated in the mixture, where the said program nucleic acid sequence can be either in solution or immobilized to another compound or solid phase.

The method according to any claim from 1 to 2 wherein the said program nucleic acid sequence is composed of target nucleic acid elements and spacer sequences separating target nucleic acid elements and the said target nucleic acid elements are of defined nucleotide sequence defined by recognition motif of nucleic acid binding factors; and spacer sequence is of any length preferentially from 1 to 50 nucleotides.

The method according to any claim from 1 to 3 wherein said nucleotide sequence of each target nucleic acid element is selected based on each motif recognized by nucleic acid binding factors used in the method as carrier of biosynthetic pathway enzyme or other functional polypeptide; and

each target nucleic acid element nucleotide motif is used once or many times, typically from 1 to 16; and

three or more different target nucleic acid element nucleotide motifs are used on a program nucleic acid sequence, typically 3 to 16, preferentially 3 to 7, and

depending on the state of oligomerization or the number of catalytic steps in the biosynthetic pathway.

The method according to any claim from 1 to 4 wherein said sequence motif of a target nucleic acid element of any size above 4 nucleotides, optionally comprises from 4 to approximately 30 nucleotides, and the said nucleic acid binding factor recognizes at each position of the nucleic acid element a single nucleotide or any combination of two, three or four nucleotides and each target nucleic acid element is positioned on a program nucleic acid sequence as one or more copies and at the order that defines the spatial arrangement of chimeric proteins in relation to each other.

The method according to any claim from 1 to 5 wherein said biosynthetic pathway enzyme or other functional polypeptide is covalently linked via linker peptide from one to approximately 100 amino acid residues to a nucleic acid binding factor of natural or artificial design and any origin with characteristics of binding to a specific nucleic acid element; and

the nucleic acid binding factor is selected from but not limited with: helix-turn-helix nucleic acid binding domain, zinc finger nucleic acid binding domain, leucine zipper nucleic acid binding domain, winged helix nucleic acid binding domain, Winged helix turn helix nucleic acid binding domain, Helix-loop-helix nucleic acid binding domain, HMG- box nucleic acid binding domain, inactive restriction endonucleases, transcription factors.

7. The method according to any claim from 1 to 6 wherein the said program nucleic acid sequence is used as such or optionally cloned into host cell appropriate vector promoting self replication.

8. Nucleic acid for chimeric polypeptides composed of biosynthetic pathway enzymes or other functional polypeptides and binding factors used in a biosynthetic method according to any claim from 1 to 6 and said nucleic acid is cloned into the appropriate vector and functionally linked to regulating sequences promoting expression of chimeric proteins, and vector and regulating sequences are suitable for expression in host cell.

9. Genetically engineered host cell comprising program nucleic acid sequence according to any claim from 3 to 5 and a host cell is selected from prokaryotic and eukaryotic cell.

10. Genetically engineered host cell comprising one or more biosyntlietic pathway enzymes or other functional polypeptides according to claim 6 and a host cell is selected from prokaryotic and eukaryotic cell.

11. Genetically engineered host cell comprising program nucleic acid sequence according to any claim from 3 to 5 and one or more functional polypeptide according to claim 6 and a host cell is selected from prokaryotic and eukaryotic cell.

12 Method according to any claim from 1 to 2 wherein biosynthetic pathway, catabolic or anabolic pathway, is synthesis of primary metabolites such as, but not limited to: amino acids, fatty acids, carbohydrates, pyrimidines, purines, citric acid, itaconic acid, ethanol, glycerol, methanol, butanol, propanol, isoprenoids, higher alcohols.

13 Method according to any claim from 1 to 2 wherein biosynthetic pathway is synthesis of secondary metabolites, such as, but not limited to: polyketides, nonribosomal peptides, hormones, terpenoids, antioxidants, pigments.

14 Method according to any claim from 1 to 2 wherein biosynthetic pathway is synthesis of carotenoids. 15 Method according to any claim from 1 to 2 wherein biosynthetic pathway is synthesis of violacein.

16. Method according to any claim from 1 to 2 wherein biosynthetic pathway is synthesis of resveratrol or methylated resveratrol.

Description:
Title:

Improved synthesis of biosynthetic product by ordered assembly of biosynthetic enzymes guided by the nucleotide sequence motif template

Filed of the invention:

The field of invention is an improved product synthesis achieved by the ordered assembly of chimeric proteins, the said chimeric proteins composed of biosynthetic enzymes or other functional polypeptides linked to nucleic acid binding factors whereas the ordered assembly of chimeirc proteins is based on binding to nucleic acid recognition motifs on the program nucleic acid sequence. Invention is a biotechnological invention.

State of the art:

For industrial applications biosynthetic pathways composed of several enzymes and other proteins are engineered to achieve high yield of desired biosynthetic products. Improving the efficiency of biosynthetic pathways to yield more reaction product faster has been of great interest. Various strategies for optimization have been undertaken so far. For example, yield of end product of a biosynthetic reaction has been improved by (i) increasing pool of available substrate and / or overexpression of the enzymes of the limiting biosynthetic steps; (ii) introducing heterologous enzymes with preferred kinetic characteristics; (iii) blocking branching of biosynthetic pathway, (iv) compartmentalizing of biosynthetic pathways by directing enzymes of a particular biosynthetic pathway to a specific cell compartments or artificially made compartments (e.g. metabolosomes), or; (v) increasing the proximity of enzymes by assembling metabolic pathways on a protein scaffold (WO 2009/108774). While there are many advantages of protein based scaffold, this approach is limited by the number of available combinations of docking peptides and this approach does not offer a control of spatial distribution and orientation of biosynthetic enzymes to optimally support the biosynthetic pathway.

Each of above mentioned solutions do not guarantee optimal arrangement of the enzymes of biosynthetic pathway to provide the desired order of biosynthetic reactions. In living cells biosynthetic pathway enzymes or other functional polypeptides are often confined to micro- location through multi-protein complexes or anchoring mechanisms. This type of organization increases the local concentration of enzymes and efficacy of biosynthetic pathways. To achieve an optimal concentration of individual enzymes and promote formation of full multi-protein complex two approaches have been adapted: (i) directing enzyme to microlocation e.g. targeting organelles; and (ii) forming multi-protein complex as a polypeptide scaffold (WO 2009/108774).

However, the protein scaffold has many disadvantages and limitations.

Even though the number and distribution of enzymes in a multi-protein complex could be programmed with the sequence of a polypeptide backbone, three dimensional arrangement of polypeptides is unpredictable due to the flexibility of and between dimerization domains (Figure 1 and 2). In addition, designing of the polypeptide backbone with the scaffold guided protein domains can be hard due to the limited number of available protein dimerization domains. Additionally each protein dimerization domain has different conditions under which it folds and forms the functional interaction.

Summary of the invention:

The problem of ordering the biosynthetic pathway into ordered sequence of biosynthetic enzymes, which is not solved by protein dimerization domain, is solved by a nucleic acid sequence motif template also named program nucleic acid sequence that guides the order of chimeric proteins, biosynthetic enzymes or other functional polypeptides linked to nucleic acid binding factors.

Inventors come to the discovery that the three dimensional order of biosynthetic enzymes or other functional polypeptides can be manipulated by a program nucleic acid. The inventors designed the chimeric polypeptides which are comprised of nucleic acid binding domain of a nucleic acid binding factor, and a biosynthetic enzyme or other functional polypeptide. The said chimeric proteins bind to the target nucleic acid motif via a nucleic acid binding factor. The program nucleic acid sequence determines the order of the target nucleic acid motifs and, hence, the sequence and spatial arrangement of biosynthetic enzymes or other functional polypeptides. Such ordered biosynthetic pathway of chimeric biosynthetic enzymes or other functional polypeptides guided by the program nucleic acid sequence increases yield of biosynthetic product of such pathway in comparison to the protein scaffold, which simply clusters the biosynthetic enzymes without of any specified order. Use of program nucleic acid sequence to immobilize and guide the order of chimeric biosynthetic pathway enzymes or other functional polypeptides provides for one or more of the following: i) increases the efficiency of pathway flux, ii) optimizes metabolic flux through the pathway, iii) reduces metabolic burden on the host cell, iv) provides the possibility of alternative branching of metabolic pathways yielding novel products of biosynthetic process, v) allows reprogramming of the chimeric enzymes within the organism by introducing different program nucleic acid.

The program nucleic acid sequence caries more than two target nucleic acid elements, these elements are specific ligands for nucleic acid binding factors. When the program nucleic acid sequence is present in a solution with chimeric proteins, the chimeric proteins bind to target elements within the program sequence form nucleic acid protein complex and facilitate formation of multi-protein complex. The order of chimeric proteins in such multi-protein complex is predictable due to the: i) positioning of target nucleic acid motifs is defined based on the nucleic acid sequence due to known three dimensional structure of nucleic acids, ii) because the order of nucleic acid motifs is predictable, it promotes ordered positioning of chimeric proteins; iii) such ordered positioning of chimeric proteins facilitates the order of biosynthetic reactions defined by the program nucleic acid sequence and therefore minimizes the time for diffusion of reaction intermediates between different biosynthetic reaction steps. Inventors come to the conclusion that it is beneficial to have a large number of different nucleotide binding domains to assemble complex reaction pathways. For example there are 262144 different combinations of nucleotide motifs consisting of nine nucleotides, which can be recognized by different nucleotide binding domains allowing extremely high variability to assemble different biosynthetic pathways.

The present invention has several advantages over protein scaffolding. The program nucleic acid sequence has no maturation problems and the ordered nucleotide binding motif can be selected at will. Docldng of the anchoring nucleic acid binding factor to the target nucleic acid element is well characterized. Due to the close proximity of chimeric proteins bound to the program nucleic acid sequence other polypeptides that might redirect synthesis are spatially excluded from multi-protein complex. Binding of all components of the biosynthetic pathway proceeds under the same type of reaction conditions since all nucleic acid binding domains used for the construction of chimeric proteins can be based on the same protein fold and interact with the nucleotides using the same type of interactions as opposed to the protein scaffold, where different types of protein domain dimers have to be used, which can be correctly folded and interact optimally under different conditions. The present invention has a an advantage that it can guide production of different reaction products depending on the sequence of the program nucleic acid that is introduced into production cells or reaction media, where the different program DNA differently assembles the biosynthetic pathway enzymes.

The present invention presents method of producing a compound by biosynthetic pathway and the method includes culturing a genetically modified host cell, which are modified to express chimeric proteins, biosynthetic pathway enzymes or other functional polypeptides linked to nucleic acid binding factors and program nucleic acid sequence which directs the ordered arrangement of chimeric proteins into a multi-protein biosynthetic complex.

The present invention refers to the processes that include at least three biosynthetic reaction steps up to approximately 100 steps. Three steps is the minimal size, where the order of reaction processes can differ and the ordered arrangement described in this invention has effect. Three steps can order for example reaction steps 1, 2, 3 as compared to the different orders 1, 3, 2 or 3, 1, 2.

In another embodiment of invention the method includes mixing produced chimeric proteins, biosynthetic pathway enzymes or other functional polypeptides linked to nucleic acid binding factors produced in host cells and isolated from them and program nucleic acid sequence, which directs order of chimeric proteins into a multi-protein biosynthetic complex. In this way the biosynthetic pathway is assembled in vitro, either in solution or with program DNA immobilized to another molecule or to the solid phase. In order to perform the biosynthetic reaction the in vitro assembled complex has to be incubated under conditions that support biosynthesis with the addition of required substrates and cofactors required for biosynthetic reaction or enzyme system that generates the required cofactors.

The present invention provides description of program nucleic acid design for in vivo and in vitro use. It provides genetically modified host cells comprising a program nucleic acid sequence and nucleic acid sequence encoding chimeric proteins, biosynthetic enzymes or other functional polypeptides covalently linked to anchoring nucleic acid binding factor that bind to specific target nucleic acid elements coded in the program nucleic acid sequence. The present invention provides nucleic acids comprising program nucleic acid sequence for use in a method for improving yield of biosynthetic products.

The present invention enables design of bioprocesses towards specific end-products by excluding other polypeptides that might redirect synthesis to undesired products, due to the close proximity of functional groups of chimeric protein - bound to program nucleic acid.

The invention refers to genetically engineered host cells expressing chimeric proteins and/or replicating program nucleic acid sequence.

The invention also refers to the method which is used for producing a compound of any biosynthetic pathway that requires close proximity of biosynthetic enzymes to synthesize named compound.

The invention also refers to the processes such as information processing, chemical degradation, signaling, bio-sensing.

Brief description of the figures:

Figure 1. Schematic overview of benefits in using program nucleic acid sequence. Figure 2. Effect of spacer sequence between target nucleic acid elements/motifs.

Figure 3. Chimeric proteins, biosynthetic enzymes linked to DNA binding factors bind to the target nucleic acid motif. Each tested DNA binding factor Znf_Zif268, Znf_Blues, Znf_PBSII, Znf_HivC and NicTAL (Table 1) binds specifically to target nucleic acid motif and represses expression of a reporter protein, β-galactosidase. [A] Znf_Zif268, Znf_Blues, Znf_PBSII, Znf HivC and NicTAL (Table 1) were tested. [B] Nucleic acid binding factors specifically recognize target nucleic acid motifs. The specific target nucleic acid motif was exchanged with a target nucleic acid motif specific for a different nucleic acid binding factor. Reporter protein β-galactosidase was not repressed when inappropriate target nucleic acid motif was used- Figure 4. In vitro binding of chimeric proteins to program nucleic acid. [A] Hybridization of program nucleic acid was performed by injection 0.5-2 μΜ up to 300 s to get the final response of 300 RU. The binding of chimeric proteins was observed by injecting various concentrations of chimeric proteins for 1 min following the 5 min dissociation step. The regeneration was achieved with two 30 s injections of 50 mM NaOH and 24 s injection of 0.5 % SDS. [B] Chimeric proteins from top: Znf Glil, Znf _Zif268, Znf_Blues, Znf_HivC, Znf_PBSII, Znf_Jazz. [C] DNA was captured at the beginning of the cycle and the chimeric proteins were injected for 1 minute one after another in 2 different sequences: solid line: Znf azz, Znf_Blues, Znf_Zif268, ZnfJPBSII, Znf_HivC and Znf_Glil; dashed line: Znf_Glil, Znf_PBSII, Znf_HivC, Znfjazz, ZnfJBlues and Znf_Zif268. The binding of all chimeric proteins in sequence showed that the binding of chimeric proteins was weaker if Znf_Glil was injected as the first one.

Figure 5. In vitro binding of nucleic acid binding factors on a program nucleic acid sequence. Binding of individual nucleic acid binding factors, Znf_Jazz, ZnfJBlues, Znf_Zif268, ZnfJPBSII, Znf jffivC and Znf _Glil was determined by mobility shift assay on agarose gel. Arrows indicate position of protein-DNA complex.

Figure.6. In vivo reconstitution of two non-functional Split-GFPs linked to nucleic acid binding factors in the presence of program nucleic acid sequence. [A] Schematic presentation of reconstitution of Split-GFPs. [B] Plasmids, 75 ng each, carrying His_tag-Znf_Glil -linker peptide-nCFP and cCFP-linker peptide-Znf_HIVC-His_tag were cotransfected with 350ng of program nucleic acid (Program SPR, split, FRET; Table 1). Emission of CFP at (475 nm) was detected after excitation at 433 nm. [C] Plasmids, 75 ng each, carrying split YFP chimeric proteins (His_tag-Znf_PBSII_linker peptide-nYFP and cYFP-linker peptide-Znf_Zif268-linker peptide-His_tag) were cotransfected with 350 ng plasmid encoding program nucleic acid (Program SPR, split, FRET; Table 1). YFP emission signal at 529 nm was measured after excitation at 513 nm under a confocal microscope.

Figure 7. In vitro reconstitution of two non-functional Split-GFPs linked to nucleic acid binding factors in the presence of program nucleic acid sequence. Lysates of HE 293 cell transfected with plasmids encoding either His_tag-Znf_Glil -linker peptide-nCFP and cCFP- linker peptide-Znf _HIVC-His_tag or HisJag-Znf PBSIIJinker peptide-nYFP and cYFP- linker peptide-Znf_Zif268 -linker peptide-His_tag were mixed with 50μg of program nucleic acid (Program SPR, split, FRET; Table 1). After 18h incubation at 4°C , fluorescent spectra for CFP and YFP were obtained. Emission peaks are indicated with arrows.

Figure 8. In vivo reconstitution of two pairs of split fluorescence proteins (Split CFP and split YFP) determined with FRET method. HEK293 cells were cotransfected with plasmids encoding His_tag-Znf_Glil -linker peptide-nCFP, cCFP-linker peptide-Znf_HIVC-His_tag, His_tag-Znf_PBSII_linker peptide-nYFP, cYFP-linker peptide-Znf_Zif268-linker peptide- His_tag and plasmid encoding program nucleic acid (Program SPR, split, FRET; Table 1). [A. Top left] Image of reconstituted CFP from split CFPs. Excitation at 453 nm, emission from 470 to 510. [A. Top right] Image of reconstituted YFP from split YFPs. Excitation at 415 nm, emission from 525 to 560. [A. Bottom left] Overlay of CFP and YFP images from above. [A. Bottom right] The same cells stained with Hoechst 34580 dye. Arrows indicate FRET positive cells proving colocalisation of all four split GFP fusions within cell's nucleus. [B. Top left] Image of donor, reconstituted CFP before photobleaching. [B. Top right] Image of aceptor, reconstituted mCitrin after photobleaching. Bleached area is marked with arrow. [B. Below left] Image of donor, reconstituted CFP after photobleaching. Increased fluorescence of donor after acceptor photobleaching is marked with arrow.

Figure 9. Biosynthesis of violacein in the presence of program nucleic acid. Overnight cultures of E. coli containing plasmids encoding chimeric proteins Znf_Blues-linker peptide- vioA, Znf_Zif268-linker peptide-vioB, ZnfJPBSII-linker peptide-vioE, Znf_HivC-linker peptide- ioD, Znf_Glil -linker peptide-vioC (See Table 1) with or without plasmid encoding program nucleic acid (See Table 1, program biosynthesis (123456) or program biosynthesis (341256)). The amount of violacein was quantified at 17.5 h after extraction and analysis by TLC. The amount of violacein produced in the presence of program nucleic acid (program biosynthesis 123456) was much higher than when program nucleic acid (program biosynthesis 2341256) or no program nucleic acid was present in E. coli.

Detailed description of the invention:

Definitions:

The term "program /nucleic acid/ sequence", used herein, refers to nucleic acid, which contains target nucleic acid sequences of any length and any number recognized by nucleic acid binding factors. These target nucleic acid sequences are separated with spacer nucleic acid sequences of any length. The program nucleic acid sequence could be used as such or inserted into suitable vector to be inserted and amplified in host cell. The term "target /nucleic acid/ sequence", "target /nucleic acid/ element/ motif, used herein, refers to nucleic acid sequence of any length that is substrate for binding factor.

The term "/nucleic acid/ binding factor", used herein, refers to any molecule with ability to bind nucleic acid molecule. The binding factor could be of natural origin or artificially designed whole protein or only a segment with characteristic to bind to nucleic acid in sequence specific manner. In this invention nucleic acid binding element is used as a carrier for enzyme included into biosynthetic pathway or any other functional polypeptide. Nucleic acid binding molecule is connected with enzyme molecule via chemical bond. This is achieved via linker peptide. Nucleic acid binding element binds to one or more specific sites on program nucleic acid sequence and assure high local concentration and proper spatial positioning of enzymes included into biosynthetic pathway.

The term "spacer /nucleic acid/ sequence", used herein, refers to nucleic acid sequence of any length that separates target nucleic acid sequences.

The term "nucleic acids", used herein, refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxynucleotides includes but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer with phospho-tioate back bonds, comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non- natural, or derivatized nucleotide bases.

The terms "polypeptide", "protein", "peptide", used herein, refers to a polymeric form of amino acids of any length, which can include coded and non-coded amino acids, chemically, or biochemically modified or derivatized amino acids, and polypeptides having modified peptide backbones. The terms "functional polypeptide", used herein, refers to a polymeric form of amino acids of any length, which expresses a function of any kind such as formation of structure, directing to a specific location, targeting organelles, facilitating and executing chemical reaction, binding to other functional polypeptide. The terms "biosynthetic pathway enzyme ", used herein, refers to a polymeric form of amino acids of any length, which expresses a function of any kind such as formation of new chemical bond.

The terms "chimeric protein ", used herein, has a general meaning and in the description refers to a polymeric form of amino acids of any length, composed of more than one protein /domain/segment, optionally linked to each other with linker of any length preferentially containing from one to 40 amino acids and at least one protein /domain /segment is nucleotide binding factor, whole or binding domain and the other is biosynthetic pathway enzyme or other functional polypeptide, whole or activity domain.

The term "heterologous", used herein, refers to in the context of a genetically modified host cell, refers to a polypeptide wherein at least one of the following is true: (a) the polypeptide is foreign ("exogenous") to (i.e., not naturally found in) the host cell; (b) the polypeptide is naturally found in (e.g., is "endogenous to") a given host microorganism or host cell but is either produced in an unnatural (e.g., greater than expected or greater than naturally found) amount in the cell, or differs in nucleotide sequence from the endogenous nucleotide sequence such that the same encoded protein (having the same or substantially the same amino acid sequence) as found endogenously is produced in an unnatural (e.g., greater than expected or greater than naturally found) amount in the cell.

The term "homologous", used herein, refers to proteins or nucleic acid with well preserved amino acid or nucleotide sequences, preferably with at least 50% conservation, with a minimum of 20% conservation, determined by protein or nucleic acid alignment techniques, known to experts in the field. Homologous proteins are characterized by performing the same function in the cell. Homologous nucleic acids are coding for homologous proteins.

The term "recombinant", used herein, means that a particular nucleic acid (DNA or RNA) is the product of various combinations of cloning, restriction, and/or ligation steps resulting in a construct having a structural coding or non-coding sequence distinguishable from endogenous nucleic acids found in natural systems. Generally, DNA sequences encoding the structural coding sequence can be assembled from cDNA fragments and short oligonucleotide linkers, or from a series of synthetic oligonucleotides, to provide a synthetic nucleic acid which is capable of being expressed from a recombinant transcriptional unit contained in a cell or in a cell-free transcription and translation system. Such sequences can be provided in the form of an open reading frame uninterrupted by internal non- translated sequences, or introns, which are typically present in eukaryotic genes. Genomic DNA comprising the relevant sequences can also be used in the formation of a recombinant gene or transcriptional unit. Sequences of non- translated DNA may be present 5' or 3' from the open reading frame, where such sequences do not interfere with manipulation or expression of the coding regions, and may indeed act to modulate production of a desired product by various mechanisms (see "DNA regulatory sequences", below).

The term "host cell", used herein, denotes an in vivo or in vitro eukaryotic cell, a prokaryotic cell, or a cell from a multicellular organism (e.g., a cell line) cultured as a unicellular entity, which eukaryotic or prokaryotic cells can be, or have been, used as recipients for a nucleic acid (e.g., an expression vector that comprises a nucleotide sequence encoding one or more biosynthetic pathway gene products such as mevalonate pathway gene products), and include the progeny of the original cell which has been genetically modified by the nucleic acid. It is understood that the progeny of a single cell may not necessarily be completely identical in morphology or in genomic or total DNA complement as the original parent, due to natural, accidental, or deliberate mutation. A "genetically modified host cell" (also referred to as a "recombinant host cell") is a host cell into which has been introduced a heterologous nucleic acid, e.g., an expression vector. For example, a prokaryotic host cell is a genetically modified prokaryotic host cell (e.g., a bacterium), by virtue of introduction into a suitable prokaryotic host cell of a heterologous nucleic acid, e.g., an exogenous nucleic acid that is foreign to (not normally found in nature in) the prokaryotic host cell, or a recombinant nucleic acid that is not normally found in the prokaryotic host cell; and a eukaryotic host cell is a genetically modified eukaryotic host cell, by virtue of introduction into a suitable eukaryotic host cell of a heterologous nucleic acid, e.g., an exogenous nucleic acid that is foreign to the eukaryotic host cell, or a recombinant nucleic acid that is not normally found in the eukaryotic host cell.

The term "biosynthetic pathway", used herein, refers to sequence of enzymatic or other reactions by which one compound is converted to another by making new covalent bonds in organisms or in vitro.

The term "in vitro", used herein, refers to a procedure which is preformed not in a living organism or cell but in controlled environment.

The term "linker peptide" refers to shorter amino acid sequences, whose role could be only to separate the individual domains of the fusion protein. The role of the linker peptide in the fusion protein, inclusion of which is optional, may also be the introduction of the splitting site or for posttranslational modifications, including the introduction of sites for improved processing of antigens. The length of the linker peptide is not restricted; however, it is usually up to 30 amino acids long.

In general, the heterologous nucleic acid is inserted into expression vector. Suitable vectors include, but are not limited to: plasmids, viral vectors, and others. Expression vectors, compatible with host organism cells are well known to experts in the field and contain appropriate control elements for transcription and translation of nucleic acids. Typically, the expression vector includes an antibiotic resistance cassette, a sequence for a chimeric protein under suitable promoter for guiding expression in host cells, polyadenylation signal and transcriptional terminator.

Method for producing a biosynthetic product/precursor.

The present invention provides method of producing a product or a precursor of a biosynthetic pathway in a genetically engineered host cell or in vitro. The method generally involves culturing the genetically engineered host cell under suitable conditions, under which genetically engineered host comprises: a) nucleic acids comprising nucleotide sequences encoding at least three or more binding factors linked to biosynthetic pathway enzymes or other functional polypeptides, and b) a nucleic acid comprising one or more program nucleic acid sequences.

In other embodiment of invention, the method generally involves formation of a product in vitro. In vitro method comprises: a) at least three or more binding factors which are linked to a chimeric biosynthetic pathway enzymes or other functional polypeptides, and b) a nucleic acid comprising one or more program nucleic acid sequences, c) a substrate for the first enzyme and cofactors for enzymes are provided to the mixture.

A binding factor binds to the target nucleic acid element within the program nucleic acid sequence with sufficient association, providing association of chimeric biosynthetic pathway functional polypeptides with program nucleic acid sequence. The association of the chimeric biosyntlietic pathway functional polypeptides with the program nucleic acid sequence is of sufficient affinity that the chimeric biosynthetic pathway enzymes or other functional polypeptides are immobilized on a program nucleic acids sequence. The program nucleic acid sequence dictates the order of adjacent binding factors and therefore the sequential order of adjacent chimeric enzymes or other functional polypeptides. The order of target target nucleic acid element with bound functional polypeptides can be arranged in the same order as they act in a particular biosynthetic reaction. Alternatively, the order of adjacent target nucleic acid element with corresponding functional polypeptides can be changed by changing the sequence of adjacent target sequences within program nu cleic acid sequence, thereby providing an alternative order of adjacent biosynthetic enzymes or other functional polypeptides. The host cell is cultured in such a way that a substrate for the first enzyme is: a) present in the cell, b) provided to the cell extracellularly. The biosynthetic pathway enzymes are synthesised in the cell and convert the substrate into a product, b) the biosynthetic enzymes are synthesized in the cell, released from the cell and the substrate is converted to a product extracellularly.

Use of program nucleic acid sequence to immobilize biosynthetic pathway functional polypeptides provides for one or more of the following: increased efficiency of pathway and optimized metabolic flux reduced metabolic burden and concentration of potentially toxic free intermediates in the cytosol, enable alternative branching of the metabolic pathways yielding novel products of biosynthetic process. Balance of enzyme activity levels is achieved through the use of program nucleic acid sequence. For example, the sequence of the nucleic acid program can be changed in such a way that includes multiple copies of target nucleic acid elements for chimeric polypeptides witch have lower activity towards conversion of substrate, resulting in higher copy number of the functional polypeptides with lower activity than the copy number of functional polypeptide with higher activity. This can be advantageous where the lower activity functional polypeptide catalyzes a rate-limiting step in the biosynthetic pathway. Use of program nucleic acid sequence increases efficiency and optimizes pathway flux through increasing proximity of enzymes involved in biosynthetic pathway; because of increased efficiency and optimized flux, equivalent or higher yields of product can be achieved with lower levels of enzyme. Lower levels of enzyme production in a host cell are advantageous, as it places less of a metabolic burden on the host cell. Because of the turnover of biosynthetic pathway intermediates is more efficient with use of a program nucleic acid sequence, the amount/concentration of pathway intermediates free in the cytosol or cytoplasm is reduced. Reduced levels of the free pathway intermediates are advantageous where such intermediates are toxic for the host cell.

In some embodiments at least three chimeric biosynthetic pathway enzymes or other functional polypeptides are immobilized onto nucleic acid program. The first chimeric functional enzyme or polypeptide produces a first product that is a substrate for the second chimeric enzyme or other functional polypeptide. Chimeric proteins, biosynthetic pathway enzymes or other functional polypeptides are positioned in close proximity of another. In this way, the effective concentration of the first product is high and the second chimeric biosynthetic pathway enzyme can act efficiently on the first product.

Three or more (e.g. three, four, five, six, seven or more) functional polypeptides can be immobilized onto the program nucleic acid sequence. For example, in some embodiments, a program nucleic acid sequence includes (from 5' to 3'), a) one copy of a target nucleic acid element for the first chimeric polypeptide of biosynthetic pathway enzyme or other functional polypeptide, b) one copy of a target nucleic acid element for the second chimeric biosynthetic pathway functional polypeptide, and c) one copy of a target nucleic acid element for the third chimeric biosynthetic pathway functional polypeptide. In other embodiments, a program nucleic acid sequence includes (from 5' to 3'), a) one copy of a target sequence for the first chimeric biosynthetic pathway functional polypeptide, and b) two or more (e.g. two three, four, or more) copies of a target nucleic acid element for the second chimeric biosynthetic pathway enzyme or other functional polypeptide. In this way, the ratio of any given chimeric functional polypeptide in a biosynthetic pathway can be varied. For example, the ratio of a first chimeric biosynthetic pathway functional polypeptide to a second chimeric biosynthetic pathway functional polypeptide can vary from about 0.1 :10, to about 10:0.1 and the ratio of the first to the third chimeric protein from about 0.1 ;10, to about 10:0.1 etc.

In some embodiments, at least three or more (e.g. four, five, six, seven, or more) chimeric biosynthetic pathway enzymes are immobilized on a nucleic acid program. The first chimeric biosynthetic pathway enzyme produces a first product that is a substrate for the second product that is a substrate for the third product chimeric biosynthetic pathway enzyme: In these embodiments, the order and the copy number of chimeric biosynthetic pathway enzymes is dictated by the sequence of the target nucleic acid sequence for particular chimeric biosynthetic pathway enzyme in a nucleic acid program.

Design of program nucleic acid sequence.

Program nucleic acid sequence is designed to organize biosynthetic pathway enzymes into a functional complex. Program nucleic acid sequence comprises three or more target nucleic acid elements for binding of nucleic acid binding factors. Binding of a nucleic acid binding factors which is linked to a chimeric biosynthetic pathway enzyme, to the target nucleic acid element, provides for immobilization of enzyme on the program nucleic acid sequence. Each target nucleic acid element has a corresponding nucleic acid binding partner in a chimeric biosynthetic pathway enzyme or a functional polypeptide. A given target nucleic acid element can be immediately adjacent to another target nucleic acid element, or can be separated from an adjacent target nucleic acid element through a spacer nucleic acid sequence. In in vivo embodiments the program nucleic sequence can be introduced to a variety of different types of host cells in a different form. Examples of a different form of program nucleic acid sequence are, but are not limited to, baculovirus vectors, bacteriophage vectors, plasmids, phagemids, cosmids, fosmids, bacterial artificial chromosomes, viral vectors (for example, but not limited to, viral vectors based on vaccinia virus, poliovirus, adenovirus, adeno-associated virus, SV40, herpes simplex virus, and the like), PI based artificial chromosomes, yeast plasmids, yeast artificial chromosomes, and other vectors. The program nucleic acid sequence can be introduced into variety of host cells also as a liner nucleic acid molecule. All these forms of program nucleic acid sequence can also be used in in vitro embodiments of this invention. If program nucleic acid sequence is at least partially composed of RNA it can be used per se, but DNA coding for program nucleic acid sequence can also be used. DNA coding for program nucleic acid sequence that is used in in vitro or in vivo embodiments of this invention can be found in different forms. For example but not limiting to any form mentioned above.

In a production system program nucleic acid sequence can be present in one or more copies. More copies of individual program nucleic acid sequence can provide for higher number of functional biosynthetic complexes constructed of immobilized chimeric biosynthetic pathway enzymes. Individual copies of program nucleic acid sequences can be present in one or more nucleic acid molecules. Thus, amplifying the number of individual nucleic acid molecules in the production system with one or more copies of program nucleic acid sequence also provides for higher number of total copies of program nucleic acid sequences in producing system.

For example, but not limited to, bacterial host cell can be transformed with self replicating plasmid, or any other type of nucleic acid molecule carrying program nucleic acid sequence in one or more copies, thus providing for higher total number of program nucleic acid sequences in the production system. Plasmid carrying one or more copies of program nucleic acid sequences can also be present in system in one or more copies (for example, but not limited to, high copy number plasmid can be used for transforming bacterial host cell). Higher number of plasmids in the production system with one or more copies of program nucleic acid sequences also provides for higher total number of program nucleic sequences in the production system.

For example, but not limited to, bacterial host cell can be transformed with a nucleic acid sequence coding for single, double, partially double, multi or partially multi stranded RNA molecule. The program nucleic acid sequence can be present in one or more copies on individual RNA molecule. More copies of program nucleic acid sequences on individual RNA molecule provide for higher total number of program nucleic acid sequences in the production system. Coding nucleic acid sequence for RNA molecule can also be present in the production system in one or more copies. Higher copy number of coding nucleic acid sequences for RNA molecule provides for higher number of RNA molecules in the production system and therefore also for higher total number of program nucleic acid sequences in the production system. Total number of individual RNA molecules in the production system can also be regulated with promoters used for driving transcription of coding nucleic acid sequences into RNA molecules consisting of one or more copies of program nucleic acid sequences. Stronger promoters provide for higher number of RNA molecules in the production system, thus providing for higher total number of individual program nucleic acid sequences in system. The higher total number of program nucleic acid molecules provides for the higher number of functional biosynthetic complexes constructed of immobilized chimeric biosynthetic pathway enzymes.

A program nucleic acid sequence comprises at least three target nucleic acid sequences, thus, at least three corresponding nucleic acid binding elements in three chimeric biosynthetic pathway enzymes bind to program nucleic acid sequence. A program nucleic acid sequence has one, two or more copies of each target nucleic acid element. Each target nucleic acid element can thus be present in one or more copies. The copies can be in tandem or separated by spacer nucleic acid sequence. The copies of target nucleic acid element can also be separated with one or more other target nucleic acid element, which can also be present in one or more copies. If only three target nucleic acid elements are present on a program nucleic acid sequence, these can be different, thus providing proper spatial organization or high local concentration for the two different chimeric biosynthetic pathway enzymes or other functional polypeptides. If only three target nucleic acid elements are present on a program nucleic acid sequence, these can also be the same kind thus providing proper spatial organization or high local concentration for three copies of same chimeric biosynthetic pathway enzymes. Specific target nucleic acid element is defined by nucleotide sequence from 5' to 3' end of nucleic acid molecule. On a double stranded program nucleic acid sequence or a double stranded segment of program nucleic acid sequence, the target nucleic acid sequence can also be reverse complement sequence of original target nucleic acid sequence. Same nucleic acid binding factor binds to the target nucleic acid element and to the reverse complement sequence of target nucleic acid sequence but in different orientation and on different site of program nucleic acid sequence. Thus, using reverse complement of target nucleic acid element instead of target nucleic acid element can be useful if two chimeric bio synthetic pathway enzymes or other functional polypeptides are too large to be immediately adjacent to each other, or for achieving different spatial orientation of two copies of the same chimeric biosynthetic pathway enzymes or other functional polypeptides on a program nucleic acid sequence.

Spacer nucleic acid element can be placed on a program nucleic acid sequence in between two target nucleic acid elements to acquire enough space for large chimeric biosynthetic pathway enzymes or other functional polypeptides which act in succession in biosynthetic pathway. In double stranded program nucleic acid sequence or double stranded segment of program nucleic acid sequence, the spacer nucleic acid sequence can also provide for proper spatial orientation of two adjacent or nearby chimeric biosynthetic pathway enzymes or other functional polypeptides. For example, in some embodiments program nucleic acid sequence can be double stranded DNA sequence, which forms turning helix structure. Thus, varying the length of a spacer nucleic acid sequence results in a different spatial position of immobilized chimeric biosynthetic pathway enzymes and also in larger distance between two adjacent chimeric biosynthetic pathway enzymes (Figure 2).

Individual program nucleic acid sequence has a general formula [((X or X')n Sp)]m where Xn is a target nucleic acid sequence, Xn' is a reverse complement of target nucleic acid sequence, S is an optional spacer nucleotide sequence of any length, m is a integer from 3 and above and represents a number of target nucleic acid sequences with optional spacer nucleic acid sequence in program nucleic acid, n is an indicator of type of target nucleic acid element (e.g. XI, X2 and so on) and represents different individual target nucleic acid elements, p is an indicator of type of spacer element (e.g. SI, S2 and so on) and represents different individual spacer elements. Individual target nucleic acid sequence in the program nucleic acid sequence can be of any length and depends on corresponding nucleic acid binding element. For example, in some embodiments of our invention, three finger DNA binding zinc finger proteins can be used. In these cases, length of target nucleic acid sequences is usually but not always 9 or 10 nucleotides.

For example, in some embodiments, program nucleic acid sequence has the formula (X1)(S1)(X2)(S2)(X3)(S3)(X4), where XI is the first target nucleic acid sequence for binding of the first chimeric biosynthetic pathway enzyme; where X2 is the second target nucleic acid sequence for binding of the second chimeric biosynthetic pathway enzyme; where X3 is the third target nucleic acid sequence for binding of the third chimeric biosynthetic pathway enzyme; where X4 is the fourth target nucleic acid sequence for binding of the fourth chimeric biosynthetic pathway enzyme; where SI if present, is the first spacer nucleotide sequence that ensures proper spatial orientation of adjacent chimeric biosynthetic pathway enzymes; where S2, if present, is the second spacer nucleotide sequence that ensures proper spatial orientation of adjacent chimeric biosynthetic pathway enzymes; where S3, if present, is the third spacer nucleotide sequence that ensures proper spatial orientation of adjacent chimeric biosynthetic pathway enzymes.

As another example, in some embodiments, program nucleic acid sequence has the formula (X1)(S1)(X2)(S1)(X2)(S1)(X3), where XI is the first target nucleic acid sequence for the binding of first chimeric biosynthetic pathway enzyme; where X2 is the second target nucleic acid sequence for binding of the second chimeric biosynthetic pathway enzyme and is repeated twice in succession; where X3 is the third target nucleic acid sequence for binding of the third chimeric biosynthetic pathway enzyme; SI if present is the spacer nucleotide sequence of same type in between every pair of target nucleic acid sequence. Successively repeated target nucleotide sequences (in this case X2) can provide for higher local concentration of a rate limiting chimeric biosynthetic pathway enzyme resulting in an increased efficiency of pathway flux. Successively repeated target nucleotide sequences (in this case X2) can also provide for increased dimerization (or multimerization if specific target nucleic acid sequence is repeated more than two times in succession) of dimeric (or multimeric) chimeric biosynthetic pathway enzyme. Successively repeated target nucleotide sequences (in this case X2) can also provide for increased efficiency of pathway flux in chimeric biosynthetic enzyme pathways where one enzyme acts on substrate more than one time in succession.

As another example, in some embodiments, program nucleic acid sequence has the formula (X1)(S1)(X2)(S2)(X1)(S3)(X3) where XI is the first target nucleic acid sequence for the binding of the first chimeric biosynthetic pathway enzyme, and is repeated once again in program nucleic acid sequence but not in direct succession; where X2 is the second target nucleic acid sequence for the binding of the second chimeric biosynthetic pathway enzyme; where X3 is the third target nucleic acid sequence for the binding of the third chimeric biosynthetic pathway enzyme; where SI if present, is the first spacer nucleotide sequence that ensures proper spatial orientation of adjacent chimeric biosynthetic pathway enzymes; where S2, if present, is the second spacer nucleotide sequence that ensures proper spatial orientation of adjacent chimeric biosynthetic pathway enzymes; where S3, if present, is the third spacer nucleotide sequence that ensures proper spatial orientation of adjacent chimeric biosynthetic pathway enzymes. This program nucleic acid sequence can be used for improving efficiency of chimeric biosynthetic enzyme pathways, where one chimeric biosynthetic pathway enzyme (in this case chimeric biosynthetic pathway enzyme that binds XI target nucleotide sequence) acts more than once in a chimeric biosynthetic enzyme pathway but not in immediate succession.

Design of (nucleic acid) binding factors.

The invention refers to chimeric polypeptides composed of nucleic acid binding factor (NABF) and biosynthetic pathway enzyme or other functional polypeptide attached together via amino acid linker. NABF are of natural origin or artificially designed, and could be isolated from any organism. NABF is any polypeptide, domain, protein or segment of protein that binds to nucleic acid. NABF is an independently folded protein domain that contains at least one motif that recognizes nucleic acid sequence. NABF interacts with nucleotides in a sequence- specific manner. The examples of polypeptides with NABF include, but are not limited to: helix-turn-helix, zinc finger, leucine zipper, winged helix, winged helix turn helix, helix-loop- helix and HMG-box .

Polypeptides with basic-helix-loop-helix motif are classified in leucine zipper factors, helix- loop-helix factors, helix-loop-helix / leucine zipper factors, NF-1, RP-X,and, bHSH. They are characterized by two a-helices connected by a loop. Transcription factors that include this domain typically bind to a consensus sequence called E-box (palindromic sequence (CACGTG)) as dimers. bHLH transcription factors bind to non-palindromic sequences, which are often similar to the E-box.

As another example, in some embodiments, NABF are selected from superclass of Zinc- coordinating DNA-binding domains (Zinc-Finger domain), which is divided into subclasses: Cys4 zinc finger of nuclear receptor type, diverse Cys4 zinc fingers, Cys2His2 zinc finger domains, Cys6 cysteine-zinc cluster, and, Zinc fingers of alternating composition. Individual zinc finger domains typically occur as tandem repeats with two, three, or more fingers comprising the DNA-binding domain of the protein. These tandem arrays can bind in the major groove of DNA and are typically spaced at 3-bp intervals. The a-helix of each domain (often called the "recognition helix") can make sequence-specific contacts with nucleotides of nucleic acids; residues from a single recognition helix can contact 4 or more nucleotides to yield an overlapping pattern of contacts with adjacent zinc fingers.

As another example, in some embodiments, NABF are selected from helix-turn-helix superclass of nucleotide binding polypeptides which is classified into six classes: homeo domain, fork head / winged helix, heat shock factors, tryptophan clusters, and transcriptional enhancer factor (TEA) domain. Helix-turn-helix is a major structural motif capable of binding DNA, where recognition and binding to DNA is done by the two a helices, one occupying the N-terminal end of the motif, the other at the C-terminus.

Beta-Scaffold Factors with Minor Groove Contacts is the fourth superclass that is divided into more classes: RHR (Rel homology region), STAT, p53, MADS box, beta-Barrel alpha- helix transcription factors, TATA binding proteins, heteromeric CCAAT factors, grainyhead, Cold-shock domain factors and Runt.

Other transcription factors that bind to specific nucleotide sequence such as mutated restriction enzymes without restriction activity but with sequence recognition ability are also important .

The invention refers but is not limited to, polypeptides with Zinc-Finger domain, that form a small, independently folded zinc containing mini domain that recognizes specific nucleic acid sequence. Zinc-Finger domains typically occur as tandem repeats with at least two, three, or more fingers comprising the DNA-binding domain of the protein. Each finger recognizes and binds to 3 base-pair subsites. Specific binding is mediated by amino acids of mini domain of Zinc-Finger on position 1, 2, 3 and 6 relative to the start of the alpha-helix. With modular or combinatorial approach we can multiply repeated mini domains and achieve chemical distinctiveness through variations in certain key amino acid residues. Thus, fingers with different triplet specificities are combined to give the specific recognition of longer nucleic acid sequences. With combining at least two or more Zinc-Finger domains, we can minimize non-specific binding on nucleic acids present in host organisms and increase the specificity of Zinc-Finger containing polypeptides binding to the program nucleic acid.

Zinc-Finger-binding motifs are stable structures that rarely undergo conformational changes upon binding to their target. In this invention, functional polypeptide domain has been linked to at least one linker sequence that is covalently bound to biosynthetic pathway enzyme of interest. Linker sequence determines the biosynthetic pathway enzyme or other functional polypeptide position with respect to Zinc-Finger binding domain and the adjacent biosynthetic pathway enzyme or other functional polypeptide. Linker sequence can vary in length and amino acid sequence, and ensures that the covalently bound enzyme can form appropriate tertiary structure and thus retains the biological function.

The number of chimeric biosynthetic pathway enzymes or other functional polypeptides attached to NABF could be, but are not limited to three, four, five or more, and could be different from the number of individual target nucleic acid elements on a program nucleic acid sequence. The number of chimeric biosynthetic pathway enzymes or other functional polypeptides attached to NABF binding factors depends on the number of steps in a biosynthetic pathway. If the biosynthetic pathway requires three, four, five different enzymes or other functional polypeptides to convert substrate to product or precursor at least three, four, five different binding elements, respectively, will be included into program nucleic acid sequence in a defined order. The nucleic acid program which contains several repeats of individual targets element are also included in this invention.

In some embodiments of our invention RNA molecules can be used as a program nucleic acid sequences. In this case nucleic acids binding factors that recognize specific RNA sequences instead of specific DNA sequences should be used. These can be, but are not limited to: RNA recognition motif (REM, also known as RED or the RNP motif), heterogeneous nuclear RNP K-homology domain (KH domain), zinc finger domains (best characterized RNA binding zinc finger domains are those of CCHH and CCCH type, but other types can also be used) and Pumilio (Puf) domains. Although RNA binding domains that recognizes specific secondary structures on RNA molecules could also be used as nucleic acid binding factors (for example but not limited to SI domains) most suitable are sequence specific RNA binding domains (for example but not limited to zinc finger RNA binding domains). Inventors prefer usage of RNA binding domains that recognizes specific double stranded RNA sequences, because proper spatial orientation can be assured with them. Methods for designing sequence specific double stranded R A binding zinc fingers are described in literature.

Design of synthetic pathways.

The present invention refers to a method for increasing yield of production of biosynthetic products. This method combines: (i) program nucleic acid sequence; and (ii) chimeric proteins of biosynthetic pathway enzymes or other functional polypeptides with NABF. All required components (i) and (ii) could be mixed (i) in vitro, in a solution with the presence of substrates and cofactors for the biosynthetic pathway enzymes, or, (ii) the components can be inserted and expressed in host cells in-vivo.

In the embodiment of invention the biosynthetic pathway is referred to any cascade of enzymes forming a new chemical bond to the substrate in the defined order, e.g. naturally occurring or artificially designed. Artificial biosynthetic pathway refers to production of a product not known in nature. Artificial biosynthetic pathway also refers to production of a product known in nature but the enzymes of the biosynthetic pathway are not from the same organism or are obtained with genetic manipulations.

The invention refers to biosynthetic pathway, to any coupled enzymatic reaction that has to be performed sequentially with three or more enzymes. The biosynthetic pathway (flux, efficiency) could be measured by determining the concentration of precursors and end products by the appropriate methods known to the experts in the field.

The invention refers to any biosynthetic pathway including, but not limited to:

• catabolic or anabolic pathways,

• primary metabolic pathways such as, but not limited to synthesis of amino acids, fatty acids, carbohydrates, pyrimidine and purine,

• secondary metabolic pathways such as, but not limited to, synthesis of secondary metabolites with biological activities, and pharmacological properties (polyketides), unusual peptides, hormones, terpenoids (carotenoids, ...), pigments (violacein, melanin, ...) etc.

Caroteinoid biosynthetic pathway enzymes Carotenoids belong to the category of tetraterpenoids. Astaxanthin, zeaxanthin and canthaxanthin are derived from β-carotene and their synthesis is mediated by carotenoid biosynthetic enzymes crtE, crtB, crti, crtY, crtO, crtZ, where the order and participation of enzymes leads to different products. For example, biosynthetic pathway that includes crtE, crtB, crti, crtY, crtO, crtZ leads to the synthesis of astxanthin, crtE, crtB, crti, crtY, crtZ leads to the synthesis of zeaxanthin, and crtL-E, crtB, crti, crtY, crtO to the synthesis of cantaxanthin, respectively.

Violacein biosynthetic pathway

Genes for the violacein pigment biosynthesis are in an operon constituted of vioE, vioD, vioC, vioB, and vioA genes. VioA is an FAD dependent L-Trp oxidase, which generates IPA imine. The latter is the substrate for VioB, a hemoprotein oxidase, which converts the IPA imine into a compound with an unknown chemical formula (compound X). VioE is a recently discovered unique protein with no characterized homologues, and is a key enzyme in the violacein biosynthetic pathway - it acts by converting compound X to intermediates which can be taken over by VioD and VioC, FAD dependent monooxygenases which hydroxylate these compounds to form violacein.

Resveratrol biosynthetic pathway

Resveratrol (3,5,4'-transhydroxystilbene) is a plant-produced polyphenol. Resveratrol biosynthetic pathway consists of four enzymes: phenylalanine ammonia lyase (PAL) and cinnamic acid 4-hydroxylase (C4H), which can be replaced by a single enzyme tyrosine ammonia lyase (TAL), 4-coumarate:CoA ligase (4CL) and stilbene synthase (STS). The first two enzymes, PAL and C4H, transform the amino acid phenylalanine into p-coumaric acid (4- coumaric acid). The third enzyme, 4CL, attaches p-coumaric acid to the pantetheine group of coenzyme-A (CoA) to produce 4-coumaroyl-CoA. The final enzyme in the pathway, STS, catalyzes the condensation of resveratrol from one molecule of 4-coumaroyl-CoA and three molecules of malonyl-CoA, which originate from fatty acid biosynthesis.

In this present invention, enzymes from particular biosynthetic pathways, such as violacein, resveratrol and carotenoid biosynthetic pathway, are fused to NABFs increase the proximity of these chimeric proteins. With sequential binding of enzymes guided by the program nucleic acid sequence, we are able to arrange the specific order of enzymatic reactions. This enables faster, more efficient reactions and also presents a possibility to create a novel product by rearrangement of a specific order of protein reactions (e.g: a different compound is synthesized if enzyme order is 1-2-3 versus 1-3-2). The advantage of our invention is the ability to produce artificial compounds with desired characteristics.

Recombinant DNA

Standard molecular biology methods are used in the invention that are generally known to experts in the field.

The invented proteins of the polypeptide material can be synthesized by expressing DNA coding for proteins in a suitable host organism. The DNA coding for the proteins is inserted in an appropriate expression vector. Suitable vectors include, but are not limited to: plasmids, viral vectors, etc. Expression vectors, which are compatible with the host organism cells, are well known to the experts in the field and include the appropriate control elements for transcription and translation of nucleic acid sequence.

Expression vector may be prepared for expression in prokaryotic and eukaryotic cells. For example, prokaryotic cells are bacteria, primarily Escherichia coli. According to the invention, prokaryotic cells are used to get a sufficient quantity of nucleic acid. Expression vector generally contains the operationally associated control elements which are operationally linked to the DNA of the invention, which codes for the protein. The control elements are selected in a way to trigger efficient and tissue-specific expression. The promoter may be constitutive or inducible, depending on the desired pattern of expression. The promoter may be of native or foreign origin (not represented in the cells, where it is used), and may be natural or synthetic. The promoter must be chosen in order to work in the target cells of the host organism. In addition, initiation signals for the efficient translation of fusion protein are included, which comprises the ATG and the corresponding sequences. When the vector, used in the invention, includes two or more reading frames, should the reading frames be operationally associated with control elements independently and the control elements should be the same or different, depending on the desired production of proteins. Before the present invention is further described, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.

Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, the preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited.

It must be noted that as used herein and in the appended claims, the singular forms "a," "an," and "the", include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "an enzyme" includes a plurality of such enzymes and reference to "the scaffold polypeptide" includes reference to one or more scaffold polypeptides and equivalents thereof known to those skilled in the art, and so forth. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as "solely," "only" and the like in connection with the recitation of claim elements, or use of a "negative" limitation.

The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.

Examples:

Cloning and purification of polypeptides

DNA sequences for nucleic acid binding factors and biosynthetic pathway enzymes or other functional polypeptides described above, were designed based on amino-acid sequences of the selected protein domains using tool Designer from DNA2.0 Inc. The Designer enables the user to design DNA fragments and optimize expression for the desired hosts (e.g. E. coli) by organism specific codon optimization. Genes were ordered from GENEART AG (Im Gewerbepark B32, D-93059 Regensburg), digested with restriction enzymes and cloned into the appropriate vector containing appropriate regulatory sequences, by the procedure known to the experts in the field. DNA sequences for split fluorescence proteins were amplified by PGR using DNA sequence of fluorescence proteins as a template (accession number or reference needed). Vectors used include commercial vectors pET, pBluescript, pCDNA, pSBlA2 (pSBlA2 http://partsregistrv.0rg/Part:pSB 1 A2) pSBlC3

(http://partsregistry.org/wiki/index.php?title=Part:pSB 1 C3), pSBlA 3 (http://pa1tsregi5trv.org/ wiki/index.php?title=Part:pSBlAK3\ and low copy plasmid vectors pSB4K5 ihtfo://partsregistry.org/Part:pSB4K5) or pSB4C5

(http://partsregistry.Org/Part:pSB4C5), carrying all necessary features such as antibiotic resistance, origin of replication and multiple cloning site.

Molecular biology methods (DNA fragmentation with restriction enzymes, DNA amplification using polymerase chain reaction-PCR, PCR ligation, DNA concentration detection, agarose gel electrophoresis, purification of DNA fragments from agarose gels, ligation of DNA fragments into a vector, transformation of chemically competent cells E. coli DH5a, isolation of plasmid DNA with commercially available kits, screening and selection) were used for preparation of DNA constructs. All procedures were performed under sterile conditions (aseptic technique). DNA segments were characterized by restriction analysis and sequencing.

Molecular cloning procedures are well known to the experts in the field and are described in details in any molecular biology handbook and well known to experts.

DNA constructs and corresponding chimeric proteins are described in Table 1. All DNA constructs have start codon (ATG) before tag of histidines or coding region. The constructs coding for chimeric proteins were cloned into pET19b vector for high-level expression or pSB4K5 and pSB4C5 for monitoring in vivo production of carotenoids and violacein, respectively. The expression cassette includes in 5' to 3' direction the T7 promoter, multiple cloning site for fusion protein, and T7 terminator. These regulatory elements enable expression of protein in prokaryotic cell line E. coli carrying T7 R A polymerase.

DNA constructs were prepared by using methods of molecular biology that are basically described in any molecular biology handbook and known to experts. Plasmids, constructs and intermediate constructs were transformed with chemical transformation into bacterium E. coli DH5alfa or BL21 (DE3) pLysS.

Several constructs have been prepared to demonstrate the feasibility of production of chimeric proteins listed in Table 1.

Plasmids encoding open reading frames of fusion proteins in Table 1 (SEQ ID No. 52 to SEQ ID No. 57) were transformed with chemical transformation into competent E. coli BL21 (DE3) pLysS cells. Selected bacterial colonies grown on LB plates with selected antibiotic (ampicillin) were inoculated into 10 mL of LB broth supplemented with selected antibiotic. After several hours of growth at 37°C 10-100 μΐ, of the culture were inoculated into 100 mL of selected growth media and left overnight shaking at 37°C. The overnight culture was diluted 20-50-times reaching the OD 6 oo of diluted culture between 0.1 and 0.2. Culture flasks with 500 ml of diluted culture were put on the shaker and bacteria were grown at 25°C or 30 or 37 °C until OD 6 oo reached 0.6-0.8, when protein expression was induced by addition of inducer IPTG (1 mM). Four hours after induction culture broth was centrifuged and bacterial cells were resuspended in lysis buffer (10 mM Tris pH 8.0, 1M NaCl, 0.1% deoxycholate, O.lmM zinc sulphate, 10 mM DTT, supplemented with protease inhibitor cocktail) and frozen at -80°C for at least overnight. Thawed cell suspension was lysed with sonication and then centrifuged. Precipitate (cell membranes, inclusion bodies) and supernatant were checked for expression of our constructs by SDS-PAGE and Western blot, using anti-His-tag antibodies as primary antibodies when necessary. Designed fusion proteins were mainly present in insoluble part (inclusion bodies), which was composed of >80% of the chosen protein. Inclusion bodies were washed twice with lysis buffer, twice with 1 M urea in 10 mM Tris pH 8.0 and twice with 2 M urea in 10 mM Tris pH 8.0. Usually this treatment resulted in >95% of protein purity. Purified inclusion bodies where dissolved in 8 M urea in 50 mM Tris pH 8.0. Proteins were eluted with 8 M urea in 50 mM Tris pH 8.0 and 250 mM imidazole. Protein refolding was carried out by dialysis against buffer containing 50 mM Tris pH 8.0, 500 niM NaF, 500 μΜ ZnS0 4 , 5 niM DTT, 0,005 % Tween and 10 % glycerol. Dialysis was carried out at least twice for 12 h at 4 °C with no mixing. Upon completion of the process the dialysate was carefully removed from the dialysis tubes and placed into centrifuge tubes. Protein that failed to refold precipitated and was separated from the refolded protein in the soluble phase via centrifugation at 10,000 rpm. Protein concentration was determined using spectrophotometer and Bradford protein assay and protein size determined by SDS-PAGE/commasie stain and Western blot analysis using anti- His antibodies.

Table 1 : Composition of plasmids, prepared for invention demonstration.

Seq ID Seq ID Construct composition Plasmid

No. No. backbone

DNA protein Binding factors

1 2 Znf Glil pSBAK3

3 4 Znf HivC pSBAK3

5 6 Znf_Zif268 pSBAK3

7 8 Znf Jazz pSBAK3

9 10 Znf PSBII pSBAK3

11 12 Znf_Blues pSBAK3

13 14 Znf Tyr pSBAK3

15 16 NicTAL pSBAK3

Biosynthetic pathway enzymes or other functional

polypeptides

17 18 vioA pSBlA2

19 20 vioB pSBlA2

39 40 vioC pSBlA2

41 42 vioD pSBlA2

43 44 vioE pSBlA2

45 46 nCFP (N-terminal mCerulean) pSBAK3

47 48 nYFP (N-terminal mCitrine) pSBAK3

49 50 cCFP (C-terminal mCerulean) pSBAK3

51 52 cYFP (C-terminal mCitrine) pSBAK3

Auxiliary polypeptides

53 54 linker peptide pSBAK3

55 his_tag pSBAK3

Program nucleic acid sequence

56 program SPR, split, FRET pBluescript

57 program NICTAL pBluescript

58 program biosynthesis (123456) pBluescript

59 program biosynthesis (12346) pBluescript

60 programn biosynthesis (341256) pBluescript

Chimeric proteins

61 62 Znf_Blues-linker peptide- vioA pSBlA2

63 64 Znf Zif268-linker peptide-vioB PSB1A2 5 66 Znf_PBSII-linker peptide-vioE PSB1A27 68 Znf_HivC-linker peptide-vioD PSB1A29 70 Znf_Glil -linker peptide- vioC pSB!A21 72 His_tag-Znf_Glil -linker peptide-nCFP pSBAK83 74 His_tag-Znf_Blues-linker peptide-nCFP pSBAK85 76 His_tag-Znf_PBSII_linlcer peptide-nYFP pSBAK87 78 cCFP-linker peptide-Znf_HIVC-His_tag pSBAK89 80 cCFP-linker peptide-Znf Jazz-linker peptide-His_tag pSBAK81 82 cYFP -linker peptide-Znf_Zif268-linlcer peptide -His_ tag pSBA 8

Binding of binding factor to target nucleic acid element β-Galactosidase Enzyme Assay

Bacterial cultures carrying plasmid constructs listed in Table 1 were incubated in 10 ml LB broth supplemented with 5μ1 IPTG (1M) (isopropyl- -D-thio-galactoside) and increasing concentrations of (0%, 0,0025%, 0,1% and 1%) L-arabinose at 37°C on a rotary shaker at 180 rpm for 18 hours. Bacterial density was determined by optical density (OD600). The measurements of β-galactosidas activity were performed in an ELISA-reader, preheated on 28°C. 5 μΐ of each culture was transferred to the wells of a 96-well microliter clear-bottom plate to which 100 μΐ of Z-buffer with chloroform (Z-buffer: 0,06 M Na2HP04 x 7H20, 0,04 M NaH2P04 x H20, 0,1M C1, 0,001 M MgS04x7H20, pH 7; Z-buffer with chloroform: Z- buffer, 1% β-mercaptoethanol, 10% chloroform) was added. Bacterial cells were lysed by addition of 50 μΐ of Z-buffer with SDS (Z-buffer, 1,6 % SDS) followed by incubation for 10 min at 28°C. 50 μΐ of 0.4 % ONPG solution in Z-buffer was added to each well and changes in OD 405 were measured for a period of 20 min at 30 sec intervals.

Results

To test whether the binding of NAB F (Zinc Fingers Zif 268, Blues, PBSII, HivC) or His- Nictal, to a corresponding specific nucleic acid target occurs in vivo, we designed a reporter system. The reporter contains plasmid composed of a) lacZ under synthetic promoter containing a DNA binding site for the particular DNA binding protein tested and b) a corresponding DNA binding protein under arabionose promoter. The successful binding of NABF to the synthetic promoter would prevent transcription of lacZ resulting in lower β - galactosidaze activity. E. coli cultures containing plasmids for various NABF were grown overnight in LB medium supplemented with increasing concentrations of arabinose. β - galactosidaze activity was measured as described in the text. In all constructs tested, the activity of β -galactosidaze decreased with increasing concentrations of arabinose suggesting that the NABF (Table 1) bind to the target nucleic acid element within the synthetic promoter in vivo (Figure 3A). Furthermore, β -galactosidaze activity was unaffected in the presence of mismatching NABF and DNA target sequence (e.g. Blues target element - blues_0 was changed with PBSII target element PBSII_0, and HivC_0 target element was swapped with Gli__0 target element), reiterating the specificity of our testing system system (Figure 3B).

Surface plasmon resonance (SPR)

SPR analysis was performed on Biacore T100 (Ge Healthcare). Series S sensor chip SA (Ge Healthcare) was immobilized with biotinylated single-stranded DNA probe at 100 nM concentration to reach final concentration of approx. 300 RU. Hybridization of complementary DNA (program DNA sequence with binding sites for ZN-fmgers HivC, Zif268, JAZZ, Blues, PBS, Glil) was performed by injection 0,5-2 μΜ in running buffer (10 mM HEPES, 150 mM NaCl, 0,lmM EDTA, 0.005 % P20, pH 7.4) up to 300 s to get the final response of approx. 300 RU. The analyte binding was observed by injection of different concentrations for 1 min following the dissociation which was monitored for 5 min. The regeneration of surface was achieved with two 30 s injections of 50 mM NaOH and one 24 s injection of 0.5 % SDS. After that the new DNA was injected to obtain fresh binding surface.

Series S sensor chip CM5 (Ge Healthcare) was immobilized with 3000 RU of avidin via amine coupling using manufacturer's protocol. The carboxymethylated surface of CM5 chip was activated using a 7 min injection pulse of 1 : 1 NHS:EDC. Avidin in Na-acetate pH 5.5 was then injected in several short pulses to reach the response of 3000 RU. The avidin was coupled only to second flowcell while first flowcell served as a reference surface to substract the nonspecific binding of analytes to dextran matrix on a sensor chip. Unreacted sites on the sensor surface were blocked with a 7 min injection pulse of 1 M ethanolamine (pH 8.5).

Results:

In vitro binding of chimeric proteins, biosynthetic enzymes linked to NABF on program DNA (Figure 4). Figure 4A presents binding of program DNA than binding of chimeric protein containing Glil NABF and regeneration of support. Hybridization of DNA was performed by injection 0.5-2 μΜ up to 300 s to get the final response of 300 RU. The analyte binding was observed by injecting different concentrations for 1 min following the 5 min dissociation. The regeneration was achieved with two 30 s injections of 50 mM NaOH and 24 s injection of 0.5 % SDS.

Analyte binding to program DNA, fresh DNA capture for each protein is presented on Figure 4B. Analytes from top: Glil, Zif268, BLUES, ZNF_HIVC 3 PBSII, JAZZ.

Sequential binding of NABF on program DNA is presented on figure 4C. DNA was captured at the beginning of the cycle and the proteins were injected for 1 minute one after another in 2 different sequences: solid line: Znf_Jazz, Znf Blues, Znf_Zif268, Znf_PBSII, Znf HivC and Znf_Glil; dashed line: Znf_Glil, Znf PBSIL Znf_HivC, Znfjazz, Znf _Blues and Znf_Zif268. The binding of all analytes in sequence showed that the binding of analytes was weaker if Glil was injected as the first one.

EMSA— electrophoretic mobility shift assay

Specific DNA binding of synthetic zinc finger domains to the program nucleic acid was tested by electrophoretic mobility shift assay. 1 μg of purified proteins were incubated with increasing amounts of program nucleic acid - 500ng, 750ng, lOOOng - for 3 hours. Samples diluted with high grade laboratory water to 20μ1 were loaded on a 2.0% agarose gel prestained with ethidium bromide and run at 70 V for 40 minutes. Nucleic acid - protein - complexes were detected under UV light.

Results:

Migration of Ethiduim Bromide stained program nucleic acid was dependent on the presence of NABF (i.e. zinc finger DNA binding domain), resulting in slower migration compared to program nucleic acid only (Figure 5). These data demonstrate binding of NABF to a specific target sequence within a DNA program (Figure 5).

Confocal microscopy

Leica TCS SP5 laser confocal microscope was used for detection of chimeric split fluorescent (e.g. NABF linked to nCFP, NABF linked to cCFP) proteins and FRET reconstitution on a nucleic acid program in vivo. Transfected HEK293 cells that were cultured overnight on 8-well microscope slide at 37°C were placed on top of objective lens carrying a droplet of immersion oil. Cells were excited at 433 nm for CFP and 515 nm for YFP. FRET was detected according to the methodology provided by the manufacturer's software (FRET AB and FRET SE wizards within Leica LAS AF computer software).

Split, FRET experiments

HEK293 cells were transfected with mammalian expression vectors carrying split fluorescent fusion proteins under CMV promoter using jetPEI™ tranfection reagent protocol and ultimately fixed with 4% paraformaldehyde. Split fluorescent protein reconstitution as well as FRET effect were observed under confocal microscope (described under "Confocal microscopy"). CFP (Znf_Glil -linker peptide-nCFP and cCFP-linker peptide- Znf_HivC) and YFP (Znf PBSII-linker peptide-nYFP and cYFP-linker peptide-Znf_Zif268) were reconstituted only in the presence of specific, but not random, program nucleic acid.

Results:

Split fluorescent proteins bind to adjacent target nucleotide sequences when plasmids carrying split fluorescent protein fusions are cotransfected with a plasmid carrying target program nucleic acid. As split fluorescent proteins cannot form a fully formed chromophore in vivo by accident, fluorescent signal comes as a consequence of split protein - zinc finger fusions bound to a DNA program nucleic acid sequence.

Split GFP reconstitution. Figure 6 shows in vivo reconstitution of two non-functional fluorphores (split-GFPs) linked to NABF only in the in the presence of program nucleic acid sequence (Figure 6A).

Plasmids, 75 ng each, carrying Znf_Glil -linker peptide-nCFP and cCFP -linker peptide- Znf fflVC were cotransfected with 350ng of program DNA into HEK293 cells. Cyan fluorescent protein emission signal was observed by exciting cells with 433 nm laser light (Figure 6B).

Plasmids, 75 ng each, carrying split yellow fluorescent protein fusions (Znf_PBSII_linker peptide-nYFP and cYFP-linker peptide Znf_Zif268) were cotransfected with 350 ng program DNA. YFP emission signal with a 529 nm peak was observed under a confocal microscope (Figure 6C). Split GFP reconstitution in vitro. HEK293 cells were cotransfected with Znf _Glil -linker peptide-nCFP, cCFP-linker peptide-Znf_HivC and Znf_PBSII-linlcer peptide-nYFP, cYFP- linker peptide-Znf _Zif268 respectively. Cells were lysed in 150μ1 Promega Passive Lysis Buffer after 72 hours, when 5(^g of program nucleic acid was introduced. Functional fluorophore was observed after incubating cell lysates with DNA program for 18 h at 4°C using PerkinElmer LS55 Luminescence Spectrometer.

FRET efficiency. FRET efficiency of four chimeric proteins, split GFPs linked to NABF in the presence of program DNA is presented in Table 2. FRET was measured using FRET AB wizard within LAS AF software provided along with Leica TCS SP5 laser confocal microscope. Emission signals in CFP and YFP channels were adjusted. Acceptor was photobleached with 515 nm laser and FRET efficiency within photobleached area were calculated thereout. HE 293 cells were transfected with program DNA and Znf_Glil -linker peptide-nCFP, cCFP-linker peptide- Znf_HIVC, Znf_PBSII-linker peptide-nYFP, Znf cYFP- linker peptide-Znf_Zif268. For negative control, cells were transfected with cytosolic CFP and YFP without DNA binding factors, and for positive control cells were transfected CFP linked to YFP via linker peptide

Table 2: FRET efficiency of four split GFPs each linked to different DNA binding factor.

FRET positive control FRET negative control FRET on a DNA program

23 ± 2 1 + 1 4 ± 2

Synthesis of Violacein

Overnight cultures of E. coli containing plasmids encoding chimeric proteins (See Table I, construct combination ZnfJBlues-linker peptide-vioA, Znf_Zif268-linker peptide-vioB, ZnfJPBSII-linker peptide-vioE, Znf_HivC-linker peptide-vioD, Znf_Glil -linker peptide-vioC) with or without plasmid encoding nucleic acid program (See Table 1, program biosynthesis (123456) or program biosynthesis (341256)) were diluted in fresh Luria Bertani broth to OD600 of 0,05 and grown at 30°C in the presence of appropriate antibiotics. At various time points samples were taken (3 ml total). Representative time point (17.5 h) is shown in Figure 9. Bacteria were lysed by addition of equal volume (1,5 ml) of 10% SDS, and violacein extracted by ethyl acetate (1 : 1 vol, vol). After brief vortexing, the organic phase was collected and absorbance spectrum at 575nm was measured. Samples were analyzed by TLC and quantitative determination of violacein was performed by denzitometry at 575 nm after scanning the spectra directly on the HPTLC plates. Violacein was also identified by mass spectrometry. The quantity of violacein production with or without nucleic acid program or in the presence of scrambled nucleic acid program was compared (Figure 9).

While the present invention has been described with reference to the specific embodiments thereof, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the true spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation, material, composition of matter, process, process step or steps, to the objective, spirit and scope of the present invention. All such modifications are intended to be within the scope of the claims appended hereto.