Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR PROTEIN EXPRESSION
Document Type and Number:
WIPO Patent Application WO/2018/158141
Kind Code:
A1
Abstract:
The present invention relates to an improved method for transient protein expression which is generally applicable to production of any therapeutic protein that can be produced using mammalian Cell lines and in particular Chinese Hamster Ovary (CHO) cells. The method combines expression construct components improving the post-transcriptional processing of a gene of interest, the introduction of a onetime host cell line selection workflow using a template protein of interest expression construct to generate a production competent cell line and approaches enabling inactivation of the template protein of interest. Further expression and protein production of any desired protein of interest using said production competent cell line for transient protein expression processes with improved yields and efficiencies. In a preferred related embodiment of the invention the production competent cell line generated can also be used for CLD using targeted integration and hence the same cell line can be used for production of a desired protein of interest using both transient expression processes and production processes utilizing stable production cells generated using fast CLD workflows.

Inventors:
IVANSSON DANIEL (SE)
Application Number:
PCT/EP2018/054459
Publication Date:
September 07, 2018
Filing Date:
February 23, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
GE HEALTHCARE BIO SCIENCES AB (SE)
International Classes:
C12P21/00; C12N15/90
Domestic Patent References:
WO2014205192A22014-12-24
WO2009130598A12009-10-29
Other References:
LIN ZHANG ET AL: "Recombinase-mediated cassette exchange (RMCE) for monoclonal antibody expression in the commercially relevant CHOK1SV cell line", BIOTECHNOLOGY PROGRESS., vol. 31, no. 6, 13 October 2015 (2015-10-13), US, pages 1645 - 1656, XP055383248, ISSN: 8756-7938, DOI: 10.1002/btpr.2175
WIRTH ET AL: "Road to precision: recombinase-based targeting technologies for genome engineering", CURRENT OPINION IN BIOTECHNOLOGY, LONDON, GB, vol. 18, no. 5, 1 October 2007 (2007-10-01), pages 411 - 419, XP022350911, ISSN: 0958-1669, DOI: 10.1016/J.COPBIO.2007.07.013
SONJA WILKE ET AL: "Streamlining Homogeneous Glycoprotein Production for Biophysical and Structural Applications by Targeted Cell Line Development", PLOS ONE, vol. 6, no. 12, 9 December 2011 (2011-12-09), pages e27829, XP055330606, DOI: 10.1371/journal.pone.0027829
YOSHINORI KAWABE ET AL: "Repeated integration of antibody genes into a pre-selected chromosomal locus of CHO cells using an accumulative site-specific gene integration system", CYTOTECHNOLOGY., vol. 64, no. 3, 25 September 2011 (2011-09-25), NL, pages 267 - 279, XP055333210, ISSN: 0920-9069, DOI: 10.1007/s10616-011-9397-y
Attorney, Agent or Firm:
ALDENBÄCK, Ulla et al. (SE)
Download PDF:
Claims:
CLAIMS

A method for creating a mammalian cell bank for transient protein production comprising the following steps:

(a) providing a recombinant suspension growing mammalian cell comprising a template DNA construct at a genomic region that is transcriptionally active during culture in a serum free culture medium, said template DNA construct having a region containing elements for expression of a template protein of interest;

(b) based on said recombinant cell generating one or several candidate cells having improved characteristics for production of said template protein of interest;

(c) measuring production traits of said generated candidate cells and selecting a top candidate cell;

(d) introducing vector(s) coding for nucleic acid processing enzyme(s) and optionally a donor DNA construct into said top candidate cell resulting in the generation of a final cell containing a receiving DNA construct lacking said template protein of interest coding region(s) and having one or several sequence elements 0 enabling the introduction of an expression vector DNA construct into said receiving DNA construct; and

(e) from said generated final cell creating a cell bank for transient protein production of a desired protein of interest belonging to the same class as said template protein of interest, wherein said transient protein production is achieved via introduction of a DNA expression vector plasmid or an in-vitro generated mRNA set encoding said desired protein of interest into a cell from said cell bank.

Method according to claim 1, wherein;

(a) said template DNA construct comprises in 5' to 3' sequence order a first recombinase recognition sequence, such as an attP or attB sequence, a first copy of a second

recombinase recognition sequence, such as a loxP sequence, a promoter less selection marker gene and a region coding for a template protein of interest in any order and orientation followed by an additional copy of said second recombinase recognition sequence, such as a second loxP sequence; and

(b) the promoter driving said promoter less selection marker gene can be placed upstream of said first recombinase recognition sequence or between said first recombinase recognition sequence and said first copy of said second recombinase recognition sequence or downstream of said second copy of said second recombinase recognition sequence; and

(c) said first and second recombinase recognition sequences act as recognition sequences for different recombinases; and

(d) said receiving DNA construct is created by the introduction of a vector coding for a recombinase with specificity for said second recognition sequence, such as Cre, catalyzing excision of the region being flanked by said second recombinase recognition sequences and leaving only a single second recombinase recognition sequence, such as a loxP sequence, downstream of said first recombinase recognition sequence; and

(e) said final cell or cell population is selected based on lack of selection marker activity and/or genetic characterization.

3. Method according to claim 1, wherein said top candidate cell is used as said final cell and step (d) is omitted and said template protein of interest region comprises inducible promoter(s) for expression of said template protein of interest, and wherein said template protein of interest expression can be silenced via changing culture conditions to inactivate said inducible promoter(s), and wherein said transient expression of said desired protein of interest is achieved via introduction of a plasmid or an in-vitro generated mRNA set encoding said desired protein of interest into a cell from said cell bank, and wherein said expression is achieved using culture conditions wherein said inducible promoter(s) is inactive.

4. Method according to any of claim 1-3, wherein said candidate cells are generated

according to any of the following procedures:

(a) Isolating clones from a culture of said recombinant cell or descendants thereof;

or

(b) Isolating pools or clones after (1) different culture time in a continuous culture format such as a perfusion culture or (2) different number of individual cultures starting with inoculation of a first culture using said recombinant cell or descendants thereof and using a volume from a finished culture to inoculate a next culture or (3) a combination of (1) and (2); or

(c) performing targeted engineering by applying gene editing methods to introduce, remove or modify genetic material in the genome of said recombinant cell or descendants thereof; or

(d) using an isolated clone from a culture of said recombinant cell or descendants thereof to inoculate a culture in procedure (b); or

(e) any combination of procedures (a) to (d).

5. Method according to any of claims 1-4, wherein prior to generating said cell bank, said top candidate cell is engineered to increase the genomic and/or epi-genomic stability by using tools such as targeted gene editing nucleases or synthetic epi-genetic regulators.

6. Method according to any of claims 1-5, wherein said template DNA construct comprises a region coding for one or several selection markers such as a fluorescent protein gene such as GFP, CFP or RFP; a toxin or antibiotic resistance gene such as NeoR or; a metabolic enzyme such as DHFR or GS, and wherein cultures used to generate said candidate cells or cell populations are designed to require an active selection marker for survival and growth of cells, such as a Neomycin resistance protein as selection marker and presence of Neomycin during culture.

7. Method according to any of claims 1-6, wherein said cell bank for transient protein

production also functions as a cell bank for cell line development, wherein cell line development is achieved by a donor DNA construct comprising a region coding for a desired protein of interest belonging to the same class as the template protein of interest being introduced via the use of an expression vector and the action of an DNA processing enzyme, and wherein said cell bank is first used for selection of a desired protein of interest via evaluation of candidate desired proteins of interest using transient protein production and/or for producing a desired protein of interest using transient protein production, and wherein said cell bank is then used to generate a stable cell line by cell line development and produce said desired protein of interest at a larger scale.

8. Method according to any of claims 1-6, wherein in addition said top candidate cell is also used to create a cell bank for cell line development, wherein cell line development is achieved by a donor DNA construct containing a region coding for a desired protein of interest belonging to the same class as the template protein of interest being introduced via the use of an expression vector and the action of an DNA processing enzyme, and wherein said cell bank for transient protein production is first used for selection of a desired protein of interest via evaluation of candidate desired proteins of interest using transient protein production and/or for producing a desired protein of interest using transient protein production, and wherein said cell bank for cell line development is then used to generate a stable cell line by cell line development and producing said desired protein of interest at a larger scale.

9. Method according to any of claims 1-8, wherein said production traits comprise at least one trait selected from the following list; cell growth, cell viability, cell specific productivity for the template protein of interest, template protein of interest aggregate level, level of template protein of interest charge variants, level of truncated protein of interest variants, glycosylation profile for the template protein of interest, glycosylation site occupancy for the template protein of interest, level of tertiary structure heterogeneity for the template protein of interest, level of host cell proteins secreted into the culture medium, level of lactate production, level of ammonium production, level of glucose consumption rate, performance in a pre-defined platform cell culture media, performance in a pre-defined platform bioprocess, fit with a defined host cell proteomic profile, fit with a defined host cell mRNA profile, fit with a defined host cell miRNA profile, fit with a defined host cell line metabolic profile, half-life of cellular template product mRNA, half-life of in-vitro generated template product mRNA, transcription activity of a defined genomic region, overall genomic stability and transfection efficiency in a transient expression process.

10. Method according to any of claims 1-9, wherein the template protein of interest is coded by:

(a) a single gene of interest, such as ones encoding growth factors, blood clotting factors, cytokines, hormones, erythropoietins, albumins, virus proteins, virus protein mimics, bacterial proteins, bacterial protein mimics, domain antibodies, ScFvs, Affibodies, DARPINs, multimerization domains, IgG Fc domains, albumin binding domains, Fc receptor binding domains or fusion proteins based on combinations of the above single gene/mRNA of interest coded groups; or

(b) two or more genes of interest such as monoclonal antibodies based on naturally occurring scaffolds, bi-specific antibodies based on naturally occurring scaffolds, Fabs, virus like particles, native virus particles, multiple chain proteins based on association of two or more different protein chains selected from the list in (a).

11. Method according to any of claims 1-10, wherein; (i) the promoter(s) used for expression of said template protein of interest using said template DNA construct or template DNA expression plasmid are strong promoter(s) such as mCMV or hCMV, (ii) some or all of the 5'- UTRs, coding regions and 3'-UTRs in said template DNA construct have been designed for improved translational efficiency.

12. Method according to claim 11, wherein said improved translation efficiency is achieved via the presence of translation enhancement elements in the 5'-UTR of said template protein of interest gene construct(s) and/or via nucleotide sequence optimization of the protein coding sequence of said template protein of interest coding gene(s).

13. Method according to any of claims 1-12 wherein steps (a) to (c) of claim 1 are iterated in the following way prior to generating said cell bank:

(i) in a next iteration the top candidate cell from previous step (c) is used as said recombinant mammalian cell in step (a); (ii) step (b) to (c) is repeated using a modified template DNA construct having been modified to provide increased expression potential for said template protein of interest compared to earlier iterations;

and

(iii) repeating (i) and (ii) n times.

14. Method according to claim 13, wherein differences in template protein of interest expression potential are achieved by using promoters with increasing strength and/or by using different combinations of translation enhancement elements in the 5'-UTR of genes coding for said template protein of interest.

15. Method according to claim 14, wherein exchange of template DNA constructs is achieved in the following way;

(a) each modified template DNA construct is designed to have conserved sequence stretches X' and Y' in their 5'- and 3'-ends that are homologous to two sequences X and Y flanking said template DNA construct and being unique or rare in the genome of said mammalian host cell line ;

(b) each modified template DNA construct is designed to have a gene editing nuclease target sequence were the sequence differs between generation z and z+1;

(c) modified template DNA constructs of generation z and z+1 contains different selection marker(s);

(d) a modified template DNA construct of generation z+1 is introduced together with a gene editing expression vector construct into a cell containing a template DNA construct of generation z and wherein the gene editing expression vector codes for a gene editing nuclease with specificity for said target sequence of the modified template DNA construct of generation z;

(e) Cells having undergone the correct exchange via double strand break catalyzed homologous recombination are enriched for by using the difference in selection markers between DNA constructs of generation z and z+1;

and

(f) optionally, DNA analysis methods are applied to ensure correct exchange only for cells passed on to next step.

16. Method according to claims 13 or 14, wherein exchange of modified template DNA

constructs is achieved in the following way;

(a) template DNA constructs are of the type described in claim 2;

(b) said top candidate cell in a specific iteration are treated with a recombinase, such as Cre, to excise said region being flanked by said second recombinase recognition sequence;

(c) providing a DNA vector containing in clockwise sequence order (i) said second recombinase recognition sequence, such as loxP, followed by a promoter less clockwise encoded selection marker gene and a region encoding said template protein of interest including promoter(s) in a modified version compared to corresponding region in said top candidate cell or cell population, or (ii) a promoter less anti-clockwise encoded selection marker gene followed by said second recombinase recognition sequence, such as loxP, and a region encoding said template protein of interest including promoter(s) in a modified version compared to corresponding region in said top candidate cell or cell population ;

(d) contacting a cell having excised said region being flanked by said second recombinase recognition sequence with said DNA vector and a matching recombinase, such as Cre, to catalyze the creation of a next generation template DNA construct containing said region encoding said template protein of interest including promoter(s) in a modified version;

(e) selecting a cell having undergone the correct modification using said selection marker gene and/or genetic characterization

17. Method according to any of claims 1-16 wherein said cell bank is based on a final cell

containing a receiving DNA construct version containing a sequence z2 being unique or rare in the total genome sequence of said cell and two sequences X and Y flanking said receiving DNA construct and being unique or rare in the genome of said cell.

18. Method according to any of claims 1-16 wherein said cell bank is based on a final cell

containing a receiving DNA construct version containing two recombinase recognition sequences flanking a first selection marker coding region and wherein said recombinase recognition sequences are both of the same type such as a serine recombinase type such as attP/attB or a tyrosine recombinase type such as Lox, Rox or FRT;

19. Method according to any of claims 1-16 wherein said cell bank is based on a final cell

containing a receiving DNA construct version containing a single recombinase recognition sequence followed by an optional first selection marker coding region and wherein said recombinase recognition sequence is of a type such as a serine recombinase type such as attP/attB or a tyrosine recombinase type such as Lox, Rox or FRT.

20. Method according to any of claims 17-19, wherein said receiving DNA construct is

introduced into said genomic region in the following way:

(a) providing a donor DNA construct containing said receiving DNA construct flanked by two sequences X' and Y' being homologous to two corresponding sequences X and Y in a template DNA construct containing cell and wherein said sequences X and Y are unique or rare in the genome of said template DNA construct containing cell and flanking said template DNA construct;

(b) defining a gene editing nuclease recognition sequence within said template DNA construct being unique or rare in the genome of said template DNA construct containing cell;

(c) a vector(s) coding for a gene editing nuclease, such as a zinc finger

nuclease/meganuclease/TALEN or a CRISPR/Cas9 combination, with specificity for said gene editing nuclease sequence is introduced together with said donor DNA construct into said top candidate cell;

(c) said gene editing nuclease creating a double strand break at said gene editing nuclease sequence catalyzing integration of said receiving DNA construct by cellular DNA repair mechanisms such as homologous recombination;

(d) cells having undergone correct introduction are selected via the use of a selection marker and/or genetic characterization.

21. A method for mammalian cell line development according to claims 7 and 8 and based on the cell bank developed according to claim 17 comprising the following steps:

(a) providing a matching expression vector in the form of a plasmid containing sequences X' and Y' being homologous to said sequences X and Y and wherein said sequences X' and Y' flanks a desired protein of interest coding region including promoter(s) and a second selection marker coding region including promoterfs;

(b) introducing vector(s) coding for a gene editing nuclease with specificity for said sequence z2 together with said expression vector into said cell and wherein said gene editing nuclease is designed using any available technology platform such as zinc finger nucleases, meganucleases, TALENs or CRISPR/Cas9 designs;

(c) said genome editing nuclease generates a double strand break at sequence z2 catalyzing exchange of the regions flanked by said sequences X and Y in said receiving DNA construct and X' and Y' in said expression vector via cellular DNA repair mechanisms;

(d) a cell having undergone correct cassette exchange only are selected via the use of selection marker(s) and/or genetic characterization.

22. Method for mammalian cell line development according to claims 7 and 8 and based on the cell bank developed according to claim 18 comprising the following steps:

(a) providing a matching expression vector in the form of a plasmid containing two recombinase recognition sequences flanking a desired protein of interest coding region and a second selection marker coding region in any order and orientation, and wherein said recombinase recognition sequences in the expression vector are of the same type as and matching said recombinase recognition sequences in said receiving DNA construct;

(b) introducing said expression vector together with vector(s) coding for a recombinase, and wherein said recombinase is of any type matching said recombinase recognition sequences in said receiving DNA construct and said expression vector such as one selected from the group of PhiC31, Cre, Dre, or Flp;

(c) said recombinase catalyzes the exchange of the regions flanked by said recognition sequences in said receiving DNA construct and said expression vector;

(d) a cell having undergone correct cassette exchange only are selected via the use of selection marker(s) and/or genetic characterization.

23. Method for mammalian cell line development according to claims 7 and 8 and based on the cell bank developed according to claim 19 comprising the following steps:

(a) providing a matching expression vector in the form of a plasmid containing a

recombinase recognition sequence followed by a desired protein of interest coding region and a second selection marker coding region in any order, and wherein said recombinase recognition sequence in the expression vector is of the same type as and matching said recombinase recognition sequence in said receiving DNA construct;

(b) introducing said expression vector together with vector(s) coding for a recombinase, and wherein said recombinase is of any type matching said recombinase recognition sequences in said receiving DNA construct and said expression vector such as one selected from the group of PhiC31, Cre, Dre, or Flp;

(c) said recombinase catalyzes the integration of said expression vector at said recognition sequence in said receiving DNA construct resulting in the presence of a functional region for expression of said desired protein of interest;

(d) a cell having undergone correct integration of the expression vector only are selected via the use of selection marker(s) and/or genetic characterization.

24. Method for mammalian cell line development according to claims 7 and 8 and based on the cell bank developed according to claim 2 comprising the following steps:

(a) providing a matching expression vector in the form of a plasmid containing in clockwise sequence order (i) a recombinase recognition sequence followed by a promoter less selection marker gene and a desired protein of interest coding region including promoters, or (ii) a promoter less anti clockwise encoded selection marker gene, a recombinase recognition sequence and a desired protein of interest coding region including promoters, and wherein said recombinase recognition sequences in the expression vector are of the same type as and matching said recombinase recognition sequences in said receiving DNA construct; (b) introducing said expression vector together with vector(s) coding for a recombinase, and wherein said recombinase is of any type matching said recombinase recognition sequences in said receiving DNA construct and said expression vector such as one selected from the group of PhiC31, Cre, Dre, or Flp;

(c) said recombinase catalyzes the targeted integration of said expression vector into said receiving DNA construct;

(d) a cell having undergone correct integration only are selected via the use of selection marker(s) and/or genetic characterization.

25. Method according to any of claims 1-24, wherein said mammalian host cell line is a CHO cell line such as CHO DG44, CHO Kl, CHO M, CHO-S or a CHO GS knockout cell line.

26. Method according to any of claim 1-25, wherein the 5'-UTR(s), signal peptide(s) and 3'-UTR(s) of said candidate desired protein of interest and desired protein of interest are identical to those used for said template protein of interest.

27. Method according to any of claim 1-26, wherein transient expression of a desired protein of interest is achieved using in-vitro generated mRNA(s) and the amount of in-vitro generated mRNA(s) introduced into cells from said cell bank is controlled so as to closely equal mRNA levels measured during a previous culture for expression of said template protein of interest.

Description:
Title: Method for protein expression

Field of the invention

The present invention relates to an improved method for transient protein expression, for some embodiments in combination with cell line development based on targeted gene integration using the same host cell line, which is generally applicable to production of any therapeutic protein or protein based format that can be produced using mammalian Cell lines and in particular Chinese Hamster Ovary (CHO) cells.

Background of the invention

During the latest 30 years recombinant protein therapeutics has evolved from a novelty to a dominating position among marketed drugs. Recombinant production of therapeutic proteins has surpassed the 100 billion $ per year market volume and plays an important role in the global economy as well as in advanced medical care. The therapeutic protein class includes replacement proteins (insulin, growth factors, cytokines and blood factors), vaccines (antigens, VLPs) and monoclonal antibodies. The by far dominating format is the monoclonal antibodies. Some of the recombinant proteins can be produced in simple microbial cells such as £ coli, but for more complex proteins including the monoclonal antibody class Chinese Hamster Ovary (CHO) cells is the dominating host for production [1]. The monoclonal antibody class is projected to continue being the dominating format but with a larger heterogeneity in molecular structure within this class including different multi-specific formats, fusion proteins, alternative scaffolds and antibody drug conjugates (ADCs). However, since most of these formats will still require advanced protein processing capacity (including glycosylation, disulfide formation and advanced folding machinery) not offered by microbial cells, CHO will likely continue to be the dominating production host for many years to come.

Increased knowledge about the molecular details underlying human diseases has revealed a huge heterogeneity of main diagnoses. As an example, breast cancer is no longer considered to be one disease but consists of at least several 10s of sub-diagnoses. Hence, protein therapeutics is becoming more targeted towards specific molecular mechanisms and will most likely be even more so in the future. Thus an increased number of drugs are needed to enable treatment of whole populations displaying different variants of disease at the molecular level. At the same time there is an increasing pressure to decrease the cost of healthcare, including drugs. Major contributors to the cost of therapeutic protein drugs are the long development time and frequent late failure of drug candidates. One approach to mitigate the risk of late failure and increase the development speed is to evaluate multiple drug candidates early for their developability potential (titer in intended production host, aggregation tendency, formulation stability, immunogenicity). For this to work the production of early protein material must be highly similar to the intended final process and require a minimum of time and effort. For complex protein therapeutics the final production is generally performed in a clonal CHO cell line carrying the recombinant genes stably inserted into the genome by a process referred to as Cell Line Development (CLD).

Currently the mainstream approach to cell line development using e.g. CHO (Chinese hamster ovary) cells is to use random integration of genes of interest followed by (a) selection of cells having the GOI (gene of interest) integrated and (b) a massive screening of clones to find specific clones with favorable production characteristics. The reasons why screening is needed is twofold (i) As a GOI is integrated randomly into the genome the resulting transcription level will be impacted by epigenetic regulation in the region of insertion. A clone having the GOI integrated into one or several highly active and stable genomic locations is needed. Typical cell lines generated generally contain between 5-20 copies of the GOI. (ii) A clone adapted to the burden of expressing a foreign protein at very high levels and with maintained good growth characteristics is needed. However, a CHO host cell is, for example, not a very competent secretor. Further, the CHO genome is highly plastic. By introducing expression of foreign secreted proteins at very high levels an evolutionary pressure towards increased folding and secretory capacity is introduced. By screening many clones, cells better adapted for high secretion can be found. The best random integration platforms today can yield high protein titers in a relatively short time period (~ 3 months) albeit using a very resource intensive workflow. Further, generated cell clones will be different at the genetic and phenotypic level between different cell line development efforts. This makes early developability assessment to improve efficiency of development difficult and increases process development efforts. One potentially major improvement is to utilize targeted integration (site-directed integration; SDI) of genes of interest. In such a scenario a pre-identified genomic location known to support high and stable transcription is used as a target destination for GOIs in all CLD efforts. Using intelligent combinations of pre-introduced sequences and vector designs, including the use of co-transfected nucleic acid enzymes such as nucleases or recombinases, will facilitate targeted insertion and ensure that all cells in culture will contain correctly inserted GOIs and hence have a high

transcription rate [2]. This will significantly reduce the number of clones in the screening campaign. All clones will have the same relatively high transcription rate and hence all clones will also have an evolutionary pressure towards improved handling of the recombinant protein production burden. However, at least two challenges remain (1) SDI generally only integrates a single copy of the GOI and hence the level of expression and evolutionary pressure is generally lower than what can be achieved using random integration and (2) one still need to find a clone that has undergone genetic changes adapting it to e.g. high secretion etc.

An alternative approach to increase the speed of protein production is to skip cell line development altogether and use transient protein production. In such a scenario a host cell line is cultured and an expression vector is introduced into cells to achieve expression of the target protein without integration of genetic material into the genome of cells [13]. Hurdles for taking this approach into clinical production has been low titers, high costs when scaling up and safety concerns due potential risks for random integration of expression vector DNA into the genome of production cells with unknown effects on the quality and purity of a target protein of interest. With a trend towards reduced volumes of clinical production batches transient production could be feasible if the yields could be increased. However, one major obstacle is the low protein production competency of CHO host cell lines. It has been shown that numerous changes exist at the genetic and phenotypic level between a typical host cell line lacking recombinant genes in its genome and a final production clone selected during CLD and being of a high producer phenotype [3]. These differences represent the transformation towards an increased capacity to handle the metabolic burden of producing a foreign recombinant protein at very high levels. Changes will likely include an increased capacity for (i) amino acid synthesis and tRNA charging (ii) protein folding and (iii) protein secretion together with an efficient basic metabolic phenotype. Finding a clone having undergone the desired

transformation generally requires substantial screening. The CHO genome is highly plastic and this plasticity forms the engine for introducing variation for screening. However, it is highly likely that the evolutionary pressure of high recombinant protein expression is needed as an inherent selection agent to guide and maintain accumulation of a large number of changes beneficial for recombinant protein production (and avoid accumulation of negative changes) in a single clone. Thus, to increase speed and enable highly parallel developability assessments, there exists a need of an improved method for transient protein expression utilizing a host cell line that can increase yields from such a process.

Summary of the invention

The present invention relates to an improved method for transient protein expression which is generally applicable to production of any therapeutic protein that can be produced using mammalian Cell lines and in particular Chinese Hamster Ovary (CHO) cells.

The method combines expression construct components improving the post-transcriptional processing of a gene of interest, the introduction of a onetime host cell line selection workflow using a template protein of interest expression construct to generate a production competent cell line and approaches enabling inactivation of the template protein of interest. Further expression and protein production of any desired protein of interest using said production competent cell line for transient protein expression processes with improved yields and efficiencies. In a preferred related embodiment of the invention the production competent cell line generated can also be used for CLD using targeted integration and hence the same cell line can be used for production of a desired protein of interest using both transient expression processes and production processes utilizing stable production cells generated using fast CLD workflows

Thus, the invention relates to a method for creating a mammalian cell bank for transient protein production comprising the following steps:

(a) providing a recombinant suspension growing mammalian cell comprising a single copy of a template DNA construct at a genomic region that is transcriptionally active during culture in a serum free culture medium, said template DNA construct having a region containing elements for expression of a template protein of interest;

(b) based on said recombinant cell generating one or several candidate cells having improved characteristics for production of said template protein of interest;

(c) measuring production traits of said generated candidate cells and selecting a top candidate cell;

(d) introducing vector(s) coding for nucleic acid processing enzyme(s) and optionally a donor DNA construct into said top candidate cell enabling catalyzing the creation of a receiving DNA construct defined by removal or silencing of said introduced template protein of interest coding region(s) and the presence of one or several sequence elements 0 enabling the introduction of an expression vector DNA construct into said receiving DNA construct;

(e) selecting a final cell having undergone the correct modification; and

(f) from said final cell creating a cell bank for transient protein production of a desired protein of interest belonging to the same class as said template protein of interest, wherein said transient protein production is achieved via introduction of a DNA expression vector plasmid or an in-vitro generated mRNA set encoding said desired protein of interest into a cell from said cell bank.

The same class in respect of protein of interest and template protein refers to a group of proteins sharing a common sequence or structural feature. Examples of protein classes include antibodies of the same class (such as IgGl antibodies), fusion proteins sharing at least one conserved domain (such as FC-fusion proteins), or in general proteins sharing a conserved scaffold sequence were sequence variation is introduced only at defined region.

Preferably the method is a method wherein steps (d) and (e) are omitted and said template protein of interest region comprises inducible promoter(s) for expression of said template protein of interest, and wherein said silencing of said template protein of interest expression is achieved via changing culture conditions to inactivate said inducible promoter(s), and wherein said transient expression of said desired protein of interest is achieved via introduction of a plasmid or an in-vitro generated mRNA set encoding said desired protein of interest into a cell from said cell bank, and wherein said expression is achieved using culture conditions wherein said inducible promoter(s) is inactive.

The candidate cells are preferably generated according to any of the following procedures:

(a) Isolating clones from a culture of said recombinant cell or descendants thereof;

or

(b) Isolating pools or clones after (1) different culture time in a continuous culture format such as a perfusion culture or (2) different number of individual cultures starting with inoculation of a first culture using said recombinant cell or descendants thereof and using a volume from a finished culture to inoculate a next culture or (3) a combination of (1) and (2);

or

(c) performing targeted engineering by applying gene editing methods to introduce, remove or modify genetic material in the genome of said recombinant cell or descendants thereof;

or

(d) using an isolated clone from a culture of said recombinant cell or descendants thereof to inoculate a culture in procedure (b); or

(e) any combination of procedures (a) to (d).

Prior to generating said cell bank, said top candidate cell is preferably engineered to increase the genomic stability of said using targeted gene editing methods.

The cell bank for transient protein production may also function as a cell bank for cell line development, wherein cell line development is achieved by a donor DNA construct containing a region coding for a desired protein of interest belonging to the same class as the template protein of interest being introduced via the use of an expression vector and the action of an DNA processing enzyme, and wherein said cell bank is first used for selection of a desired protein of interest via evaluation of candidate desired proteins of interest using transient protein production and/or for producing a desired protein of interest using transient protein production, and wherein said cell bank is then used to generate a stable cell line by cell line development and produce said desired protein of interest at a larger scale.

The top candidate cell may additionally also be used to create a cell bank for cell line development, wherein cell line development is achieved by a donor DNA construct containing a region coding for a desired protein of interest belonging to the same class as the template protein of interest being introduced via the use of an expression vector and the action of an DNA processing enzyme, and wherein said cell bank for transient protein production is first used for selection of a desired protein of interest via evaluation of candidate desired proteins of interest using transient protein production and/or for producing a desired protein of interest using transient protein production, and wherein said cell bank for cell line development is then used to generate a stable cell line by cell line development and producing said desired protein of interest at a larger scale.

The template protein of interest may be coded by:

(a) a single gene of interest, such as ones encoding growth factors, blood clotting factors, cytokines, hormones, erythropoietins, albumins, virus proteins, virus protein mimics, bacterial proteins, bacterial protein mimics, domain antibodies, ScFvs, Affibodies, DARPINs, multimerization domains, IgG Fc domains, albumin binding domains, Fc receptor binding domains or fusion proteins based on combinations of the above single gene/mRNA of interest coded groups; or

(b) two or more genes of interest such as monoclonal antibodies based on naturally occurring scaffolds, bi-specific antibodies based on naturally occurring scaffolds, Fabs, virus like particles, native virus particles, multiple chain proteins based on association of two or more different protein chains selected from the list in (a).

In one embodiment the steps (a) to (c) of the method are iterated in the following way prior to generating said cell bank:

(i) in a next iteration the top candidate cell from previous step (c) is used as said recombinant mammalian cell in step (a);

(ii) step (b) to (c) is repeated using a modified template DNA construct having been modified to provide increased expression potential for said template protein of interest compared to earlier iterations; and (iii) repeating (i) and (ii) n times.

The cell bank may be based on a final cell comprising a receiving DNA construct version comprising a sequence z2 being unique or rare in the total genome sequence of said cell and two sequences X and Y flanking said receiving DNA construct and being unique or rare in the genome of said cell.

Alternatively said cell bank is based on a final cell comprising a receiving DNA construct version containing two recombinase recognition sequences flanking a first selection marker coding region and wherein said recombinase recognition sequences are both of the same type such as a serine recombinase type such as attP/attB or a tyrosine recombinase type such as Lox, Rox or FRT;

In a further alternative said cell bank is based on a final cell comprising a receiving DNA construct version comprising a single recombinase recognition sequence followed by an optional first selection marker coding region and wherein said recombinase recognition sequence is of a type such as a serine recombinase type such as attP/attB or a tyrosine recombinase type such as Lox, Rox or FRT.

The receiving DNA construct may be introduced into said genomic region in the following way: (a) providing a donor DNA construct containing said receiving DNA construct flanked by two sequences X' and Y' being homologous to two corresponding sequences X and Y in a template DNA construct containing cell and wherein said sequences X and Y are unique or rare in the genome of said template DNA construct containing cell and flanking said template DNA construct;

(b) defining a gene editing nuclease recognition sequence within said template DNA construct being unique or rare in the genome of said template DNA construct containing cell;

(c) a vector(s) coding for a gene editing nuclease, such as a zinc finger

nuclease/meganuclease/TALEN or a CRISPR/Cas9 combination, with specificity for said gene editing nuclease sequence is introduced together with said donor DNA construct into said top candidate cell;

(c) said gene editing nuclease creating a double strand break at said gene editing nuclease sequence catalyzing integration of said receiving DNA construct by cellular DNA repair mechanisms such as homologous recombination;

(d) cells having undergone correct introduction are selected via the use of a selection marker and/or genetic characterization.

The invention also relates to different embodiments of methods for mammalian cell line

development based on the cell bank developed as described above which will be described more closely in the detailed description of the invention. . In all the embodiments of the invention the mammalian host cell line is preferably a CHO cell line such as CHO DG44, CHO Kl, CHO M, CHO-S or a CHO GS knockout cell line.

In the methods of the invention transient expression of a desired protein of interest is preferably achieved using in-vitro generated mRNA(s) and the amount of in-vitro generated mRNA(s) introduced into cells from said cell bank is controlled so as to closely equal mRNA levels measured during a previous culture for expression of said template protein of interest.

Brief description of the drawings

Fig 1 is a general description of a method for generation of an improved cell CI using a single TGOI (target gene of interest) load (Figla) as well as the use of said improved cell for transient expression (Figlb) or targeted CLD generating a protein producing cell C2 (Figlc); Fig 2 is a general description of a method for generation of an improved cell using an iterative workflow in which the expression load is gradually increased and intermittent improved cells are isolated after each increase in expression load until a final improved cell Cn+1 is generated:

Fig 3 describes specific implementations of the method described in Fig 1 wherein Fig 3a describes the TGOI under control of an inducible promoter enabling the TGOI to be actively expressed during the improvement workflow; and Fig 3 b describes the TGOI inactive during conditions for transient expression of a desired gene of interest using a plasmid expression vector or a set of synthetic mRNAs;

Fig 4 describes a specific implementation for generation of the improved cell CI carrying a template DNA construct T enabling excision of the TGOI via the transient expression of a recombinase; Fig 5 describes a special embodiment of the method of Fig 2 utilizing reversible recombinase sites RS2 together with a matching recombinase (Rec-RS2) to perform iterative introductions and excisions of TGOI variants with increasing expression load;

Fig 6 describes cell line development using the cell C2 generated according to Fig 4;

Fig 7 describes a specific embodiment of the iterative approach described in Fig 2 wherein cassette exchange mediated by a double strand break inducing site specific nuclease is used to exchange between TGOI variants with increasing expression load; Fig 8 describes gene editing approaches for modification of the TGOI containing template recombinant DNA construct T of the improved cell CI to generate a cell C2 containing a desired receiving recombinant DNA construct R compatible with targeted CLD;

Fig 9 describes a specific example of a receiving recombinant DNA construct R based on a cassette of two recombinase recognition sequences RS together with different promoter placement alternatives. Further an overview of targeted CLD using such an improved cell C2 together with matching expression vectors EV and recombinase Rec-RS to generate a protein producing cell C3; and Fig 10 describes another specific example of a receiving recombinant DNA construct R based on a single recombinase recognition sequence RS together with different promoter placement alternatives. Further an overview of targeted CLD using such an improved cell C2 together with matching expression vectors EV and recombinase Rec-RS to generate a protein producing cell C3.

Detailed description of the invention

The invention will now be described more closely in association with the accompanying drawings and some non-limiting Examples.

A unifying concept underlying the invention is the time limited utilization of an actively expressed template protein of interest for guiding the transformation of an initial mammalian host cell line having limited production capacity for a certain class of proteins into a high productivity host cell line that can be used for efficient production of desired proteins of interest using transient protein production methods and were the desired protein of interest belongs to the same class as the template protein of interest. The invention includes methods enabling removal or inactivation of template protein of interest expression before expression of a desired protein of interest is initiated. Compared to potential methods using targeted engineering methods without the continuous expression of a template protein of interest to enable the desired transformation the current invention offers improvements in that the effects of single or multiple changes can be directly assessed using a protein indicative of the performance for a desired class of proteins and that the expression conditions used for the template protein of interest (mRNA levels and mRNA design) can be reproduced for any desired protein of interest. In a preferred embodiment a key aspect of the invention is the opportunistic utilization of the inherent plasticity of the genome of typical mammalian host cell lines, such as the CHO genome, as an engine for generating epi-genetic and/or genetic changes enabling the desired transformation into a high productivity state and where the continuous high level expression of a template protein of interest during the transformation is used as an inherent selection agent and/or physiological state sensor directing and/or enabling detection of accumulation of positive changes leading towards the desired end goal. The potential for template protein of interest expression acting as an inherent selection agent is based on the fact that a high recombinant expression load imposed on all cells will have an impact on their viability and growth. Cells that do not handle the recombinant expression load well are hypothesized (and this is generally accepted in the field) to be subjected to stress responses (amino acid shortage, charged tRNA shortage, hold up of the ribosomal machinery of recombinant mRNAs, hold up of the folding machinery on recombinant proteins, build-up of soluble or aggregated forms of recombinant protein within cells) reducing viability and growth. Further, cells having genetic/epigenetic changes leading to an improved handling of the recombinant expression load are hypothesized to have a higher viability and growth. Hence, by culturing cells for many generations, far exceeding what is used in typical CLD workflows, a large diversity of genetic/epigenetic changes are sampled and enrichment of cells having accumulated multiple positive changes are hypothesized based on this directed evolution mechanism. An alternative way of screening through a massive diversity, far exceeding current scope in CLD workflows, is to utilize the template protein expression as a physiological state sensor and perform iterative stages of clone screening followed by periods of culturing. Each state of culturing introduces new diversity and diversity can also be increased above natural levels via the use of epi-genetic de-regulators or means to increase mutation rates.

In a further preferred implementation the genome of the improved host cell line is stabilized ("Frozen") using targeted gene editing methods [8] following the desired transformation. Compared to potential approaches using targeted engineering only to achieve the desired transformation the current invention offers potential large improvements. As shown in comparison of host cell lines and producer cell lines, generated using cell line development screening efforts, a large number of more or less subtle differences rather than a few dramatic differences exists between the low productivity and the high productivity state [5]. Examples known in the art has also shown limited success using single or a few targeted changes only and typically such changes yield competitive performance first after being combined with changes introduced using selection workflows (and hence the plasticity of genomes) following introduction of genes coding for a protein of interest [12]. Hence it is likely that a large number (10s to 100s) of more or less subtle changes spanning over many different cellular pathways are needed to transform for example a CHO cell to a high productivity state. Further it is likely that different protein classes will need at least partly different modifications for optimal performance. To achieve such large numbers of changes and changes with such precision in their effect will be highly challenging using targeting engineering approaches. First the desired changes must be understood and this will likely include a combination of major screening efforts and advanced systems biology modeling and even so combinatorial effects might easily be missed. Secondly, generating clones having the correct set of multiple changes will require a range of gene editing tools and major selection and screening. The approach based on utilizing the plasticity of host cell genomes during a pro-longed time period disclosed in the current invention offers a means to sculpture the complete genome to achieve the desired transformation without the need for massive screening to understand and define the changes needed or the need for a large range of costly genome editing reagents. Further, the postulated cost and resource effectiveness of the proposed approach could enable the generation of a range of host cell lines being adapted for the optimal production of different protein/biotherapeutic classes or even sub-groups with different characteristics (such as amino acid frequencies) within a given class. Finally, the non-targeted sculpturing of the genome can be combined with specific targeted changes introduced by genome editing. Such changes could include modulation/control of glycosylation, further boosting of specific parts of the secretion machinery, introduction of machinery for non-natural amino acids, introduction of machinery for specific post translational protein modifications and the already mentioned changes for stabilizing the genome of a host cell.

In a most preferred general implementation of the invention the same transformed host cell line used for transient protein production of a certain desired protein class can also be used for cell line development using site directed integration to enable production of this protein class using a stable production cell line with identical or at least highly similar phenotype/genotype.

As outlined in the background there exist a need in the art for faster and more flexible protein expression methodologies capable of producing major complex bio therapeutic protein classes and simultaneously enabling: (a) predictive early developability assessment of a large number of slightly different protein variants targeting the same disease target to enable increased efficiency and success rates in drug development; (b) reduced time and efforts in developing new protein drugs to enable parallel development of an increased number of more targeted drugs and (c) expression technologies supporting the design of flexible and cost effective manufacturing facilities enabling production of many different protein drugs in the same facility with amount and number of drugs changing over time.

For drugs needing moderate amounts of protein production capacities, which are becoming likely in future targeted medicine scenarios, transient protein production is a highly attractive concept.

However there are several technical obstacles which must be overcome in order to make transient expression a viable option for protein expression in manufacturing stages. These includes the risk of stable introduction of genetic material when using plasmid expression vectors, the difficulty in delivering genetic material with high efficiency to cells growing to high densities in nutrient rich growth media without the need for growth media exchange and the fact that cell lines such as CHO is not a high producer phenotype without the selection approaches following stable integration of components for expression of a protein of interest. The risk of stable integration of genetic material is not covered in the current invention but can likely be overcome by using in-vitro generated mRNAs instead of plasmid DNA vectors as genetic material to achieve transient expression and this will also likely give faster expression and increased control over intracellular mRNA levels. Efficient transfection of genetic material in growth media supporting high densities of suspension growing cells is not covered in this invention but ongoing developments in the field is encouraging. With the methods disclosed in the current invention efficient transient production of major bio therapeutic protein classes using e.g. in-vitro generated mRNA sets could be possible. Transient expression using mRNA enables production without any selection or screening steps and hence enabling fast and highly parallel developability assessment in a straightforward set-up with minimal hands-on needs. This approach is faster and easier even when compared to a platform utilizing site directed integration into a host cell line that have been transformed into a high productivity stage using the procedures described in the current invention since no

selection/screening step is needed. Further, if transient expression using the exact same host cell line is used also for manufacturing stages the developability assessment should be fully predictable as the same conditions can later be reproduced at larger scale. As a result increased efficiency in drug development could be achieved.

Transient expression also offers potential to increase flexibility and efficiency at manufacturing stages. Since the same host cell line can be used for production of multiple products belonging to the same class less complex multi-product facilities utilizing a single common seed train can be designed. Further, due to the increased control over timing and level of intracellular mRNA levels optimal expression profiles allowing the cells to better utilize its resources can be designed, e.g. by separating growth and protein production into two stages. This could also reduce product heterogeneity by minimizing product occupancy in the culture broth and ensuring all product being produced using the same cellular physiological state. Further, the control over timing and levels of different intracellular mRNA species could enable improved performance for production of currently difficult to express protein classes such as certain IgGs, certain multi-chain bi-specific antibodies and certain virus like particles or virus vaccines. Specifically transient expression approaches could potentially enable manufacturing of different classes of flu vaccines or other virus or virus like particle based vaccines using e.g. suspension growing CHO cell lines enabling very fast response to e.g. pandemic threats. For scenarios were transient expression is not desired at clinical or manufacturing stages either due to regulatory hurdles or cost concerns at larger scales the embodiment of the current invention enabling both transient expression and site-directed integration of genes of interest into the same improved host cell line offers an expression platform with major improvements over what is currently available in the art. Using this platform transient expression can be used to enable fast and straightforward developability assessment during pre-clinical stages. Since the phenotypes will be highly similar, the same mRNA design can be used and the mRNA levels can be made to closely match between transient and stable expression scenarios, the developability assessment will likely be highly predictable. The invention will enable increased possibilities to evaluate developability traits (immunogenicity, protein titers, aggregation levels, formulation stability etc.) with high precision for an increased number of protein candidates in each drug development program and with cell lines and conditions nearly identical to ones used in final production and without the data being corrupted by variation coming from differences in the physiological state of cell lines. This can in turn enable even larger cost savings and efficiency increases in drug development by increasing the likelihood of success and reducing the rates of late failures. The similarity in cellular phenotype and expression conditions should also enable the stage for switching between transient expression and stable expression to be highly flexible since comparability of product quality or titers should not be an issue. Hence, the switch could be performed during entry into clinical stages, before entry into clinical stage III or first after an increased demand appears after launch of a new drug.

After this description of the general aspects and benefits of the current invention, specific

embodiments will now be described with support of accompanying drawings. A conceptual general workflow for generating a host cell line with improved properties can be found in Figure la. First, an initial mammalian cell (C) carrying a single copy of a template DNA construct T, containing a fully functional template gene of interest (TGOI), at a defined location in the genome (HS) is provided. This cell is then put through a selection workflow (SI) to isolate/select a cell (CI) with highly increased capacity to produce the template protein of interest. The improved CI cell will have a modified genome (Gl) and/or transcriptome compared to the original cell genome (G) and/or transcriptome reflecting multiple accumulated changes in diverse cellular pathways and processes which together give rise to a phenotype with the improved expression capacity. In a following step the template DNA construct (T) of the improved cell CI is modified to create a receiving DNA construct (R) lacking the TGOI but containing sequences 0 enabling targeted integration of a desired gene of interest (GOI) from a matching expression vector (EV). This can either be achieved via the use of a DNA modifying enzyme (I) in solitude cutting out the TGOI containing region from T or via the combined use of a DNA modifying enzyme (I) and a donor DNA construct vector (DV). The resulting cell C2 will contain a receiving DNA construct (R) carrying sequences 0 enabling site directed integration of a GOI. This improved cell line C2 can then be used either to enable highly improved performance in transient expression workflows by transfecting the cell with non-integrating Expression plasmids or synthetic mRNA (Figure lb) or be used in a streamlined CLD workflow (Figure lc) in which C2 is contacted with an expression vector (EV) containing a matching recombinant construct R' enabling a targeted integration of a desired protein of interest gene by introduction of R' from EV with the aid of a DNA modifying enzyme with specificity for 0 (0) to create a cell C3 maintaining the expression phenotype and genome (Gl) of C2 but now expressing the desired protein of interest. The genomic location should preferably be a hot spot region, meaning that it supports high transcription of introduced genes and that this transcription is stable over time and reproducible for different genes and different culture conditions. Especially the transcription activity should be high and stable using a serum free culture medium and growth during suspension conditions. Hot spot regions can be identified either via screening approaches, bioinformatics or a combination of these. The current invention builds on that a defined genomic location has been selected and that the sequence of this genomic site is known. The template DNA construct T contains a TGOI and optionally gene(s) coding for selection marker(s) (SM(s)). Preferably the TGOI design contains genetic elements either outside or inside the coding sequence(s) providing a high level translation power for the corresponding protein to maximize the expression load/potential. Such elements can include strong promoters such as mCMV, hCMV or synthetic promoters [13], 5'-UTR designs providing increased mRNA stability and/or increased translation, 3'-UTR designs providing increased mRNA stability and/or increased translation, signal peptides providing improved secretion properties and optimized sequence stretches in coding regions based on synonymous codon changes. Preferentially, design of these sequence elements are based on TEEs in the 5'-UTR and RESCUE-modification of the coding region [6, 7]. Typical mRNA transcriptional levels using a certain promoter at the hot spot region are preferably known so that the mRNA levels during transient transfection conditions can be matched.

The generation of an improved cell can either be performed using a single TGOI expression load as outlined in Figure 1 or by an iterative improvement workflow in which the expression load is gradually increased and intermittent improved cells are isolated after each increase in expression load until a final improved cell is generated as outlined in Figure 2. In this workflow the first step improved cell carrying a recombinant DNA construct T enabling a first TGOI expression load is contacted with exchange vector(s) enabling modification of T including introduction of a new template DNA construct Tlenabling a second higher TGOI expression load. This in turn enables the selection of a second step improved cell. This gradual increase in expression load can be repeated any number of times until a final improved cell is generated. The expression load can be varied by using different promoter strengths in different TGOI construct generations, by utilizing 5'-UTR and 3'-UTR variants promoting different mRNA stability and/or translational efficiency, by utilizing coding sequences promoting different mRNA stability and/or translational efficiency or by changing the TGOI copy number between different generations. One specific embodiment of this iterative approach is to utilize cassette exchange mediated by a double stand break inducing site specific nuclease (Nz) as outlined in Figure 7. Here the template DNA construct T contains a sequence X at the 5'-end, a sequence Y at the 3'-end and an internal sequence z. All sequences X, Y and z are unique or rare in the genome (Gl) of cell CI. To exchange T for Tl the cell CI is contacted with an exchange vector V carrying Tland a site specific nuclease (Nz) with specificity for z. Tl contain sequences at the ends that are either identical to or highly similar to the sequences X and Y in T. The site specific nuclease creates a double strand break at z catalyzing cassette exchange between T and Tl via homologous recombination repair mechanisms using Tl as a repair template. If additional cassette exchange reactions are planned Tl should also contain a sequence zl being unique or rare in the genome (Gl) to enable cassette exchange using a second site specific nuclease with specificity for zl. The site specific nuclease can be any type of gene editing solution such as zinc finger nucleases, homing

endonucleases, TALENs or CRISPR/Cas9 variants. An alternative to using gene editing assisted homologous recombination for exchanging TGOI variants is to utilize reversible recombinase systems, see Figure 5, such as solutions based on Cre/loxP (RS2/Rec-RS2 in general terms) in which a TGOI variant (TGOIn) can be introduced at a loxP (RS2) sequence via the aid of the Cre recombinase (Rec- RS2) resulting in an introduced TGOI flanked by two loxP (RS2) sequences. Following isolation of an improved cell the previous generation TGOI can be removed via the action of Cre (Rec-RS2) which re- creates the single loxP (RS2) sequence enabling introduction of a new generation TGOI.

Two main approaches can be used to isolate an improved cell from an initial TGOI carrying cell. The first approach utilizes the inherent plasticity of the genome of typical mammalian host cell lines used for recombinant protein production. One embodiment of this approach to generate an improved cell line with improved properties is to screen clones from a culture for a desired set of protein production traits and select the top performing clone. Protein production traits could be, but are not limited to: template protein of interest production rate or culture titer, template protein of interest aggregation level, template protein of interest charge heterogeneity, template protein of interest size

heterogeneity, glycosylation site occupancy and glycosylation profile for the template protein of interest, cell growth characteristics and cell metabolic characteristics, tertiary structure profile for the template protein of interest, template protein of interest self-association tendency, DNA sequence profiles, mRNA profiles, miRNA profiles, proteomic profiles and genomic stability of cells. This can in principle be performed in analogy with current CLD screening approaches used in the field. There initial screens of many clones using simple parallel culture formats and a few measured parameters such as titer and growth are followed by more extensive screening, including protein quality attributes as described above, of a lower amount of selected clones in more predictive culture formats such as shake flasks or bioreactors.

A second embodiment is based on directed evolution of the cells via pro-longed culture of the cells with recombinant expression pressure present. The high recombinant expression load imposed on all cells will have an impact on the viability and growth. Cells that do not handle the recombinant expression load well are hypothesized to be subjected to stress responses (amino acid shortage, charged tRNA shortage, hold up of the ribosomal machinery of recombinant mRNAs, hold up of the folding machinery on recombinant proteins, build-up of soluble or aggregated forms of recombinant protein within cells) reducing viability and growth. Further, cells having genetic/epigenetic changes leading to an improved handling of the recombinant expression load are hypothesized to have a higher viability and growth. Hence, by culturing cells for many generations, far exceeding what is used in typical CLD workflows, a large diversity of genetic/epigenetic changes are sampled and enrichment of cells having accumulated multiple positive changes are hypothesized based on this directed evolution mechanism. Preferentially the TGOI codes for a template protein of interest representing an important class of proteins such as IgGl antibodies or FC-fusion proteins and preferentially a difficult to express protein of this class to promote isolation of the highest possible production competency of the generated host cell line. Preferentially the culture of the cells is performed using conditions highly similar to a platform process defined for production of protein for clinical phases or commercial purposes to enable the adaptation through directed evolution to be directly compatible with these conditions. This could for example mean using a bioreactor fed-batch culture with defined culture medium, feed medium and process parameters. Pro-longed culture in this format could for example be achieved by inoculation of next generation cultures using a fraction of the culture from the previous culture. Prolonged culture could also be achieved in a chemostat reactor or a perfusion culture, potentially repeated multiple times using seeding of cells from a previous culture stage. Preferentially a selection marker, such as Neomycin resistance, a DHFR gene or a GS gene, is used together with culture conditions that put a strong selection pressure for the presence of an active selection marker. This could for example be the use of a neomycin resistance gene as selection marker and the use of neomycin during culture.

Another potential selection marker design could utilize a genetic circuit coupling cellular survival directly to expression of the TGOI. Such a genetic circuit could be based on non-native miRNAs binding both to a sequence stretch of TGOI mRNA and a sequence stretch on a selection marker gene such as NeoR, GS or DHFR. This is to further ensure, in addition to the use of a transcription hot spot region, that the expression construct is not silenced during culture leading to the enrichment of cells that are not expressing the template protein of interest. This approach has the potential to generate superior protein production clones as compared to approaches based on mere screening of clones. Typically in screening approaches a first culture is performed to select a first set of clones from. Individual clones are cultured for assessment followed by a second selection of clones. This is repeated a few times. As genetic variants are removed early and a low number of generations are allowed between selection steps a relatively low amount of genome variation is sampled using this approach. Using directed evolution and pro-longed culture for many generations keep all the genetic variation and allows time for accumulation of rare modifications and most importantly rare combinations of changes. Importantly, using this approach on a cell line lacking a TGOI would most certainly not lead to the same accumulation of positive protein production traits as most such changes would not be favored without the evolutionary pressure of high recombinant expression load and would not be possible to detect without the presence of a TGOI. Directed evolution and screening can also be combined and preferentially at least one final step including screening of production traits should be included. Intermediate screening steps in a workflow based on directed evolution can be used to further ensure that the rare event of clones having managed to silence the SM/TGOI does not lead to such cells being enriched in cultures. If the workflow starts with a clone screening and selection step a range of medium to high performance clones should preferably be selected for a round of culturing to ensure a large genetic diversity. Finally, a clone or a pool of cells isolated from any of these workflows is used to create a master cell bank (MCB) of a final improved host cell line. The final host cell line having accumulated genetic and/or epigenetic changes compared to the initial host cell line and recombinant mammalian host cell. In addition to the phenotypic diversity generated during cell growth, phenotypic diversity could also be artificially increased between selection/screening rounds by use of chemicals such as epigenetic de-regulators or by radiation increasing mutation rates. Besides utilizing the natural or artificially enhanced plasticity in the genome to sample or promote random changes, a second approach based on targeted engineering can also be used to generate the final host cell line for CLD. A cell (C) according to Figure la being subjected to a defined TGOI expression load is subjected to targeted changes to the genome (G) via the use of genome editing enzymes (such as Zinc finger nucleases, meganucleases, TALENs, CRISPR/Cas9 variants) and recombinant nucleic acid donor constructs to knock-out genetic functionality or add novel genetic functionality. After selection of cells having undergone correct/desired changes, the effect of these targeted changes is evaluated in follow up cultures to look at protein production traits such as those described above. A range of different individual targeted changes can be evaluated and cells with targeted changes having positive effects on protein production traits can then be subjected to an iterative approach adding and evaluating additional changes. This process can be repeated until a final clone or cell pool with desired properties (based on the accumulation of one or multiple targeted changes) can be isolated. Preferentially the evaluation of protein production traits is performed using culture conditions highly similar to a platform process defined for production of proteins for clinical phases or commercial purposes to enable a fit to these conditions. This could for example mean using a bioreactor fed-batch culture with defined culture medium, feed medium and process parameters. Compared to targeted engineering approaches applied on host cell lines lacking a TGOI, the method according to the present invention enables several major advantages. First, the presence of a TGOI with controlled expression properties that can be reproduced for any GOI of the same class following CLD enables evaluation of targeted changes to be performed in conditions that are predictive of the intended final use. Secondly, the continuous presence of an expression load during the engineering workflow reduces the risk of loss of functionality due to genetic instability. As an added feature directed evolution, screening of natural genetic diversity and targeted changes can be combined in any form together with conditions predictable to the final use to generate the final improved host cell line. In one embodiment of the invention the instability of the host cell genome is first used to enable generation of an improved host cell via multiple genetic and/or epigenetic changes throughout the genome that would likely be difficult to generate using targeted engineering alone. In a second stage the instability of the genome is reduced either via directed evolution/selection or via targeted engineering. Research is currently underway to define engineering targets enabling stabilization of the genome of for example CHO cells [8].

As previously described the isolation/selection steps can also be repeated multiple times using a gradually increased expression load as outlined in Figure 2. The rationale for a stepwise increase in the expression load is to enable a gradual move towards a higher performance phenotype. If the initial expression load is too high a low number of clones will be capable of matching this expression load and show up as competent producers during screening and hence a low number of clones can be passed on to a second round of growth and screening. This leads to an early potentially detrimental reduction of phenotypic diversity. There is a high risk for a low survival frequency and very low growth of cultured cells again leading to an inefficient sampling of clonal variation and slow accumulation of improved phenotypes. The initial clonal variation might not even be sufficient to enable survival of any cells at all. By gradually increasing the expression load cells can gradually adapt and an increased genetic variation can be sampled and taken forward in each iteration.

Some examples of improvements over typical procedures in the field have been described in previous art. First, directed evolution of a final host cell line has been proposed [9]. However, in this case directed evolution is performed on an initial cell line lacking the expression of a template protein of interest and hot spot integration site and selection traits are not directly linked to protein production traits. Further, the generated host cell line is then utilizing random integration for CLD and no mention have been done to either targeted integration or transient expression workflows. Host cells generated using this approach will not have been subjected to a pressure to accumulate changes improving protein production traits and as adaptation to specific culture conditions has been done without the recombinant expression burden there is a risk for sub-optimal adaptation to the conditions experienced during production of a recombinant protein. The present invention represents several improvements over this approach. The expression of a template protein of interest enables selection/evolution of protein production traits matching the combined demands of the specific culture conditions and a high level recombinant expression pressure. Further, the cassette exchange approach to CLD or the use of transient expression processes enables the conditions experienced by the cells during production of a desired protein of interest to be highly similar to the conditions used during generation of the host cell line. Utilization of pre-adapted cells has also been proposed for targeted integration based CLD [10]. In this approach it is proposed that a cell line generated using random integration CLD and displaying desired protein production traits should be selected as a source for generating a final host cell line. In the proposed procedure, the genomic location is identified (must be a single site) and the recombinant constructs are cut out using gene editing based homologous recombination and exchanged for a construct carrying a selection marker flanked by recombinase sequences. After isolation of cells having undergone correct exchange, the genomic site is treated with a recombinase to cut out the selection marker and leave a single recombinase site flanked by a promoter. This host cell line can then be used for targeted integration of a second expression construct. In this approach there is not a match between the expression load provided by the multiple copies of the first expression construct and the single copy of a second expression construct following CLD. Hence, the properties of the host cell line are not likely to be fully suitable to the new conditions. This mismatch can be further increased if the culture conditions are different between the initial cell line and the second cell line. Most importantly no mention of the possibility to use this host cell line for improved transient protein production is made. Hence, the current invention offers multiple improvements over this approach in that the selection/evolution of traits can be better matched between host cell line and the cell line producing the desired protein of interest using transient protein expression processes or using a stable cell line based on CLD using targeted integration. In addition the increased sampling of diversity possible by directed evolution and the possibility to add targeted modifications as described in the current invention has the potential to generate production clones with superior protein production traits.

Using the natural diversity of cells has recently been discussed and highlighted as a potentially superior approach in a GEN article [14]. Using selection to generate a high performance cell expressing a certain template protein is here contemplated. However instead of isolating this cell and using it directly in subsequent CLD workflows the potential to identify engineering targets by detailed omics characterization to enable reproduction of a high productivity cellular phenotype using targeting engineering approaches is proposed.

Following the detailed outline of the general concept of the invention and its benefits above, some non-limiting specific implementations will now be described. In a first specific implementation (Figure 3) the TGOI is under control of an inducible promoter enabling the TGOI to be actively expressed during the improvement workflow (Figure 3a) but inactive during a conditions for transient expression of a desired gene of interest using a plasmid expression vector or a set of synthetic mRNAs (Figure 3b). This implementation is only applicable for transient expression and not for complementing targeted CLD for expression of a desired gene of interest (GOI). In a preferred specific implementation the improved cell CI carrying a receiving DNA construct R is generated according to Figure 4 using a recombinase (Rec-RS2) to excise a TGOI region being flanked by reversible recombinase target sequences RS2. CLD using this cell is outlined in Figure 6. The cell CI containing one irreversible recombinase recognition sequence (RSI) and one reversible recombinase recognition sequence (RS2) is contacted with a recombinase with specificity for RS1/RS1' and a matching Expression Vector (EV) containing a recombinant DNA construct R' with a matching irreversible recombinase recognition sequence RSI' a desired gene of interest (GOI) and a selection marker (SM2). Following insertion a cell or a pool of cells having undergone correct modification is selected using the selection marker (SM2). The recombinases used can be of any reversible and irreversible type such as attP/attB/PhiC31 as an irreversible system and loxP/Cre as an reversible system.

In a third set of specific implementations, the template recombinant DNA construct T of the improved cell CI is modified to a receiving recombinant DNA construct R via the use of gene editing approaches as outlined in Figure 8. The improved cell CI contains two long sequences (> 300bp and typically >= 1 Kb) X and Y flanking the template gene of interest and an optional selection marker and being unique or rare in the genome Gl. Further the flanked region contains an additional shorter sequence z (15-40 nt) being unique or rare in the genome Gl. As CI is contacted with a donor vector (DV) containing a recombinant DNA construct T' and a site specific gene editing DNA nuclease with specificity for z (Nz) a cassette exchange between T and T' occurs via homologous repair mechanisms catalyzed by a double strand break at z. The two sequences X' and Y' are identical or highly homologous to the sequences X and Y in the genome Gl. The resulting recombinant DNA construct R contains sequences 0 enabling targeted integration of a GOI into the hot spot region HS. The site specific nuclease can be based on any gene editing solution such as zinc finger nucleases, homing endonucleases, TALENs or CRISPR/Cas9 variants or other CRISPR systems.

One approach utilizes a receiving DNA construct design R in which an optional SM gene(s) are flanked by two recombinase recognition sequences (RS) and an expression vector design in which a GOI(s) and an optional second SM gene(s) are flanked by matching recombinase recognition sequences (RS'). By co-transfecting the improved cell CI with the expression vector in the form of a plasmid and a plasmid encoding a recombinase (Rec-RS) with specificity for RS/RS' a cassette exchange between R and R' is achieved. A cell C2 having undergone the correct exchange only can be selected via the difference in SMs between R and R'. The resulting recombinant DNA construct E contains recombined recombinase recognition sequences RC. Depending on the recombinase system used these can either be different from RS and RS' and differ between the 5' and 3' sequences (as for

attP/attB/PhiC31) or be identical to RS/RS' (as for loxP/Cre). The recombinase recognition sequences used can be of any type such as a serine recombinase type such as attP/attB or a tyrosine recombinase type such as Lox, Rox or FRT together with matching recombinases such as PhiC31, Cre, Dre, or Flp. Different examples of this approach based on varying the promoter placement for R and R' are outlined in Figure 9.

In another approach the receiving DNA construct R contains a single recombinase recognition sequence RS followed by an optional SM gene(s) and the matching expression vector EV a single matching recombinase recognition sequence RS' followed by a GOI(s) and a SM gene(s) in any order. By co-transfecting the improved cell CI with the expression vector in the form of a plasmid and a plasmid encoding a recombinase (Rec-RS) with specificity for RS/RS' the R' construct is introduced into the cell and simultaneously changing the relative position of the original R construct so that the resulting E recombinant DNA construct is a combination of R and R'. Different examples of this approach based on varying the promoter placement for R and R' are outlined in Figure 10.

In any of the above embodiments of the invention the TGOI/GOI could contain a single gene of interest coding for (or mRNA set encoding) proteins such as growth factors, blood clotting factors, cytokines, hormones, erythropoietins, albumins, virus proteins, virus protein mimics, bacterial proteins, bacterial protein mimics, domain antibodies, ScFvs, Affibodies, DARPINs, multimerization domains, IgG Fc domains, albumin binding domains, Fc receptor binding domains or fusion proteins based on combinations of the above single gene of interest coded groups. The TGOI/GOI could also contain two or more genes of interest coding for (or an mRNA set encoding) proteins such as monoclonal antibodies based on naturally occurring scaffolds, bi-specific antibodies based on naturally occurring scaffolds, Fabs, virus like particles, native virus particles, multiple chain proteins based on association of two or more different protein chains selected from the list of single gene coded proteins above. In preferred embodiments of the invention the TGOI (or mRNA set) of the host cell line and the GOI (or mRNA set) used for protein production of a desired protein of interest encodes proteins belonging to the same protein class. In further preferred embodiments the TGOI (or mRNA set) represents a hard to express protein of that protein class. In further preferred embodiments a single copy of TGOI and GOI is used or the mRNA levels are designed to match between host cell line generation and conditions for production of a desired protein of interest. In further preferred embodiments genetic elements, such as promoter(s), 5'-UTR(s), signal peptide(s), design principle for synonymous nucleotide encoding in the coding region and 3'-UTR(s), used in the TGOI and GOI (or mRNA set) are identical (for transient expression during generation of the host cell line and use of stable cell lines for production of a desired protein of interest the mRNA amounts in said mRNA should be designed to match typical levels generated using a certain promoter at said genomic region in said host cell line). In further preferred embodiments a TGOI construct design (or mRNA set design) enabling a very high expression load is used, such as genetic elements having previously been shown to support an expression rate of a recombinant protein in excess of 40, 60 or 80 picogram of protein per cell and day or a final culture titer of 1, 3 or 5 g/l. In some embodiments multiple final host cell lines having utilized different TGOIs (or mRNA sets) of the same protein class but with different amino acid ratios are available and the specific cell line used for CLD or transient protein production using a specific GOI (or mRNA set) is selected based on closest match between amino acid ratios of the template protein of interest (encoded by TGOI) and the desired protein of interest (encoded by GOI). In some embodiments the TGOI (or mRNA set) and GOI can represent identical proteins at the amino acid sequence level but being encoded by different nucleotide sequences. In some embodiments multiple final host cell lines having utilized identical TGOIs (or mRNA sets) but selected/derived to display a specific protein quality profile, such as a specific glycol- profile, are available. The specific cell line used for CLD or transient protein production using a specific GOI (or mRNA set coding for the desired protein of interest) is selected based on closest match between desired protein quality profile and available protein quality profiles.

References

[1] Hacker, D.L., De Jesus, M., Wurm, F.M., 2009. 25 years of recombinant proteins from reactor-grown cells - where do we go from here? Biotechnology Advances 27,1023-1027.

[2] Wirth, D., et al., Road to precision: recombinase-based targeting technology for genome engineering. Current Opinion in Biotechnology, 2007. [3] D. Wirth, L. Gama-Norton, R. Schucht, K. Nehlsen "Site-Directed Engineering of Defined

Chromosomal Sites for Recombinant Protein and Virus Expression- Site-directed engineering of defined chromosomal sites", BioPharm International, Volume 22, Issue 7 (2009)

[4] Alexandra Baer and Jurgen Bode, Coping with kinetic and thermodynamic barriers: RMCE, an efficient strategy for the targeted integration of transgenes, Current Opinion in Biotechnology 2001, 12:473-480

[5] Wei-shou Hu; Cell Culture Bioprocessing Engineering; ISBN 978-0-9856626-0-8, pages 127-146

[6] WO2009/075886 (Translation Enhancer Elements, TEEs) [7] WO2010/98861 (RESCUE)

[8] http://www.chorus.co.at/proiects/genomic-stability-of-the-ho st-cell-line.html, December 3rd 2014

[9] Nazanin Dadehbeigi et al.; Robust and efficient recombinant mAb production using a proprietary CHO host cell with improved characteristics identified through directed evolution, conference poster http://www.fujifilmdiosynth.com/pdfs/CCE_XIV_Poster_Fay_Saun ders.pdf [10] Eric Rhodes; Gene editing approaches for viable commercial production; conference presentation Bioprocess summit Boston Aug 2012

[11] US 6 632 672 (Stanford att-site patent).

[12] Kim J.Y., et al.; CHO cells in biotechnology for production of recombinant proteins: current state and further potential; Appl Microbiol Biotechnol (2012) 93:917-930 [13] Hacker D.L, et al.; Polyethyleneimine-based transient gene expression processes for suspension- adapted HEK-293E and CHO-DG44 cells; Protein Expression and Purification 92 (2013) 67-76.

[14] Angelo DePalma; Cell-Line Optimization: Nature or Nurture? Are Great Cell Lines Born or Made? Both!; GEN Nov 1, 2015 (Vol. 35, No. 19)