SHRIMP ALKALINE PHOSPHATASE - NORWEGIAN INST OF FISHERIES &

Title:

SHRIMP ALKALINE PHOSPHATASE

Document Type and Number:

WIPO Patent Application WO/2002/031157

Kind Code:

A2

Abstract:

The present invention relates to a nucleic acid molecule comprising a nucleotide sequence as set forth in SEQ ID Nos. 1 or 20 or having at least 55 % sequence identify therewith and/or capable of hybridising under medium stringency conditions to the complementary sequence of SEQ ID Nos. 1 or 20, or the complementary sequence of said sequences, wherein said nucleotide sequence encodes or is complementary to a sequence which encodes a heat labile alkaline phosphatase and to a recombinant heat labile alkaline comprising: (a) all or a significant part of an amino acid sequence as shown in SEQ ID No. 2; or (b) all or a significant part of an amino acid sequence which has at least 60 % sequence identify with SEQ ID No. 2 as well as methods for the manufacture thereof.

Inventors:

GARDNER REBECCA (GB)
NILSEN INGE (NO)
OEVERBOE KERSTI (NO)

Application Number:

PCT/GB2001/004509

Publication Date:

April 18, 2002

Filing Date:

October 10, 2001

Export Citation:

Click for automatic bibliography generation Help

Assignee:

NORWEGIAN INST OF FISHERIES & (NO)
GARDNER REBECCA (GB)
NILSEN INGE (NO)
OEVERBOE KERSTI (NO)

International Classes:

A01K67/027; A01K67/033; C12N1/15; C12N1/19; C12N1/21; C12N5/10; C12N9/16; C12N15/09; (IPC1-7): C12N15/52

Domestic Patent References:

WO2000034476A2

2000-06-15

Other References:

R.L. OLSEN ET AL: "alkaline phosphatase from the hepatopancreas of schrimp (pandalus borealis): a dimeric enzyme with catalytically active subunits" BIOCHEM. PHYSIOL., vol. 99b, no. 4, 1991, pages 755-761, XP001062205
I.W. NIELSEN ET AL.: "Thermolabile alkaline phosphatase from northern shrimp (pandalus borealis) : protein and cDNA sequence" COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY, 2001, pages 853-861, XP001057780
DATABASE EMBL [Online] Acc. No.: Q9EQ79, XP002192892

Attorney, Agent or Firm:

Gardner, Rebecca (179 Queen Victoria Street, London EC4V 4EL, GB)

Download PDF:

View/Download PDF PDF Help

Claims:

Claims

1.

A nucleic acid molecule comprising a nucleotide sequence as set forth in SEQ ID Nos. 1 or 20 or having at least 55% sequence identity therewith and/or capable of hybridising under medium stringency conditions to the complementary sequence of SEQ ID Nos. 1 or 20, or the complementary sequence of said sequences, wherein said nucleotide sequence encodes or is complementary to a sequence which encodes a heat labile alkaline phosphatase.

2.	A nucleic acid molecule as claimed in claim 1 comprising a nucleotide sequence which has at least 75% sequence identity with the nucleotide sequence of SEQ ID Nos. 1 or 20.

3.	A nucleic acid molecule as claimed in claim 1 or claim 2 comprising a nucleotide sequence which has at least 85% sequence identity with the nucleotide sequence of SEQ ID Nos. 1 or 20.

4.	An isolated nucleic acid molecule which encodes or is complementary to a sequence which encodes a heat labile alkaline phosphatase, said alkaline phosphatase comprising the amino acid sequence of SEQ ID No. 2 or variants of said amino acid sequence which are at least 50% identical thereto.

5.	A nucleic acid molecule as claimed in claim 3 wherein said alkaline phosphatase has an amino acid sequence which is at least 70% identical to the amino acid sequence of SEQ ID No. 2.

6.	A nucleic acid molecule as claimed in claim 3 wherein said alkaline phosphatase has an amino acid sequence which is at least 80% identical to the amino acid sequence of SEQ ID No. 2.

7.	A nucleic acid molecule as claimed in any preceding claim wherein the alkaline phosphatase is Pandalus borealis alkaline phosphatase.

8.	A genetic construct comprising a nucleic acid molecule as claimed in any one of claims 1 to 7 operably linked to a promoter sequence.

9.	A recombinant expression vector comprising a nucleic acid molecule as claimed in any one of claims 1 to 7 and which is capable of propagation in a host cell.

10.	A vector as claimed in claim 9 which is a plasmid or viral vector.

11.	A cell transformed with a construct or vector as claimed in any one of claims 9 to 11.

12.	A cell as claimed in claim 11 which is a prokaryotic cell.

13.	A cell as claimed in claim 11 which is part of a eukaryotic cell culture.

14.	A recombinant cell or organism having a nucleic acid molecule comprising a nucleotide sequence as defined in any one of claims 1 to 3.

15.	A recombinant heat labile alkaline phosphatase comprising : (a) all or a significant part of an amino acid sequence as shown in SEQ ID No. 2; or (b) all or a significant part of an amino acid sequence which has at least 60% sequence identity with SEQ ID No. 2.

16.	A recombinant alkaline phosphatase as claimed in claim 15 having an amino acid sequence as shown in SEQ ID No. 2 or an amino acid sequence which has at least 75%sequence identity with the sequence of SEQ ID No. 2.

17.	A recombinant alkaline phosphatase as claimed in claim 15 or 16 which is a Pandalus borealis alkaline phosphatase.

18.	A recombinant cell or organism which is capable of expressing a heat labile alkaline phosphatase which comprises the amino acid sequence of SEQ ID No. 2 or an amino acid sequence which has at least 60% sequence identity with SEQ ID No. 2.

19.	A recombinant cell or organism which is capable of expressing a heat labile alkaline phosphatase which comprises the amino acid sequence of SEQ ID No. 2 or an amino acid sequence which has at least 75% sequence identity with SEQ ID No. 2.

20.	A method for the manufacture of heat labile alkaline phosphatase, comprising transforming a host cell with an expression vector as claimed in claim 9, maintaining the cell in a culture medium and isolating heat labile alkaline phosphatase expressed by said cell.

21.

A nucleic acid molecule comprising a nucleotide sequence as set forth in SEQ ID No. 15 or having at least 55% sequence identity therewith and/or capable of hybridising under medium stringency conditions to the complementary sequence of SEQ ID No. 15, or the complementary sequence of said sequences, wherein said nucleotide sequence encodes or is complementary to a sequence which encodes a heat labile alkaline phosphatase.

22.	A recombinant heat labile alkaline phosphatase comprising : (a) all or a significant part of an amino acid sequence as shown in SEQ ID No. 16 ; or (b) all or a significant part of an amino acid sequence which has at least 60k sequence identity with SEQ ID No. 16.

23.	A recombinant alkaline phosphatase as claimed in claim 22 having an amino acid sequence as shown in SEQ ID No. 16 or an amino acid sequence which has at least 75k sequence identity with the sequence of SEQ ID No. 16.

24.	A method for the preparation of an expression vector as claimed in claim.

Description:

Shrimp Alkaline Phosphatase The present invention relates to the enzyme alkaline phosphatase and more particularly to this enzyme derived from Pandalus borealis and to nucleic acid molecules encoding it.

Alkaline phosphatases (E. C. 3.1.3.1), having the alternative names alkaline phosphomonoesterase, phosphomonoesterase or glycerophosphatase, are orthophosphoric-monoester phosphohydrolases with enzyme activity optima at alkaline conditions. Examples of alkaline phosphatase (ALP) substrates are DNA, RNA, and ribo-as well as deoxyribonucleoside triphosphates. The hydrolysis produces an alcohol and an orthophosphate.

In other words, ALPs dephosphorylate DNA, RNA, rNTPs and dNTPs. Dephosphorylation of protein by various ALPs have also been reported. ALPs are widespread in nature, found in organisms ranging from bacteria to humans.

Complex organisms usually contain both tissue-specific and non-specific ALPs. The polypeptides differ in size from 15 to 170 kDa. Some of these proteins are bound or "anchored"to cellular membranes. Most commonly the enzymes have a requirement for divalent metal cations such as Mg2+ or Zn2+.

Current usage of ALPs in molecular biology is focused on, but not limited to, three main applications of DNA analysis or preparation: 1) dephosphorylation of vector DNA after restriction enzyme digestions to minimise self-ligation of the cloning vector, thus favouring ligation of insert to the vector and creating a recombinant construct, 2) dephosphorylation of dNTPs after PCR amplifications, in combined use with a single- strand exonuclease that hydrolyses primers to dNTPs, to omit the need of further clean-up before direct DNA sequencing of PCR products, or 3) dephosphorylation of DNA ends for subsequent labelling with 32p using [y- 32P]NTP and T4 polynucleotide kinase. The described ALP enzyme reactions are intermediate steps in DNA analysis processes. Other important applications known in the art include those where the enzyme activities are used as reporters such as in enzyme-linked immunosorbent assay (ELISA), in gene-fusion or gene-delivery systems, or in conjugation to oligonucleotides used as hybridisation probes.

Three main ALPs are used in commercially available products; i) calf intestinal alkaline phosphatase (CIAP) ii) shrimp alkaline phosphatase (SAP) from the arctic shrimp Pandalus borealis iii) bacterial alkaline phosphatase (BAP) The animal CIAP and SAP enzymes have similar specific activities of 2000-4000 units/mg protein compared to 50 units/mg BAP protein, which make the two former enzymes more attractive as efficient enzymes.

The SAP enzyme is inactivated by moderate heat as discussed below, and is thus preferred for several applications in which the enzyme activity needs to be removed prior to further steps in the processes.

A genetically engineered temperature sensitive BAP mutant was reported [Shandilya, H. And Chatterjee, D. K., 1995. An engineered thermosensitive alkaline phosphatase for dephosphorylating DNA. Focus, 17 (3): 93-95] with no details given to describe the engineering. This mutant enzyme (TsAP), sold by LifeTechnologies, Inc., is inactivated (95k or more) by heat (65°C for 15 min) in the presence of EDTA only.

The recommended reaction temperature for the mutant and the wild-type enzyme is also 65°C. TsAP has at least 40-fold higher activity than wild-type BAP. With reference to high specific activity, TsAP is almost comparable to other ALPs such as CIAP and SAP.

Two heat-labile ALPs from a psycrophilic microorganism have been purified and characterised [de Prada, P., and Brenchley, J. E., 1997, Purification and characterization of two extracellular alkaline phosphatases from a psycrophilic Arthrobacter isolate, Appl. Env. Microbiol., 63 (7): 2928-2931]. The enzymes that varied with respect to substrate specificities and kinetic properties, displayed different heat-labilities of which the most labile resembles the SAP enzyme lability. No specific activity in units/mg protein and no primary structures were reported.

A cold-adapted ALP from atlantic cod was isolated and characterised [Asgeirsson, B., Hartemink, R., and Chlebowski, J. F., 1995, Alkaline phosphatase from atlantic cod (Gadus morhua). Kinetic and structural properties which indicate adaption to low temperatures.

Comp. Biochem. Physiol., 110B (2): 315-329]. The enzyme showed thermolability similar to SAP. No primary structure of the protein/gene was provided.

A study on ALP isozymes from trout [Whitmore, D. H., and Goldberg, E., 1972, Trout intestinal alkaline phosphatases II. The effect of temperature upon enzymatic activity in vitro and in vivo. J. Exp. Zool., 182: 59-68] showed that temperatures of the environment affects the isozyme pattern, and that some isoforms are thermolabile.

Shrimps from the warm water region outside Taiwan contain several ALPs [Lee, A.-C., and Chuang, N.-N., 1991, Charaterization of different molecular forms of alkaline phosphatase in the hepatopancreas from shrimp Penaeus monodon (Crustacea : Depacoda). Comp. Biochem.

Physiol., 99B (4): 845-850], and the enzymes were concluded to be heat-stable. No primary structures were provided.

An alkaline phosphatase activity from the hepatopancreas of arctic shrimp Pandalus borealis was found to be contained in the processing wastewater from the shrimp industry [Olsen, R. L., Johansen, A., and Myrnes, B., 1990, Recovery of enzymes from shrimp waste.

Process Biochem. 25: 67-68], and the enzyme was later purified from the hepatopancreas [Olsen, R. L., verb, K., and Myrnes, B., 1991, Alkaline phosphatase from hepatopancreas of shrimp (Pandalus borealis) : a dimeric enzyme with catalytically active subunits. Comp.

Biochem. Physiol. 99B (4): 755-761]. This purified protein had an apparent molecular weight of 65 kDa (each subunit) and was shown to be a dimeric enzyme with catalytically active subunits in contrast to most other animal ALPs that require dimerisation for activity.

According to the report SAP has an isoelectric point of 3.7. The shrimp enzyme efficiently removes terminal 5' phosphate from any DNA strand termini (5'and 3' overhang or blunt ends) produced by restriction endonucleases, although 5'protruding ends are more reactive than blunt ends or 5'recessive ends.

Relative to CIAP, the SAP enzyme has a slight shift to lower temperatures for activity, but it is not considered to be truly cold-active. With a maximum enzyme activity at about 40°C (45°C for CIAP), almost 40% activity is retained at 10°C or at 50°C compared to 10% and 90% activity, respectively, for CIAP. Although the temperature for maximum activity is close to 40°C, the SAP enzyme starts to loose activity when pre-incubated for a period of 15 min beyond 37°C. After pre- incubation at 65°C the SAP activity is reduced by 95% or more and the activity is undetectable after pre- incubation at 70°C. In comparison, after similar heat- treatments CIAP retains 40% and 20% of its activity.

Thus, relative to its commercial competitor, the SAP enzyme is heat-labile and cold-active making it particularly suited for use in multi-step laboratory protocols where a simple heating step can de-activate the enzyme so that it plays no part in further method steps.

BIOTEC ASA, Troms, Norway, produces the commercial SAP enzyme, from the shrimp industry wastewater.

Onboard the trawlers, freshly collected shrimps are frozen in large blocks. When landed the shrimps are carefully thawed by re-circulated cold water.

Approximately 1000 1 of water is used for 4000 kg of shrimp. During the process of freezing and thawing, the shrimp hepatopancreas breaks and the contents are released to the water. This wastewater is then concentrated and several chromatographic steps are used to purify the SAP enzyme.

The producer supplies SAP for the world market through well-known companies like USB, Boehringer- Mannheim or Amersham Pharmacia Biotech.

Due to its high specific activity, its "versatility"regarding DNA termini, and its relative temperature-lability, SAP is frequently used in dephosphorylation of cloning vectors prior to ligation reactions, and in treatment of PCR amplification product-mixtures prior to DNA sequencing reactions as described in US patents 5,741,676 and 5,756,285.

The present production of SAP suffers from varying quality of the wastewater, which again affects the production efficiency. Two factors cause this variation; i) natural seasonal variation of enzyme production in the shrimp; and ii) the handling of the shrimp source prior to or during freezing and the handling of shrimps or water during or after the thawing process.

There is also a concern about the future availability of wastewater ; i) as a natural resource shrimps are not guaranteed to be available at all times; and ii) the shrimp industry is now looking into possible new ways to freeze the shrimps, i. e. single-freezing, from which the wastewater has been tested to contain small amounts of enzymes such as SAP.

In addition to these practical problems, the market has a demand for a recombinant SAP product as recombinant products are frequently preferred in molecular biology techniques, particularly where product purity is an issue, e. g. in the production of DNA based therapeutics or in forensic science. SAP has highly advantageous enzymatic and physiochemical properties and is a preferred enzyme for the laboratory protocols in which it is useful. There is therefore an appreciable need for a synthetic or recombinant source of SAP, which is produced in a uniform and pure fashion. However, neither DNA or amino acid sequences for SAP have been previously elucidated.

In order to isolate the gene and to subsequently clone and produce a recombinant SAP, several attempts have been made to obtain an N-terminal sequence of the purified protein. None of these attempts have succeeded. The sequence analyses have revealed multiple (four or more) alternative amino acids for each specific N-terminal position. Thus, no protein sequence information-has been available for use, even. with highly degenerate oligonucleotide probes, to hunt for the SAP gene. The reason (s) for the non-homologous N-termini'in an apparently homogenous enzyme preparation can only be speculated. It is possible that SAP isoforms are produced by different genes, by alternative splicing or by varying post-translational modifications of the protein, or that SAP is attacked by proteases at the N- terminus with no detectable reduction in molecular weight.

It has also been found by the present inventors that the limited homology between the alkaline phosphatase from the hepatopancreas of arctic shrimps (Pandalus borealis) and alkaline phosphatase from species which have been sequenced prevents the routine isolation of nucleic acid sequences encoding alkaline phosphatase from Pandalus borealis.

In spite of these real difficulties in obtaining a coherent consensus amino acid sequence for SAP, the Inventors have surprisingly been able to isolate a cDNA encoding SAP from a Pandalus borealis cDNA library and thus elucidate the coding sequence for SAP. The invention relates to the provision of an SAP cDNA and its use in providing an alternative source of the enzyme.

The sequence of SEQ ID No. 1 corresponds to the full cDNA sequence of the enzyme minus a small portion at the 5'end. Nevertheless, the expression product of a nucleic acid molecule comprising this sequence is believed to have all or at least a significant proportion (i. e. at least 60%, 70%, 80% or typically at least 90'-.) of the activity of the native enzyme. The few N-terminal amino acids missing from this sequence do not contain binding sites for the substrate or cofactor.

Hence reference herein to SAP or to a heat labile alkaline phosphatase is taken to also include molecules which are smaller than naturally occurring ALPs but have the same or substantially the same activity.

Advantageously, such expression products and other recombinant ALP enzymes according to the invention may have a greater activity than the native enzyme as isolated. Preferably the specific activity is higher than the published value for SAP of 1900 U mg~1. A suitable test for enzymatic activity is described herein [Olsen et al.].

Thus in one aspect, the present invention provides a (an isolated) nucleic acid molecule comprising (preferably consisting essentially of) a nucleotide sequence as set forth in SEQ ID No. 1 or having at least 55% sequence identity therewith and/or capable of hybridising under medium stringency conditions to the complementary sequence of SEQ ID No. 1, or the complementary sequence of said sequences, wherein said nucleotide sequence encodes or is complementary to a sequence which encodes a heat labile alkaline phosphatase.

A nucleic acid molecule according to the invention may thus be single or double stranded DNA, cDNA or RNA.

A nucleic acid molecule of the invention may be an isolated nucleic acid molecule (in other words isolated or separated from the components with which it is normally found in nature) or it may be a recombinant or a synthetic nucleic acid molecule.

Reference is made herein to sequences consisting essentially of the sequence of SEQ ID No. 1 and its variants. Thus flanking regions at either end may also be present. In particular, a 5'flanking region may be present of 1 to 81 or 1-60 nucleotides, more particularly 1-30 nucleotides, which may incorporate a signalling sequence for directing processing of the active mature form of the enzyme during transcription/translation. 3'flanking regions will typically contain 1-60, e. g 1-30 additional nucleotides.

The complement of a given sequence will be a single precise sequence which can be determined according to the normal rules of base pairing.

Preferably the above nucleotide sequence will have at least 65%, more preferably at least 75%, e. g. at least 85% or at least 95% sequence identity with SEQ ID No. 1 or its complement. Likewise, the nucleotide sequence will preferably hybridise under high stringency conditions to SEQ ID No. 1 or its complement. While the present invention is based on the identification of the amino acid sequence and cDNA sequence for SAP, it is well understood in the art that variants of these sequences, e. g. allelic variants and related sequences modified by single or multiple base substitution, addition or deletion will exist or can be generated which are functionally equivalent to the newly identified sequences.

A particularly preferred embodiment of the present invention is an isolated nucleic acid molecule which encodes a thermolabile eukaryotic alkaline phosphatase, preferably such an enzyme derived from shrimp, and derivatives of such enzymes. These derivatives may incorporate conservative amino acid substitutions, in that amino acids from the naturally occurring enzyme may be replaced by amino acids having similar functional characteristics in terms, e. g. of size, lipophilicity and polarity. Such functionally related groups of amino acids being understood by the man skilled in the art.

Further additions, deletions or substitutions may also be included which do not significantly affect the enzyme's catalytic activity. Typically the enzyme derivatives will have 1-50, preferably 1-30, e. g 1-15 non-conservative alterations as compared to the native enzyme.

By"functionally equivalent"is meant nucleic acid sequences which encode polypeptides (or polypeptides themselves) having alkaline phosphatase activity, preferably of similar or greater specific activity compared to the native enzyme isolated from Pandalus borealis processing wastewater. In other words, functionally equivalent nucleic acid sequences encode polypeptides which can catalyse the removal of terminal 5'phosphate groups from DNA strand terminii (5'and 3' overhang or blunt ends) produced, for example by restriction endonucleases.

Variations in the alkaline phosphatase-encoding nucleotide sequence may occur between different populations of Pandalus borealis within the species, between similar populations of different geographical origin, and also within the same organism. Such variations are included within the scope of this invention.

For the purposes of defining the level of stringency, a medium stringency comprises a hybridisation and/or a wash carried out in 0.2 x SSC-2 x SSC buffer, 0.1% (w/v) SDS at 42°C to 65°C, while a high stringency comprises a hybridisation and/or a wash carried out in 0.1 x SSC-0.2 x SSC buffer, 0.1 (w/v) SDS at a temperature of at least 55°C. Conditions for hybridisations and washes are well understood by one normally skilled in the art.

Included within the scope of the invention are nucleotide sequences which hybridise to SEQ ID NO. 1 or its complement, or to parts thereof (i. e. to hybridisation probes derived from SEQ ID NO. 1), under high stringency conditions and which preferably encode or are complementary to a sequence which encodes an alkaline phosphatase enzyme or part thereof. Conditions of high stringency may readily be determined according to techniques well known in the art, as described for example in Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, 2nd Edition. Hybridising sequences included within the scope of the invention are those binding under non-stringent conditions (6 x SSC/50 formamide at room temperature) and washed under conditions of high stringency (e. g. 0.1 x SSC, 68°C), where SSC = 0.15 M NaCl, 0.015M sodium citrate, pH 7.2.

A hybridisation probe may be a part of the SEQ ID No. 1 sequence (or complementary sequence), which is of sufficient base length and composition to function to hybridise to sample or test nucleic acid sequences to determine whether or not hybridisation under high stringency condition occurs. The probe may thus be at least 15 bases in length preferably at least 30,40,50, 75,100 or 200 bases in length. Representative probe lengths thus include 30-500 bases e. g. 30-300,50-200, 50-150,75-100.

Nucleotide sequence identity may be determined using the BestFit program of the Genetics Computer Group (GCG) Version 10 Software package from the University of Wisconsin. The program uses the local homology algorithm of Smith and Waterman with the default values: Gap creation penalty = 50, Gap extension penalty = 3, Average match = 10, 000, Average mismatch =-9.000.

Methods for producing variants of the specific sequences set out herein, for example by site-directed mutagenesis, random mutagenesis, or enzymatic cleavage and/or ligation of nucleic acids are well known in the art, as are methods for determining whether the thus- modified nucleic acid has significant homology to the subject sequence, for example by hybridisation.

Alternatively viewed, the present invention provides an isolated nucleic acid molecule which encodes or is complementary to a sequence which encodes a heat labile alkaline phosphatase, said alkaline phosphatase comprising the amino acid sequence of SEQ ID No. 2 or variants of said amino acid sequence which are at least 50%, preferably at least 60% e. g. at least 70% or 80% identical thereto.

The invention also provides a cDNA, preferably derived from Pandalus borealis, encoding a heat labile alkaline phosphatase comprising the amino acid sequence of Figure 2 (SEQ ID NO. 2) or a variant or fragment thereof being heat labile and having alkaline phosphatase activity.

The invention also provides a cDNA, preferably derived from Pandalus borealis, encoding a heat labile alkaline phosphatase consisting of an amino acid sequence of which one or more amino acid residues are deleted from,. substituted for, or added to the amino acid sequence of SEQ ID NO. 2.

A further aspect of the invention provides a polypeptide encoded by a nucleic acid molecule of the invention as defined herein. Provision of a nucleic acid molecule according to the invention thus enables recombinant alkaline phosphatase enzymes particularly SAP and variants thereof to be obtained in quantities and with a purity previously unavailable, thereby permitting a sustainable supply of the enzyme in a form that can be manufactured wholly under controlled conditions or according to certifiable conditions as specified at any time.

Accordingly, the present invention also extends to recombinant polypeptides comprising (or consisting essentially of) an amino acid sequence constituting a heat labile alkaline phosphatase enzyme encoded by the nucleotide sequence as shown in Figure 1 (SEQ ID No. 1), or a functionally-equivalent variant thereof.

Alternatively viewed, the invention also provides recombinant polypeptides comprising an amino acid sequence constituting a heat labile alkaline phosphatase enzyme, encoded by the nucleotide sequences as shown in Figure 1 (SEQ ID No. 1), or a functionally-equivalent variant thereof, substantially free from other Pandalus borealis components.

The term"polypeptide"as used herein includes both full length protein, and shorter peptide sequences.

"Functionally equivalent"as used above in relation to the polypeptide amino acid sequences defines polypeptides related to or derived from the above- mentioned polypeptide sequences where the amino acid sequence has been modified by single or multiple amino acid substitution, addition or deletion, and also sequences where the amino acids have been chemically modified, including by glycosylation or deglycosylation, but which nonetheless retain catalytic activity. Such functionally-equivalent variants may occur as natural biological variations or may be prepared using known techniques, for example functionally equivalent recombinant polypeptides may be prepared using the known techniques of site-directed mutagenesis, random mutagenesis, or enzymatic cleavage and/or ligation of amino acids.

The synthetic polypeptides according to this aspect of the invention may be prepared by expression in a host cell containing a recombinant DNA molecule which comprises a nucleotide sequence as broadly described above operatively linked to an expression control sequence, or a recombinant DNA cloning vehicle or vector containing such a recombinant DNA molecule e. g. a plasmid, cosmid, bacteriophage molecule or yeast artificial chromosome. Alternatively the polypeptides may be expressed by direct injection of a naked DNA molecule according to the invention into a host cell.

The synthetic polypeptide so expressed may be a fusion polypeptide comprising a catalytically functional portion of an alkaline phosphatase enzyme and an additional polypeptide coded for by the DNA of the recombinant molecule fused thereto. For example, it may be desirable to produce a fusion protein comprising a synthetic alkaline phosphatase or other polypeptide according to the invention coupled to a protein such as P-galactosidase, phosphatase, glutathione-S-transferase, urease and the like. Most fusion proteins are formed by expression of a recombinant gene in which two coding sequences have been joined together with reading frames in phase. It may also be desirable to fuse synthetic peptides of the invention to proteins or peptides with a ligand binding function, for example antibodies, to produce, for example enzyme-linked reporter antibodies, as well known in the art. All such fusion or hybrid derivatives of alkaline phosphatase-encoding nucleic acid molecules and their respective amino acid sequences are encompassed by the present invention. Such suitable recombinant DNA and polypeptide expression techniques are described for example in Sambrook et al., 1989.

Alternatively, the polypeptides of the invention may be produced by chemical means, such as the well-known Merrifield solid phase synthesis procedure.

More particularly, this aspect of the invention provides a recombinant polypeptide comprising (or consisting essentially of): (a) all or part of an amino acid sequence as shown in SEQ ID No. 2; or (b) all or part of an amino acid sequence which has at least 60% sequence identity with all or part of SEQ ID No. 2.

The term'recombinant'distinguishes these molecules from naturally occurring polypeptides and signifies that the molecule has been generated using 'recombinant DNA technology', a phrase well understood in the art. The enzyme having the amino acid sequence of SEQ ID No. 2 is believed to be a truncated version of the naturally occurring enzyme in Pandalus borealis and therefore if the corresponding nucleic acid is used in recombinant production then the produced recombinant enzyme will in any event not be the same as the non-recombinant naturally produced molecule.

In particular the amino acid sequence may exhibit at least 65%, 70%, 75%, 80%, 85%, 90%, 95% or 98% identity with the polypeptide of SEQ ID No. 2.

Alternatively, the amino acid sequence may exhibit at least 70%, 75%, 80%, 85%, 90%, 95% or 98% similarity with the polypeptide of SEQ ID No. 2.

In particular variations may occur at the N terminus. The polypeptides of the invention may also include a further 4-20, e. g 5-10 amino acids at the N terminus. It should also be noted that the incomplete cDNA sequence of SEQ ID NO. 1 corresponds to the amino acid sequence of SEQ ID NO. 2 but begins only at residue 7, aspartic acid. In the recombinant polypeptide, amino acid variations, in particular substitutions, may occur within these first 6 amino acids as compared to the sequence given in Fig 2. Furthermore, the N terminal lysine residue of Fig. 2 may be replaced by arginine.

A cDNA sequence corresponding to the amino acid sequence of SEQ ID No. 2 can be generated and this is made up of SEQ ID No. 1 together with the derived nucleotide sequence at the 5'end. This region is also, for the most part, provided by the forward primer of SEQ ID No. 7, itself derived from a sequenced fragment of native SAP. Due to the degeneracy of the genetic code, it is not possible from the amino acids KAYWNK to specify the sequence more precisely than AArGCnTAyTGGAAyAAr [SEQ ID No. 19] wherein r = A or G, n = T, C, A or G and y = C or T. The full cDNA sequence corresponding to SEQ ID No. 2 which comprises SEQ ID No. 19 and SEQ ID No. 1 is referred to herein as SEQ ID No. 20.

However, the sequencing work carried out as part of Example 4 has established the full cDNA sequence for Pandalus borealis alkaline phosphatase. Thus the uncertainties discussed above in respect of SEQ ID No.

19 have been resolved and the cDNA encoding KAYWNK is as follows: AAAGCATATTGGAACAAA [SEQ ID No. 21]. This sequence can replace SEQ ID No. 19 within SEQ ID No. 15 to produce, together with the part of SEQ ID No. 17 encoding the amino acid sequence NPITEED, the precise cDNA sequence encoding the recombinant enzyme. This full sequence is referred to as SEQ ID No. 22.

Amino acid sequence identity or similarity may be determined using the BestFit program of the Genetics Computer Group (GCG) Version 10 Software package from the University of Wisconsin. The program uses the local homology algorithm of Smith and Waterman with the default values: Gap creation penalty = 8, Gap extension penalty = 2, Average match = 2.912, Average =- 2.003.

A"part"of the amino acid sequence of SEQ ID No. 2 or its variants as defined above may comprise at least 20 contiguous amino acids, preferably at least 30,40, 50,70,100,150,200,300,400 or 450 contiguous amino acids. Parts having above 200, particularly above 300 amino acids can be considered'significant parts'and such parts will typically have the enzymatic active site as identified herein and preferably the co-factor binding site.

The recombinant polypeptide is preferably functionally active according to the definitions given above, e. g. is enzymatically active. Alternatively, it may not itself be functionally active but may provide regions with functional properties of the whole, e. g. represent the active site or co-factor binding site required for enzymatic activity.

As discussed above, SEQ ID No. 1 and SEQ ID No. 2 do not correspond to the complete cDNA or amino acid sequence of the native SAP. The truncated polypeptides which exhibit the same or substantially the same enzymatic activity as native SAP and retain that enzyme's thermolability constitute a further aspect of the present invention. The complete cDNA molecules or recombinant polypeptides which incorporate the truncated sequences defined herein or functionally equivalent variants thereof comprise further preferred aspects of the present invention.

According to a further aspect of the present invention is provided a polypeptide having the same or substantially the same catalytic activity and thermal properties as SAP but lacking part of its N-terminal region. Preferably, these polypeptides lack the N terminal amino acid and 2-50, preferably 5-30 e. g. 5-10 amino acids which are adjacent thereto in the native enzyme. By'substantially the same catalytic activity' is meant that the polypeptide has at least 75%, preferably at least 85%, more preferably at least 90% of the activity of native SAP. Apart from these missing N- terminal parts, the polypeptides preferably have at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or 98% identity with the polypeptide of SEQ ID No. 2.

According to the present invention, an alkaline phosphatase can be considered heat labile if its catalytic activity is undectable after pre-incubation at 65°C for 15, preferably 10 minutes.

A suitable assay for the catalytic activity of enzymes of the invention follows that of Olsen et al.

[R. L. Olsen, K. verb, and B. Myrnes (1991) Alkaline phosphatase from the hepatopancreas of shrimp (Pandalus borealis): a dimeric enzyme with catalytically active subunits, Comp. Biochem. Physiol., 998: 755-761].

Alkaline phosphatase activity is determined by incubating enzyme with 6 mM p-nitrophenyl phosphate at 37°C in 0.1 M glycine/NaOH buffer pH 10.4 with 1 mM MgCl2, and 1 mM ZnCl2in a total volume of 0.5 ml. After 15 min, the reaction is stopped by 0.5 ml of 2M NaOH and the amount of p-nitrophenol released is determined by measuring the absorbance at 405 nm (0405=18.5mM'crn).

One unit of enzyme activity will produce 14mol of p- nitrophenol per min. SAP is not negatively affected by zinc, and there is no significant demand of magnesium above a minimal concentration.

Also envisaged within the scope of the invention are variants of SEQ ID. No. 1 or SEQ ID. No. 2 which encode or comprise proteins which retain and exhibit the catalytic properties of SAP, but which are thermostable -i. e. do not suffer from complete loss of activity after heating at 70°C for 15 min.

Also provided are methods for preparing recombinant nucleic acid molecules according to the invention, comprising inserting the nucleic acid molecules containing the nucleotide sequences of the invention into another nucleic acid molecule, e. g. into vector nucleic acid, e. g. vector DNA.

Thus the invention further provides a recombinant vector, which is a vector DNA possessing a DNA fragment comprising any one of cDNAs of the invention.

In the recombinant vector, the vector DNA is preferably an expression vector, and is capable of propagation in a host cell such as Escherichia coli, Saccharomyces cerevisiae or Pichia pastoris.

Expression vectors of the invention may include appropriate control sequences such as for example translational (e. g. start and stop codons, ribosomal binding sites) and transcriptional control elements (e. g. promoter-operator regions, termination stop sequences) linked in matching reading frame with the nucleic acid molecules of the invention.

Vectors according to the invention may include plasmids and viruses (including both bacteriophage and eukaryotic viruses) according to techniques well known and documented in the art, and may be expressed in a variety of different expression systems, also well known and documented in the art.

A variety of techniques are known and may be used to introduce such vectors into prokaryotic or eukaryotic cells for expression, or into germ line or somatic cells to form transgenic animals. Suitable transformation or transfection techniquesare well described in the literature.

A genetic construct comprising a nucleic acid molecule of the invention as defined herein operably linked to a promoter sequence constitutes a further aspect of the present invention.

The invention also includes transformed or transfected prokaryotic or eukaryotic host cells, or transgenic organisms containing a nucleic acid molecule according to the invention as defined above. For convenience, herein, no distinction is made between transformation and transfection. Such host cells may for example include prokaryotic cells such as E. coli, Streptomyces and other bacteria, eukaryotic cells such as yeasts (e. g. Pichia pastoris) or the baculovirus- insect cell system, transformed mammalian cells and transgenic animals and plants. Prokaryotic host cells being preferred.

Thus the invention also provides a transformant, which is a cell transformed with the recombinant vector of the invention.

In a further aspect, the present invention provides a recombinant cell or organism having a nucleic acid molecule comprising a nucleotide sequence as set forth in SEQ ID Nos. 1 or 20, or a functionally equivalent variant thereof as defined herein, and which expresses or is capable of expressing a thermolabile alkaline phosphatase. The cell or organism is'recombinant'in that it contains genetic material (in this case encoding a thermolabile alkaline phosphatase) which is not native to that organism, thus the shrimp Pandalus borealis is excluded, for while it expresses a thermolabile alkaline phosphatase, the gene therefore is found naturally in that species.

Alternatively viewed, the cell or organism expressing a thermolabile alkaline phosphatase could be considered"modified"in that such a cell or organism of that genotype and phenotype does not exist naturally.

Thus the cell or organism could also be termed a "non-naturally"occurring cell or organism due to the presence in it of non-native genetic material and its ability to produce a protein (a thermolabile alkaline phosphatase) not produced by naturally occurring members of that species or cell type.

The above described cells and organisms can be termed transformants in that they or their ancestors have been transformed by introduction of genetic material encoding a thermolabile alkaline phosphatase as described herein.

The invention also provides another transformant, which is Escherichia coli or Saccharomyces cerevisiae transformed with the recombinant vector. Suitable expression strains of bacteria, yeasts and other cultured cells are known in the art.

The invention also provides a process for producing a heat labile alkaline phosphatase, comprising culturing the transformants of the invention in a culture medium and isolating the alkaline phosphatase from the cultured transformant. More particularly, the invention provides a method for the manufacture of a heat labile alkaline phosphatase, comprising transforming a host cell with an expression vector that comprises a nucleotide sequence as set forth in SEQ ID Nos. 1 or 20 or having at least 55% sequence identity therewith and/or capable of hybridising under medium stringency conditions to the complementary sequence of SEQ ID Nos. 1 or 20, or the complementary sequence of said sequences, maintaining the cell in a culture medium and isolating heat labile alkaline phosphatase expressed by said cell. Methods of cell culturing are well known in the art and will typically allow for cell multiplication and simultaneous or subsequent expression of the ALP gene. The gene product may accumulate intracellularly or extracellularly. Methods for harvesting and purifying proteins generated in this way are well known in the art.

Further work has been done and the precise sequence information refined. The changes in the revised cDNA and protein sequences are minor. The revised cDNA sequence is shown in Fig. 4 and called SEQ ID No. 15.

As well as incorporating an additional 5'sequence (the equivalent amino acid sequence of which was always present in SEQ ID No. 2) it lacks 9 nucleotides as shown in the comparison of sequences in Figure 7. As compared the sequences are 99.4% identical.

On the protein level, there is a slightly bigger change as the corrections created a change in the reading frame in a small part of the protein. The revised amino acid sequence is shown in Figure 5 and is called SEQ ID No. 16. A comparison with the original amino acid sequence is shown in Figure 8. As compared the sequences are around 96% identical.

Thus, all references hereinbefore to SEQ ID Nos. 1 and 2 may be replaced with SEQ ID Nos. 15 and 16 respectively. These sequences comprising preferred embodiments of the invention already described.

In addition, a sequence for the full cDNA of SAP has recently been elucidated and thus the whole molecule corresponds to SEQ ID No. 17 followed by SEQ ID No. 15 and the full amino acid sequence of SAP (plus signal sequence) corresponds to SEQ ID No. 18 followed by SEQ ID No. 2 or 16. It is believed. that of the 47 amino acids of SEQ ID No. 18, the last 7 amino acids (NPITEED) or some of them, e. g. PITEED or TEED are found at the N terminus of the mature SAP protein. The remaining amino acids are thought to be a signal polypeptide. Molecules having these sequences constitute further aspects of the present invention. In a preferred embodiment of the present invention the recombinant alkaline phosphatases have the amino acid sequence NPITEED at their N terminus.

The invention will now be further described in the following non-limiting examples in which: Figure 1 shows the cDNA sequence encoding SAP from Pandalus borealis (minus a small portion at the 5'end).

The stop codon and the cDNA downstream sequence including the polyA tail are underlined; Figure 2 shows the amino acid sequence encoded by the SAP cDNA shown in Figure 1 (and incorporates an additional N terminal region). Homologies to sequenced protein fragments are underlined and specified. The sequence of the enzyme active site (according to known alkaline phosphatases) is shown in bold type; and Figure 3 shows the SAP sequence aligned to its top scoring homologs of tissue-nonspecific ALPs found in the Swiss-Prot database; mouse (Accession number P09242), human (A. n. P05186), chicken (A. n. Q92058) and an ALP of the silk moth Bombyx mori (A. n. P29523).

Figure 4 shows the revised cDNA encoding SAP from Pandalus borealis. The polyA-tail is indicated (A) 23.

Five nucleotide positions in the 5'-sequence (lower case) are biased by the degenerated PCR primer used for amplification and sequencing, and the positions are thus not specified.

Figure 5 shows the amino acid sequence encoded by the SAP cDNA shown in Figure 4. Homologies to sequenced protein fragments are underlined and specified. The sequence of the proposed enzyme active site is shown in bold. The N-terminal lysine residue (lower case) could alternatively be arginine because of the biasing of the degenerated PCR primer as discussed in respect of Figure 4 above.

Figure 6 corresponds to Figure 3 above but incorporates the revised amino acid sequence of Figure 5.

Figure 7 shows an alignment of the revised cDNA sequence of Figure 4 (top) with the original sequence of Figure 1 (additionally comprising the 5'end originally only presented in the equivalent amino acid sequence of Figure 2). The comparison was performed using the Needle program which uses the Needleman-Wunsch global alignment algorithm to find the optimum alignment (including gaps) of two sequences. [Needleman et al.

J. Mol. Biol. (1983) 48, pp 443-453], matrix Blosum 62 with gap penalties of Gap open = 10.0 and Gap = 0.5. The overall %-identity = 99.4%.

Figure 8 shows an alignment of the revised amino acid sequence of Figure 5 (top) with the original amino acid sequence of Figure 2. The comparison was performed using the above-described Needle program, matrix Blosum 62 with gap penalties of Gap open = 10.0 and Gap extended = 0.5. The overall'-. identity = 96.03.

Figure 9 gives the cDNA [SEQ ID No. 17] and amino acid [SEQ ID No. 18] sequence of the proposed signal sequence and N-terminal region of alkaline phosphatase from Pandalus borealis. The SAP amino acid sequence continues then with KAYWNK... as described in Figures 2 and 4.

Figure 10 is a photo from an SDS-PAGE of total cell protein extracts from the expression cultures of Example 4, in which: Lane 1 and 10 : Broad Range protein standard, Lane 2 and 9: SAP 1.5 yg, Lane 3: negative control culture, non induced, Lane 4: negative control culture, induced, Lane 5 : SAP-culture 1, non-induced, Lane 6: SAP-culture 1, induced, Lane 7: SAP-culture 2, non-induced, Lane 8: SAP-culture 2, induced. The proteins in the Broad-Range protein standard are: myosine (200 kD), -galactosidase (116 kD), phosphorylase b (97.4 kD), serum albumin (66 kD), ovalbumine (45 kD), carbonic anhydrase (31 kD), trypsin inhibitor (21.5 kD), lysozyme (14.4 kD) and aprotinin (6.5 kD).

Figure 11 is a photo of an immunoblot of total cell protein extracts from the expression cultures of Example 4, in which Lane: 1 and 8: SAP 50 ng, Lane 2: negative control culture, non induced, Lane 3: negative control culture, induced, Lane 4: SAP-culture 1, non-induced, Lane 5: SAP-culture 1, induced, Lane 6: SAP-culture 2, non-induced, Lane 7: SAP-culture 2, induced.

EXAMPLES Example 1-Sequencing the sap protein and isolating the sap gene Protein isolation, fragmentation and sequence analysis Shrimp alkaline phosphatase (SAP) was purified to apparent homogeneity as described [R. Olsen et al., 1991]. A single protein band of approximately 55 kDa was detected by coomassie as well as silver staining after electrophoresis in 10% NuPAGE Bis-Tris gel system (Novex) using a 50 mM 2- (N-morpholino) ethane. sulphonic acid running buffer containing 0. 1% SDS.

1 mg protein was lyophilised and submitted to a commercial sequence analysis service (Innovagen, Sweden) where the protein was fragmented by trypsin and the fragments were separated by reverse-phase HPLC. More than 30 fragments were produced and 4 fragments were selected for further analysis. Mass spectrometry mediated sequence analysis of the selected fragments (H23 : 18, H23: 30,5ReRP6: 26, 5ReRP6: 17) produced 12,10, 8 and 5 amino acids in sequence, respectively.

Synthesis of oligonucleotides derived from protein fragment sequences Based on the amino acid sequences, standard codons were predicted and degenerated deoxyribo- oligonucleotides of forward and reverse complementary sequences were subsequently custom made (Eurogentec, Belgium).

Synthesis of shrimp cDNA Freshly collected P. borealis shrimps were dissected and individual hepatopancreas were stored on liquid nitrogen until use. mRNA was isolated from a single hepatopancreas by the use of PolyATracts System 1000 (Promega). Isolated mRNA was used for first strand cDNA synthesis before second strand synthesis and cDNA amplification was performed following instructions given in the Smart PCR cDNA synthesis kit (Clontech) and the Advantage cDNA PCR kit (Clontech).

PCR amplification and sequence analysis of a SAP-gene internal fragment A small aliquot of the synthesised cDNA was used as template in PCRs primed by pair-wise combinations (forward and reverse directions) of the protein-derived oligonucleotides. The amplification reactions, having standard mixture compositions, were run in an Eppendorf gradient thermocycler. PCR products were detected after agarose gel electrophoresis. The primer combination of oligonuclotides 17F and 26R gave a distinct amplification product of approximately 600 bp which was sequenced using the Thermo Sequenase radiolabelled terminator cycle sequencing kit (Amersham) by exploiting the same PCR primers for sequencing.

Rapid amplification of cDNA ends (RACE) Based on the DNA sequence found in the PCR product, new specific forward and reverse primers for 5'and 3' RACE reactions were synthesised as described for the use with the Marathon cDNA amplification kit (Clontech) in amplification from the cDNA template.

By the use of the SAP gene-specific forward primer MalpF in combination with the kit-contained AP1 primer, having sequence identity to the ligated adaptor, a 1.4 kb 3'-RACE fragment was produced. Similarly, the reverse gene-specific primer MalpR gave rise to a 0.5 kb 5'-RACE product. DNA sequencing analysis of the 3'-RACE product confirmed that the RACE product overlapped the 600 bp PCR product. The 3'-RACE product was subcloned into the pCR-Script vector (Stratagene) to produce the pMalpF plasmid for further sequencing of the cDNA gene downstream of the sequence region contained in the 600 bp PCR product.

The 600 bp PCR product and the pMalpF insert were sequenced several times in both directions using new gene specific primers.

High-fidelity PCR amplification of the entire cDNA gene A final PCR reaction was performed using primers SapF and SapR and the Pfu polymerase (Stratagene) and cDNA as template for a 1.5 kb SAP cDNA gene amplification product. The 1.5 kb PCR product was analysed for confirmation of the obtained sequence. The cDNA gene was found to contain an open reading frame encoding a 478 amino acid polypeptide and a 3'sequence that includes a putative signal for polyadenylation of the transcript-see figures 1 and 2.

Thermal cycling profiles used in PCR The following cycling profiles were used for amplification reactions: -total cDNA amplification 1 x : 95°C, 2 min 16x : 95°C, 5 sec 65°C, 5 sec 68°C, 6 min -internal 600 bp fragment PCR 1 x : 94°C, 2 min 36x: 94°C, 10 sec 51°C, 10 sec 72°C, 1 min 1x: 72°C, 5 min - 5' RACE 1x: 94°C, 30 sec 5x: 94°C, 5 sec 72°C, 4 min lx : 94°C, 30 sec 5x: 94°C, 5 sec 70°C, 4 min lx : 94°C, 30 sec 23x: 94°C, 5 sec -3'RACE lx: 94°C, 30 sec 36x : 94°C, 5 sec 68°C, 4 min -high-fidelity PCR lx : 94°C, 3 min 40x: 94°C, 10 sec 55°C, 15 sec 72°C, 3 min lx: 72°C, 5 min RESULTS The SAP protein subunit was originally estimated to have a molecular mass of 65 kDa. The USB product sheet describes a 59 kDa SAP protein, and in the Inventors' SDS-PAGE system, SAP migrates in a similar fashion to a 55 kDa protein of the molecular weight standard.

Discrepancies are probably explained by variations in buffer systems and gel qualities used.

The isolated cDNA encodes a polypeptide having sequence identities to the analysed SAP protein fragments-see figure 2.

The SAP protein sequence was used as query sequence in homology searches in public available databases. The software programs BLAST and ClustalW were used for homology search and multi-sequence alignments, respectively. The SAP sequence was aligned to its top scoring homologs of tissue-nonspecific ALPs found in mouse (Accession number P09242), human (A. n. P05186), chicken (A. n. Q92058) and an ALP of the silk moth Bombyx mori (A. n. P29523). The alignment is shown in Figure 3. No sequence equivalent of the cDNA is found by homology searches in public available databases. The derived protein sequence, however, scores relative high homology to a number of known ALPs having 44 % identity and 60% similarity in amino acid sequences. Protein homologies were searched in the SWALL database (Non- redundant protein sequence database; Swissprot+Trembl+TremblNew) running the Fasta3 program with default parameters (blosum50 matrix, gap open penalty =-12 and gap extension penalty-2, and KTUP = 2) using the internet server of the European Bioinforrnatics Institute. [W. R. Pearson and D. J.

Lipman (1988),"Improved tools for biological sequence analysis", PNAS 85: 2444-2448; W. R. Pearson, (1990), "Rapid and sensitive sequence comparison with FASTP and FASTA", Methods in Enzymology 183: 63-98.] Also, the cDNA-encoded protein has the motif VTDSAASAT, which corresponds to the ALP active site pattern defined by Prosite PSOO 123, contained in its sequence. Thus, the isolated cDNA of a mRNA-transcript from shrimp hepatopancreas represents the gene for P. borealis SAP.

There is one amino acid discrepancy between the sequenced protein fragment 5ReRP6 : 26 and the corresponding cDNA-derived sequence. This could be explained by allelic variations in the SAP gene or by an error in the protein sequence analysis.

Sequences Used in Example 1 Protein fragments : H23: 18 DINFRYASAAP (V) K [SEQ I. D. NO. 3] H23: 30 HLITDWLDDK [SEQ I. D. NO. 4] 5ReRP6: 26 VIMGGERR [SEQ I. D. NO. 5] 5ReRP6 : 17 AYWNK [SEQ I. D. NO. 6] Oligonucleotides : 17F: GCITAYTGGAAYAAR [SEQ ID. NO. 7] 26R: CKYTCICCICCCATDATIAC [SEQ ID. NO. 8] where D = A + G + T I = Inosine K = G + T R = A + G Y = C + T MalpF: ATTTCGTGGGAAGAATTCGACTTTGC [SEQ I. D. NO. 9] MalpR: GATCTGCCAGCCTCCTGGAACCA [SEQ I. D. NO. 10] SapF: AARGCNTAYTGGAAYAARGAT [SEQ ID. NO. 11] SapR: GAAGGTAATCATCTACATCTCA [SEQ ID. NO. 12] Example 1 has been repeated and the sequence information refined. As reflected in Figures 4 and 5, the cDNA gene was found to contain an open reading frame encoding a 475 amino acid protein. The cDNA of Fig. 4 has a theoretical molecular mass of 53 kDa as compared to an estimate of the mass of the purified native SAP based on SDS-PAGE of 54-55 kDa. From the multi-sequence alignments, it is believed that the SAP contains 5-10 additional amino acids at its N-terminal, bringing the molecular mass close to 54 kDa. As discussed earlier, this N-terminal region is thought to comprise some or all of the heptapeptide NPITEED, see also Example 4.

Example 2-Amino acid composition of cDNA-derived protein sequence compared to native SAP Freeze-dried SAP protein was dissolved in 6 M HC1 and hydrolysed at 110°C for 24 h. After evaporation of HC1 the sample was resuspended in 0.2 M sodium citrate buffer, pH 2.2, and subjected to HPLC chromatography for identification and percentage molar determination of amino acids in the hydrolysed product.

Note: This system does not differentiate between aspartate and asparagine or between glutamate and glutamine. Thus, the figures for these amino acids in the native SAP protein are combined. Also, the numbers of cysteine residues are supposedly underestimated since the protein was not oxidised.

% in cDNA-derived Native SAP Alanine 8.3 9.6 Aspartate + Asparagine (13.7) 13.6 Aspartate 10.0- Asparagine 3.7- Glutamate + Glutamine (9.0) 11.7 Glutamate 6.7- Glutamine 2.3- Cysteine 1.0 0.2 Phenylalanine 4.1 4.1 Glycine 8.4 8.4 Histidine 3.7 2.8 Isoleucine 5.2 5.4 Lysine 4.6 4.2 Leucine 7.3 8.0 Methionine 1.6 1.9 Proline 2.7 3.1 Arginine 5.2 5.2 Serine 4.6 4.6 Threonine 8.7 8.5 Valine 5.4 4.9 Tryptophan 1.6- Tyrosine 4.1 3.6 Conclusion: For the majority of amino acids the molar ratios of each amino acid in the cDNA-derived protein sequence are close to identical to the ratios found in the purified native protein. The apparently small disagreements are discussed above.

Thus, compared to the SAP protein the isolated cDNA encodes a protein of very similar or identical amino acid composition.

This Example and analysis has been repeated for the revised cDNA sequence and the results are presented below: Amino acid composition of cDNA-derived protein sequence compared to native SAP. a in cDNA-derived Native SAP Alanine 8.8 9.6 Aspartate + Asparagine-13.6 Aspartate 10.1- Asparagine 3.4- Glutamate + Glutamine-11. 7 Glutamate 7.1- Glutamine 2.3- Cysteine 0.8 0.2 Phenylalanine 4.0 4.1 Glycine 8.0 8.4 Histidine 3.3 2.8 Isoleucine 5.4 5.4 Lysine 4. 6 4.2 Leucine 7.6 8.0 Methionine 2.1 1.9 Proline 2.7 3.1 Arginine 4.8 5.2 Serine 4.6 4.6 Threonine 9.0 8.5 Valine 5.4 4.9 Tryptophan 1.5- Tyrosine 4.0 3.6 Example 3-Recombinant expression of SAP in Pichia pastoris The cDNA encoding SAP is used to express the enzyme heterologously in a suitable micro-organism for production of recombinant enzyme by fermentation.

Several expression systems are available, and eukaryotic expression systems are preferred since these may express the enzyme at reasonable levels in its native state.

Such a system is the methylotrophic yeast Pichia pastoris (Invitrogen) which has been reported to express recombinant proteins at levels up to 12 g/L by intracellular expression [Clare, J. J.. Raiyment, F. B., Ballantine, S. P., Sreekrishna, K. and Romanos, M. A.

(1991): High-level expression of tetanus toxin fragment c in Pichia pastoris strains containing multiple tandem integrations of the gene. Blo/Technology 9,455-460] and up to 2.6 g/L secreted into the medium. [Paifer, E., Margolles, E., Cremata, J., Montesino. R., Herrera, L. and Delgado, JM. (1994): Efficient expression and secretion of recombinant alpha amylase in Pichia pastoris using two different signal sequences. Yeast 10.

1416-1419].

The inserted gene is regulated by the AOX1 (alcohol oxidase) promoter so expression may be induced by methanol.

The SAP cDNA is inserted into the commercially available vector pPIC9K. Recombinant SAP may then be expressed as a secreted product by fusion to the signal peptide sequence derived from S. cerevisiae a-factor contained in the vector. When integrated into a his' strain (KM71 or GS115) strains are selected for recombinants by selection of his + clones in a histidine-free medium, since the his genotype is complemented by the vector insert. Furthermore, recombinants are screened for multiple gene inserts by resistance to increased concentrations of Geneticin (G418, Invitrogen) acquired by the plasmid-borne kanamycin resistance gene.

In detail, the PCR-generated SAP cDNA is used as a template in a new PCR reaction in order to generate a new PCR product containing proper restriction sites for integration into the pPIC9K vector in frame with the a- factor signal sequence. As the forward primer for the PCR reaction, a modified"17F"primer [SEQ. ID. No. 13] is used, replacing ATG codon (Met) with a lysine codon (AAG) in order to reconstruct a putative lysine residue cleaved off by trypsin. The reverse primer [SEQ. ID.

No. 14] is constructed in order to attach a NotI restriction site 3'to the stop codon of the SAP cDNA sequence.

The resulting PCR product is cut with NotI and ligated into a pPIC9K vector previously cut with SnaBI and NotI (resulting in plasmid pPIC9K-SAP). Since the vector is blunt-ended by SnaBI at the 5'joining site, and both termini are in reading frame, no further modification of the 5'-terminus of the modified cDNA is necessary.

The vector is then propagated by transformation of a competent E. coli strain like TOP10F' (Invitrogen) or equivalent, and recombinants are selected by their resistance to ampicillin. Isolated pP109K-SAP plasmid may then be linearized by restriction cutting with SocI in order to create a vector cassette to be inserted into The AOXI gene of Pichia pastoris. After transformation of Pichia pastoris cells (strains GS115 or KM71) with linearized pPIC9K-SAP construct, recombinants are selected from his + mutants according to the manufacturer's protocol.

Isolated colonies are tested for expression of recombinant SAP into the medium by growing cells in the presence of methanol according to the method recommended by the manufacturer. If desired, further increases in expression levels are achieved by screening recombinant strains for clones containing multiple copies of the recombinant gene. This is done by cultivating his + mutants in the presence of increasing levels of the antibiotic G418 (Invitrogen). Clones growing at high G418 levels are likely to contain multiple copies of the gene.

All methods of cultivation, transformation and selection are described in detail in protocols available from the supplier of the Pichia pastoris system (Invitrogen, The Netherlands).

The modified SAP construct yields a recombinant product where 5 amino acids (EAEAY) are added from the remaining signal recognition sequence of the signal peptidase KEX2 of Pichia pastoris. This N-terminal sequence results in a final product that is slightly truncated, but in reasonable consensus with other alkaline phosphatases, still containing a recognition sequence for Pichia pastoris signal peptidase.

Primers Used in Example 3 Forward: AAGGCITAYTGGAAYAAR [SEQ. ID. No. 13] where I = Inosine R = A + G Y = C + T Reverse: TACAGCGGCCGCCATCTOAIl1TTCG [SEQ. ID. No. 14] Example 4-Expression of SAP in Escherichia coli TOP10 Determination of the SAP-signal sequence The N-terminal signal sequence of SAP was determined by 5'-RACE using cDNA as a template. mRNA was isolated from 100 mg of shrimp (Pandalus Borealis) hepatopancreas using the Oligotex Direct mRNA Mini Kit (Qiagen) as described by manufacturer. Shrimp hepatopancreas cDNA was made using SMART PCR cDNA synthesis Kit (Clontech) following the protocol described by the manufacturer.

The cDNA was blunt ended and purified by adding 40 Ag proteinase K to 50 Al of the cDNA and incubated at 45°C for 45 min and 90°C for 8 min, and then adding 15 U T4-DNA polymerase (Promega) and incubated at 16°C for 30 min and at 72°C for 10 min. Finally the cDNA was extracted two times in phenol/chlorophorm and ethanol precipitated. RACE-adaptors were ligated to the cDNA as described in Marathon cDNA amplification kit (Clontech).

The 5'-RACE reaction was done in a final volume of 50 Al using the Advantage cDNA polymerase mix (Clontech) and a APl-primer (supplied with the Marathon cDNA amplification kit) and a SAP-specific primer (5'-GCG TGG TGC ATA TGG TCA ATC CGT CC-3') and 5 Al 50 times diluted adaptor ligated cDNA as template. The RACE PCR-reaction was done in a Biometra TGradient thermocycler with a initial denaturation step of 94°C for 30 seconds, followed by 5 cycles of 94°C for 5 sec and 72°C for 3 min, 5 cycles of 94°C for 5 sec and 70°C for 3 min, and 30 cycles of 94°C for 5 sec and 68°C for 3 min. The 1200 bp SAP-fragment generated was purified from the agarose gel, re-amplified using PCR and sequenced.

Cloning of SAP into pBAD/gIII A expression vector To clone the SAP-gene into the pBAD/gIII A vector, the SAP-gene was PCR-amplified with two primers containing a SacI and HindIII site, respectively (underlined). Based on the full length SAP-sequence obtained (signal sequence in addition to the main sequence), the primers were designed to amplify the SAP-gene with the amino acids NPITEEDKAYWNK... as N-terminus. In addition, three bases were changed in primer 1 (from C to A, A to C, and A to C) in order to codon-optimize three of the N-terminal codons (denoted with small letters in the primer sequence) (Primer 1: 5'-ATG GAG CTC AAC CCa ATc ACc GAA GAA GAC AA-3'Primer 2: 5'-ATG AAG CTT TCA TTT TTC GTC ACA GAA AGT G-3'). The PCR was carried out in a final volume of 50 Al containing 10 AM of each of the two primers, 1.5 U Pfu-polymerase (Promega) and buffer supplied, 0.2 mM dNTP and 10 ng shrimp hepatopancreas cDNA as a template. PCR was done with a initial denaturation step of 94°C for 30 sec followed by 30 cycles of 94°C for 15 sec, 55°C for 1 min and 72°C for 3 min. The PCR-product was purified with Qiaquick PCR purification kit (Qiagen).

Both the pBAD/gIII A vector and the SAP PCR-product was treated with 5U of SacI and HindIII restriction enzymes, respectively. The SAP PCR-product was ligated into the vector in a 10 Al reaction using 1 U T4-DNA ligase (USB) and approximately 1 mg of both plasmid and PCR-product.

The reaction mix was incubated at 16°C for 16 h. The ligation mix was then used to transform E. coli TOP10 cells by adding the ligation mix to 40 Al electrocompetent E. coli TOP10 cells and electroporated at 1800 V. Immediately after electroporation, 1 ml SOC-medium was added and incubated at 37°C for 1 h. The cells were then plated on LB-plates containing 100 mg/ml ampicillin for selection of transformed cells.

Expression of SAP in E. coli TOP10 Two E. coli TOP10 colonies containing the recombinant pBAD/gIII A vector containing the SAP-gene were selected for expression. An E. coli TOP10 strain. containing a empty pBAD/gIII vector was used as negative control.

Three Erlenmeyer flasks containing 50 ml LB-medium were inoculated with 2.5 ml overnight cultures in LB-medium supplied with 100 yg/ml ampicillin. The cells were grown at 30°C with shaking until the cells reached a density of ODgoo 0. 5. Each of the three cultures were then splitted into two (2 x 25 ml), where one of the two cultures were induced by adding L-arabinose to a final concentration of 0.05 % (w/v).-All cell-cultures were further grown at 30°C with shaking for 3 h before the cells were harvested.

Analysing the expression cultures From each of the expression cultures, a 200 Al sample were centrifuged and the supernatants were removed. To the cell pellets, 100 Al 1X SDS-PAGE sample buffer was added and incubated at 100°C for 3 min and centrifuged 18.000 g for 3 min. The cell lysates were then directly applied and analysed on a SDS-PAGE gel. An additional SDS-PAGE gel was further electroblotted to transfer the proteins to a nitrocellulose blotting membrane (0.45 um, Bio-Rad) and immunostained using a antibody raised against SAP purified from the shrimp processing water and Goat Anti-Rabbit IgG (H + L) (Human IgG Adsorbed) Horseradish Peroxidase Conjugate (Bio-Rad). Standard protocols for SDS-PAGE, electroblotting and immunostaining were followed.

As shown in figures 10 and 11, recombinant SAP is expressed using the pBAD/gIII A vector and E. coli TOP10 cells as a host. Both recombinant E. coli strains containing recombinant pBAD/gIII A-SAP vectors produce recombinant SAP, whereas the negative control did not give any signal on the western blot (Figure 11).

Previous Patent: BVH-A2 AND BVH-A3 ANTIGENS OF GROUP B STREPTOCOCCUS

Next Patent: FEEDBACK-RESISTANT PYRUVATE CARBOXYLASE GENE FROM CORYNEBACTERIUM