Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A MICROBIOLOGICAL DETECTION METHOD
Document Type and Number:
WIPO Patent Application WO/2010/109173
Kind Code:
A1
Abstract:
The present invention relates to a method for detecting the presence of a hydrocarbon deposit in a geographical location. The method comprises the steps of determining the concentration of a first and a second polynucleotide present at the location, wherein both polynucleotides encode proteins capable of metabolising hydrocarbons. The first polynucleotide encodes a different protein from the second polynucleotide. A mathematical operation is calculated using the concentration of the first and second polynucleotides, and it is indicative of the presence or absence of a hydrocarbon deposit at the location. Optionally the concentration of a third polynucleotide, which encodes a generic gene, is calculated and is used to normalise the data.

Inventors:
HATTON RICHARD (GB)
SLEAT ROBERT (GB)
Application Number:
PCT/GB2010/000531
Publication Date:
September 30, 2010
Filing Date:
March 23, 2010
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ENVIROGENE LTD (GB)
HATTON RICHARD (GB)
SLEAT ROBERT (GB)
International Classes:
C12Q1/68
Domestic Patent References:
WO2009013516A12009-01-29
WO2005103284A22005-11-03
WO1991002086A11991-02-21
WO2003012390A22003-02-13
Other References:
SHANE M POWELL ET AL: "Using Real-Time PCR to Assess Changes in the Hydrocarbon-Degrading Microbial Community in Antarctic Soil During Bioremediation", MICROBIAL ECOLOGY, SPRINGER-VERLAG, NE LNKD- DOI:10.1007/S00248-006-9131-Z, vol. 52, no. 3, 31 August 2006 (2006-08-31), pages 523 - 532, XP019458675, ISSN: 1432-184X
ROY I ET AL: "Microbes as an indicator of underground petroleum deposits", FUEL, IPC SCIENCE AND TECHNOLOGY PRESS, GUILDFORD, GB LNKD- DOI:10.1016/0016-2361(89)90093-8, vol. 68, no. 3, 1 March 1989 (1989-03-01), pages 311 - 314, XP025456539, ISSN: 0016-2361, [retrieved on 19890301]
ALTSCHUL, STEPHEN F.; THOMAS L. MADDEN; ALEJANDRO A. SCHAFFER; JINGHUI ZHANG; ZHENG ZHANG; WEBB MILLER; DAVID J. LIPMAN: "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", NUCLEIC ACIDS RES., vol. 25, 1997, pages 3389 - 3402, XP002905950, DOI: doi:10.1093/nar/25.17.3389
OIL; GAS, SCIENCE AND TECHNOLOGY - REV. IFP, vol. 58, 2003, pages 427 - 440
MIGUEZ ET AL., MICROBIOL ECOLOGY, vol. 33, 1997, pages 21 - 31
APPLIED & ENVIRONMENTAL MICROBIOLOGY, vol. 65, 1999, pages 80 - 87
APPLIED & ENVIRONMENTAL MICROBIOLOGY, vol. 56, 1990, pages 254 - 259
APPLIED & ENVIRONMENTAL MICROBIOLOGY, vol. 67, 2001, pages 1542 - 1550
APPLIED & ENVIRONMENTAL MICROBIOLOGY, vol. 66, 2000, pages 806837 - 8678
APPLIED & ENVIRONMENTAL MICROBIOLOGY, vol. 69, 2003, pages 3350 - 3358
SUZUKI ET AL., AEM, vol. 66, no. 11, 2000, pages 4605 - 4614
DING C ET AL.: "Quantitative Analysis of Nucleic Acids - The Last Few Years of Progress", J. BIOCHEM MOL. BIOL., vol. 37, no. 1, 31 January 2004 (2004-01-31), pages 1 - 10, XP002541401
APPLIED & ENVIRONMENTAL MICROBIOLOGY, vol. 69, 2003, pages 6597 - 6604
ENVIRONMENTAL MICROBIOLOGY, vol. 6, 2004, pages 754 - 759
FEMS MICROBIOLOGY ECOLOGY, vol. 41, 2002, pages 141 - 150
ENVIRONMENTAL MICROBIOLOGY, vol. 1, 1999, pages 307 - 317
Attorney, Agent or Firm:
ARENDS, William, Gerrit (90 Long Acre, London WC2E 9RA, GB)
Download PDF:
Claims:
CLAIMS:

1. A method for detecting the presence of a hydrocarbon deposit in a geographical location comprising the steps of:

i) determining the concentration, at a site in the location or in a sample taken from the site, of a first polynucleotide encoding a protein capable of metabolising a hydrocarbon;

ii) determining the concentration, at the same site in the location or in a sample taken from the site, of a second polynucleotide encoding a protein capable of metabolising a hydrocarbon; and

iii) calculating a mathematical operation on the concentration of the first polynucleotide relative to the concentration of the second polynucleotide at the site;

wherein the result of the mathematical operation is indicative of the presence or absence of a hydrocarbon deposit at the location, and wherein the first polynucleotide encodes a different protein from the second polynucleotide.

2. A method according to claim 1 wherein the mathematical operation is calculating the ratio between the first and second polynucleotide.

3. A method according to claim 1 wherein steps i) and ii) further comprise the step of dividing the first and second polynucleotide concentrations by their respective population medians to achieve a common amplitude scale.

4. A method according to claim 1 wherein the step of determining the concentration of the first polynucleotide comprises determining the concentration of a subsequence of the first polynucleotide sequence.

5. A method according to either of claims 1 or 4 wherein the step of determining the concentration of the second polynucleotide comprises determining the concentration of a subsequence of the second polynucleotide sequence.

6. A method according to either of claims 4 or 5 wherein the subsequence comprises a consensus sequence present in a plurality of homologous genes from different micro-organisms encoding a selected protein capable of metabolising a hydrocarbon.

7. A method according to any one of the preceding claims wherein the first and/or second polynucleotide is DNA.

8. A method according to any one of the preceding claims wherein the first and/or second polynucleotide is RNA.

9. A method according to any one of the preceding claims wherein the first and/or second polynucleotide encodes a protein which metabolises a hydrocarbon selected from a Cl to C20 alkane, an optionally substituted single or multi-ring aromatic hydrocarbon, or a naphthene.

10. A method according to any one of the preceding claims wherein the first and/or second polynucleotide encodes a biphenyl dioxygenase, a toluene monooxygenase, an alkane hydroxylase, a naphthalene dioxygenase, a toluene dioxygenase, a xylene monooxygenase, a butane monooxygenase, a methane monooxygenase, a catechol 2,3, dioxygenase, a bacterial P450 oxygenase, or a eukaryotic P450 oxygenase.

11. A method according to claim 10 wherein the first or second polynucleotide is an AIkB gene.

12. A method according to claim 10 wherein the first or second polynucleotide is a XyIM gene.

13. A method according to claim 10 wherein the first or second polynucleotide is a ToID gene.

14. A method according to claim 10 wherein the first or second polynucleotide encodes a methane monooxygenase, such as particulate methane monooxygenase (pmoA).

15. A method according to claim 10 wherein the first or second polynucleotide encodes a catechol 2,3, dioxygenase.

16. A method according to any one of the preceding claims wherein one of the first or second polynucleotides encodes a protein capable of metabolising a short chain alkane, and wherein the other polynucleotide encodes a protein capable of metabolising a long chain alkane or an aromatic compound.

17. A method according to any one of claims 1 to 15 wherein one of the first or second polynucleotides encodes a protein capable of metabolising an alkane, and wherein the other polynucleotide encodes a protein capable of metabolising an aromatic compound.

18. A method according to any either of claims 16 and 17 wherein the second polynucleotide encodes a catechol 2,3, dioxygenase.

19. A method according to any one of the preceding claims wherein the concentration of the first and/or second polynucleotide is determined by quantitative PCR.

20. A method according to any one of the preceding claims wherein the sample is taken from a depth of between 0 and 100cm, preferably between 0 and 50cm, and most preferably between 10 and 40cm below the solid surface of the Earth.

21. A method according to any one of the preceding claims wherein the sample is a soil sample.

22. A method according to any one of the preceding claims wherein the sample is a marine sediment sample.

23. A method according to any one of the preceding claims wherein the sample is a freshwater sediment sample.

24. A method according to any one of the preceding claims, further comprising the step of, after taking the sample, stabilising the nucleic acids in the sample.

25. A method according to any one of the preceding claims wherein steps (i), (ii) and (iii) are carried out with respect to a plurality of different sites at the location.

26. A method according to claim 25, further comprising the step of correlating the results of steps (i), (ii) and (iii) at each site, thereby determining the variations in the results of the mathematical operations at different sites within the location.

27. A method according to any one of the preceding claims further comprising the step of determining the concentration of a third polynucleotide.

28. A method according to clam 27 further comprising the step of calculating a mathematical operation on the concentration of the first and/or second polynucleotide relative to the concentration of the third polynucleotide.

29. A method according to any one of the preceding claims, wherein step iii) further comprises, prior to calculating the mathematical operation on the concentration of the first polynucleotide relative to the concentration of the second polynucleotide, the step of determining the concentration of the third polynucleotide and calculating a mathematical operation on the concentration of the first and/or second polynucleotide relative to the concentration of the third polynucleotide.

30. A method according to any one of claims 27 to 29 wherein the third polynucleotide encodes a generic bacterial protein.

31. A method according to any one of claims 27 to 30 wherein the third polynucleotide is a EuBac gene.

32. The use of a catechol 2,3,dioxygenase gene as a biomarker for normalising the concentration, in an environmental sample, of one or more other genes encoding a protein or proteins capable of metabolising one or more hydrocarbons, for detecting hydrocarbon deposits.

33. A method of selecting a polynucleotide encoding a protein capable of metabolising one or more hydrocarbons as a biomarker suitable for normalising the concentration, in an environmental sample, of one or more other polynucleotides encoding a protein or proteins capable of metabolising one or more hydrocarbons, in a method for detecting hydrocarbon deposits, comprising;

i) determining the concentration of a plurality of polynucleotides encoding proteins capable of metabolising one or more hydrocarbons, at each of a plurality of sites;

ii) calculating a set of pairwise mathematical operations on the concentrations of the selected polynucleotides;

iii) determining a set of correlations between the results of the mathematical operations on the concentration of the selected polynucleotides, and a further data set indicative of the presence or absence of a hydrocarbon deposit obtained with respect to each of the sites; and

iv) identifying a pairwise mathematical operation which shows a correlation with the further data set, and from this determining a polynucleotide suitable for use as a normalising biomarker.

34. A method as defined in claim 33 wherein the further data set comprises geochemical data.

35. A method as defined in claim 34 wherein the geochemical data relates to the concentration of free hydrocarbons within a solid substrate at each site.

36. A method as defined in claim 34 wherein the geochemical data relates to the concentration of free hydrocarbon gas or vapour above each site.

Description:
A Microbiological Detection Method

Technical Field

The present invention relates to a method for detecting the presence of a hydrocarbon deposit and, more specifically, a method of detecting a naturally occurring hydrocarbon deposit in a geographical location.

Background

The increasing demand for fossil fuels gives rise to an ongoing need to find new, naturally occurring deposits of oil and gas which can be industrially extracted. There are many known methods of prospecting for oil and gas such as by analysing the geology of a location suspected of containing a hydrocarbon deposit. For example, a gravity or magnetic survey may be used or, for a particularly promising location, a seismic survey may be carried out.

It is also known that significant hydrocarbon deposits can result in visible surface features such as oil and natural gas or vapour seeps. However, even if such seeps are not visible, it is possible to prospect for hydrocarbon deposits by detecting the seeps in other ways. For example, methods of geochemical prospecting can be used. These methods involve obtaining a subsurface sample from a depth of at least 1 to 4m, which can be difficult to obtain, and the sensitivity of the results can vary.

WO- A-91/02086 reports on the detection of oil and gas deposits by taking soil samples from a location where there is potentially a hydrocarbon deposit. This approach is based on the theory that microbes, in particular bacteria, which are capable of metabolising hydrocarbons have a selective advantage in areas where subsurface hydrocarbon gases or vapours are present. Therefore the concentration of such microbes is higher than average in a soil sample which is located above an oil or gas deposit.

In the approach described in WO- A-91/02086, a first portion of a soil sample obtained from a location is exposed to a hydrocarbon gas such as ethane and the amount of a metabolite resulting from the metabolism of the hydrocarbon gas is measured over a predetermined length of time. This gives an indication of the activity of microbes within the soil sample which are capable of metabolising the hydrocarbon gas. A second portion of the soil sample is exposed to a substrate, such as glucose, which can generally be metabolised by all bacteria and the production of the metabolite thereof is also measured. This gives an indication of the overall microbial population in the soil sample. The ratio of the microbial population which is capable of metabolising hydrocarbons to the overall microbial population is calculated to provide a normalised index of the presence of microbes which are capable of metabolising hydrocarbons. The detection of an index value is indicative of the presence or absence of a hydrocarbon deposit at the location. Furthermore, a plurality of samples can be taken at different sites across a location and the index determined at each site. The variation of index values across the location can then be mapped out to indicate the presence of a deposit within the location.

WO- A-91/02086 also reports on combining the index with the concentration of free hydrocarbon gas at each site in order to arrive at a multivariate index that maps seepage more reliably.

The problem with the oil and gas exploration approach reported in WO-A-91/02086 is that, in practice, it is beset with a number of technical difficulties. For example, in many cases, a location which holds a potential hydrocarbon deposit is very remote and therefore it is usually some time between a soil sample being obtained from a location and the sample being analysed under laboratory conditions. During this period of time, it is possible that the ratio of the number of hydrocarbon metabolising microbes to the overall number of microbes will fall as the microbes are withdrawn from their source of hydrocarbons. Thus this approach can give false results. Another problem is that the detection process requires the provision of radioactive hydrocarbon isotopes such as carbon 14 so that the (isotopic) metabolic products can be detected and distinguished from the products of other metabolic processes. However, radioactive isotopes require careful storage, handling and disposal. Furthermore, supplies of hydrocarbon isotopes can easily become contaminated with compounds that disrupt the normal metabolism of microorganisms leading to inaccurate results.

US2002/0065609A1 reports a different approach to mineral exploration. It discloses the analysis of microbial populations in relation to sequences of their small subunit ribosomal DNA (rDNA) sequence. It hypothesises that specific polymorphisms of the 16S rDNA sequence of bacteria can be correlated to a sample parameter such as the geographical location of populations of bacteria. If the sample parameter is the presence of hydrocarbon deposits at geographical locations then the presence of bacteria containing the polymorphisms at a geographical location is indicative of the presence of a hydrocarbon deposit at the geographical location. That is to say, specific 16S rDNA polymorphisms are putative markers for bacteria suited for survival over or around hydrocarbon deposits.

However, there is a problem with the approach reported in US2002/0065609A1. The 16S rDNA gene simply encodes the 16S subunit rRNA. Although the gene is highly conserved between taxonomic groups, it is not related, per se, to the ability of a microbe to survive and reproduce in an environment with a higher than average concentration of free hydrocarbon gas. Therefore, since there is no direct link between the supposed marker and the environmental survival trait, this approach is not expected to be very reliable. WO2005/103284 relates to multi-targeted microbial screening and monitoring methods. It involves testing for the presence/absence of microbial markers that are shared by both 'target' and 'index' microbes. Index microbes are genetically distinct from target microbes but behave in a similar way under equivalent conditions. The results are used to calculate an aggregate index value. The index values are useful when the number of markers detected is not sufficient to indicate the presence of the target microbe. A threshold index value can be calculated, and if the index value is above the threshold is it indicative of the presence of target microbes.

Another approach for the detection of hydrocarbons is provided in WO03/012390. It discloses the use of micro-arrays to analyse samples for the presence of hydrocarbons and to perform multiple tests in parallel. The probes used specifically bind to analytes of targets associated with hydrocarbons.

However, the results obtained via the methods of WO2005/103284 and WO03/012390 do not distinguish between the possible causes for an increase in the number of target/hydrocarbon metabolising genes detected in the bacterial population, i.e. they do not indicate whether the detected increase is due to changes in conditions which are favourable to all bacteria in general, or whether it is caused by changes in conditions that are favourable to hydrocarbon metabolising bacteria only.

WO2009/013516 also reports a method for the detection of hydrocarbon deposits in a geographical location. The method of WO09/013516 relies on comparing the concentration of a hydrocarbon metabolising marker gene with the number of bacteria at the location. The comparative results are given as a ratio. This 'normalised' data, termed an index value, provides information about the abundance of hydrocarbon metabolising genes at the location. It is indicative of the presence or absence of a hydrocarbon deposit. However, there is always demand for enhanced methods of hydrocarbon deposit detection with improved accuracy and which utilise more refined processes.

The present invention seeks to alleviate one or more of the above problems.

Statements of Invention

According to one aspect of the present invention, there is provided a method for detecting the presence of a hydrocarbon deposit in a geographical location comprising the steps of:

i) determining the concentration, at a site in the location or in a sample taken from the site, of a first polynucleotide encoding a protein capable of metabolising a hydrocarbon;

ii) determining the concentration, at the same site in the location or in a sample taken from the site, of a second polynucleotide encoding a protein capable of metabolising a hydrocarbon; and iii) calculating a mathematical operation on the concentration of the first polynucleotide relative to the concentration of the second polynucleotide at the site;

wherein the result of the mathematical operation is indicative of the presence or absence of a hydrocarbon deposit at the location, and wherein the first polynucleotide encodes a different protein from the second polynucleotide.

Preferably, the mathematical operation is calculating the ratio between the first and second polynucleotide.

Advantageously, steps i) and ii) further comprise the step of dividing the first and second polynucleotide concentrations by their respective population medians to achieve a common amplitude scale.

Conveniently, the step of determining the concentration of the first polynucleotide comprises determining the concentration of a subsequence of the first polynucleotide sequence.

Preferably, the step of determining the concentration of the second polynucleotide comprises determining the concentration of a subsequence of the second polynucleotide sequence.

Advantageously, the subsequence comprises a consensus sequence present in a plurality of homologous genes from different micro-organisms encoding a selected protein capable of metabolising a hydrocarbon.

Preferably, the first and/or second polynucleotide is DNA.

Alternatively, the first and/or second polynucleotide is RNA.

Conveniently, the first and/or second polynucleotide encodes a protein which metabolises a hydrocarbon selected from a Cl to C20 alkane, an optionally substituted single or multi-ring aromatic hydrocarbon, or a naphthene.

Advantageously, the first and/or second polynucleotide encodes a biphenyl dioxygenase, a toluene monooxygenase, an alkane hydroxylase, a naphthalene dioxygenase, a toluene dioxygenase, a xylene monooxygenase, a butane monooxygenase, a methane monooxygenase, a catechol 2,3, dioxygenase, a bacterial P450 oxygenase, or a eukaryotic P450 oxygenase.

Preferably, the first or second polynucleotide is an AIkB gene.

Alternatively, the first or second polynucleotide is a XyIM gene.

Alternatively, the first or second polynucleotide is a ToID gene. Conveniently, the first or second polynucleotide encodes a methane monooxygenase, such as particulate methane monooxygenase (pmoA).

Alternatively, the first or second polynucleotide encodes a catechol 2,3,dioxygenase.

Preferably, one of the first or second polynucleotides encodes a protein capable of metabolising a short chain alkane, and wherein the other polynucleotide encodes a protein capable of metabolising a long chain alkane or an aromatic compound.

Conveniently, one of the first or second polynucleotides encodes a protein capable of metabolising an alkane, and wherein the other polynucleotide encodes a protein capable of metabolising an aromatic compound.

Advantageously, the second polynucleotide encodes a catechol 2,3,dioxygenase.

Preferably, the concentration of the first and/or second polynucleotide is determined by quantitative PCR.

Conveniently, the sample is taken from a depth of between 0 and 100cm, preferably between 0 and 50cm, and most preferably between 10 and 40cm below the solid surface of the Earth.

Preferably, the sample is a soil sample.

Alternatively, the sample is a marine sediment sample.

Alternatively, the sample is a freshwater sediment sample.

Advantageously, the method further comprises the step of, after taking the sample, of stabilising the nucleic acids in the sample.

Preferably, steps (i), (ii) and (iii) are carried out with respect to a plurality of different sites at the location.

Conveniently, the method further comprises the step of correlating the results of steps (i), (ii) and (iii) at each site, thereby determining the variations in the results of the mathematical operations at different sites within the location.

Advantageously, the method further comprises the step of determining the concentration of a third polynucleotide. Preferably, the method further comprises the step of calculating a mathematical operation on the concentration of the first and/or second polynucleotide relative to the concentration of the third polynucleotide.

Conveniently, step of the method iii) further comprises, prior to calculating the mathematical operation on the concentration of the first polynucleotide relative to the concentration of the second polynucleotide, the step of determining the concentration of the third polynucleotide and calculating a mathematical operation on the concentration of the first and/or second polynucleotide relative to the concentration of the third polynucleotide.

Advantageously, the third polynucleotide encodes a generic bacterial protein.

Preferably, the third polynucleotide is a EuBac gene.

According to another embodiment of the present invention, there is provided the use of a catechol 2,3,dioxygenase gene as a biomarker for normalising the concentration, in an environmental sample, of one or more other genes encoding a protein or proteins capable of metabolising one or more hydrocarbons, for detecting hydrocarbon deposits.

hi a further embodiment of the present invention, there is provided a method of selecting a polynucleotide encoding a protein capable of metabolising one or more hydrocarbons as a biomarker suitable for normalising the concentration, in an environmental sample, of one or more other polynucleotides encoding a protein or proteins capable of metabolising one or more hydrocarbons, in a method for detecting hydrocarbon deposits, comprising;

i) determining the concentration of a plurality of polynucleotides encoding proteins capable of metabolising one or more hydrocarbons, at each of a plurality of sites;

ii) calculating a set of pairwise mathematical operations on the concentrations of the selected polynucleotides;

iii) determining a set of correlations between the results of the mathematical operations on the concentration of the selected polynucleotides, and a further data set indicative of the presence or absence of a hydrocarbon deposit obtained with respect to each of the sites; and

iv) identifying a pairwise mathematical operation which shows a correlation with the further data set, and from this determining a polynucleotide suitable for use as a normalising biomarker. Preferably, the further data set comprises geochemical data.

Advantageously, the geochemical data relates to the concentration of free hydrocarbons within a solid substrate at each site.

Conveniently, the geochemical data relates to the concentration of free hydrocarbon gas or vapour above each site.

Advantageously, the presence of the first and/or second polynucleotide, wherein said polynucleotide encodes a small subunit rRNA, is determined using a forward primer comprising one of SEQ ID NO:s 33 or SEQ ID NO: 39, or a sequence with at least 80% identity thereto, a reverse primer comprising SEQ ID NO:34 or SEQ ID NO: 40 respectively, or a sequence with at least 80% identity thereto, and optionally a probe comprising SEQ ID NO:35 or SEQ ID NO: 41 respectively, or a sequence with at least 80% identity thereto.

Conveniently, the presence of the first and/or second polynucleotide, wherein said polynucleotide encodes 16s rRNA, is determined using a forward primer comprising SEQ ID NO:36 or SEQ ID NO: 42, or a sequence with at least 80% identity thereto, a reverse primer comprising SEQ ID NO:37 or SEQ ID NO :43 respectively, or a sequence with at least 80% identity thereto, and optionally a probe comprising SEQ ID NO:38 or SEQ ID No:44 respectively, or a sequence with at least 80% identity thereto.

Conveniently, the protein capable of metabolising a hydrocarbon is encoded by a nucleotide sequence comprising a sequence with at least 80% sequence identity to a sequence referred to in Table 2. It is preferred that the sequence has at least 90%, 95%, 99% or 100% sequence identity to a sequence referred to in Table 2. Table 2 provides the GenBank accession numbers of the sequences. The preferred sequences are those present in the GenBank Database on 28 th July 2008.

Advantageously, the presence of the biphenyl dioxygenase protein is determined using a forward primer comprising SEQ ID NO: 13 or a sequence with at least 80% identity thereto, a reverse primer comprising SEQ ID NO: 14 or a sequence with at least 80% identity thereto, and optionally a probe comprising SEQ ID NO: 15 or a sequence with at least 80% identity thereto.

Conveniently, the presence of the catechol 2,3,dioxygenase protein is determined using a forward primer comprising SEQ ID NO: 1 or a sequence with at least 80% identity thereto, a reverse primer comprising SEQ ID NO:2 or a sequence with at least 80% identity thereto, and optionally a probe comprising SEQ ID NO:3 or a sequence with at least 80% identity thereto. Preferably, the presence of the naphthalene dioxygenase protein is determined using a forward primer comprising SEQ ID NO:4 or a sequence with at least 80% identity thereto, a reverse primer comprising SEQ ID NO:5 or a sequence with at least 80% identity thereto, and optionally a probe comprising SEQ ID NO:6 or a sequence with at least 80% identity thereto.

Advantageously, the presence of the toluene dioxygenase protein is determined using a forward primer comprising SEQ ID NO: 7 or a sequence with at least 80% identity thereto, a reverse primer comprising SEQ ID NO:8 or a sequence with at least 80% identity thereto, and optionally a probe comprising SEQ DD NO:9 or a sequence with at least 80% identity thereto.

Conveniently, the presence of the xylene monooxygenase protein is determined using a forward primer comprising SEQ ID NO: 10 or a sequence with at least 80% identity thereto, a reverse primer comprising SEQ ID NO:11 or a sequence with at least 80% identity thereto, and optionally a probe comprising SEQ ID NO: 12 or a sequence with at least 80% identity thereto.

Preferably, the presence of the butane monooxygenase protein is determined using a forward primer comprising SEQ ID NO:28 or a sequence with at least 80% identity thereto, a reverse primer comprising SEQ ID NO:29 or a sequence with at least 80% identity thereto, and optionally a probe comprising SEQ ID NO:30 or a sequence with at least 80% identity thereto.

Advantageously, the presence of the alkane dehydrogenase protein is determined using a forward primer comprising one of SEQ ID NO:s 16, 19, 22 or 25, or a sequence with at least 80% identity thereto, a reverse primer comprising SEQ ID NO: 17, 20, 23 or 26 respectively, or a sequence with at least 80% identity thereto, and optionally a probe comprising SEQ ID NO: 18, 21, 24 or 27 respectively, or a sequence with at least 80% identity thereto.

Conveniently, the presence of the methane monooxygenase protein, is determined using a forward primer comprising SEQ ED NO:31, or a sequence with at least 80% identity thereto, and a reverse primer comprising SEQ ID NO:32 respectively, or a sequence with at least 80% identity thereto.

Conveniently, the presence of the particulate methane monooxygenase protein, is determined using a forward primer, and a reverse primer.

Alternatively, the forward primers, reverse primers and probes used for the detection of the hydrocarbon metabolising and generic polynucleotide genes have 80, 85, 90, 95, 99 or 99.5% identity with the respective SEQ ID NOs.

In this specification, the percentage "identity" between two sequences is determined using the BLASTP algorithm version 2.2.2 (Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI- BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402) using default parameters. In particular, the BLAST algorithm can be accessed on the Internet using the URL http://www.ncbi.nlm.nih.gov/blast/.

In this specification the term "hydrocarbon" means an organic chemical compound comprising hydrogen and carbon atoms.

In this specification, where a protein is described as being "capable of metabolizing a hydrocarbon" this means that the protein has activity in facilitating or causing a chemical reaction on a hydrocarbon under in vivo conditions. This can be tested, for example, by adding a sample hydrocarbon to a reaction medium that replicates intracellular conditions and detecting the decrease in the presence of the hydrocarbon over a predetermined length of time in comparison with a control medium from which the protein is absent.

The term "mathematical operation" as used in this specification encompasses any procedure for generating a value from one or more other values (the operands). It includes analytical or algorithmic operations, for example, calculating a ratio or a median value from a data set. In this instance, the variable operands can be the concentrations of the first, second and/or third polynucleotides.

In this specification the term "short chain alkane" refers to alkanes comprising 6 carbon atoms or less. The term "long chain alkane" encompasses alkanes with more than 6 carbon atoms.

In this specification, the term "live petroleum" is used to indicate petroleum deposits with a high abundance of the biochemically labile short chain normal alkanes relative to the biochemically less labile aromatics.

The term "EuBac", as used in this specification, refers to a polynucleotide which encodes a small subsection of bacterial 16S rRNA. It is highly conserved amongst most bacteria, and therefore the EuBac copy number is an example of a useful indicator of the total microbial population.

Brief Description of the Figures

Embodiments of the present invention will now be described with reference to the accompanying figures:-

Figure 1 is a graph showing free or adsorbed gas, vapour or liquid hydrocarbon concentrations in surface soils versus location. Figure 2 is a schematic cross-sectional view of a geographical location containing a subsurface hydrocarbon deposit.

Figure 3 is a block diagram demonstrating the principle of the present invention.

Figure 4 is a sectional display of the field site of Experimental example 1 illustrating the relationship between gene copy number values and the location of an oil seep. The index value populations are each divided by their respective medians in order to reduce them to a common amplitude scale. The sections are referenced in the plan view shown in Figure 5.

Figure 5 is a plan view of the field site of Experimental example 1 illustrating the relationship between Transform 1 index values and the location of the oil seep. The index value populations are each divided by their respective medians in order to reduce them to a common amplitude scale. The letter notation (A-D) refers to the section line in Figure 6.

Figure 6 is a sectional view of the field site of Experimental example 1 illustrating the relationship between Transform 1 index values and the location of the oil seep. The sections are referenced in the plan view, Figure 5. The index value populations are each divided by their respective medians in order to reduce them to a common amplitude scale.

Figure 7 is a plan view of the field site of Experimental example 1 illustrating the spatial and scalar variation in Transform 2 index values in relation to the location of the oil seep. The index value populations are each divided by their respective medians in order to reduce them to a common amplitude scale. The letter notation (A-D) refers to the section line in Figure 8.

Figure 8 is a sectional view of the field site of Experimental example 1 illustrating the relationship between Transform 2 index values and the location of the oil seep. The sections are referenced in the plan view, Figure 7. The index value populations are each divided by their respective medians in order to reduce them to a common amplitude scale.

Figure 9 is a sectional display of the field site of Experimental example 1 showing the relationship between the Transform 3 index values and the location of the Formby oil seep. Sections ABC and BD are referenced in Figure 5.

Figure 10 is a plan view of the field site of Experimental example 2 illustrating the spatial and scalar variation in pMoA index Transform 2 values in relation to a virgin gas field and a dry hole. The index values are scaled and truncated at the low end in order to facilitate this visualisation. Detailed Description

Referring to Figures 1 and 2, some of the principles that underly the present invention will be described.

At a geographical location 1, a subsurface hydrocarbon deposit 2 is not visible from the surface 3. However, the vertical migration of hydrocarbon gases, vapours or liquids 4 to the surface 3 from the hydrocarbon deposit results in the generation of an anomaly at the surface 3 of increased concentrations of hydrocarbons in surface soils. In some situations, the anomaly is directly above the hydrocarbon accumulation (an apical anomaly) but in other situations, the anomaly takes the form of a halo around the periphery of the hydrocarbon deposit (a halo anomaly). Referring to Figure 1 , the concentration of migrant hydrocarbons across the geographical location 1 either as an apical or a halo anomaly is shown.

The presence of hydrocarbon gas in surface soils results in the development of soil microbial populations capable of using the hydrocarbons as a nutrient and energy source. Thus the presence of these bacterial populations in soils (or, to be precise, the presence of elevated concentrations of these bacterial populations in comparison to soils which are not influenced by migrant hydrocarbons) indicates that hydrocarbons are migrating through the soils. This, in turn, indicates that there is an underlying oil or gas reservoir.

Therefore, the present invention concerns detecting the presence of microbial and, in particular, bacterial populations which are capable of metabolising hydrocarbons. This involves determining the concentration of a first and a second polynucleotide present in a sample obtained from a site, wherein the site is to be tested for the presence of a petroleum deposit. Each of the first and second polynucleotides encodes a different hydrocarbon metabolising gene. The next step utilises a comparative mathematical operation performed on the respective concentration values of the first and second polynucleotides. The results are interpreted and are indicative of the presence or absence of a hydrocarbon deposit in the vicinity of the sample site. In some embodiments the concentrations of the first and second polynucleotide are each divided by their respective population medians in order to bring them to a common amplitude scale before they are compared with each other. In some embodiments the concentration of the second polynucleotide serves to act as a normalising biomarker in relation to the concentration of the first polynucleotide.

The method is applicable to analysis of a previously obtained sample (e.g. transported to a different location) as well as analysis in situ.

hi a further embodiment of the present invention the concentration of a third polynucleotide is also detected. This third polynucleotide encodes a generic gene, such as Eubac, which is common to the majority of bacteria and hence is useful in calculating the total microbial population. The concentrations of the first and second polynucleotides can be normalised with respect to the concentration of the third polynucleotide to give an index value before they are compared with each other. This can take place before or after the optional division of the concentrations of the first and second polynucleotides with their respective population medians.

In a still further embodiment, a plurality of polynucleotides are detected, each of which encodes a different hydrocarbon metabolizing gene, at a plurality of sample sites. The concentration of each polynucleotide is compared pairwise via a mathematical operation to the concentration of each of the other polynucleotides detected at the same sample site. The results are then assessed (e.g. classical statistical criteria of significance are applied) and the pairwise comparison that yields the best results in terms of accurately indicating the presence of a hydrocarbon deposit can then be selected, i.e. the optimal normalising biomarker combination is chosen. This selection is then verified by an alternative method of hydrocarbon prospecting. It is advantageous to compare the concentration of a variety of different hydrocarbon metabolising genes and then select the pairing which gives the best results. This is because the environmental conditions at each sample site vary, as does the age of the petroleum deposit, so different hydrocarbon metabolising genes will be present, and it can be difficult to predict which ones will yield the best results at the outset.

hi an alternative embodiment, the present invention is utilised in synergistic combination with other technology. For example, the technique of controlled source electromagnetism, in which sensors are deployed over an area in order to detect hydrocarbon deposits directly, is adapted so that the sensors simultaneously retrieve a core sample at 50cm beneath the soil surface. The sample is then tested in accordance with the present invention. The effectiveness of the present invention when analysing a core sample from a shallow depth, such as 50cm, is advantageous in this instance. Similarly, underwater vehicles, such as submarines, which are currently used in various aspects of oil prospecting, are also adapted to collect shallow core samples.

Detection of Microbial Populations

In embodiments of the present invention, microbial populations in the vicinity of a hydrocarbon deposit are detected by detecting the presence of genes which encode s protein capable of metabolising a hydrocarbon or a family of hydrocarbons. The principle which lies behind this concept will now be described with reference to Figure 3 which shows a block diagram indicating the relationship between the presence of hydrocarbons and the expression of genes encoding hydrocarbon metabolising proteins.

Starting at block 5, the presence of a hydrocarbon in the environment surrounding a microbe (in particular a bacterium) is detected by the microbial cell via a number of mechanisms, principally via receptors on the cell surface. The detection of the presence of hydrocarbon results in a raised level of a transcription of mRNA 6 which encodes a protein capable of metabolising the hydrocarbon. The protein 7, is in turn, translated and expressed within the cell. The protein duly oxidises the hydrocarbon 8 releasing energy and resulting in growth 9 and multiplication of the cell. The multiplication of the cell, leads to an increase in the number of genes 10 encoding hydrocarbon metabolising proteins in that locale, i.e. an increase in the concentration of such genes.

hi contrast, a microbe which does not contain a gene encoding a hydrocarbon metabolising protein does not metabolise hydrocarbons and does not benefit from the hydrocarbons as a source of energy in the environment. Thus where hydrocarbon deposits are present, the microbial population is skewed in favour of microbes containing genes encoding hydrocarbon metabolising proteins. Therefore, the concentration of DNA in a sample population is skewed in favour of DNA encoding hydrocarbon metabolising proteins, hi environments where hydrocarbons are the only source of energy, all the microbes present will contain one or more genes encoding a hydrocarbon metabolising protein in order to survive.

hi specific embodiments of the present invention, the concentration of genes encoding hydrocarbon metabolising proteins varies not only due to the number of cells containing such genes in the population but also the number of copies of such genes within each cell. For example bacterial cells containing a plasmid with such a gene will have a selective advantage in a hydrocarbon-rich environment over cells which do not contain such a plasmid. Furthermore, a cell which contains multiple plasmids incorporating such a gene may have a selective advantage over a cell containing plasmid with only a single copy of a hydrocarbon metabolising protein encoding gene. Therefore, in such embodiments, the detection of genes encoding hydrocarbon metabolising proteins is particularly sensitive.

The above described embodiments relate to the detection of DNA encoding hydrocarbon metabolising proteins. However, in some alternative embodiments, RNA and, in particular, mRNA molecules encoding hydrocarbon metabolising proteins are detected instead. It is to be appreciated that mRNA transcripts exist for a relatively short period of time within a cell and therefore the detection of such an mRNA transcript is indicative of the cell actively metabolising hydrocarbons at the time of sampling. In contrast, the DNA gene copy number is an integrative measure of the micororganisms exposure to specific hydrocarbons over a period of time.

In further embodiments of the present invention, the relative concentrations of DNA and mRNA encoding hydrocarbon metabolising proteins are compared to give an indication of the history of the presence of hydrocarbons in the environment of the microbial population. For example, if the concentration of DNA encoding hydrocarbon metabolising proteins in a microbial population is found to be average but the concentration of mRNA encoding hydrocarbon metabolising proteins is found to be well above average then this may be an indication that there is no underlying hydrocarbon deposit in the environment of the microbial population and that the presence of the high concentration of mRNA transcripts is due to human intervention (e.g. the vehicle of an individual taking the samples).

Genes encoding Hydrocarbon metabolising proteins

In order to carry out embodiments of the present invention, it is necessary to identify hydrocarbons which are metabolised by microbes; proteins which effect the oxidation process; and genes which encode the proteins.

The number of organic compounds comprising petroleum hydrocarbons numbers in the thousands. While there are far fewer compounds found in thermogenically derived gas or vapour, they are still plentiful. Hydrocarbons which migrate from a naturally occurring subsurface reservoir may be grouped into two categories based. They are grouped not according to their structure, but on their volatility; namely volatile, semi-volatile or liquid. The volatiles are generally found as gases or vapours at standard temperature and pressure conditions while semi-volatiles are generally liquid under similar conditions but can volatilise very easily.

The main chemical component of gas or vapour migrating from a naturally occurring sub-surface reservoir to the surface is methane. Methane can be generated thermogenically and also biogenically, and therefore there is a risk of "false positive" results if detecting the presence of methane. However, it is still useful to determine the concentration of this gene, hi preferred embodiments, a C2 to C20 alkane is detected (longer chained alkanes are found in decreasing concentrations in migrating gas or vapour) or straight chain alkanes and branch chain alkanes are detected. Alkenes are also detected, as are simple and alkylated single multi-ring aromatics and saturated rings (napthenes).

Alkane Oxidation

Alkanes are types of organic hydrocarbon compounds which only have single carbon-carbon bonds. Acyclic alkanes have the general formula C n H (2n+2 ). The enzymology of microbial alkane oxidation is well known in the art with many reviews available (see for example Oil & Gas Science and Technology - Rev. IFP, Vol. 58 (2003) pp 427-440).

Straight-chain hydrocarbons are oxidized by a group of enzymes known as alkane hydroxylases. These enzymes introduce oxygen atoms derived from molecular oxygen into the alkane substrate. Alkane degrading yeast strains contain multiple alkane hydroxylases belonging to the P450 superfamily, while many bacteria contain membrane-bound alkane hydroxylase systems. Short-chain alkanes are thought to be oxidized by alkane hydroxylases related to the soluble and particulate methane monooxygenases.

Some embodiments involve the detection and determination of the concentration of methane monooxygenases, e.g. the detection of soluble methane monooxygenase (sMMO) using primers mmoXl-mmoX2 (see Table 1), as described in Miguez et al., Microbiol Ecology (1997), 33:21-31. Alternatively, polynucleotides encoding e.g. particulate methane monooxygenase gene (pmoA) are detected. Alternative embodiments comprise assays based on the membrane bound alkane hydroxylase (alkB), which is thought to target longer chain alkanes.

Aromatic Oxidation

Aromatic hydrocarbons are hydrocarbons which incorporate one or more planar sets of six carbon atoms connected by delocalized electrons, i.e. a benzene ring. The enzymology of microbially mediated aromatic oxidation is also well known in the art.

Most aerobic aromatic-hydrocarbon biodegradation pathways converge through catechol-like intermediates that are typically cleaved by ortho- or meta-cleavage dioxygenases. One of the functions of catechol 2,3 dioxygneases (C23DO) is to metabolise aromatic hydrocarbons. The determination of the concentration of this gene is particularly useful in the present invention.

The individual pathways of aromatic biodegradation are usually initiated through the action of either a dioxygenase or a monooxygenase. For example biphenyl dioxygenase is involved in the oxidative biodegradation of phenol whilst toluene monooxygenase is involved in the oxidative biodegradation of toluene.

References of reports on detection of genes responsible for aromatic oxidation in environmental samples include the following: Applied & Environmental Microbiology, 65, 80-87 (1999); Applied & Environmental Microbiology, 56, 254-259 (1990); Applied & Environmental Microbiology, 67, 1542-1550 (2001); Applied & Environmental Microbiology, 66, 80-8678-6837 (2000); and Applied & Environmental Microbiology, 69, 3350-3358 (2003)

Naphthenes

Naphthenes are cycloalkanes, i.e. they are types of alkanes which have one or more rings of carbon atoms in their chemical structure. They consist of carbon and hydrogen only and there are no double bonds between the carbon atoms.

Preferred embodiments of the present invention involve the detection of genes encoding one of the following enzymes: Alkane hydroxylase (alkB related); Catechol 2,3 dioxygenase; Napthalene dioxygenase; Toluene monooxygenase; Toluene dioxygenase; Xylene monooxygenase, biphenyl dioxygenase, or particulate methane monoxygenases (pmoA).

Other suitable genes are those which encode one of the following enzymes: Butane monooxygenases (similar to pMMO and sMMO); Bacterial P450 oxygenases (C4-C16 n-alkanes); or Eukaryotic P450 oxygenases (C 10-Cl 6 n-alkanes).

Hydrocarbon Metabolising Genes

Typically, when petroleum deposits occur in nature, leading to an abundance of hydrocarbons in the environment, the number of hydrocarbon metabolizing microbes in the vicinity increases. These hydrocarbon degrading microbes are capable of using the hydrocarbons as an energy source. Therefore, in the vicinity of the deposit, the number of microbes with hydrocarbon metabolizing genes is elevated. Hence, the number of hydrocarbon metabolizing genes present in the microbial population increases. As a general rule, the first hydrocarbons to be metabolized by microbes are short chain alkanes, followed by long chain alkanes. Next, the alkenes present in the petroleum are metabolized, and lastly microbes degrade the aromatic hydrocarbons. Without wishing to be bound by theory, it is believed that comparing the concentration of different types of hydrocarbon metabolizing genes is advantageous for the present invention, i.e. genes that are more or less active at different stages in the degradation of the petroleum hydrocarbons. For example, a mathematical operation that compares the concentration of alkane-metabolizing genes, which are relatively abundant in a fresh petroleum deposit, with the concentration of aromatic-metabolizing genes, which are relatively resistant to degradation and therefore more likely to be relatively abundant in a degraded petroleum occurrence, is indicative of the presence or absence of 'fresh petroleum'.

In one embodiment of the present invention it is preferred that the first polynucleotide encodes a short chain alkane metabolizing gene and that the second polynucleotide encodes an aromatic metabolizing gene, or a long chain alkane metabolizing gene. In an alternative embodiment the first polynucletide encodes a long chain alkane metabolizing gene, and the second polynucleotide encodes an aromatic metabolizing gene. In a still further embodiment, the first polynucleotide encodes a methane, toluene or xylene metabolizing gene and the second polynucleotide encodes the aromatic metabolizing gene Cat 2,3.

Further details of such exemplary enzymes and the primers that can be used for their identification are provided in Table 1. However, it is to be appreciated that the genes referred to in Table 1 are by no means exhaustive and further genes could be used instead. Other suitable genes are identified by, for example, searching public databases (e.g. GenBank and Ribosomal Database Project) for genes reported to encode hydrocarbon-metabolising proteins. Having identified a plurality of genes and their sequences in this way, it is preferred that the genes are aligned and areas of homology located in order to identify potential motifs that characterize genes encoding proteins that have this functionality. Such potential motifs are then compared with gene databases and those potential motifs that are found in genes encoding proteins not associated with hydrocarbon metabolism are discarded. Confirmed motifs (that is to say, motifs/subsequences only found in genes encoding hydrocarbon-metabolising proteins) are then used as target polynucleotides in the method of the present invention. Preferably, the motif subsequences range from 50 nucleotides to 100 nucleotides in length.

For example, the DNA sequence(s) coding for a specific catabolic gene can be searched for in the GenBank (http://www.ncbi.nlm.nih.govΛ. e.g. see Table 2, and imported into software for manipulating DNA sequences (such as DS Gene, www.accelrvs.com " ). Using the software tools provided within the program, the sequences are aligned and phylogenetic analysis performed. These selected downloaded sequences are examined and consensus regions identified. These consensus sequences must show a high percentage of conformity for the sequences obtained for the resulting assay to be specific for the desired gene. The consensus sequence is then exported into Primer Express software (www.appliedbiosystems.com), which analyses the consensus sequence and provides suggestions of primer/probe combinations that can be used in qPCR assays.

Table 1

OO

Table 2 - GenBank Accession Numbers of partial or complete nucleotide sequences encoding hydrocarbon metabolising enzymes, each of which is incorporated herein by reference. Target Accession Number

The present invention involves identifying the concentration of two genes each encoding a different hydrocarbon metabolizing protein in a sample. The concentrations of each hydrocarbon metabolizing gene are then compared and one gene acts as a normalizing biomarker with respect to the other gene. In a further embodiment the concentration is determined with reference to the total microbial population in the sample so as to give an indication of the relative concentration of such genes with respect to the total microbial population, i.e. a normalized value.

The total microbial population is calculated by measuring the concentration of generic oligonucleotide sequences which are present in all or almost all microbes, irrespective of their capacity to metabolize hydrocarbons. Examples of suitable genes from a bacterial population are provided in Table 3. Two exemplary EuBac genes are provided, the first is that provided and used in the assay described in Suzuki et al. 2000, AEM, 66(11) p4605-4614, and the second is used in a modified assay which optimises the results obtained in the normalisation assays. In the modified version the same EuBac primer sequences have been used as reported in Suzuki et al., and the probe sequence is as reported with the exception that a FAM-minor groove binder probe has been used rather than the FAM-TAMRA probe. In addition, when the modified EuBac assay is run the PCR thermo-cycling conditions differ from those described in Suzuki et al. The modified cycle conditions comprise 10 minutes at 95 0 C followed by 40 cycles of 95 0 C for 15 seconds and 57 0 C for 1 minute.

It is also to be noted that in some embodiments, the microbial population is determined by carrying out quantitative PCR using generic primers of oligonucleotide sequences that vary slightly between strains of bacteria. Because the primers are generic, amplification of the oligonucleotide sequences takes place and is indicative of the microbial population notwithstanding differences between the oligonucleotide sequences of different bacterial strains.

It is to be appreciated that other suitable generic nucleotide sequences could be used instead of those disclosed in Table 3. Such sequences can be identified by, for example, searching public databases for nucleotide motifs which are present in a high proportion of microorganisms, e.g. at least 80% of microorganisms.

hi some embodiments it is not necessary to calculate the total microbial population in this way. For example, in some instances e.g. an oil deposit beneath abyssal sea floor, which is relatively devoid of life, the total microbial population will be strongly correlated with the hydrocarbon metabolizing population. Therefore, the total gene count is dominated by hydrocarbon consumers, and hence primarily controlled by the abundance of hydrocarbon substrate. Table 3

Gene Quantification

In the present invention, the concentration of a first and second polynucleotide, each encoding a different hydrocarbon metabolizing protein, and optionally a third polynucleotide encoding a generic protein, are determined. The preferred technique for determining these concentrations is quantitative polymerase chain reaction (qPCR) which is also known as "real time PCR" (see, for example, Ding C et al "Quantitative Analysis of Nucleic Acids - The Last Few Years of Progress" J. Biochem MoI. Biol. 2004 Jan 31; 37 (l):l-10).

The principle underlying qPCR is that during the course of a PCR assay, the number of amplicons generated is monitored PCR cycle by PCR cycle. This is usually achieved by introducing a fluorophor into the assay system. The amount of fluorescence generated is directly proportional to the number of amplicons generated at each PCR cycle whilst the number of amplicons is a function of starting copy number and the number of PCR cycles. Therefore, by measuring the intensity of the signal (e.g. fluorescence) as the PCR cycles progress, the starting concentration of a target sequence may be determined. In particular, the PCR is monitored during the exponential phase where the first significant increase in the amount of PCR product correlates to the initial amount of target template. The higher the starting copy number of the nucleic acid target, the sooner a significant increase in fluorescence is observed. A significant increase in fluorescence above the baseline value indicates the detection of accumulated PCR product (measured by the Ct value).

Absolute quantitation in PCR requires a standard curve of known copy numbers, which can be constructed using a synthesized oligonucleotide or amplicon. This amplicon is of a known concentration and by serial dilution can give a wide range of known standards. These standards then undergo PCR using exactly the same conditions as the target DNA sequence. The copy number of the target DNA sequence is then extrapolated in the sample from the calibration graph or standard curve, which is constructed plotting the log of copy number against the Ct value.

Exemplary apparatus includes the ABI 7300 sequence detector which performs 96 parallel wells of qPCR analyses, determines a standard curve and calculates the amount of target DNA in each of the sample wells. Thus the output is a direct measure of the abundance of the target sequence in the sample.

While a number of variants of the chemistry of the assay system are known in the art, a Taqman 5' nuclease assay or a SyBr Green system are particularly preferred. The Taqman 5' nuclease assay is more sensitive to the presence of a target sequence and is more specific thereto but it requires three closely linked conserved regions in the target gene. The SyBr Green assay system is less sensitive and specific but only requires the presence of two conserved regions.

The concentrations of the first, second and third polynucleotides, or a subsequence thereof, can be determined simultaneously in a multiplex qPCR reaction. In such an assay system, forward and reverse primers and a probe are provided for both genes but the signaling system is different for each probe (e.g. the label fluoresces at a different wavelength) so that the relative quantities of each gene can be determined independently during PCR.

The detection of generic gene sequences using Taqman qPCR assays are described in Aromatics - Applied & Environmental Microbiology, 69, 3350-3358 (2003); andlόS RNA for total microbial populations - Applied & Environmental Microbiology, 69, 6597-6604 (2003).

The detection of gene sequences using SyBr Green qPCR assays are disclosed in Environmental Microbiology, 6, 754-759 (2004); FEMS Microbiology Ecology, 41, 141-150 (2002); and Environmental Microbiology, 1, 307-317 (1999).

Core Sample Stabilization

The core samples suitable for testing according to the present invention are obtained from a variety of locations. For example, the sample is a soil sample extracted from over or around a prospective oil field. Alternatively, the sample is sediment or sand taken from a fresh or salt-water environment.

It is to be appreciated that under normal conditions, the microbial population of a core sample will change over time, once the sample has been removed from its original geographical location. More specifically, if a core sample is taken from a location where a subsurface hydrocarbon deposit is present then, upon removal of the sample, the microbes within the sample are deprived of their hydrocarbon source. This may affect the concentration of the first and/or second polynucleotide detected. In practice, it may be necessary for samples to be taken from a locality and analyzed by the methods of the present invention under laboratory conditions. It is therefore important that the gene copy numbers within the sample remain unchanged between removal of the sample from a location and analysis in the laboratory.

Accordingly, in some embodiments of the present invention, specific steps are taken to stabilize the presence of nucleic acids within the sample. In one embodiment, core samples are kept frozen at O 0 C or lower, for example, in liquid nitrogen at -196 0 C. However, it is preferred that gene copy numbers are stabilized by the addition of a nucleic acid stabilizing compound such as RNALater™ from Ambion™ and RNAprotect™ from Qiagen(TM). The provision of a chemical nucleic acid stabiliser avoids the need for cumbersome freezing apparatus to be transported to the geographical location from which the core sample is obtained. Although these stabilizing compounds are specifically designed for stabilization of RNA, which is generally less stable than DNA, the compounds are also effective in stabilizing DNA.

Core Sample Extraction

In summary, the present invention involves obtaining a core sample from a site. The sample is dried and milled, and DNA or RNA is extracted from a representative subsample and the concentration of the DNA/RNA of interest is quantified.

In embodiments of the present invention, RNA and/or DNA is extracted from soil samples. Soil samples may contain contaminating substances which interfere with the PCR reaction and thus the quantitation process. The comparison of the concentration of first polynucleotide to a second hydrocarbon polynucleotide or a generic microbial gene environmentally normalizes the result so that the data obtained is not detrimentally affected by contamination. Different soil types may affect the efficiency of the extraction of nucleic acids, but this can also be normalized by quantification of a second hydrocarbon metabolizing hydrocarbon gene or a generic microbial gene sequence in soil samples.

Kits for extracting an isolated nucleic acid from soil samples are sold commercially by MoBio Laboratories, Inc. These kits require a soil sample to be added to a bead beating tube for rapid homogenization. Cell lysis occurs by both chemical and mechanical means (vortex adapter). Total genomic DNA is captured on a silicone membrane in a conventional spin column format. DNA is washed and then eluted from the spin column.

Epicentre Limited produces the SoilMaster DNA extraction kit which utilizes a hot detergent lysis process combined with a chromatographic step, which removes enzymatic inhibitors known to co- extract with DNA from soil and sediment samples.

Qbiogene Inc also produces a range of kits for extraction of DNA and RNA from soil samples. The kits are based on their FastPrep ™ system. Experimental

Example 1 - Field Trial 1: Survey of Confirmed Active Oil Seep at the Form by Oilfield, Onshore Lancashire, UK

The first field trial of the survey technique was performed at the now depleted Formby oil field in Lancashire England. Oil from Carboniferous age rocks of the East Irish Sea Basin migrated to and was trapped in the Formby oilfield where it leaks to surface at places where agricultural drainage ditches cut the Pleistocene Gault Clay. The slightly biodegraded seep oil visibly stains the ditch banks and the associated vapours can be smelled within a few tens of metres of the seep emergence point. Fresh oil and associated gas or vapour can be released by inserting a spade into the ditch bottoms within oil-stained reaches of the ditches. This oil seepage serves as an absolute natural control against which to demonstrate the present invention.

Soil samples were collected within and around the confirmed seepage location in order to test the survey tool. The sampling sites were at approximately 100m horizontal intervals in undisturbed soil sections located between cultivated land and drainage ditches in the spatial pattern illustrated in Figure 5. The soil samples were collected from the 20 cm to 30 cm subsurface interval using a spade and placed into polyethylene bags that were then stored in an ice chest until frozen at -18°C in the laboratory, pending analysis.

Quantitative PCR was used to obtain the copy number of the following genes, which are defined in this specification:

Gene Assay Target Microbial Group

EuBac Total bacteria pMoA Methane consumers

2,3Cat Aromatics consumers, generally

AIkB-Pl Higher alkane consumers

ToM Toluene consumers

XyM Xylene consumers

The resulting gene copy number data are listed in Table 4 and presented graphically in Figure 4.

Development of Data Transforms Various arithmetic data transforms were developed to enhance the signal/noise ratio and thus to locate seepage sites more accurately. Two fundamental bases for transformation were considered, the first (transform 1) a simple normalisation based upon the total microbial population to reflect environmental factors, the others (transforms 1 and 2) based upon the geochemical principle that the alkane/aromatic ratio is an indication of seepage activity, high values being associated with 'live' (fresh or only slightly altered) petroleum seepage.

As explained elsewhere in the specification, live petroleum is characterised geochemically by high abundances of biochemically labile normal alkanes relative to less labile aromatics. The abundance of alkanes may be simulated using the abundance of normal alkane consumers which is in turn estimated using the AIkB-Pl gene. The abundance of labile aromatics may be simulated using the generic index of aromatic consumers, the 2,3Cat gene. Thus high abundances of the AIkB-Pl gene relative to the 2,3Cat gene are deemed to signal occurrences of 'live' (fresh or only slightly altered) petroleum. High relative abundances of the pMoA gene have the same meaning in respect of methane.

The AIkB-Pl to 2,3Cat gene copy number ratio also serves the purpose of an environmental normaliser because the subject genes do not discriminate between metabolic substrates according to provenance (soil or plant-derived hydrocarbons versus migrant petroleum).

High abundances of the pMoA, ToM and XyM genes relative to the 2,3Cat gene also signal 'live' petroleum for reasons analogous to those given above. This normalisation strategy is based upon the fact that the 2,3Cat gene is generic to soils whereas the ToM and XyM genes are relatively limited in their occurrence, significant numbers being closely associated with petroleum occurrences.

Transform 1

In Transform 1, for every soil sample, the copy number of the hydrocarbon metabolising gene was divided by the EuBac copy number and expressed as a percentage. This gives index data normalised on the total microbial abundance (the ratio of the microbial population capable of metabolizing hydrocarbons to the overall microbial population). The normalised index data was rescaled by dividing each population by its median value in order to render all of the variable data to a common amplitude scale for visualisation purposes. The result is called Transform 1. Figures 5 and 6 show that Transform 1 data identify the visible seep location poorly, also identifying a number of additional anomalies presumed to represent invisible (micro) seepage. The outwardly weak performance of Transform 1 at Formby is potentially attributable to the fact the survey coverage does not extend to truly background areas outside the extent of the former oilfield.

Transform 2

Transform 2 involved a singular ratio followed by rescaling. The copy number of each of the hydrocarbon metabolising genes was divided by the corresponding 2,3 Cat copy number and each resulting ratio value divided by its respective population median in order to render it to a common amplitude scale. In identifying the visible seep location more precisely, the transform values thus derived exhibit improved signal/noise ratio (Figures 7 and 8).

Transform 3

Transform 3 involved three steps, effectively combining Transforms 1 and 2. The higher alkane index, AIkB-Pl, that is most directly indicative of 'live' petroleum/oil, is used to illustrate the transform. The 2,3Cat and AIkB-Pl index values were firstly normalised on the total microbial population by dividing them by the homologous EuBac index values. Each resulting index population was then rescaled by dividing each constituent value by the population median, enabling direct graphical comparison and also arithmetic recombination with minimal scalar artifact. Finally the transformed 2,3Cat index values thus derived were subtracted from the homologous AIkB-Pl index values to arrive at the live petroleum index illustrated in Figure 9. The Formby oil seep is identified by positive transform values.

Table 1 Results of Field Trial 1. Fomby

498,185 5,934,382 10 1 0 3E+09 3E+07 1 E+06 2 E+08 1 E+06 6E+05 194 279 177 148 322 090 221 104 121 131

498,146 5,934,462 11 2 89 1 E+08 1E+06 5E+04 5 E+06 2E+05 6E+O4 151 238 152 548 754 019 051 024 077 -310

498,117 5,934,557 12 3 188 5E+09 5E+07 6E+05 1E+08 1 E+06 3E+05 172 073 100 087 092 135 098 100 059 -014

498,108 5,934,649 13 4 281 7E+09 6E+07 1 E+06 2 E+08 2 E+06 8 E+06 143 099 109 073 1944 135 160 131 1490 026

498,109 5,934,744 14 5 376 5E+09 3E+07 1 E+06 2 E+08 1 E+06 2E+07 102 158 099 070 6704 100 265 124 5367 088

498,109 5,934,839 15 8 482 7E+09 3E+07 8E+05 9E+07 6E+05 1E+07 064 068 042 027 3135 164 298 137 6560 041

498,116 5,934,858 9 10 503 5E+09 6E+07 2 E+06 2 E+08 2 E+06 6E+05 221 219 174 100 205 152 257 152 115 119

498,104 5,934,865 73 11 517 6E+09 6E+07 2 E+06 2 E+08 1E+06 6E+05 157 139 096 055 156 198 298 153 160 084

498,104 5,934,866 6 12 518 2E+09 5E+07 4E+05 1E+08 1 E+06 7E+05 371 100 153 131 503 194 089 102 214 -031

498,103 5,934,873 5 14 525 4E+09 5E+07 7E+05 2 E+08 9E+05 3E+05 178 094 121 068 096 182 163 157 079 026

498,104 5,934,902 2 17 555 4E+09 1E+07 2E+05 5E+07 3E+05 2E+05 065 029 045 022 100 207 156 183 259 007

498,097 5,934,957 18 19 610 5E+09 3E+07 1 E+06 2 E+08 3 E+06 2E+05 097 140 113 182 062 037 090 054 019 -042 K)

498,088 5,935,065 20 21 761 8E+09 1E+07 3E+05 1E+0Θ 3 E+06 2E+05 027 019 040 116 042 016 019 030 020 -097

498,194 5,935,132 32 22 861 7E+09 9E+07 8E+05 3 E+08 3 E+06 3E+05 200 061 123 105 069 130 067 101 036 -045

498,313 5,935,147 31 23 961 6E+09 3E+07 4E+05 1 E+08 1E+06 4E+05 069 034 054 064 111 075 062 074 097 -030

498,409 5,935,168 30 24 1061 2E+09 4E+07 6E+05 1 E+08 4 E+06 5E+05 310 153 145 567 338 038 032 022 033 -414

498,582 5935,147 29 25 1161 8E+09 2E+07 3E+05 4E+07 3 E+06 5E+05 037 020 016 107 093 024 022 013 048 -087

498,756 5,935,269 28 26 1261 8E+09 3E+07 7E+05 2 E+08 2 E+06 2E+05 068 050 085 062 041 075 094 119 037 -012

498,855 5,935,306 27 27 1361 8E+09 3E+07 2 E+06 2 E+08 3 E+06 4E+05 054 108 089 092 068 040 137 085 042 016

497,951 5,935,089 26 28 1461 6E+09 3E+07 2E+06 2 E+08 6 E+06 2E+05 072 234 139 326 057 015 084 037 010 -092

497,873 5,935,042 25 29 1561 4E+09 1 E+06 6E+03 5 E+06 2E+05 4E+04 004 001 003 015 013 016 006 020 049 -014

497,775 5,934,986 24 30 1661 5E+09 7E+06 1E+05 4E+07 5E+05 2E+05 025 016 029 034 077 050 053 073 125 -019

497,673 5,934,921 23 31 1761 6E+09 2E+07 1E+06 2 E+08 3 E+06 3 E+06 052 113 093 162 727 022 082 050 250 -049

497,573 5,934,866 22 32 1861 3E+09 2E+07 1E+06 2E+0B 3 E+06 2E+05 100 255 180 299 094 023 100 053 018 -044

497,478 5.934,803 21 33 1961 4E+09 3E+07 1 E+06 2 E+08 3 E+06 4E+05 138 178 123 214 159 044 097 050 041 -036

Example 2 - Field Trial 2: Survey of Virgin Gas Field, Barents Sea, Offshore Norway

The second field trial was performed in a sub-sea location, at a virgin gas field under the Barents Sea, offshore Norway. At this site some exploratory wells drilled in c. 300m of water have encountered commercially viable gas accumulations (gas discoveries) at depths exceeding 2 km sub-bottom. However, other wells drilled in the same region were unsuccessful (dry). A gas field comprising the discovery well and also a dry well were surveyed in order to test the tool in a comparative and commercially realistic context.

The trial survey comprised collection of seabed sediment cores for subsampling and later laboratory analysis. The cores were collected using standard gravity coring equipment in c. 300m water depths at c. 1,000m horizontal intervals in a polygonal traverse across the gas field and the dry hole, as shown in Figure 10. Subsamples for DNA analysis of the acquired sediment cores were collected from within the 40 - 50 cm sub-bottom interval, placed in polyethylene bags and then stored 4°C pending laboratory analysis.

Quantitative PCR was used to obtain the copy number of the following indicator genes, which are defined in this specification:

Gene Assay Target Microbial Group

EuBac Total bacteria pMoA Methane consumers

2,3Cat Aromatics consumers, generally

AIkB-Pl Higher alkane consumers

ToM Toluene consumers

XyM Xylene consumers

Transform 2 was applied as explained in example 1, with the exception that the detected polynucleotide copy numbers were not divided by the population median to render them to a common amplitude scale. Each hydrocarbon metabolising gene copy number was expressed as a percentage of the EuBac copy number. Separately the copy number of the hydrocarbon metabolising gene was expressed as a percentage of the 2,3 Cat copy number. The index values thus transformed have the same meaning attributed to them in example 1.

The resulting pMoA Transform 2 index values are displayed in plan view Figure 10 in relation to the field circumstances. The symbol size is proportional to the index value. The larger symbols (high index value scores) represent methane seepage anomalies whereas the small symbols indicate the background condition where methane seepage is subdued. The anomaly amplitude increases from the margin of the gas field towards the crest of the closure where the discovery well penetrates the maximum thickness of reservoired gas. The index value anomalies cluster and reach their maximum in the immediate vicinity of the discovery well, thereby identifying active methane seepage emanating from the underlying reservoir. Conversely the dry hole locality exhibits few, lower amplitude anomalies. Together these results accurately represent the relative prospectivity of the two well sites surveyed. These results were corroborated by independently acquired geochemical and geomicrobial survey data.

Sequence Listing Free Text

<210> 1

<223> 23CAT - GrI Forward Primer

<210> 2

<223> 23CAT - GrI Reverse Primer

<210> 3

<223> 23CAT - GrI Probe

<210> 4

<223> NapD - GrIa Forward Primer

<210> 5

<223> NapD - GrIa Reverse Primer

<210> 6

<223> NapD - GrIa probe

<210> 7

<223> ToID - GrI Fwd primer

<210> 8

<223> ToID - GrI Reverse Primer

<210> 9

<223> ToID - GrI Probe

<210> 10

<223> XyIM - Gr2b Fwd primer

<210> 11

<223> XyIM - Gr2b Reverse Primer

<210> 12

<223> XyIM - Gr2b Probe

<210> 13

<223> BiPh - GrIa Fwd Primer

<210> 14

<223> BiPh - GrIa Reverse Primer

<210> 15

<223> BiPh - GrIa Probe

<210> 16

<223> AIkB - P gl Fwd Primer

<210> 17

<223> AIkB - P gI Reverse Primer <210> 18

<223> AIkB -PgI Probe

<210> 19

<223> AIkB - P g2 Fwd Primer

<210> 20

<223> AIkB - P g2 Reverse Primer

<210> 21

<223> AIkB - P g2 Reverse Primer

<210> 22

<223> AIkB - R gl Fwd Primer

<210> 23

<223> AIkB - R gl Reverse Primer

<210> 24

<223> AIkB - R gI Probe

<210> 25

<223> AIkB - R g2 Forward primer

<210> 26

<223> AIkB - R g2 Reverse Primer

<210> 27

<223> AIkB - R g2 Probe

<210> 28

<223> ButM - AY093933 Fwd Primer

<210> 29

<223> ButM - AY093933 Reverse Primer

<210> 30

<223> ButM - AY093933 Probe

<210> 31

<223> Methane monooxygenase (mmoXl-mmoX2) Fwd primer

<210> 32

<223> mmoXl-mmoX2 Reverse Primer

<210> 33

<223> EuBac Fwd primer

<210> 34

<223> EuBac Reverse Primer

<210> 35

<223> EuBac Probe <210> 36

<223> Modified EuBac Fwd Primer

<210> 37

<223> Modified EuBac Reverse primer

<210> 38

<223> Modified EuBac Probe

<210> 39

<223> Total bacteria small-subunit rRNA -Fwd Primer

<210> 40

<223> Total bacteria small-subunit rRNA - Reverse Primer

<210> 41

<223> Total bacteria small-subunit rRNA - Probe

<210> 42

<223> Total bacteria 16S RNA - Fwd Primer

<210> 43

<223> Total bacteria 16S RNA - Reverse Primer

<210> 44

<223> Total bacteria 16S RNA - Probe