Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A METHOD TO COMPUTE COMPOSITE DISTANCE MATRIX FROM A MULTITUDE OF DATA ATTRIBUTES
Document Type and Number:
WIPO Patent Application WO/2017/039660
Kind Code:
A1
Abstract:
Example implementations described herein create a rich feature set based on observed/recorded attributes as well as attributes derived from those, and models each well as a data vector in this multi-dimensional attribute space. Example implementations then compute composite similarity between wells which provides better insights into their behavior. This composite similarity can be calculated along all dimensions or subsets of dimensions, and serve as input to any clustering algorithm for further analysis. Finally, the similarity can be incrementally computed by incorporating more attributes as required. Such implementations can be used to provide insights into behavior of oil wells, especially horizontal wells by integrating features from multiple upstream processes.

Inventors:
SAHU ANSHUMAN (US)
VENNELAKANTI RAVIGOPAL (US)
DAYAL UMESHWAR (US)
Application Number:
PCT/US2015/048186
Publication Date:
March 09, 2017
Filing Date:
September 02, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HITACHI LTD (JP)
International Classes:
E21B44/00
Foreign References:
US20090132458A12009-05-21
US20140032580A12014-01-30
US20150022517A12015-01-22
US20050149307A12005-07-07
US20130161096A12013-06-27
US20130282862A12013-10-24
US20110225173A12011-09-15
US20100268737A12010-10-21
Attorney, Agent or Firm:
HUANG, Ernest C. et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A management server configured to be coupled to a plurality of rig systems by a network, each of the plurality of rig systems having a plurality of sensors, a rig and a rig node, the management server further operable to access a database, the management server comprising: a memory configured to store rig information of the plurality of rig systems, the rig information comprising a plurality of attributes from a plurality of phases of operation of a rig, and a processor configured to: calculate a similarity score between a first rig of a first rig system of the plurality of rig systems, and a second rig of a second rig system of the plurality of rig systems based on attribute interaction values between a first attribute and a second attribute of the plurality of attributes associated with the first rig and the second rig that are in a same phase of the plurality of phases of operation, and attribute interaction values between the first attribute and a third attribute from the plurality of attributes associated with the first rig and the second rig that are in different phases of the plurality of phases of operation,

2. The management server of claim 1 , wherein the first attribute, the second attribute and the third attribute each comprise a set of values.

3. The management server of claim 1, wherein the processor is configured to convert ones of the plurality of attributes having continuous values into discrete values.

4. The management server of claim 1, wherein the processor is configured to, for missing ones of the plurality of attributes, calculate values for the missing ones of the plurality of attributes such that the similarity score is maintained as a same score.

5. The management server of claim 1 , wherein the processor is configured to, for the missing ones of the plurality of attributes, incorporate values for the missing ones of the plurality of attributes from a rig of the plurality of rig systems having a highest similarity score.

6. The management server of claim 1, wherein the processor is configured to, for missing ones of the plurality of attributes, calculate values for the missing ones of the piuraiity of attributes by calculating an average of values from a set of rigs of the plurality of rig systems having a similarity score within a threshold,

7. The management server of claim 1, wherein the processor is configured to: include a new attribute into the plurality of attributes; calculate another similarity score between the first rig and the second rig for the new attribute; and add the another similarity score to the similarity score.

8. A method of managing a plurality of rig systems, each of the plurality of rig systems having a piuraiity of sensors, a rig and a rig node, comprising: storing rig information of the plurality of rig systems, the rig information comprising a plurality of attributes from a piuraiity of phases of operation of a rig, and calculating a similarity score between a first rig of a first rig system of the plurality of rig systems, and a second rig of a second rig system of the plurality of rig systems based on attribute interaction values between a first attribute and a second attribute of the plurality of attributes associated with the first rig and the second rig that are in a same phase of the plurality of phases of operation, and attribute interaction values between the first attribute and a third attribute from the piuraiity of attributes associated with the first rig and the second rig that are in different phases of the plurality of phases of operation.

9. The method of claim 8, wherein the first attribute, the second attribute and the third attribute each comprise a set of values.

10. The method of claim 8, further comprising converting ones of the plurality of attributes having continuous values into discrete values.

11. The method of claim 8, further comprising, for missing ones of the plurality of attributes, calculating values for the missing ones of the plurality of attributes such that the similarity score is maintained at a same score.

12. The method of claim 8, further comprising, for the missing ones of the plurality of attributes, incorporate values for the missing ones of the plurality of attributes from a rig of the plurality of rig systems having a highest similarity score.

13. The method of claim 8, further comprising, for missing ones of the plurality of attributes, calculating values for the missing ones of the plurality of attributes by calculating an average of values from a set of rigs of the plurality of rig systems having a similarity score within a threshold.

14. The method of claim 8, further comprising: including a new attribute into the plurality of attributes; calculating another similarity score between the first rig and the second rig for the new attribute; and add the another similarity score to the similarity score.

15. A computer program for managing a plurality of rig systems, each of the plurality of rig systems having a plurality of sensors, a rig and a rig node, the computer program having instructions comprising: storing rig information of the plurality of rig systems, the rig information comprising a plurality of attributes from a plurality of phases of operation of a rig, and calculating a similarity score between a first rig of a first rig system of the plurality of rig systems, and a second rig of a second rig system of the plurality of rig systems based on attribute interaction values between a first attribute and a second attribute of the plurality of attributes associated with the first rig and the second rig that are in a same phase of the plurality of phases of operation, and attribute interaction values between the first attribute and a third attribute from the plurality of attributes associated with the first rig and the second rig that are in different phases of the plurality of phases of operation.

Description:
A METHOD TO COMPUTE COMPOSITE DISTANCE MATRIX FROM A MULTITUDE OF DATA ATTRIBUTES

BACKGROUND

Field

[0001] The present, disclosure relates generally to oil and gas data analytics, and more specifically, to generating composite scores across rigs or wells from cross dependent data sets in oil and gas data sets.

Related Art

[0002] in the related art, oil and gas rigs utilize computerized systems to assist the operators of the rigs throughout the different phases of the oil or gas rigs (e.g., exploration, drilling, production, completions). Such computer systems are deployed for the development of energy sources such as shale gas, oil sands, and deep water resources, in the related art, attention has shifted to the development of shale gas for supplying future energy needs. Related art advances in horizontal directional drilling and hydraulic fracturing technologies have unlocked the potential for recovering natural gas from shale to become a viable energy source.

[0003] However, the issue of maximizing output from an oil and gas reservoir, particularly shale gas reservoirs, is not well understood, even with the assistance from present computer systems. The process of making production decisions and sizing top-side facilities is a manual process that depends on the judgment of the rig operator. Furthermore, operators often struggle with real-time performance of support for down- hole gauges, semi-submersible pumps, and other equipment. Non-Productive Time (NPT) for a rig may constitute over 30% of the cost of drilling operations.

[0004] One aspect of the issue of output maximization is the lack of effective data processing and data analytics, along with the sheer volume of data received from oil and gas wellsites. The data sets obtained from different upstream processes can be substantial in terms of number of available attributes. Manually developing applications that utilize these attributes can be very time-consuming.

- i - [0005] Further, related art implementations of applications have been limited in scope. Such related art implementations include performing field operations based on the subterranean formation, or obtaining a drilling model based on the drilling tool and the subterranean formation. Such related art systems do not consider monitoring attributes across multiple phases of operation.

SUMMARY

[0006] Aspects of the present disclosure include a management server configured to be coupled to a plurality of rig systems by a network, each of the plurality of rig systems having a plurality of sensors, a rig and a rig node, wherein the management server is further operable to access a database. The management server may include a memory configured to store rig information of the plurality of rig systems, the rig information including a plurality of attributes from a plurality of phases of operation of a rig, and a processor configured to calculate a similarity score between a first rig of a first rig system of the plurality of rig systems, and a second rig of a second rig system of the plurality of rig systems based on attribute interaction values between a first attribute and a second attribute of the plurality of attributes associated with the first rig and the second rig that are in a same phase of operation of the plurality of phases of operation, and attribute interaction values between the first attribute and a third attribute from the plurality of attributes associated with the first rig and the second rig that are in different phases of the plurality of phases of operation.

[0007] Aspects of the present disclosure include a method of managing a plurality of rig systems, each of the plurality of rig systems having a plurality of sensors, a rig and a rig node. The method may involve storing rig information of the plurality of rig systems, the rig information having a plurality of attributes from a plurality of phases of operation of a rig, and calculating a similarity score between a first rig of a first rig system of the plurality of rig systems, and a second rig of a second rig system of the plurality of rig systems based on attribute interaction values between a first attribute and a second attribute of the plurality of attributes associated with the first rig and the second rig that are in a same phase of operation of the plurality of phases of operation, and attribute interaction values between the first attribute and a third attribute from the plurality of attributes associated with the first rig and the second rig that are in different phases of the plurality of phases of operation.

[0008] Aspects of the present disclosure may further include a computer program for managing a plurality of rig systems, each of the plurality of rig systems having a plurality of sensors, a rig and a rig node. The computer program may have instructions that include storing rig information of the plurality of rig systems, the rig information having a plurality of attributes from a plurality of phases of operation of a rig, and calculating a similarity score between a first rig of a first rig system of the plurality of rig systems, and a second rig of a second rig system of the plurality of rig systems based on attribute interaction values between a first attribute and a second attribute of the plurality of attributes associated with the first rig and the second rig that are in a same phase of operation of the plurality of phases of operation, and attribute interaction values between the first attribute and a third attribute from the plurality of attributes associated with the first rig and the second rig that are in different phases of the plurality of phases of operation. Such instructions may be stored on a non-transitor computer readable medium for execution by one or more processors.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] FIG. 1(a) illustrates a system involving a plurality of rig systems and a management server, in accordance with an example implementation.

[0010] FIG. 1 (b) illustrates an example timeline for a rig system, in accordance with an example implementation.

[0011] FIG. 2 illustrates a rig in accordance with an example implementation.

[0012] FIG. 3 illustrates an example configuration of a rig system, in accordance with an example implementation.

[0013] FIG. 4 illustrates a configuration of a management server, in accordance with an example implementation.

[0014] FIG. 5 illustrates a flow diagram for constructing a composite score, in accordance with an example implementation. [0015] FIG. 6(a) illustrates example vectors, in accordance with an example implemen ation.

[0016] FIG. 6(b) illustrates an example conversion from continuous values for attributes to discrete values, in accordance with an example implementat on.

[0017] FIG. 6(c) illustrates a management of a set of attributes for each timeline phase of operation from FIG. 1(b), in accordance with an example implementation.

[0018] FIG. 6(d) to 6(f) illustrate an example change to the management information from imputation of values for missing values in the management information, in accordance with an example implementation.

[0019] FIG. 7 illustrates an example flow diagram for calculation of the composite similarity score from the mutual information values and average interaction values between attributes, in accordance with an example implementation.

[0020] FIGS. 8(a) and 8(b) illustrate example management of composite score values, in accordance with an example implementation.

[0021] FIG. 9 illustrates an example system architecture for the management server, in accordance with an example implementation.

[0022] FIG. 10 illustrates an example of attributes from different upstream processes, in accordance with an example implementation.

[0023] FIG. 11 illustrates the management of independent and dependent attributes across subsystems, in accordance with an example implementation.

[0024] FIG. 12 illustrates an example flow diagram for recalculation of the composite index after incorporating new attributes, in accordance with an example implementation.

[0025] FIG. 13(a) to 13(d) illustrate an example change to the management information from imputation of values, in accordance with an example implementation.

[0026] FIG. 14 illustrates an example output of composite score values in accordance with an example implementation. DETAILED DESCRIPTION

[0027] The following detailed description provides further details of the figures and example implementations of the present application. Reference numerals and descriptions of redundant elements between figures are omitted for clarity. Terms used throughout the description are provided as examples and are not intended to be limiting. For example, the use of the term "automatic" may involve fully automatic or semi-automatic implementations involving user or administrator control over certain aspects of the implementation, depending on the desired implementation of one of ordinary skill in the art practicing implementations of the present application. The term "rig" and "well" may also be used interchangeably. "Rig systems" and "wellsites" may also be utilized interchangeably.

[0028] Example implementations of the present disclosure include systems and methods that analyze the commonalities among attributes available across different processes, and leverage that analysis to generate composite scores to facilitate the comparison of one well to another. Oil and gas data are analyzed from exploration, drilling, completions, and production processes which collectively fall under the category of upstream operations, as known to those skilled in the art.

[0029] Data sets from multiple upstream processes can be complex and interdependent. The related art considers attributes from single siloed processes when comparing behavior of wells. Thus, the related art datasets may be unable to capture the rich interaction between attributes across different processes. Example impleme tations disclosed herein create a feature set based on observed/recorded attributes as well as attributes derived from those, and models each well as a data vector in this multidimensional attribute space. Example implementations then compute composite similarity between wells to provide insights into well behavior. This composite similarity can be calculated along all dimensions or subsets of dimensions, and serve as input to any clustering algorithm for further analysis. Finally, the similarity can be incrementally computed by incorporating more attributes as required.

[0030] FIG. 1(a) illustrates a system involving a plurality of rig systems and a management server, in accordance with an example implementation. One or more rig systems 101-1, 101 -2, 101-3, 101-4, and 101-5 can each involve a corresponding rig 200-

1, 200-2, 200-3, 200-4, 200-5 as illustrated in FIG. 2 along with a corresponding rig node 300-1, 300-2, 300-3, 300-4, and 300-5 as illustrated in FIG. 3. Each of the rig systems 101-1 , 10.1-2, 101-3, 101 -4, and 101-5 is connected to a network 100 which is connected to a management server 102. The management server 102 manages a database 103, which contains data aggregated from the rig systems in the network 100. In alternate example implementations, the data from the rig systems 101-1, 101-2, 101-3, 101-4, and 101-5 can be aggregated to a central repository or central database such as public databases that aggregate data from rigs or rig systems, such as for government compliance purposes, and the management server 102 can access or retrieve the data from the central repository or central database.

[0031] FIG. 1(b) illustrates an example timeline for a rig system, in accordance with an example implementation. The timeline for the rig system 101 may include multiple phases of rig operation. These phases can include (but are not limited to) an exploration phase, a drilling phase, a completions phase, a production phase, a processing phase and a pipeline phase. In the following description, the term "process" may also be used interchangeably with the term "phase". Example implementations may involve data or attributes associated with one or more of the phases of the timeline, depending on the desired impl em entation.

[0032] During the exploration phase, the well is initially drilled to determine whether reservoirs with oil or gas are present and the initial construction of the rig. in example implementations described herein, the rig node may be configured to assist the user in determining how to configure the rig and the parameters for the drilling during the exploration phase.

[0033] The drilling phase follows the exploration phase as determined in the exploration phase, e.g., if promising amounts of oil and gas are confirmed from the exploration phase. During the drilling phase, the size and characteristics of the discovery are determined and technical information is utilized to allow for more optimal methods for recovery of the oil and gas. An appraisal drilling can be performed and a rig is established. In example implementations described herein, the rig node may be configured to assist the user in determining appropriate parameters for the drilling and assist in the management and obtaining of desired characteristics for the rig. [0034] The completions phase is directed to the determination as to whether the well should be completed as a well, or whether it should be abandoned as a dry hole. The completion phase transforms the drilled well into a producing well. During this phase, the casing of the rig may be constructed, along with the perforations. Various aspects of the construction of the rig, such as cementing, gravel packing and production tree installation may be employed. Sensors may be employed to determine various parameters for facilitating the completion of the rig, such as rate of flow, flow pressure and gas to oil ratio measurements, but not limited thereto.

[0035] The production phase follows the completions phase and is directed to the facilitation of production of oil or gas. The production phase includes the operation of wells and compressor stations or pump stations, waste management, and maintenance and replacement of facility components. Sensors may be utilized to observe the above operations, as well as determining environmental impacts from parameters such as sludge waste accumulation, noise, and so on. Example implementations described herein may provide feedback to rig system operators to maximize the production of the rig based on the use of model signatures.

[0036] During the processing and pipelining phase, the produced oil or gas is processed and transferred to refineries through a pipeline.

[0037] FIG. 2 illustrates an example rig 200 in accordance with an example implementation. The example implementation depicted in FIG. 2 is directed to a shale gas rig. However, similar concepts can be employed at other types of rigs as well without departing from the inventive scope of the present disclosure, for example, example impleme tatio s described herein can also be applied to horizontal oil wells by integrating features from multiple upstream processes. The well 201 may include one or more gas lift valves 201 -1 which are configured to control hydrostatic pressure of the tubing 201-2. Tubing 201-2 is configured to extract gas from the well 201. The well 201 may include a case 201-3 which can involve a pipe constructed within the borehole of the well. One or more packers 201-4 can be employed to isolate sections of the well 201. Perforations 201- 5 within the casing 201 -3 allow for a connection between the shale gas reservoir to the tubing 201-2. [0038] The rig 200 may include multiple sub-systems directed to injection of material into the well 201 and to the production of material from the well 201. For the injection system 250 of the rig 200, there may be a compressor system 202 that includes one or more compressors that are configured to inject material into the well such as air or water, A gas header system 202. may involve a gas header 202-1 and a series of valves to control the injection flow of the compressor system 202. A choke system 203 may include a controller or casing choke valve which is configured to reduce the flow of material into the well 201.

[0039] For the production system 260 of the rig 200, there may be a wing and master valve system 204 which contains one or more wing valves configured to control the flow of production of the well 20.1. A flowline choke system 205 may include a f!owline choke to control flowline pressure from the well 201. A production header system 206 may employ a production header 206-1 and one or more valves to control the flow from the well 201, and to send produced fluids from the well 201 to either testing or production vessels. A separator system 207 may include one or more separators configured to separate material such as sand or silt from the material extracted from the well 201.

[0040] As illustrated in FIG. 2 various sensors may be applied throughout the rig to measure various data or attributes for a rig node, which are described in further detail below. The sensors are identified by an "8" in an octagon in FIG. 2. These sensors provide feedback to the rig node which can interact with the system as illustrated in FIGS. 1(a) and 1(b), and can be fed to the management server 102 for database storage 103, and/or supplied to a central repository or database such as a public database, which can then be harvested by management server 102.

[0041 ] FIG. 3 illustrates an example configuration of a rig system 101, in accordance with an example implementation. The rig system 101 includes a rig 200 as illustrated in FIG. 2 which contains a plurality of sensors 210. The rig system 101 includes a rig node 300 which may be in the form of a server or other computer device and can contain a processor 301, a memory 302, a storage 303, a data interface (I/F) 304 and a network I/F 305. The data I/F 304 interacts with the one or more sensors 210 of the rig 200 and store raw data in the storage 303, which can be sent to a management server for processing, or to a central repository or central database. The network I/F 305 provides an interface to connect to the network 100. [0042] FIG. 4 illustrates a configuration of a management server 102, in accordance with an example implementation. Management server 102 may involve a processor 401 , a memory 402, a storage I/F 404 and a network I/F 405. The processor 401 is configured to execute one or more programs in the memory 402, to process data and for calculating composite similarity scores. The storage I/F 404 is the interface to facilitate connections between the management server 102 and the database 103. The network I F 405 facilitates interactions between the management server 102 and the plurality of rig systems. Data is aggregated to the management server by the network I/F and then subsequently stored in the database, for example, for future analytics. Alternatively, the plurality of rig systems may send the data to a central database or repository, which is then processed by the management server 102. Management server 102 may execute a process for constructing composite si milarity scores by using programs stored in memory 402 and executed by processor 401. Such processes can include the flow diagrams as illustrated in FIG. 5 and FIGS. 7 and 12.

[0043] Memory 402 may be configured to store rig information for one or more of the rig systems managed by the memory server. Such information can include attributes from multiple phases of operation of a rig as illustrated and described in more detail in FIGS. 6(a) to 6(c), 8(a) and 8(b) and 10. Such information can be obtained from rig systems directly, or from the central database or repository.

[0044] Processor 401 can be configured to calculate a similarity score between rigs of rig systems managed by the management server, based on attribute interaction values between the attributes stored in memory 402. The attribute interaction values can include composite similarity scores of rigs across multiple phases of operation, or in a same phase of operation. Further, some of the attributes can be associated with sets of values, or continuous values which can be discretized according to the desired implementation as described herein. For rigs having missing values that are not in the database or not retrievable from the associated rig system, the missing values can be imputed for similarity score calculations. The imputing of values can be based on other similar wells within a threshold as determined by the operator of the management server. Example implementations of imputations of values can include solving for values such that the similarity score between two rigs are maintained given a set of values from another similar rig, directly incorporated from the highest similar score rig, or can be taken as an average of values from a set of rigs that have a similarity score within a threshold as defined by the operator of the management server.

[0045] For new attributes that are derived or input by the operator of the management server, the processor 401 can be configured to include the new attribute into the attributes stored in the memory 402, and recalculate the composite similarity scores by calculating the similarity score for the new attribute, and then adding the additional similarity score to the composite similarity score based on the weighing of the new attribute. Further details are provided with respect to FIG. 5 and 11.

[0046] FIG. 5 illustrates a flow diagram for constructing a composite score, in accordance with an example implementation. At 501, the process begins and the management server 102 ingests data from multiple data sources, such as data streamed from rig systems or obtained from databases (e.g., central repository or central/public database) regarding the wellsites under management. Such data can contain information about attributes from different processes (from multiple phases in the timeline for rig operation) as explained in FIG. 1 (b). The data can be multi-faceted- streaming or stored; structured or unstructured, and so on.

[0047] At 502, the management server 102 maps the data to the respective processes, wherein the management server 102 determines which data attributes belong to which process. This can be implemented, for example, by identifying attributes from a table of oil and gas management attributes.

[0048] At 503, the management server 102 models each well as a vector in the space spanned by all of the attributes. The vector may be derived from an implementation of the attributes in a cube, having scalar', temporal and spatial dimensions. The attributes are classified into at least the scalar, spatial, and temporal categories, which therefore define the cube that can automate the generation of different key performance indicators ( PIs) representative of the oil and gas management and operations across a well or many wells. Further, the hierarchy and grain can be defined along each dimension to facilitate such a cube implementation. An example of how the well is incorporated into a cube implementation can be found in PCT Application PCT7U82015/023399, filed on March 30, 2015, the disclosure of which is incorporated herein by reference in its entirety for ail purposes, in example implementations, the cube can be utilized to provide the vector in the space spanned by all attributes. Once the attributes are determined, the management server ingests the values from the rigs or from the central database. Because some of the attributes may be a set of continuous values instead of individual values, the management server may discretize the set of values to generate the composite similarity score. More detail of the discretization process is given in FIG. 6(a) and 6(b).

[0049] FIG. 6(a) illustrates example vectors, in accordance with an example implementation. The vectors of FIG. 6(a) are examples of output generated by the modeling of each well as a vector from FIG. 5 at 503, wherein each well is modeled as a vector in the space spanned by the attributes (observed and/or derived). FIG. 6(a) illustrates an example where the entries in each ceil represent values for a well for that attribute. Note that a well is uniquely identified by its American Petroleum Institute (API) well number. Each row in the table illustrated in FIG. 6(a) describes a vector, and the first column represents the well ID or API.

[0050] FIG. 6(b) illustrates an example conversion from continuous values for attributes to discrete values, in accordance with an example implementation. Specifically, FIG. 6(b) utilizes the continuous set of values for each of the attributes of FIG. 6(a) and characterizes the values as a discrete value. The discrete value can be determined based on a desired threshold or other criteria related to the values. For example, for the set of values of the geology rock attribute and the fracking fluid composition attribute, each set of values is already discrete so the discretized values are the same. For an attribute such as the slope of oil production, measured depth, well spacing, and water volume, the set of values can be discretized based on the ranges of the set of values or can be discretized by other methods such as average, median, and so on, depending on the attribute and the desired implementation. In the example of FIGS. 6(a) and FIG. 6(b), the set of values of oil production for Wl is discretized as 'WA1, WA2 \ and the set of values for W2 is discretized as 'WB 1, WB2... ' in FIG. 6(b) to represent a range of production (e.g., each of WA and WB representing ranges above a first value and below a second value). As the oil production is discretized as WA and WB, the discretized values can imply that the range of production values for Wl is different from the range of values of W2. Similarly, in the example of Measured Depth, as the sets for both Wl and W2 are discretized as MDl , this can imply that the set of values for Wl and W2 all fall within the same range. The discretization can be implemented in any manner according to the desired implementation (average, median, mode, etc.) and are not limited to ranges as illustrated in FIG. 6(b).

[0051] FIG. 6(c) illustrates a management of a set of attributes for each timeline phase of operation from FIG. 1 (b), in accordance with an example implementation. By applying analytic flows, the management server may identify attributes that affect each of the phases of operation in the timeline. The attributes can be derived based on the data analytics performed in example implementations as described, for example, in PCI Application No. PCT/US2014/032394, filed on March 31, 2014, the disclosure of which is herein incorporated by reference in its entirety for ail purposes. Attributes may be initially entered manually or by other methods such as being derived from one or more models supplied by the management server operator, and then automatically updated based on new data from rig systems and/or analytics applied to past rig systems through learning based methods. When new attributes are obtained for processing, from being derived by analytics or input by the administrator, the similarity scores can be computed for the new attributes and added on to the composite score as described in example implementations herein.

[0052] At the flow of 504 from FIG. 5, a cheek is performed to determine if there are missing values in the vector. If so (Yes), the flow proceeds to 505 to impute the missing feature values, if present. In an example implementation, missing feature values are imputed for a well by computing a composite similarity between that well and other wells based on the features already available for the well with the missing feature value. The composite similarity value is calculated by equation (1 ) as described below. Based on the top similar wells for which the feature value is available, those values can be similarly assigned to that well with the missing feature values. Examples of imputation of values include using the value of the most similar well, averaging a desired number of values from a desired number of wells, and determining a value that would maintain the composite similarity score. In the description, the terms "composite similarity score" and "composite similarity value" may be used interchangeably.

[0053] The example implementation as illustrated at FIGS. 6(d) to 6(f) depict how imputation of values may be conducted in accordance with the flow at 505 of FIG. 5. In the example illustrated in FIG. 6(d), well W3 is missing feature values for the tracking fluid. Thus as shown at FIG. 6(e), the composite similarity score for W3 is determined with respect to the other wells having known tracking fluid values. In the examples as illustrated in FIGS. 6(d) to 6(f), the well having the highest composite similarity score is utilized. At FIG. 6(f), the value of the highest composite score from WI is imputed for the value of W3, as indicated in bold and underline. In example implementations, the imputed value may also be replaced once the actual value is received.

[0054] At the flow of 506 of FIG. 5, the composite similarity score between wells is calculated. Related art methods such as Gower's similarity, Random Forest similarity, weighted Jaccard similarity co-efficient, and so on have been utilized, however, the have several problems in implementation. In related art implementations, similarity calculators could handle the mixed nature of data but do not consider feature interaction. Other related art implementations take the mixed nature of data and feature interaction into consideration, but cannot handle the case when data vector for each feature involves of sets. Moreover, the training time is very expensive.

[0055] When the data is mixed (contains numerical, categorical, and possibly sets), is extremely sparse in nature, and relatively huge in volume, the related art implementations may fail. The computations have to be performed very fast. Moreover, related art implementations cannot incorporate new attributes into the similarity computation without redoing the previous calculation.

[0056] Thus, example impleme tations address the above problems with a composite similarity computation method shown below:

[0057] Let the feature (attribute) set be denoted by f . i = 1, 2, ... , d. Let the wells be denoted by Wi, / = 1, 2, ... , n. Let the corresponding entry in the n X d matrix be denoted by Ci j . In case feature is continuous, let the feature be discretized into several categories. Note that ; - can be a set, possibly singleton as well as empty (null). Thus, each well is a rf-dimensional vector. Similarity co-efficient between Wi and W m is denoted by sim (W t , W m ) which is calculated as follows:

[ ~ [0059] ahs(, ) denotes absolute value, i . I denotes the cardinality of the set. One possible value for the weights a 4 is to take into account the sparseness of the feature i, '(¾· != 0)

n

[0060] Other examples for weights are also possible, depending on the desired implementation. Weights may also be adjusted by the operator based on the desired emphasis on certain attributes.

[0061 ] Feature i' s importance may also be taken into consideration if known previously. Here /(. ) is the indicator function meaning if the condition inside (.) is satisfied, it takes value 1, otherwise 0. The factor β ί3 calculates mutual information between features i, s based on the total data set.

[0062] When similarity is computed in equation (1), the effect of similarity for each feature is considered as well as the effect of interaction value between features. Since the features belong to multiple phases in the timeline of the rig operation, the interaction value is derived based on features associated with different phases of the timeline, or features associated with the same phase of the timeline. Equation (1) takes into account the weight assigned to the feature through a t and the average interaction with other features ex f™™^), and computes the ratio of how many values for a feature are shared between the two wells and total number of values of that feature across the two wells.

[0063] in example implementations as described herein, the similarity computation considers interaction values where attributes from one phase of operation may affect attributes of other phases of operation (or attributes from one phase of operation may affect another attribute of the same phase). For example, the characteristics of the geology rock in the exploration and drilling phase of operation may affect the values for oil production in the production phase of operation. Thus, even if wells have different raw production values, the similarity score between wells may be higher for the oil production based on the values of the geology rock. The average interaction with other features can be computed based on the implementations as illustrated in FIGS. 6(a), 6(b) and FIG. 7. For example, once the values are diseretized, the mutual information factor fj is can be calculated between features i, s based on the total data set. The mutual information between the two features can be calculated as: [0064] β ί3 = ∑ !Et ses p(I, S) * log

[0065] wherein p (/, S) represents the joint probability distribution function of i and s, and p(I) and p(S) represent the marginal probability distribution functions of i and s respectively. Taking the example of the discretized features of FIG. 6(b), there are two wells and two features across which have collected information. The similarity is computed based on equation ( 1 ) above.

[0066] Let j be Geology rock, f 2 be Tracking fluid composition, and f 3 be Oil Production from FIG. 6b). In this case, assume a ~ c 2 = 3 = 1 . Other factors are calculated as below:

[0067] β, 2 - β 21 - 2 * 0.5 * log (^) - 0, ^3 - 2 * 0.5 * log(¾= 0, ? 23 = 2 * 0.5 * log(^) = 0.301

[0068] Once the attribute interaction values of ? is are calculated, the similarity score can be computed based on the average interaction with other features exp(-~~), and I c n c - j

-----— can also be computed to determine the ratio of how many values for a feaixire are \C a u c ira | *

shared between the two wells and total number of values of that feature across the two wells. In the example of FIG. 6(b), Cli n Cl2 ] = - = 1, ^ 21 ^ 22 ) = - = 0.33. lC3 ' n C32 = 0

[0069] From the above calculations, example implementations ensure that the interaction values across features from the same phase of operation and across different phases of operation are considered in the similarity score calculation.

[0070] Upon computing the similarity between two wells, example implementations can drill down to which features contributed to high or low similarity by looking at the individual feature similarity values.

[0071] FIG. 7 illustrates an example flow diagram for calculation of the composite similarity score from the mutual information values and average interaction values between attributes, in accordance with an example implementation. As described above, because each attribute is weighted according to the desired implementation, it is possible to aggregate the weighted similarity score of the new attribute to the composite similarity score, instead of recalculating the composite similarity score. In this manner, computation time by the server can be reduced. The flow diagram of FIG. 7 illustrates the process from FIG. 6(a) to 6(b) and how the composite score is calculated in accordance with the flow at 506 of FIG.5.

[0072] The new attribute can be derived, or can be input manually by the operator (e.g. insertion of proprietary attributes), or by other methods depending on the desired impleme tation. At 700 the management server calculates mutual information values between the discretized values, by solving for β ί5 for each pair of attributes as described in FICi. 6(b). At 701, the management server then proceeds to calculate the average interaction values with the other features and the ratio of shared values after β- ιΓί is solved for the pair of attributes. That is, because is solved, the management server can proceed to solve for exp(™™) and take the ratio discretized values. At

702, the composite similarity score can thereby be calculated by equation (1). Because is solved for each attribute, the management server therefore facilitates the relation of mutual information values between pairs of attributes that may span across different phases of operation, which can provide a more accurate composite score in comparison to analyzing only attributes spanning the same phase of operation.

[0073] At 507 of FIG. 5, the management server stores values for rig management as illustrated in FIGS. 8(a) and 8(b). FIGS. 8(a) and 8(b) illustrate example management of composite score values, in accordance with an example implementation. In the example of FICi. 8(a), composite scores are normalized from a scale of 0 (completely dissimilar) to 1 (identical) from the equation above, however the example implementations are not limited to this configuration and other implementations are also possible depending on the desired implementation. The composite score between wells includes a comparison of one or more attributes as described above. FIG. 8(b) illustrates an example drill down of the composite score to the attribute components, and the similarity scores across each attribute,

[0074] FIG. 9 illustrates an example system architecture for the management server, in accordance with an example impleme ation. The lower layer of the management server includes the data ingestion layer 900, which is configured to receive data from the network. The data can be external data sources and received directly from rig systems, or can be extracted from central repositories or databases. In an example implementation involving a central database, the data ingestion layer 900 may periodically query the central database, such as a public database that tracks oil and gas management information from rig systems, and obtain data from corresponding rig systems based on API. The data ingestion layer 900 may perform an extract, transform, and load (ETL) to process and forward the data to the raw operational data 901.

[0075] Data is ingested and processed as raw operational data in the raw operational data layer 901. The data that can be processed can include data such as operational data 901-1 , Geographic Information Systems (GIS) data 901-2, Geology data 901-3, and so on. In the raw operational data layer 901, a feature extraction process is executed to process the data and extract features for feeding to the core engine layer 902.

[0076] The core engine layer 902 conduct pre-processing and feature extractions 902- I to extract features from the raw operational data. Core engine layer 902 may also utilize hidden Markov models (HMM) 902-2 to determine relationships between attributes, including dependencies. Rule based analytics and analytical models 902-3 can be utilized to conduct analytics on the data for generating models, such as the example implementations described herein for generating the composite similarity scores. The outputs are processed and sent to the processed data layer 903, which can manage the features 903-1, co-relations between attributes 903-2 and other information for the decision cube 904. The decision cube layer 904 can also send cube queries to the core engine 902 to manage cube information as necessary. Core engine 902 and decision cube 904 may work in tandem to implement the calculation of the similarity scores between wells as described above, and store the data in the processed data 903. In particular, rule based analytics and analytical models 902-3 can include algorithms needed to calculate the similarity score in accordance with the example implementations.

[00771 Visual analytics can also be provided to the User Interface (UI) layer 905 for generating output. The UI interface layer 905 can be configured to take any two attributes from the cube and display them in a desired output format (e.g., bar graph, pie graph, line graph, etc.), with the x-axis and y-axis being based on the cube dimension of the attribute. The units of the x-axis and the y-axis can be scaled up and down according to the hierarchy of the attributes (i.e. along the scalar, spatial and temporal hierarchy). [0078] FIG. 10 illustrates an example of attributes from different upstream processes, in accordance with an example implementation. The attributes as illustrated from FIG. 10 may be derived from other processes. With respect to the measured depth, for each well, along the measured depth, example implementations compute the spacing between each well and other wells. Example implementations compute the physical distance between the segments of wells where they are producing (e.g., marked by perforation points). For some wells, the number of completions stages may be known. For the wells where the completion stage is known, example implementations can compute the minimum physical distance between the frac stage segments.

[0079] Examples for the measured depth, oil production, propant volume, and rock geology are described as follows.

[0080] Measured depth: For each well, example implementations can consider the depth drilled for each well, as well as the spacing between wells along a measured depth. Exam le implementations extract features over periods of time throughout the drilling phase, up until the completion of the drilling phase for each well. The features can also include spacing between wells at specific depths as new wells are drilled or existing wells continue their drilling throughout the drilling phase of operation, and such data can be updated over time across wells.

[0081] Oil Production: For each well, example implementations consider the oil produced from the well as a time series data. Example implementations extract features from the time series in consecutive segments of k time-periods (k=3, 6, ....). The features include maximum production, minimum production, slope of production etc. Let the i ta segment be denoted by ί/ έ ) . Let x[f\ denote the oil production at h time point, / 6 / " (, Ui). The slope is computed by (*[£ {— 1]— χ[ί έ ] )/((/,·— 1— ί έ ).

[0082] Propant volume: Propants are a component used to complete a well. The derived feature is calculated by the formula: (Volume of proppant used) / (# completions stages).

[0083] Rock Geology: During the exploration phase of operation, the geology of the rock formations are stored for each well. Further, during the drilling phase of operation, additional geology information may be incorporated based on new geology information resulting from drilling of the wells. [0084] The attributes of FIG. 10 are examples, and the present disclosure is not limited to the attributes described in FIG. 10. Other attributes may also be utilized, depending on the desired implementation.

[0085] FIG. 11 illustrates the management of independent and dependent attributes across subsystems, in accordance with an example implementation. The independent and dependent variables can be determined from any form of analytics as known in the art, or initialized from manual entry from the management server operator or derived from one or more models. In one example from the circulation system, the pump strokes/minute attribute is an independent variable, and pump pressure is a dependent variable on the pump strokes/minute attribute. The independent and dependent attributes are not exhaustive, and can include other attributes depending on the analytics utilized or the attributes provided. The example implementations of oil and gas management analytics of PCX Application No. PCT US2014/32394 can be utilized for the construction of the management information of FIGS. 10 and 11 , which can be used to generate the vectors of FIG. 6(a) and 6(b).

[0086] In example implementations, the operator of the management server may desire to incorporate new attributes during the calculation of the similarity score. The new attribute may be an attribute not previously considered in the scoring that the operator desired to incorporate into the scoring, or can also be a proprietary attribute as defined by the operator.

[0087] FIG. 12 illustrates an example flow diagram for recalculation of the composite similarity score after incorporating new attributes, in accordance with an example implementation. As described above, because each attribute is weighted according to the desired implementat on, it is possible to aggregate the weighted similarity score of the new attribute to the composite similarity score, instead of recalculating the composite similarity score. In this manner, computation time by the server can be reduced. At 1201 , the management server receives a new attribute and incorporates the new attribute into the set of data attributes. The new attribute can be derived, or can be input manually by the operator (e.g., insertion of proprietary attributes), or by other methods depending on the desired implementation. At 1202, the management server calculates the value for the new attribute for each well (e.g., imputing the value based on values of other wells), or incorporates the value from the associated rig system or from the database. At 1203, the management server then proceeds to impute missing values for the new attribute for the wells that are missing values for the new attribute, as illustrated in the example of FIGS. 6(d) to 6(f). At 1204, the management server then proceeds to calculate the similarity score for the new attribute between wells based on the values obtained for the new attribute in accordance with the flow diagram of FIG. 7. At 12.05, the weighted similarity score is added to the composite similarity score to update the composite similarity score between wells.

[0088] FIG. 13(a) to 13(d) illustrate an example change to the management information from imputation of values, in accordance with an example implementation. Specifically, FIGS. 13(a) to 13(d) illustrate the changes in management information from the execution of the flow diagram of FIG. 12. in FIG. 13(a), the administrator of the management server includes a new attribute 'Rotary RPM' into the management information in accordance with 1201 of FIG. 12. The management server then retrieves values for Rotary RPM from each of the well in accordance with the flow at 1202 of FIG. 12. In the example of FIG. 13(a), well Wl has a missing value for Rotary RPM as it was not recorded by the manager of the rig node for well Wl. Thus the value is imputed for well Wl as illustrated in FIG. 13(c) and in accordance with the flow of 1203 of FIG. 12, wherein the management server determines the wells that have the highest composite similarity score with Wl to impute the value as illustrated in FIG. 13(b). In the example implementation illustrated in FIG. 13(c), the management server copies the Rotary RPM values from the well with the highest similarity score at well W3 from FIG. 13(b), and imputes the values for Wl as shown in bold and underline. In FIG. 13(d), based on the weighting of the new attribute, the similarity score is calculated between each well for the new attribute in accordance with the flow at 1204 of FIG. 12, and the new similarity score is weighted and added to the composite score as shown in bold and underline in accordance with the flow at 1205 of FIG. 12. Although the example implementation of FIG. 13(d) is directed to selecting the values from the well having the highest composite score, other implementations are also possible. For example, values for the new attribute can be solved for such that the similarity score between two wells (e.g., between two wells of the highest similarity score) are maintained, or can be taken as an average of values from a set of rigs that have a similarity score within a threshold as defined by the administrator of the management server. [0089] FIG. 14 illustrates an example output of composite score values in accordance with an example implementation. In the example of FIG. 14, wells W l and W2 are selected at the 2D map of pane 1401. The selection is made on pane 1401 to select two wells in the well reservoir with the freeform dashed line, wherein the composite score- between the two wells as managed by the management server 102 are provided at pane 1400. Further, should a selection be made on pane 1400 to drill down the components of the composite score, output pane of 1402 can also be generated to provide the composite scores for the parameters based on the phase of operation. In the example impleme ation as depicted in FIG. 14, further drill down can also be possible based on selection of the phase of operation score in pane 1402. For example, if the drilling parameter score is selected, then further panes can be generated illustrating the scores for each parameter in the drilling phase of operation. Should the parameter also have dependent attributes as shown in FIG. 11, further drill down to the dependent attributes are also possible. The present disclosure is not limited to the interface as described, and can be implemented in any way according to the desired implementation.

[0090] Some portions of the detailed description are presented in terms of algorithms and symbolic representations of operations within a computer. These algorithmic descriptions and symbolic representations are the means used by those skilled in the data processing arts to convey the essence of their innovations to others skilled in the art. An algorithm is a series of defined steps leading to a desired end state or result. In example implementations, the steps carried out require physical manipulations of tangible quantities for achieving a tangible result.

[0091] Unless specifically stated otherwise, as apparent from the discussion, it is appreciated that throughout the description, discussions utilizing terms such as "processing," "computing," "calculating,' '' "determining," "displaying," or the like, can include the actions and processes of a computer system or other information processing device that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system's memories or registers or other information storage, transmission or display devices.

[0092] Example implementations may also relate to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may include one or more general-purpose computers selectively activated or reconfigured by one or more computer programs. Such computer programs may be stored in a computer readable medium, such as a computer-readable storage medium or a computer-readable signal medium. A computer-readable storage medium may involve tangible mediums such as, but not limited to optical disks, magnetic disks, read-only memories, random access memories, solid state devices and drives, or any other types of tangible or non-transitory media suitable for storing electronic information. A computer readable signal medium may include mediums such as carrier waves. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Computer programs can involve pure software implementations that involve instructions that perform the operations of the desired implementation.

[0093] Various general-purpose systems may be used with programs and modules in accordance with the examples herein, or it may prove convenient to construct a more specialized apparatus to perform desired method steps, in addition, the example implementations are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the example implementations as described herein. The instructions of the programming language(s) may be executed by one or more processing devices, e.g., central processing units (CPUs), processors, or controllers.

[00941 As is known in the art, the operations described above can be performed by hardware, software, or some combination of software and hardware. Various aspects of the example implementations may be implemented using circuits and logic devices (hardware), while other aspects may be implemented using instructions stored on a machine -readable medium (software), which if executed by a processor, would cause the processor to perform a method to carry out implementations of the present application. Further, some example implementations of the present application may be performed solely in hardware, whereas other example implementations may be performed solely in software. Moreover, the various functions described can be performed in a single unit, or can be spread across a number of components in any number of ways. When performed by software, the methods may be executed by a processor, such as a general purpose computer, based on instructions stored on a computer-readable medium. If desired, the instructions can be stored on the medium in a compressed and/or encrypted format. [0095] Moreover, other implementations of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the teachings of the present application. Various aspects and/or components of the described example implementations may be used singly or in any combination. It is intended that the specification and example implementations be considered as examples only, with the true scope and spirit of the present application being indicated by the following claims.