Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MACHINE LEARNING WORKFLOW FOR PREDICTING HYDRAULIC FRACTURE INITIATION
Document Type and Number:
WIPO Patent Application WO/2023/283544
Kind Code:
A1
Abstract:
Systems and methods include a computer-implemented method for predicting hydraulic fracture initiation. A fracking operations dataset is prepared using historical field information for fracking wells. A set of hyper-parameters is tuned for use in a machine learning algorithm configured to predict fracture initiation for new fracturing wells. The dataset is divided into training and test datasets. A regression algorithm is applied to train the training dataset and to validate with the test dataset. A target variable of a breakdown pressure for a new hydraulic fracturing treatment is determined. A prediction dataset is updated using at least the target variable. The training dataset is trained using a classifier of the machine learning algorithm. A prediction is made using the prediction dataset whether the new hydraulic fracturing treatment can be initiated or not. The breakdown pressure is incrementally adjusted, and the method is repeated until successful hydraulic fracture initiation is predicted.

Inventors:
XIA KAIMING (SA)
Application Number:
PCT/US2022/073415
Publication Date:
January 12, 2023
Filing Date:
July 05, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SAUDI ARABIAN OIL CO (SA)
ARAMCO SERVICES CO (US)
International Classes:
E21B43/16; E21B43/26
Domestic Patent References:
WO2018117890A12018-06-28
WO2021119313A12021-06-17
Foreign References:
US20210087925A12021-03-25
US20150218925A12015-08-06
US9135475B22015-09-15
US202117369678A2021-07-07
Attorney, Agent or Firm:
BRUCE, Carl E. et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A computer-implemented method, comprising: preparing a fracking operations dataset based on historical information for a group of fracking wells from fields; tuning a set of hyper-parameters of a machine learning algorithm configured to predict fracture initiation for new fracturing wells; dividing the fracking operations dataset into a training dataset and a test dataset, wherein the training dataset is configured to train a fracture initiation prediction model; applying a regression algorithm to train the training dataset and to validate with the test dataset; determining, using the fracture initiation prediction model and the training dataset, a target variable of a breakdown pressure for a new hydraulic fracturing treatment; updating a prediction dataset based on at least the target variable; training the training dataset using a classifier of the machine learning algorithm; predicting, using the prediction dataset, whether the new hydraulic fracturing treatment can be initiated or not; and incrementally adjusting the breakdown pressure and repeating the updating, training, and predicting until successful hydraulic fracture initiation is predicted.

2. The computer-implemented method of claim 1 , wherein the set of hyper-parameters includes at least leaming rate, gamma, max depth, and max leaves. 3. The computer-implemented method of claim 1, wherein the fracking operations dataset includes, for each fracking well: well survey data including landing depth of TVD, azimuth, and deviation; a stress regime, in-situ stresses and orientation; rock types and rock mechanical properties; perforation related properties; and a breakdown pressure and a fracture initiation classification. 4. The computer-implemented method of claim 1, wherein the machine learning algorithm is an XGBoost algorithm.

5. The computer-implemented method of claim 1 , wherein incrementally adjusting the breakdown pressure includes increasing the breakdown pressure in increments of 200 pounds per square inch (psi).

6. The computer-implemented method of claim 1 , wherein predicting whether the new hydraulic fracturing treatment can be initiated or not includes using a prediction function based on previous predictions.

7. The computer-implemented method of claim 6, wherein the prediction function includes learning rate multipliers applied to previous predictions.

8. A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations comprising: preparing a fracking operations dataset based on historical information for a group of fracking wells; tuning a set of hyper-parameters of a machine learning algorithm configured to predict fracture initiation for new fracturing wells; dividing the fracking operations dataset into a training dataset and a test dataset, wherein the training dataset is configured to train a fracture initiation prediction model; applying a regression algorithm to train the training dataset and to validate with the test dataset; determining, using the fracture initiation prediction model and the training dataset, a target variable of a breakdown pressure for a new hydraulic fracturing treatment; updating a prediction dataset based on at least the target variable; training the training dataset using a classifier of the machine learning algorithm; predicting, using the prediction dataset, whether the new hydraulic fracturing treatment can be initiated or not; and incrementally adjusting the breakdown pressure and repeating the updating, training, and predicting until successful hydraulic fracture initiation is predicted. 9. The non-transitory, computer-readable medium of claim 8, wherein the set of hyper parameters includes at least leaming rate, gamma, max depth, and max leaves.

10. The non-transitory, computer-readable medium of claim 8, wherein the fracking operations dataset includes, for each fracking well: well survey data including landing depth of TVD, azimuth, and deviation; a stress regime, in-situ stresses and orientation; rock types and rock mechanical properties; perforation related properties; and a breakdown pressure and a fracture initiation classification.

11. The non-transitory, computer-readable medium of claim 8, wherein the machine learning algorithm is an XGBoost algorithm.

12. The non-transitory, computer-readable medium of claim 8, wherein incrementally adjusting the breakdown pressure includes increasing the breakdown pressure in increments of 200 pounds per square inch (psi).

13. The non-transitory, computer-readable medium of claim 8, wherein predicting whether the new hydraulic fracturing treatment can be initiated or not includes using a prediction function based on previous predictions.

14. The non-transitory, computer-readable medium of claim 13, wherein the prediction function includes learning rate multipliers applied to previous predictions.

15. A computer-implemented system, comprising: one or more processors; and a non-transitory computer-readable storage medium coupled to the one or more processors and storing programming instructions for execution by the one or more processors, the programming instructions instructing the one or more processors to perform operations comprising: preparing a fracking operations dataset based on historical information for a group of fracking wells; tuning a set of hyper-parameters of a machine learning algorithm configured to predict fracture initiation for new fracturing wells; dividing the fracking operations dataset into a training dataset and a test dataset, wherein the training dataset is configured to train a fracture initiation prediction model; applying a regression algorithm to train the training dataset and to validate with the test dataset; determining, using the fracture initiation prediction model and the training dataset, a target variable of a breakdown pressure for a new hydraulic fracturing treatment; updating a prediction dataset based on at least the target variable; training the training dataset using a classifier of the machine learning algorithm; predicting, using the prediction dataset, whether the new hydraulic fracturing treatment can be initiated or not; and incrementally adjusting the breakdown pressure and repeating the updating, training, and predicting until successful hydraulic fracture initiation is predicted.

16. The computer-implemented system of claim 15, wherein the set of hyper parameters includes at least leaming rate, gamma, max depth, and max leaves.

17. The computer-implemented system of claim 15, wherein the fracking operations dataset includes, for each fracking well: well survey data including landing depth of

TVD, azimuth, and deviation; a stress regime, in-situ stresses and orientation; rock types and rock mechanical properties; perforation related properties; and a breakdown pressure and a fracture initiation classification.

18. The computer-implemented system of claim 15, wherein the machine learning algorithm is an XGBoost algorithm.

19. The computer-implemented system of claim 15, wherein incrementally adjusting the breakdown pressure includes increasing the breakdown pressure in increments of 200 pounds per square inch (psi).

20. The computer-implemented system of claim 15, wherein predicting whether the new hydraulic fracturing treatment can be initiated or not includes using a prediction function based on previous predictions.

Description:
MACHINE LEARNING WORKFLOW FOR PREDICTING HYDRAULIC

FRACTURE INITIATION

CLAIM OF PRIORITY

[0001] This application claims priority to U.S. Patent Application No. 17/369,678 filed on July 7, 2021, the entire contents of which are hereby incorporated by reference.

TECHNICAL FIELD

[0002] The present disclosure applies to predicting breakdown pressures and fracture initiation in fracking wells. BACKGROUND

[0003] In conventional oil fracking practices, analytical solutions have been developed for calculating breakdown pressure (for example, based on elasticity) that have generally been applicable to vertical open holes for drilling. Improvements in industry solutions have included developing a computational framework for calculating the breakdown pressure for hydraulic fracturing treatment. The framework can be applicable to deviated, cased-hole and clustered perforation hydraulic fracturing treatments. However, these analytical solutions still cannot fully capture the effect of three-dimensional (3D) complex configurations of perforated wellbores and rock damage behaviors. Further, models used in conventional solutions require the definition (for example, user entry) of several parameters along the borehole, such as parameters classified as mechanical properties, geometric parameters, and in-situ stresses and orientation. The determination of breakdown pressure can be challenging and can involve uncertainties. In practice, it can be difficult to accurately account for certain factors for calculating breakdown pressure. SUMMARY

[0004] The present disclosure describes techniques that can be used for predicting hydraulic fracture breakdown pressure and fracture initiation. In some implementations, a computer-implemented method includes the following. A fracking operations dataset is prepared based on historical information for a group of fracking wells from fields. The fracking operations dataset is divided into a training dataset and a test dataset, where the training dataset is configured to train a fracture initiation prediction model. The training dataset should be a random selection of a high percentage of the original data like 80% or more, and the testing set should be the remaining 20% or less of the original data. A regression algorithm is applied to train the training dataset firstly, which will create the prediction model. And then to test the model with the testing dataset, which can show the accuracy of the prediction model. A set of hyper-parameters is tuned for use in a machine learning algorithm configured to predict fracture initiation for new fracturing wells. A target variable of a breakdown pressure for a new hydraulic fracturing treatment is determined using the fracture initiation prediction model and the training dataset. The training dataset is updated based on at least the target variable of breakdown pressure. Then the training dataset is trained using a classifier of the machine learning algorithm. A prediction is made using the updated training dataset whether the new hydraulic fracturing treatment can be initiated. The breakdown pressure is incrementally adjusted, and the updating, training, and predicting are repeated until successful hydraulic fracture initiation is predicted.

[0005] The previously described implementation is implementable using a computer-implemented method; a non-transitory, computer-readable medium storing computer-readable instructions to perform the computer-implemented method; and a computer-implemented system including a computer memory interoperably coupled with a hardware processor configured to perform the computer-implemented method, the instructions stored on the non-transitory, computer-readable medium.

[0006] The subject matter described in this specification can be implemented in particular implementations, so as to realize one or more of the following advantages. The present disclosure describes machine learning (ML) techniques for predicting breakdown pressure and fracture initiation, which is very critical to resolve breakdown issues for fracturing deep and tight gas reservoirs. The present disclosure presents an approach to predict hydraulic fracturing breakdown issue which is an improvement over conventional techniques, and which allows inclusion of more factors than those of analytical solutions. Techniques of the present disclosure can account for more factors or parameters related to hydraulic fracturing treatment than techniques used in conventional analytical solutions. The techniques can be used to reasonably predict the results even with few parameters partially missed. [0007] The details of one or more implementations of the subject matter of this specification are set forth in the Detailed Description, the accompanying drawings, and the claims. Other features, aspects, and advantages of the subject matter will become apparent from the Detailed Description, the claims, and the accompanying drawings. DESCRIPTION OF DRAWINGS

[0008] FIG. 1 is a diagram of an example system for hydraulic fracturing treatment using a plug-and-perf method, according to some implementations of the present disclosure.

[0009] FIG. 2 is a diagram of an example of a deviated, cased hole, and clustered perforation hydraulic fracturing well, according to some implementations of the present disclosure.

[0010] FIG. 3 is a diagram of an example of in-situ stresses and a maximum horizontal stress angel for a deviated well and perforation hydraulic fracturing treatment, according to some implementations of the present disclosure. [0011] FIG. 4 is a diagram showing an example of a new tree that is generated along the direction of negative gradient of loss function, according to some implementations of the present disclosure.

[0012] FIG. 5 is a diagram showing an example of a workflow for applying

XGBoost method to predict breakdown pressure and to classify whether hydraulic fractures can be successfully initiated or not, according to some implementations of the present disclosure.

[0013] FIG. 6 is a flowchart of an example of a method for predicting hydraulic fracture initiation, according to some implementations of the present disclosure.

[0014] FIG. 7 is a block diagram illustrating an example computer system used to provide computational functionalities associated with described algorithms, methods, functions, processes, flows, and procedures as described in the present disclosure, according to some implementations of the present disclosure.

[0015] Like reference numbers and designations in the various drawings indicate like elements. DETAILED DESCRIPTION

[0016] The following detailed description describes techniques for predicting hydraulic fracture initiation. The techniques can be used to estimate breakdown pressure and hydraulic fracture initiation for hydraulic fracturing treatment, especially for fracturing deep and tight gas/oil reservoirs. The techniques can be extended to predict more variables related to hydraulic fracturing treatment. Various modifications, alterations, and permutations of the disclosed implementations can be made and will be readily apparent to those of ordinary skill in the art, and the general principles defined may be applied to other implementations and applications, without departing from scope of the disclosure. In some instances, details unnecessary to obtain an understanding of the described subject matter may be omitted so as to not obscure one or more described implementations with unnecessary detail and inasmuch as such details are within the skill of one of ordinary skill in the art. The present disclosure is not intended to be limited to the described or illustrated implementations, but to be accorded the widest scope consistent with the described principles and features.

[0017] Estimating breakdown pressure is essential for a successful hydraulic fracturing treatment, especially for fracturing deep and tight reservoirs. For hydraulic fracturing treatment, the best practices of hydraulic fracturing design can include estimating the breakdown pressure as accurately as possible, in order to select the right casing, treatment tubing, and wellhead based on the estimates. Unfortunately, typical conventional hydraulic fracturing simulators cannot accurately predict the breakdown pressure, for example, due to model simplifications. New computational frameworks may be developed for calculating breakdown pressure applicable to deviated, cased hole, and clustered perforation hydraulic fracturing treatment (see FIG. 2). Models used with reference to FIG. 2 represent a first time that a breakdown pressure mechanics model accounts for the casing-cement-formation interaction effects that can be used in the calculation of breakdown pressure. The calculation of breakdown pressure requires several parameters along the borehole, which can be classified as mechanical properties, geometric parameters, and in-situ stresses and orientation. Realistic inputs of these parameters involve significant uncertainties due to spatial variability and uncertainties, logging measurements, and data processing. Analytical solutions cannot account for the effects of three-dimensional (3D) complex configurations of perforated wellbore and rock damage behavior. The current disclosure offers a new approach to predict breakdown pressure using machine learning (ML), which is a subset of artificial intelligence (AI) and uses computer algorithms to analyze hydraulic fracturing data from field and make intelligent decisions based on what is learned. Instead of following rule- based algorithms, machine learning builds models to regress, classify, and make predictions from the data collected from fields. There is no need to worry the mechanics of whether or not model defects exist. For the breakdown or fracture initiation issue, the parameters that are likely related to the breakdown pressure calculation are known. Machine learning can make it possible to represent different factors related to hydraulic fracturing breakdown pressure. Some parameters not appearing in an analytical model, for example, can be included in machine learning. A top priority of machine learning is to collect enough hydraulic fracturing data, corresponding breakdown pressure that is used, and success or failure information (for example, as a success/failure flag). With machine learning, and given the dataset of hydraulic fracturing data, a model can be created. Further, using data inputs, the model can learn and predict the breakdown pressure for new hydraulic fracturing design.

[0018] To better include all the parameters likely related to breakdown pressure and fracture initiation issue, ML can be used. As a subset of AI, ML is an area of computational science that focuses on analyzing and interpreting patterns and structures in data to enable learning, reasoning, and decision-making outside of human interaction. Machine learning allows the user to feed a computer algorithm an immense amount of data, from which the computer analyzes the data and makes data-driven recommendations and decisions based on only the input data. If any corrections are identified, the algorithm can incorporate that information to improve its future decision making.

[0019] In some implementations, a workflow for predicting hydraulic fracture initiation can include the following. Potential factors are identified that impact hydraulic fracturing breakdown pressure and hydraulic fracture initiation. A workflow is established to conduct machine learning to predict the breakdown pressure required for a hydraulic fracture initiation. A machine learning workflow is established to classify the fractability issue for deep and tight gas reservoirs.

[0020] FIG. 1 is a diagram of an example system 100 for hydraulic fracturing treatment using a plug-and-perf method, according to some implementations of the present disclosure. The method can be used in a wellbore 102 drilled to a selected true vertical depth 104, for example, measured from the Kelly bushing point at the drilling rig. Fractures 106 and packers 108 are located downstring in the wellbore 102.

[0021] Accurate prediction of breakdown pressure, such as in the wellbore 102, is a first step in designing a successful hydraulic fracturing treatment. For deep and tight gas reservoirs, hydraulic fracturing treatments may fail at the beginning due to the breakdown issue. Conventional solutions do not capture all the related factors used in calculating the breakdown pressure for deviated, cased hole, and clustered perforation hydraulic fracturing treatments.

[0022] In conventional solutions, machine learning has been used to predict closure pressure, production rate, and wellhead pressure, which are somewhat related to hydraulic fracturing treatment. However, none of these parameters is directly related to predicting breakdown pressure for hydraulic fracturing treatment. [0023] The present disclosure describes a new approach for predicting breakdown pressure related to hydraulic fracturing treatment using ML, and making intelligent decisions based on what has been learned from field data on hydraulic fracturing treatment. This allows the representation of even more factors than mechanical properties, geometries, and perforation friction available used in the analytical models. Instead of following mechanics-based algorithms, machine learning can build models to regress/classify and make predictions from previous field data. For the breakdown or fracture initiation issue, parameters likely relating to breakdown pressure calculation are known. Machine learning is able to allow the representation of different factors related to hydraulic fracturing breakdown pressure. In some cases, hydraulic fracturing parameters might be missing, however the machine learning algorithm used here still can predict the results. Some parameters not appearing in the analytical model also can be included in the machine learning. With machine learning and given the dataset of hydraulic fracturing data, a model can be created and learned that, given inputs, can predict the breakdown pressure for new hydraulic fracturing design. Further, the model can be used to classify the fractability in terms of fracture initiation being successful or not.

Parameters Impacting Hydraulic Fracturing Breakdown

[0024] Hydraulic fracturing represents a well stimulation technique and has been used to stimulate production from tight reservoirs with low permeability. The process involves the high pressure injection of fracking fluid (primarily slick water, containing proppants) into a wellbore first, with the fluid then flowing through perforations to create fractures in the reservoir (see FIG. 2). The small size proppants will hold the fracture open after hydraulic pressure declines or hydraulic fluid flows back. Fracturing occurs whenever effective stress in the formation is overcome by the fluid pressure within the fractures. The minimum principal stress becomes tensile and exceeds the tensile strength of the rock. Hydraulic fractures are generally oriented in the direction perpendicular to the minimum principal stress and therefore propagate along the maximum horizontal (hor.) stress direction in subsurface. For cased hole/perforation hydraulic fracturing treatment, the breakdown pressure refers to the internal casing pressure when fracture initiates from the perforation tunnel. From a mechanics point of view, the breakdown pressure of perforation is a three-dimensional mechanics problem and is very difficult to obtain in a closed form analytical solution. Rock breakdown or fracture initiation is a first must for a successful hydraulic fracturing treatment. For hydraulic fracturing treatment, accurately estimating the breakdown pressure of formation is very critical, facilitating control in selecting the right tubing size and grade, wellhead grade and burst pressure limiting requirement as well as pump schedule design. The best practice of hydraulic fracturing design is to estimate the breakdown pressure as accurately as possible in the first place, and select the right treatment tubing, and wellhead thereafter. Otherwise, a hydraulic fracturing pump schedule cannot be injected as planned if breakdown pressure were underestimated in the first place. This can eventually lead to giving up hydraulic fracturing treatment. [0025] Unfortunately, most of the currently used hydraulic fracturing simulators cannot accurately predict the breakdown pressure due to model simplifications. The several reasons include, for example: (1) the 3D complex configuration of perforated wellbore cannot be captured in the modeling; (2) numerical models have to use very coarse mesh size; (3) rock damage behavior cannot be considered; and (4) only maximum tensile stress criteria is used in the available analytical solutions. However, published models on this topic can clearly help to understand which parameters are important to determining rock breakdown and fracture initiation.

[0026] FIG. 2 is a diagram of an example of a deviated, cased hole, and clustered perforation hydraulic fracturing well 200, according to some implementations of the present disclosure. FIG. 2 shows perforations 202 downstring in the wellbore 102.

[0027] FIG. 3 is a diagram of an example of in-situ stresses and a maximum horizontal stress angle for a deviated well 300 and perforation hydraulic fracturing treatment, according to some implementations of the present disclosure. For the deviated well 300, cased hole and clustered perforation hydraulic fracturing treatment, the breakdown pressure calculation is much complex than for an open vertical hole. For analytical approach, the far field in-situ stresses are first to be transformed from a global coordinate system to a local coordinate system associated with the individual perforation (see FIG. 3). The coordinate system can be used to define coordinates of points 302 and 304 along the wellbore, with the second point 304 being at a perforation 306. In some frameworks, the global coordinate system can be defined as a situation in which the x- axis aligns with the North direction, the y-axis aligns with the East direction, and the z- axis is vertically downwards. [0028] This coordinate system is denoted by a global coordinate system

(x G y G z G ). To be general, assume the azimuth of maximum principal stress is q ΰH , which is the angle turning clockwise from North to the maximum principle stress. For a well survey, any point along the well trajectory can be determined by three parameters: MD (measured depth), a D (wellbore deviation), and c¾( wellbore azimuth). The coordinate system at any point along the well trajectory can be tracked and obtained by the following rotations about the global coordinate system, x G y G z G , particularly following two steps: (1) rotation of deviation a D about the y G -axis; and (2) rotation of azimuth a A about the z G -axis. The coordinate system for different phase angles of perforation tunnels can be rotated by two steps: (1) rotating a y B = p/2 about the y B - axis of the wellbore coordinate system; and (2) rotation of phase angle a Z B about the z B -axis of the wellbore coordinate system. The breakdown pressure for each phase perforation of the perforation cluster should be calculated, and the minimum one of the calculated pressures will be considered as the required breakdown pressure for the perforation cluster. Obviously the landing depth of TVD, azimuth and deviation are important parameters used in the analytical solutions, which should be considered in the ML dataset.

[0029] For cased hole, the impact of casing and cement on breakdown pressure has been considered in recent innovations. The mechanical properties of casing and cement should be accounted for when estimating breakdown pressure. Also, the corresponding inner diameter (ID) and outer diameter (OD) of casing and cement should be considered. Fracture initiation from the wellbore-perforation interface hinges on the fluid pressure inside the perforation tunnel. Calculations of the fluid pressure acting on the perforation tunnel should account for the pressure loss across the perforation, which is important for hydraulic fracturing pump schedule design. Currently, a sharp-edge orifice equation is widely used to estimate the pressure drop as follows: where p is a fluid density in pounds per gallon (lb /gal) ; cl is an initial perforation diameter in inches (in.); C d is a perforation coefficient of discharge; Q is a flow rate in barrels per minute and N is a number (no.) of perforations (perf). From the above equation, it can be seen that the perforation diameter, flow rate, and number of perforations should be better included in the ML data list. Hydraulic Fracturing Training Dataset

[0030] It is seen that related parameters impact the hydraulic fracturing treatment. The key parameters are collectively listed in Tables 1A and IB (where Table IB contains the additional columns of Table 1A). The parameters are classified as several groups, which are related to well trajectory and orientation, rock type, stress regime types, in-situ stresses and orientation, reservoir (res.) pressure, rock mechanical properties, and perforation parameters. As evidenced in recent innovations, well trajectory and orientation control the perforation orientation and provide a big impact on the required magnitude of breakdown pressure as well as in-situ stress tensor. Generally, hydraulic fracturing treatment is likely to be executed in a normal fault and strike-slip two stress regime. In this dataset, normal fault can be classified as 1 and strike-slip as 2. Rock types do not appear directly in the analytical formulations but likely impact the breakdown pressure and hydraulic fracturing outcome, which are generally reflected through mechanical properties. In this machine learning approach, rock type is also included in the dataset to train the model (for example, classified as 1 for shale, 2 for sandstone, and 3 for carbonate). Rock properties such as porosity are also included in the dataset for machine learning prediction. As indicated in Equation (1), perforation quality can directly impact the breakdown pressure. The number of perforations are included for each perforation cluster, perforation interval, and perforation diameter. These different groups of variables can be factored into predicting the required breakdown pressure for a successful hydraulic fracture initiation. Therefore, the corresponding breakdown pressure is included in the dataset for machine learning, which will be the prediction target variable. Also, the success or failure of fracture initiation can be included in the machine learning dataset as another predicted target variable. The success/failure will be labeled as 1 if hydraulic fractures successfully initiate, and otherwise labeled as 0 if hydraulic fractures fail to initiate.

Table 1A - Variables for machine learning for predicting hydraulic fracturing treatment (cont.)

Table IB - Variables for machine learning for predicting hydraulic fracturing treatment Introduction to Machine Learning - XGBoost Method

[0031] Machine learning has applications in many industrial areas. Machine learning is typically made up of three parts: (1) the computational algorithm at the core of making determinations; (2) variables and features that make up the decision; and (3) a knowledge base for which the answer is known that enables training the system to leam. Initially, the model is fed parameter data for which the answer is known. The algorithm is then run, and adjustments are made until the algorithm’s output (learning) agrees with the known answer. At this point, increasing amounts of data are able to help the system leam and process higher computational decisions. [0032] XGBoost has emerged as one of the most popular machine learning algorithms currently used in the industry, regardless of the type of prediction task at hand, regression or classification. XGBoost is well known to provide better solutions than other machine learning algorithms. The algorithm is known for its good performance as compared to all other machine learning algorithms. In fact, the algorithm has become the “state-of-the-art” machine learning algorithm to deal with structured data. XGBoost has enhanced performance and speed in tree-based (sequential decision trees) machine learning algorithms. In Gradient Boosting algorithms, the loss function is optimized. Also, XGBoost is an advanced implementation of gradient boosting along with some regularization factors. [0033] XGBoost is used for supervised learning problems such as regression, classification, and ranking, where the training data is used with multiple features x L to predict a target variable y L . The process can start with an arbitrary initial prediction. This can be the average in the case of regression. At first, all residuals are calculated into each leaf, and the similarity score is calculated by simply setting l = 0. An attempt is made to split the residuals into two groups by clustering similar residuals. The similarity score is calculated for each group, which are named as left leaf and right leaf. A quantification is made regarding how much better the leaves cluster similar residuals than the root by calculating the gain. The gain is given by:

Gain = Left leaf similarity + Right leaf similarity — Root leaf similarity

(2) [0034] In the above equation, G L is the sum of residual of the left leaf, G R is the sum of residual of the right leaf, H L is the number of the left leaf, H L is the number of the right leaf. After this, the gain can be compared with the gains corresponding to other thresholds to select the largest gain for better split. The process can start by picking a number as a threshold y, which represents the user-definable penalty and is meant to encourage pruning. The difference of ( Gain - y) between the gam associated with the lowest branch in the tree and the value for g is calculated. If the difference between the gam and gamma is negative, the branch is removed. If the difference is positive, the branch is kept and pruning is continued. [0035] The process continues with the original residuals, and a tree is built just like before. The similarity score and gain are calculated in the same way. When lambda is greater than 0, the similarity and gain will be smaller, making it easier to prune leaves. The value of lambda can prevent over-fitting of the training data. After the tree is built, the output value of the tree is determined, which is calculated by:

[0036] A prediction can then be based on the tree. The first prediction is:

Prediction _1 = y f + Learning Rate * output (first prediction) [0037] The process can simply compare the new residuals, determining whether small steps are taken in the right direction. The process can keep building other trees based on new residuals. Make new predictions can result in smaller residuals until residuals are very small or have reached a threshold (for example, maximum number).

[0038] Assuming that a dataset contains totally n samples, which means n rows, then i can represent each example in the dataset. XGBoost can use a loss function to build trees by minimizing the following value:

I [0039] In this equation, the first part represents the loss function which calculates the pseudo residuals of predicted value y ¾ and true value y, in each leaf. The second part contains two parts as described previously. The last part in Equation (6) contains regularization term l, which intends to reduce the prediction insensitivity to individual observations. The term w represents the leaf weight, which can be considered as the output value for the leaf. Also, T represents the number of terminal nodes or leaves in a tree. Also, y is defined as representing the user-definable penalty, which is meant to encourage pruning.

[0040] For the XGBoost method, an additive strategy can be used, fixing what has been learned, and adding one new tree at a time. A new tree can be generated along the direction of a negative gradient of a loss function defined by Equation (5). The loss will be become smaller and smaller as the number of tree models increases. The residuals can be used to construct a decision tree. The process can be repeated until the training reaches the maximum number of estimators (for example, using a default of 100). Once training is completed for the model, the predictions made by the XGBoost model as a whole include the sum of the initial prediction and the predictions made by each individual decision tree multiplied by the leam rates.

[0041] FIG. 4 is a diagram showing an example of a new tree 402k that is generated along the direction of negative gradient of loss function, according to some implementations of the present disclosure. The tree 402k can be generated as part of a prediction process 400 using a tree model 404 that progresses with predictions 402a, 402b, 402c, and so on before a final prediction at the new tree 402k.

[0042] The prediction value at step t can be written as yf. Then: yf = Initial Prediction (6) pi = yf + Learning Rate * Prediction^ (first prediction)

(7) yf = yf + Learning Rate * PredictionJZ (second prediction)

(8) (k th prediction) (9)

Workflow for Applying Machine Learning XGBoost to Predict Breakdown Pressure and Fracture Initiation

[0043] FIG. 5 is a diagram showing an example of a workflow 500 for applying XGBoost method to predict breakdown pressure and to classify whether hydraulic fractures can be successfully initiated or not, according to some implementations of the present disclosure. A first step of the workflow 500 is to collect enough hydraulic fracturing field data as feasibly possible. For example, at 502, a dataset is prepared for machine learning, including a well survey (TVD, azimuth, and deviation); a stress regime, in-situ stresses and orientation; rock types and rock mechanical properties; perforation related properties; and a breakdown pressure and a fracture initiation classification. Quite a large amount of uncertainty can exist in the input data describing the design of a hydraulic fracturing treatment. Collecting enough field data can take some time and can require special skills to filter out unreliable noise from the field data since they are full of uncertainties. At 504, hyper-parameters for the XGBoost algorithm are adjusted or timed. The dataset is divided into training data and test data. Generally, most (for example, 90%) of the dataset can be used as training data, and the rest (for example, 10%) can be used as test data. At 506, XGBoost regression is applied to train the dataset and validate with the test data. For example, the training data can be used to train the boosted tree algorithms to make accurate predictions for predicting targeted variable y t . Once the trees are determined, the accuracy of the prediction can be tested with the help of test data. For training the XGBoost model, the more dataset from field wells will be more objective and accurate. Once the XGBoost model is validated, the model can be used to predict the targeted variable y t . The first targeted variable is to predict breakdown pressure for hydraulic fracturing treatment in this disclosure. XGBoost regression will be used to establish the relationships between features x t and the result y where y L is a continuous variable. Once the breakdown pressures for new hydraulic fracturing treatments are predicted at 508, the breakdown pressures can be added to a prediction dataset at 510 as initial breakdown pressures for the new hydraulic fracturing design. The second machine learning part can be used to predict whether the new hydraulic fracture design with the predicted breakdown pressure can initiate hydraulic fracture successfully or not. Fracture initiation can be labeled as a discrete class label y based on many input features x t . The XGBoost Classifier can be used to predict fracture initiation success/failure for this part machine learning. In this case, at 512, if the first XGBoost Classifier predicts no fracture initiation can be achieved, and the breakdown pressure will be increased, at 516, at an increment of 200 pounds per square inch (psi) or other number. Then, iteration occurs using the XGBoost Classifier to check, at 514, whether updated breakdown pressure can successfully initiate hydraulic fracture or not. It might take several iterations to find a breakdown pressure which can guarantee a successful hydraulic fracture initiation. The workflow can be implemented using a Python script, for example. Example Study for Demonstration

[0044] In this section, an example study is presented on applying the XGBoost method to predict the breakdown pressure for hydraulic fracturing treatment. As collectively shown in Tables 1 A and IB, more parameters are collected for the machine learning study than those used for analytical approach. Some of the potential parameters not used for an analytical solution are also included in the machine learning model, such as stress regime, rock type, porosity, number of perforations, perforation interval, and perforation diameter. As a demonstration, 100 sets of fracturing stage cases were generated. Increasing the number of samples that are provided to a supervised learning algorithm leads to improved precision in regressing and classifying new data. The complete dataset is listed in Tables 2A and 2B (where Table 2B contains the additional columns of Table 2A). In this case, 90% of the dataset can be used to train the model. Once trained and validated, the model can be used to predict the breakdown pressure based on the input dataset of new hydraulic fracturing stages listed in Tables 3A and 3B (prediction dataset). Input dataset also refers to the prediction dataset in this disclosure. The new predicted breakdown pressures are bolded in the last column of Table 3B (where Table 3B contains the additional columns of Table 3A). Compared to the machine learning example study shown in Tables 2A, 2B, 3A, and 3B, a parameter of porosity is added to the dataset for training the model as shown in Tables 4A and 4B (where Table 4B contains the additional columns of Table 4A). Based on the dataset shown in Tables 4 A and 4B for machine learning, the predicted breakdown pressures are listed in Tables 5A and 5B (where Table 5B contains the additional columns of Table 5 A), which are different from those predicted breakdown pressures in Tables 3 A and 3B. A feature of the XGBoost algorithm is that the algorithm can handle missing values. The algorithm is designed in such a way that the algorithm can identify and define the trends in the missing values.

[0045] As previously described, the XGBoost method can be used to classify new data. In the current case, some sets of the hydraulic fracturing dataset can be collected, including labels of hydraulic fracturing initiation success/failure, which will be used to build the classification model. Tables 6A and 6B collectively show the part of the dataset used to classify hydraulic fracturing initiation (where Table 6B contains the additional columns of Table 6A). Success/failure can be labeled as 1 if hydraulic fracture initiation is successful or labeled as 0 if hydraulic fracture initiation is a failure. Compared to the regression part of training for breakdown pressure, one more variable is added to represent hydraulic fracture initiation. This part of machine learning is achieved by using the XGBoost classifier function. Tables 6A and 6B list the training dataset, and Tables 7A and 7B list the machine learning prediction (where Table 7B contains the additional columns of Table 7A). As listed in Tables 7A and 7B, four cases of hydraulic fracturing designs with the planned breakdown pressure will likely initiate hydraulic fracture propagating, and the other two will fail to initiate hydraulic fractures. In this situation, the breakdown pressure can be increased using an increment of 200psi, 300psi or 400psi to the failed cases of designed breakdown pressure listed in Tables 7A and 7B. Then, the XGBoost classifier part can be rerun to see whether updated hydraulic fracturing breakdown pressure can achieve a fracture initiation or not. It might need several iterations to find a breakdown pressure, which allows XGBoost Classifier to predict a successful fracture initiation.

[0046] Machine learning can be used to direct hydraulic fracturing (fract.) design, which can open an avenue toward an expert system for advising field engineers on determining an optimal set of hydraulic fracturing design parameters. An optimal set of parameters, as defined in the present disclosure, can represent a set of parameters, for example, as listed in Tables 1A and IB. The optimal set of parameters can include parameters such as landing depth of TVD, well trajectory, stress regime, in-situ stresses and maximum (max.) stress angle, reservoir pressure, rock type, rock properties, perforation parameters, and required breakdown pressure. The required breakdown pressure can be calculated either using analytical or numerical methods, or using an artificial intelligence (AI)/machine learning (ML) approach.

Table 2A - Dataset for machine learning to predict hydraulic fracturing breakdown pressure

Table 2B - Dataset for machine learning to predict hydraulic fracturing breakdown pressure

Table 3A - Prediction of hydraulic fracturing breakdown pressure by machine learning XGBoost regression.

Table 3B - Prediction of hydraulic fracturing breakdown pressure by machine learning XGBoost regression.

Table 4A - Dataset used for machine learning to predict hydraulic fracturing breakdown pressure

Table 4B - Dataset used for machine learning to predict hydraulic fracturing breakdown pressure

Table 5A - Prediction of hydraulic fracturing breakdown pressure by machine learning XGBoost regression Table 5B - Prediction of hydraulic fracturing breakdown pressure by machine learning XGBoost regression

Table 6A - Dataset used for machine learning to predict hydraulic fracture initiation success or not

Table 6B - Dataset used for machine learning to predict hydraulic fracture initiation success or not

Table 7A - Prediction of hydraulic fracture initiation by machine learning XGBoost classifier. Table 7B - Prediction of hydraulic fracture initiation by machine learning XGBoost classifier.

[0047] FIG. 6 is a flowchart of an example of a method 600 for predicting hydraulic fracture initiation, according to some implementations of the present disclosure. For clarity of presentation, the description that follows generally describes method 600 in the context of the other figures in this description. However, it will be understood that method 600 can be performed, for example, by any suitable system, environment, software, and hardware, or a combination of systems, environments, software, and hardware, as appropriate. In some implementations, various steps of method 600 can be run in parallel, in combination, in loops, or in any order.

[0048] At 602, a fracking operations dataset is prepared based on historical information for a group of fracking wells from fields. For example, historical information from past fracking and production operations can be accessed, including parameters used over time and the results of fracking. From 602, method 600 proceeds to 604.

[0049] At 604, a set of hyper-parameters is tuned for use in a configured to predict fracture initiation in new fracturing wells. As an example, the set of hyper parameters can include leaming rate, gamma (minimum loss reduction required to make a further split on a machine learning algorithm leaf node of the tree), max depth (maximum depth of a tree; increasing this value will make the model more complex and likely to over-fit), and max leaves (maximum number of leaves in a tree). In some implementations, the machine learning algorithm can be an XGBoost algorithm. From 604, method 600 proceeds to 606.

[0050] At 606, the fracking operations dataset is divided into a training dataset and a test dataset, where the training dataset is configured to train a fracture initiation prediction model. For example, the fracking operations dataset can include, for each fracking well: well survey data including landing depth of TVD, azimuth, and deviation; a stress regime, in-situ stresses and orientation; rock types and rock mechanical properties; perforation related properties; and a breakdown pressure and a fracture initiation classification. From 606, method 600 proceeds to 608.

[0051] At 608, a regression algorithm is applied to train the training dataset and to validate with the test dataset. The algorithm can include use of Equations (2) to (5), for example. From 608, method 600 proceeds to 610. [0052] At 610, a target variable of a breakdown pressure for a new hydraulic fracturing treatment is set using the fracture initiation prediction model and the training dataset. The target can be, for example, an initial starting point for the breakdown pressure that is known to be a good starting point. From 610, method 600 proceeds to 612. For example, a sample dataset for machine learning to predict hydraulic fracturing breakdown pressure is provided in tables 2A and 2B.

[0053] At 612, the training dataset is updated based on at least the target variable. Tables 1 A and IB provide example values of the variables. From 612, method 600 proceeds to 614. [0054] At 614, the training dataset is trained using a classifier of the machine learning algorithm. From 614, method 600 proceeds to 616.

[0055] At 616, a prediction is made using the training dataset whether the new hydraulic fracturing treatment can be initiated. Examples of predicted values are provided in Tables 3A, 3B, 5A, and 5B. Predicting whether the new hydraulic fracturing treatment can be initiated or not can include using a prediction function based on previous predichons, as described in Equations (6)-(9). The prediction function can include learning rate multipliers applied to previous predictions. From 616, method 600 proceeds to 618.

[0056] At 618, the breakdown pressure is incrementally adjusted, and the updating, training, and predicting are repeated until successful hydraulic fracture initiation is predicted.

[0057] FIG. 7 is a block diagram of an example computer system 700 used to provide computational functionalities associated with algorithms, methods, functions, processes, flows, and procedures described in the present disclosure, according to some implementations of the present disclosure. The illustrated computer 702 is intended to encompass any computing device such as a server, a desktop computer, a laptop/notebook computer, a wireless data port, a smart phone, a personal data assistant (PDA), a tablet computing device, or one or more processors within these devices, including physical instances, virtual instances, or both. The computer 702 can include input devices such as keypads, keyboards, and touch screens that can accept user information. Also, the computer 702 can include output devices that can convey information associated with the operation of the computer 702. The information can include digital data, visual data, audio information, or a combination of information. The information can be presented in a graphical user interface (UI) (or GUI).

[0058] The computer 702 can serve in a role as a client, a network component, a server, a database, a persistency, or components of a computer system for performing the subject matter described in the present disclosure. The illustrated computer 702 is communicably coupled with a network 730. In some implementations, one or more components of the computer 702 can be configured to operate within different environments, including cloud-computing-based environments, local environments, global environments, and combinations of environments.

[0059] At a top level, the computer 702 is an electronic computing device operable to receive, transmit, process, store, and manage data and information associated with the described subject matter. According to some implementations, the computer 702 can also include, or be communicably coupled with, an application server, an email server, a web server, a caching server, a streaming data server, or a combination of servers. [0060] The computer 702 can receive requests over network 730 from a client application (for example, executing on another computer 702). The computer 702 can respond to the received requests by processing the received requests using software applications. Requests can also be sent to the computer 702 from internal users (for example, from a command console), external (or third) parties, automated applications, entities, individuals, systems, and computers.

[0061] Each of the components of the computer 702 can communicate using a system bus 703. In some implementations, any or all of the components of the computer 702, including hardware or software components, can interface with each other or the interface 704 (or a combination of both) over the system bus 703. Interfaces can use an application programming interface (API) 712, a service layer 713, or a combination of the API 712 and service layer 713. The API 712 can include specifications for routines, data structures, and object classes. The API 712 can be either computer-language independent or dependent. The API 712 can refer to a complete interface, a single function, or a set of APIs. [0062] The service layer 713 can provide software services to the computer 702 and other components (whether illustrated or not) that are communicably coupled to the computer 702. The functionality of the computer 702 can be accessible for all service consumers using this service layer. Software services, such as those provided by the service layer 713, can provide reusable, defined functionalities through a defined interface. For example, the interface can be software written in JAVA, C++, or a language providing data in extensible markup language (XML) format. While illustrated as an integrated component of the computer 702, in alternative implementations, the API 712 or the service layer 713 can be stand-alone components in relation to other components of the computer 702 and other components communicably coupled to the computer 702. Moreover, any or all parts of the API 712 or the service layer 713 can be implemented as child or sub-modules of another software module, enterprise application, or hardware module without departing from the scope of the present disclosure.

[0063] The computer 702 includes an interface 704. Although illustrated as a single interface 704 in FIG. 7, two or more interfaces 704 can be used according to particular needs, desires, or particular implementations of the computer 702 and the described functionality. The interface 704 can be used by the computer 702 for communicating with other systems that are connected to the network 730 (whether illustrated or not) in a distributed environment. Generally, the interface 704 can include, or be implemented using, logic encoded in software or hardware (or a combination of software and hardware) operable to communicate with the network 730. More specifically, the interface 704 can include software supporting one or more communication protocols associated with communications. As such, the network 730 or the interface’s hardware can be operable to communicate physical signals within and outside of the illustrated computer 702.

[0064] The computer 702 includes a processor 705. Although illustrated as a single processor 705 in FIG. 7, two or more processors 705 can be used according to particular needs, desires, or particular implementations of the computer 702 and the described functionality. Generally, the processor 705 can execute instructions and can manipulate data to perform the operations of the computer 702, including operations using algorithms, methods, functions, processes, flows, and procedures as described in the present disclosure. [0065] The computer 702 also includes a database 706 that can hold data for the computer 702 and other components connected to the network 730 (whether illustrated or not). For example, database 706 can be an in-memory, conventional, or a database storing data consistent with the present disclosure. In some implementations, database 706 can be a combination of two or more different database types (for example, hybrid in-memory and conventional databases) according to particular needs, desires, or particular implementations of the computer 702 and the described functionality. Although illustrated as a single database 706 in FIG. 7, two or more databases (of the same, different, or combination of types) can be used according to particular needs, desires, or particular implementations of the computer 702 and the described functionality. While database 706 is illustrated as an internal component of the computer 702, in alternative implementations, database 706 can be external to the computer 702. [0066] The computer 702 also includes a memory 707 that can hold data for the computer 702 or a combination of components connected to the network 730 (whether illustrated or not). Memory 707 can store any data consistent with the present disclosure. In some implementations, memory 707 can be a combination of two or more different types of memory (for example, a combination of semiconductor and magnetic storage) according to particular needs, desires, or particular implementations of the computer 702 and the described functionality. Although illustrated as a single memory 707 in FIG. 7, two or more memories 707 (of the same, different, or combination of types) can be used according to particular needs, desires, or particular implementations of the computer 702 and the described functionality. While memory 707 is illustrated as an internal component of the computer 702, in alternative implementations, memory 707 can be external to the computer 702.

[0067] The application 708 can be an algorithmic software engine providing functionality according to particular needs, desires, or particular implementations of the computer 702 and the described functionality. For example, application 708 can serve as one or more components, modules, or applications. Further, although illustrated as a single application 708, the application 708 can be implemented as multiple applications 708 on the computer 702. In addition, although illustrated as internal to the computer 702, in alternative implementations, the application 708 can be external to the computer 702. [0068] The computer 702 can also include a power supply 714. The power supply 714 can include a rechargeable or non-rechargeable battery that can be configured to be either user- or non-user-replaceable. In some implementations, the power supply 714 can include power-conversion and management circuits, including recharging, standby, and power management functionalities. In some implementations, the power-supply 714 can include a power plug to allow the computer 702 to be plugged into a wall socket or a power source to, for example, power the computer 702 or recharge a rechargeable battery. [0069] There can be any number of computers 702 associated with, or external to, a computer system containing computer 702, with each computer 702 communicating over network 730. Further, the terms “client,” “user,” and other appropriate terminology can be used interchangeably, as appropriate, without departing from the scope of the present disclosure. Moreover, the present disclosure contemplates that many users can use one computer 702 and one user can use multiple computers 702.

[0070] Described implementations of the subj ect matter can include one or more features, alone or in combination.

[0071] For example, in a first implementation, a computer-implemented method includes the following. A fracking operations dataset is prepared based on historical information for a group of fracking wells from fields. A set of hyper-parameters of a machine learning algorithm configured to predict fracture initiation for new fracturing wells are tuned. The fracking operations dataset is divided into a training dataset and a test dataset, where the training dataset is configured to train a fracture initiation prediction model. A regression algorithm is applied to train the training dataset and to validate with the test dataset. A target variable of a breakdown pressure for a new hydraulic fracturing treatment is determined using the fracture initiation prediction model and the training dataset. A prediction dataset is updated based on at least the target variable. The training dataset is trained using a classifier of the machine learning algorithm. Using the prediction dataset, a prediction is made whether the new hydraulic fracturing treatment can be initiated or not. The breakdown pressure is incrementally adjusted, and the updating, training, and predicting are repeated until successful hydraulic fracture initiation is predicted.

[0072] The foregoing and other described implementations can each, optionally, include one or more of the following features: [0073] A first feature, combinable with any of the following features, where the set of hyper-parameters includes at least leaming rate, gamma, max depth, and max_leaves.

[0074] A second feature, combinable with any of the previous or following features, where the fracking operations dataset includes, for each fracking well: well survey data including landing depth of TVD, azimuth, and deviation; a stress regime, in-situ stresses and orientation; rock types and rock mechanical properties; perforation related properties; and a breakdown pressure and a fracture initiation classification. [0075] A third feature, combinable with any of the previous or following features, where the machine learning algorithm is an XGBoost algorithm.

[0076] A fourth feature, combinable with any of the previous or following features, where incrementally adjusting the breakdown pressure includes increasing the breakdown pressure in increments of 200 pounds per square inch (psi). [0077] A fifth feature, combinable with any of the previous or following features, where predicting whether the new hydraulic fracturing treatment can be initiated or not includes using a prediction function based on previous predictions.

[0078] A sixth feature, combinable with any of the previous or following features, where the prediction function includes learning rate multipliers applied to previous predictions.

[0079] In a second implementation, a non-transitory, computer-readable medium stores one or more instructions executable by a computer system to perform operations including the following. A fracking operations dataset is prepared based on historical information for a group of fracking wells from fields. A set of hyper- parameters of a machine learning algorithm configured to predict fracture initiation for new fracturing wells are tuned. The fracking operations dataset is divided into a training dataset and a test dataset, where the training dataset is configured to train a fracture initiation prediction model. A regression algorithm is applied to train the training dataset and to validate with the test dataset. A target variable of a breakdown pressure for a new hydraulic fracturing treatment is determined using the fracture initiation prediction model and the training dataset. A prediction dataset is updated based on at least the target variable. The training dataset is trained using a classifier of the machine learning algorithm. Using the prediction dataset, a prediction is made whether the new hydraulic fracturing treatment can be initiated or not. The breakdown pressure is incrementally adjusted, and the updating, training, and predicting are repeated until successful hydraulic fracture initiation is predicted.

[0080] The foregoing and other described implementations can each, optionally, include one or more of the following features: [0081] A first feature, combinable with any of the following features, where the set of hyper-parameters includes at least leaming rate, gamma, max depth, and max_leaves.

[0082] A second feature, combinable with any of the previous or following features, where the fracking operations dataset includes, for each fracking well: well survey data including landing depth of TVD, azimuth, and deviation; a stress regime, in-situ stresses and orientation; rock types and rock mechanical properties; perforation related properties; and a breakdown pressure and a fracture initiation classification.

[0083] A third feature, combinable with any of the previous or following features, where the machine learning algorithm is an XGBoost algorithm.

[0084] A fourth feature, combinable with any of the previous or following features, where incrementally adjusting the breakdown pressure includes increasing the breakdown pressure in increments of 200 pounds per square inch (psi).

[0085] A fifth feature, combinable with any of the previous or following features, where predicting whether the new hydraulic fracturing treatment can be initiated or not includes using a prediction function based on previous predictions.

[0086] A sixth feature, combinable with any of the previous or following features, where the prediction function includes learning rate multipliers applied to previous predictions. [0087] In a third implementation, a computer-implemented system includes one or more processors and a non-transitory computer-readable storage medium coupled to the one or more processors and storing programming instructions for execution by the one or more processors. The programming instructions instruct the one or more processors to perform operations including the following. A fracking operations dataset is prepared based on historical information for a group of fracking wells from fields. A set of hyper-parameters of a machine learning algorithm configured to predict fracture initiation for new fracturing wells are tuned. The fracking operations dataset is divided into a training dataset and a test dataset, where the training dataset is configured to train a fracture initiation prediction model. A regression algorithm is applied to train the training dataset and to validate with the test dataset. A target variable of a breakdown pressure for a new hydraulic fracturing treatment is determined using the fracture initiation prediction model and the training dataset. A prediction dataset is updated based on at least the target variable. The training dataset is trained using a classifier of the machine learning algorithm. Using the prediction dataset, a prediction is made whether the new hydraulic fracturing treatment can be initiated or not. The breakdown pressure is incrementally adjusted, and the updating, training, and predicting are repeated until successful hydraulic fracture initiation is predicted. [0088] The foregoing and other described implementations can each, optionally, include one or more of the following features:

[0089] A first feature, combinable with any of the following features, where the set of hyper-parameters includes at least leaming rate, gamma, max depth, and max_leaves. [0090] A second feature, combinable with any of the previous or following features, where the fracking operations dataset includes, for each fracking well: well survey data including landing depth of TVD, azimuth, and deviation; a stress regime, in-situ stresses and orientation; rock types and rock mechanical properties; perforation related properties; and a breakdown pressure and a fracture initiation classification. [0091] A third feature, combinable with any of the previous or following features, where the machine learning algorithm is an XGBoost algorithm.

[0092] A fourth feature, combinable with any of the previous or following features, where incrementally adjusting the breakdown pressure includes increasing the breakdown pressure in increments of 200 pounds per square inch (psi). [0093] A fifth feature, combinable with any of the previous or following features, where predicting whether the new hydraulic fracturing treatment can be initiated or not includes using a prediction function based on previous predictions.

[0094] Implementations of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Software implementations of the described subject matter can be implemented as one or more computer programs. Each computer program can include one or more modules of computer program instructions encoded on a tangible, non-transitory, computer-readable computer-storage medium for execution by, or to control the operation of, data processing apparatus. Alternatively, or additionally, the program instructions can be encoded in/on an artificially generated propagated signal. For example, the signal can be a machine-generated electrical, optical, or electromagnetic signal that is generated to encode information for transmission to a suitable receiver apparatus for execution by a data processing apparatus. The computer-storage medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of computer-storage mediums.

[0095] The terms “data processing apparatus,” “computer,” and “electronic computer device” (or equivalent as understood by one of ordinary skill in the art) refer to data processing hardware. For example, a data processing apparatus can encompass all kinds of apparatuses, devices, and machines for processing data, including by way of example, a programmable processor, a computer, or multiple processors or computers. The apparatus can also include special purpose logic circuitry including, for example, a central processing unit (CPU), a field-programmable gate array (FPGA), or an application-specific integrated circuit (ASIC). In some implementations, the data processing apparatus or special purpose logic circuitry (or a combination of the data processing apparatus or special purpose logic circuitry) can be hardware- or software- based (or a combination of both hardware- and software-based). The apparatus can optionally include code that creates an execution environment for computer programs, for example, code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of execution environments. The present disclosure contemplates the use of data processing apparatuses with or without conventional operating systems, such as LINUX, UNIX, WINDOWS, MAC OS, ANDROID, or IOS.

[0096] A computer program, which can also be referred to or described as a program, software, a software application, a module, a software module, a script, or code, can be written in any form of programming language. Programming languages can include, for example, compiled languages, interpreted languages, declarative languages, or procedural languages. Programs can be deployed in any form, including as stand-alone programs, modules, components, subroutines, or units for use in a computing environment. A computer program can, but need not, correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data, for example, one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files storing one or more modules, sub-programs, or portions of code. A computer program can be deployed for execution on one computer or on multiple computers that are located, for example, at one site or distributed across multiple sites that are interconnected by a communication network. While portions of the programs illustrated in the various figures may be shown as individual modules that implement the various features and functionality through various objects, methods, or processes, the programs can instead include a number of sub-modules, third-party services, components, and libraries. Conversely, the features and functionality of various components can be combined into single components as appropriate. Thresholds used to make computational determinations can be statically, dynamically, or both statically and dynamically determined.

[0097] The methods, processes, or logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output. The methods, processes, or logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, for example, a CPU, an FPGA, or an ASIC.

[0098] Computers suitable for the execution of a computer program can be based on one or more of general and special purpose microprocessors and other kinds of CPUs. The elements of a computer are a CPU for performing or executing instructions and one or more memory devices for storing instructions and data. Generally, a CPU can receive instructions and data from (and write data to) a memory.

[0099] Graphics processing units (GPUs) can also be used in combination with

CPUs. The GPUs can provide specialized processing that occurs in parallel to processing performed by CPUs. The specialized processing can include artificial intelligence (AI) applications and processing, for example. GPUs can be used in GPU clusters or in multi-GPU computing.

[00100] A computer can include, or be operatively coupled to, one or more mass storage devices for storing data. In some implementations, a computer can receive data from, and transfer data to, the mass storage devices including, for example, magnetic, magneto-optical disks, or optical disks. Moreover, a computer can be embedded in another device, for example, a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a global positioning system (GPS) receiver, or a portable storage device such as a universal serial bus (USB) flash drive. [00101] Computer-readable media (transitory or non-transitory, as appropriate) suitable for storing computer program instructions and data can include all forms of permanent/non-permanent and volatile/non-volatile memory, media, and memory devices. Computer-readable media can include, for example, semiconductor memory devices such as random access memory (RAM), read-only memory (ROM), phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), and flash memory devices. Computer-readable media can also include, for example, magnetic devices such as tape, cartridges, cassettes, and intemal/removable disks. Computer-readable media can also include magneto-optical disks and optical memory devices and technologies including, for example, digital video disc (DVD), CD-ROM, DVD+/-R, DVD-RAM, DVD-ROM, HD-DVD, and BLU-RAY. The memory can store various objects or data, including caches, classes, frameworks, applications, modules, backup data, jobs, web pages, web page templates, data structures, database tables, repositories, and dynamic information. Types of objects and data stored in memory can include parameters, variables, algorithms, instructions, rules, constraints, and references. Additionally, the memory can include logs, policies, security or access data, and reporting files. The processor and the memory can be supplemented by, or incorporated into, special purpose logic circuitry.

[00102] Implementations of the subject matter described in the present disclosure can be implemented on a computer having a display device for providing interaction with a user, including displaying information to (and receiving input from) the user. Types of display devices can include, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), a light-emitting diode (LED), and a plasma monitor. Display devices can include a keyboard and pointing devices including, for example, a mouse, a trackball, or a trackpad. User input can also be provided to the computer through the use of a touchscreen, such as a tablet computer surface with pressure sensitivity or a multi-touch screen using capacitive or electric sensing. Other kinds of devices can be used to provide for interaction with a user, including to receive user feedback including, for example, sensory feedback including visual feedback, auditory feedback, or tactile feedback. Input from the user can be received in the form of acoustic, speech, or tactile input. In addition, a computer can interact with a user by sending documents to, and receiving documents from, a device that the user uses. For example, the computer can send web pages to a web browser on a user’s client device in response to requests received from the web browser.

[00103] The term “graphical user interface,” or “GUI,” can be used in the singular or the plural to describe one or more graphical user interfaces and each of the displays of a particular graphical user interface. Therefore, a GUI can represent any graphical user interface, including, but not limited to, a web browser, a touch-screen, or a command line interface (CLI) that processes information and efficiently presents the information results to the user. In general, a GUI can include a plurality of user interface (UI) elements, some or all associated with a web browser, such as interactive fields, pull-down lists, and buttons. These and other UI elements can be related to or represent the functions of the web browser.

[00104] Implementations of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, for example, as a data server, or that includes a middleware component, for example, an application server. Moreover, the computing system can include a front-end component, for example, a client computer having one or both of a graphical user interface or a Web browser through which a user can interact with the computer. The components of the system can be interconnected by any form or medium of wireline or wireless digital data communication (or a combination of data communication) in a communication network. Examples of communication networks include a local area network (LAN), a radio access network (RAN), a metropolitan area network (MAN), a wide area network (WAN), Worldwide Interoperability for Microwave Access (WIMAX), a wireless local area network (WLAN) (for example, using 802.11 a/b/g/n or 802.20 or a combination of protocols), all or a portion of the Internet, or any other communication system or systems at one or more locations (or a combination of communication networks). The network can communicate with, for example, Internet Protocol (IP) packets, frame relay frames, asynchronous transfer mode (ATM) cells, voice, video, data, or a combination of communication types between network addresses. [00105] The computing system can include clients and servers. A client and server can generally be remote from each other and can typically interact through a communication network. The relationship of client and server can arise by virtue of computer programs running on the respective computers and having a client-server relationship.

[00106] Cluster file systems can be any file system type accessible from multiple servers for read and update. Locking or consistency tracking may not be necessary since the locking of exchange file system can be done at application layer. Furthermore, Unicode data files can be different from non-Unicode data files.

[00107] While this specification contains many specific implementation details, these should not be construed as limitations on the scope of what may be claimed, but rather as descriptions of features that may be specific to particular implementations. Certain features that are described in this specification in the context of separate implementations can also be implemented, in combination, in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations, separately, or in any suitable sub combination. Moreover, although previously described features may be described as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can, in some cases, be excised from the combination, and the claimed combination may be directed to a sub-combination or variation of a sub combination.

[00108] Particular implementations of the subject matter have been described. Other implementations, alterations, and permutations of the described implementations are within the scope of the following claims as will be apparent to those skilled in the art. While operations are depicted in the drawings or claims in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed (some operations may be considered optional), to achieve desirable results. In certain circumstances, multitasking or parallel processing (or a combination of multitasking and parallel processing) may be advantageous and performed as deemed appropriate.

[00109] Moreover, the separation or integration of various system modules and components in the previously described implementations should not be understood as requiring such separation or integration in all implementations. It should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

[00110] Accordingly, the previously described example implementations do not define or constrain the present disclosure. Other changes, substitutions, and alterations are also possible without departing from the spirit and scope of the present disclosure. [00111] Furthermore, any claimed implementation is considered to be applicable to at least a computer-implemented method; a non-transitory, computer-readable medium storing computer-readable instructions to perform the computer-implemented method; and a computer system including a computer memory interoperably coupled with a hardware processor configured to perform the computer-implemented method or the instructions stored on the non-transitory, computer-readable medium.