ALGORITHMS TO IDENTIFY PATIENTS WITH HEPATOCELLULAR CARCINOMA

Title:

ALGORITHMS TO IDENTIFY PATIENTS WITH HEPATOCELLULAR CARCINOMA

Document Type and Number:

WIPO Patent Application WO/2015/050921

Kind Code:

Abstract:

A method for identifying patients with a high risk of liver cancer development includes receiving patient data describing a plurality of patients and executing a patient identification algorithm on the patient data to identify at least some of the plurality of patients as having a high risk of developing liver cancer. The patient identification algorithm is generated based on an application of machine learning techniques to a training data set, and the patient identification algorithm is validated based on both the training data set and an external validation data set. Further, the method includes generating a grouping of the plurality of patients based on the identification of the at least some of the plurality of patients.

Inventors:

WALJEE AKBAR (US)
ZHU JI (US)
MUKERJEE ASHIN (US)
MARRERO JORGE (US)
HIGGINS PETER (US)
SINGAL AMIT (US)

Application Number:

PCT/US2014/058519

Publication Date:

April 09, 2015

Filing Date:

October 01, 2014

Export Citation:

Click for automatic bibliography generation Help

Assignee:

UNIV MICHIGAN (US)

International Classes:

G01N33/573; A61B5/00; G01N33/49; G06Q50/22; G16Z99/00

Domestic Patent References:

WO2013043644A1	2013-03-28
WO2008107134A2	2008-09-12

Foreign References:

US20100297018A1	2010-11-25
KR20120055252A	2012-05-31
US8357489B2	2013-01-22

Attorney, Agent or Firm:

RUETH, Randall, G. (GERSTEIN & BORUN LLP233 S. Wacker Drive,6300 Willis Towe, Chicago IL, US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

We claim:

1 . A computer-implemented method for identifying patients with a high risk of liver cancer development, the method comprising:

receiving, via a network interface, patient data describing a plurality of patients; executing, with one or more processors, a patient identification algorithm on the patient data to identify at least some of the plurality of patients as having a high risk of developing liver cancer,

wherein the patient identification algorithm is generated based on an application of machine learning techniques to a training data set, and

wherein the patient identification algorithm is validated based on both the training data set and an external validation data set; and

generating, with the one or more processors, a grouping of the plurality of patients based on the identification of the at least some of the plurality of patients.

2. The computer-implemented method of claim 1 , wherein the grouping of the plurality of patients includes forming a group of patients with a high risk of liver cancer development and a group of patients with a low risk of liver cancer development.

3. The computer-implemented method of claim 1 , wherein the machine learning techniques include a random forest analysis.

4. The computer-implemented method of claim 1 , wherein the patient data includes indications of age, gender, race, body mass index (BMI), past medical history, lifetime alcohol use, and lifetime tobacco use.

5. The computer-implemented method of claim 1 , wherein the patient data includes indications of underlying etiology and a presence of ascites, encephalopathy, and esophageal varices.

6. The computer-implemented method of claim 1 , wherein the patient data includes indications of platelet count, aspartate aminotransferase (AST), alanine aminotransferase (ALT), alkaline phosphatase, bilirubin, albumin, international normalized ratio (INR), and AFP.

7. The computer-implemented method of claim 1 , wherein the patient data, the training data set, and the external validation data set each include indications of at least three of age, gender, race, body mass index (BMI), past medical history, lifetime alcohol use, lifetime tobacco use, underlying etiology, presence of ascites, presence of encephalopathy, presence of esophageal varices, platelet count, aspartate

aminotransferase (AST), alanine aminotransferase (ALT), alkaline phosphatase, bilirubin, albumin, international normalized ratio (INR), and AFP.

8. The computer-implemented method of claim 7, wherein the application of machine learning techniques to the training data set includes generating a variable importance ranking of variables in the training data set.

9. The computer-implemented method of claim 1 , wherein the application of machine learning techniques to the training data set includes quantifying an importance of longitudinal variables, and wherein the longitudinal variables are represented by at least one of a maximum, mean, minimum, baseline, slope, and acceleration.

10. The computer-implemented method of claim 9, wherein the identification of the at least some of the plurality of patients is based at least partially on temporal models and wherein the temporal models utilize the longitudinal variables.

1 1 . A computer device for identifying patients with a high risk of liver cancer development, the computer device comprising:

one or more processors; and

one or more memories coupled to the one or more processors; wherein the one or more memories include computer executable instructions stored therein that, when executed by the one or more processors, cause the one or more processors to:

receive, via a network interface, patient data describing a plurality of patients; execute a patient identification algorithm on the patient data to identify at least some of the plurality of patients as having a high risk of developing liver cancer,

wherein the patient identification algorithm is generated based on an application of machine learning techniques to a training data set, and

wherein the patient identification algorithm is validated based on both the training data set and an external validation data set; and

generate a grouping of the plurality of patients based on the identification of the at least some of the plurality of patients.

12. The computer device of claim 1 1 , wherein the patient data, the training data set, and the external validation data set each include indications of at least three of age, gender, race, body mass index (BMI), past medical history, lifetime alcohol use, lifetime tobacco use, underlying etiology, presence of ascites, presence of

encephalopathy, presence of esophageal varices, platelet count, aspartate

aminotransferase (AST), alanine aminotransferase (ALT), alkaline phosphatase, bilirubin, albumin, international normalized ratio (INR), and AFP.

13. The computer-implemented method of claim 12, wherein the application of machine learning techniques to the training data set includes generating a variable importance ranking of variables in the training data set.

14. The computer-implemented method of claim 13, wherein the most important variable in the variable importance ranking is AST.

15. The computer-implemented method of claim 12, wherein the computer executable instructions further cause the one or more processors to: send, via the network interface, an indication of the grouping of the plurality of patients to a remote computer device.

Description:

ALGORITHMS TO IDENTIFY PATIENTS WITH HEPATOCELLULAR CARCINOMA CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application No.

61 /885,283, entitled "Algorithms to Identify Patients with Hepatocellular Carcinoma" which was filed on October 1 , 2013, the disclosure of which is hereby incorporated herein by reference in its entirety for all purposes.

TECHNICAL FIELD

[0002] The present disclosure generally relates to identifying patients at high risk for liver cancer and, more particularly, to a machine learning method for predicting patient outcomes.

BACKGROUND

[0003] Currently, Hepatocellular carcinoma (HCC) is the third leading cause of cancer-related death worldwide and one of the leading causes of death among patients with cirrhosis. The incidence of HCC in the United States is increasing due to the current epidemic of hepatitis C virus (HCV) infection and non-alcoholic fatty liver disease (NAFLD). Prognosis for patients with HCC depends on tumor stage, with curative options available for patients diagnosed at an early stage. Patients with early HCC achieve five-year survival rates of seventy percent with resection or

transplantation, whereas those with advanced HCC have a median survival of less than one year.

[0004] Frequently, surveillance methods use ultrasound with or without alpha fetoprotein (AFP) every six months to detect HCC at an early stage. Such methods are recommended in high-risk populations. However, one difficulty in developing an effective surveillance program is the accurate identification of a high-risk target population. Patients with cirrhosis are at particularly high risk for developing HCC, but this may not be uniform across all patients and etiologies of liver disease.

Retrospective case-control studies have identified risk factors for HCC among patients with cirrhosis, such as older age, male gender, diabetes, and alcohol intake, and subsequent studies have developed predictive regression models for the development of HCC using several of these risk factors. However, these predictive models are limited by moderate accuracy, and none of the predictive models have been validated in independent cohorts.

SUMMARY

[0005] In one embodiment, a computer-implemented method for identifying patients with a high risk of liver cancer development comprises receiving, via a network interface, patient data describing a plurality of patients, and executing, with one or more processors, a patient identification algorithm on the patient data to identify at least some of the plurality of patients as having a high risk of developing liver cancer. The patient identification algorithm is generated based on an application of machine learning techniques to a training data set, and the patient identification algorithm is validated based on both the training data set and an external validation data set. Further, the method includes generating, with the one or more processors, a grouping of the plurality of patients based on the identification of the at least some of the plurality of patients.

[0006] In another embodiment, a computer device for identifying patients with a high risk of liver cancer development comprises one or more processors and one or more memories coupled to the one or more processors. The one or more memories include computer executable instructions stored therein that, when executed by the one or more processors, cause the one or more processors to receive, via a network interface, patient data describing a plurality of patients, and execute a patient identification algorithm on the patient data to identify at least some of the plurality of patients as having a high risk of developing liver cancer. The patient identification algorithm is generated based on an application of machine learning techniques to a training data set, and The patient identification algorithm is validated based on both the training data set and an external validation data set. Further, the computer executable instructions cause the one or more processors to generate a grouping of the plurality of patients based on the identification of the at least some of the plurality of patients. BRIEF DESCRIPTION OF THE DRAWINGS

[0007] Fig. 1 illustrates cumulative incidences of HCC development in an internal training data set;

[0008] Fig. 2 illustrates an example classification tree for HCC development.

[0009] Fig. 3 illustrates the importance of variables in an example outcome prediction algorithm.

[0010] Fig. 4 is a summary table of results for an example outcome prediction algorithm such as an outcome prediction algorithm based on the variables illustrated in Fig. 3.

[0011] Fig. 5 is another summary table of results for an example outcome prediction algorithm such as an outcome prediction algorithm based on the variables illustrated in Fig. 3.

[0012] Fig. 6 is a flow diagram of an example method for identifying patients with a high risk of HCC development.

[0013] Fig. 7 is a block diagram of an example computing system that may implement the method of Fig. 6.

DETAILED DESCRIPTION

[0014] Although the following text sets forth a detailed description of numerous different embodiments, it should be understood that the legal scope of the description is defined by the words of the claims set forth at the end of this disclosure. The detailed description is to be construed as exemplary only and does not describe every possible embodiment since describing every possible embodiment would be impractical, if not impossible. Numerous alternative embodiments could be implemented, using either current technology or technology developed after the filing date of this patent, which would still fall within the scope of the claims.

[0015] It should also be understood that, unless a term is expressly defined in this patent using the sentence "As used herein, the term ' ' is hereby defined to mean..." or a similar sentence, there is no intent to limit the meaning of that term, either expressly or by implication, beyond its plain or ordinary meaning, and such terms should not be interpreted to be limited in scope based on any statement made in any section of this patent (other than the language of the claims). To the extent that any term recited in the claims at the end of this patent is referred to in this patent in a manner consistent with a single meaning, that is done for the sake of clarity only so as to not confuse the reader, and it is not intended that such claim term be limited, by implication or otherwise, to that single meaning. Finally, unless a claim element is defined by reciting the word "means" and a function without the recital of any structure, it is not intended that the scope of any claim element be interpreted based on the application of 35 U.S.C. § 1 12, sixth paragraph.

[0016] The techniques of the present disclosure may be utilized to identify patients at high risk for liver cancer, such as Hepatocellular Carcinoma (HCC), by executing a patient identification algorithm with one of more processors of a computing device (see Fig. 7 for further discussion of an example computing device). As such, the patient identification algorithm may allow clinicians to stratify patients with regard to their risk of HCC development.

[0017] In some implementations, the patient identification algorithm may be both internally and externally validated. External validation may be an important aspect of the development of the algorithm, in some scenarios, given that the performance of regression models is often substantially higher in derivation (i.e., training) datasets than in validation sets. Further, given the marked heterogeneity among at-risk populations in terms of etiologies of liver disease, degree of liver dysfunction, and prevalence of other risk factors (such as diabetes, smoking or alcohol use), validation of any predictive model for HCC development is likely crucial.

[0018] In some implementations, health care providers or clinician may use the patient identification algorithm as a basis for an electronic health record decision support tool to aid with real-time assessments of HCC risk and recommendations regarding HCC surveillance. For example, the patient identification algorithm may identify high-risk individual cases and transmit annotated data back to a provider, thus facilitating changes to a clinical assessment. Moreover, the patient identification algorithm may form the basis for a publicly available online HCC risk calculator.

[0019] Accurate assessment of HCC risk among patients with cirrhosis, via execution of patient identification algorithm on patient data, may allow targeted application of HCC surveillance programs, in some implementations. High risk patients, as identified by the validated learning algorithms, may benefit from a relatively intense HCC surveillance regimen. For example, although surveillance with cross sectional imaging is not recommended among all patients with cirrhosis, such surveillance may be cost-effective among a subgroup of cirrhotic patients.

[0020] Moreover, contrary to existing trends to use only static laboratory tests (e.g., test for AFP), the patient identification algorithm may account for and quantify the importance of both static variable values and temporal characteristics (e.g., base, mean, max, slope, and acceleration) of variables. Based on this quantification, the patient identification algorithm may be refined (e.g., with machine learning techniques) to more efficiently and effectively identify high risk patients, in some implementations.

[0021] To generate, validate, and refine the patient identification algorithm, a computing device (e.g., a server) may execute an algorithm generation routine in two phases. First, the algorithm generation routine may analyze a set of internal training data to generate an outcome prediction algorithm and internally validate the outcome prediction algorithm. Second, the algorithm generation routine may externally validate the outcome prediction routine to produce an internally and externally validated patient identification routine.

Machine Learning and Internal Training Data

[0022] The algorithm generation routine may include machine learning components to identify patterns in large data sets and make predictions about future outcomes. For example, the algorithm generation routine may include neural network, support vector machine, and decision tree components. Specifically, a type of decision tree analysis called a random forest analysis may divide large groups of cases (e.g., within an internal training data set) into distinct outcomes (e.g. HCC or no HCC), with a goal of minimizing false positives and false negatives. [0023] A random forest analysis, or other suitable machine learning approach, used to generate an outcome prediction algorithm may have several characteristics: (i) a lack of required hypotheses which may allow important but unexpected predictor variables to be identified; (ii) "out-of-bag" sampling which facilitates validation and reduces the risk of overfitting; (iv) consideration of all possible interactions between variables as potentially important interactions; and (v) requirement of minimal input from a statistician to develop a model. Further, machine learning models may easily incorporate new data to continually update and optimize algorithms, leading to improvements in predictive performance over time.

[0024] An internal training data set, used by the algorithm generation routine to generate an outcome prediction algorithm, may include demographic, clinical, and laboratory training data. Demographics data may include variables such as age, gender, race, body mass index (BMI), past medical history, lifetime alcohol use, and lifetime tobacco use. Clinical data may include variables such as underlying etiology and a presence of ascites, encephalopathy, or esophageal varices, and laboratory data may include variables such as platelet count, aspartate aminotransferase (AST), alanine aminotransferase (ALT), alkaline phosphatase, bilirubin, albumin, international normalized ratio (INR), and AFP.

[0025] In general, a complete blood count may include any set of the following variables: hemoglobin, hematocrit, red blood cell count, white blood cell count, platelet count, mean cell volume (MCV), mean cell hemoglobin (MCH), mean cell hemoglobin concentration (MCHC), mean platelet volume (MPV), neutrophil count (NEUT), basophil (BASO) count, monocyte count (MONO), lymphocyte count (LYMPH), and eosinophil count (EOS). Also, chemistries may include any set of the following variables: aspartate aminotransferase (ASP), alanine aminotransferase (ALT), alkaline phosphatase (ALK), bilirubin (TBIL), calcium (CAL), albumin (ALB), sodium (SOD), potassium (POT), chloride (CHLOR), bicarbonate, blood urea nitrogen (UN), creatinine (CREAT), and glucose (GLUC).

[0026] The internal training data set may also include data about patients who underwent prospective evaluations over time. For example, the internal training data set may include data about patients who underwent evaluations every 6 to 12 months by physical examination, ultrasound, and AFP. If an AFP level was greater than 20 ng/mL or any mass lesion was seen on ultrasound, the data may also indicate triple- phase computed tomography (CT) or magnetic resonance imaging (MRI) data to further evaluate the presence of HCC. In this manner, outcome predication algorithms and the patient identification algorithm may be at least partially based on temporal changes in variables.

[0027] In one example scenario, an internal training set (referred to as the "Internal university training set") includes 442 patients with cirrhosis but without prevalent HCC. The median age of the patients in the internal university training set is 52.8 years (range 23.6 - 82.4), and more than 90% of the patients are Caucasian. More than 58.6% of the patients are male, and the most common etiologies of cirrhosis in the internal university training set are hepatitis C (47.3%), cryptogenic (19.2%), and alcohol-induced liver disease (14.5%). A total of 42.9% patients in the internal university training set were Child Pugh class A and 52.5% were Child Pugh class B. Median Child Pugh and MELD scores at enrollment of patients in the internal university training set are 7 and 9, respectively. Median baseline AFP levels are 5.9 ng/mL in patients who developed HCC, and 3.7 ng/mL in patients who did not develop HCC during follow-up (p<0.01 ), in the example scenario. Median follow-up of the internal university training set is 3.5 years (range 0-6.6), with at least one year of follow-up in 392 (88.7%) patients. Over a 1454 person-year follow-up period, 41 patients with data in the internal university training set developed HCC for an annual incidence of 2.8% (see Fig. 1 ). The

cumulative 3- and 5-year probability of HCC development is 5.7% and 9.1 %,

respectively. Of the 41 patients with HCC in the internal university training set, 4 (9.8%) tumors are classified as very early stage (BCLC stage 0) and 19 (46.3%) as BCLC stage A.

[0028] Although the internal university training set will be referred to below in reference to the generation and internal validation of outcome predication algorithms, it is understood that any suitable internal training set may be used to generate and validate outcome predication algorithms. [0029] In general, several parameters may be measured to determine how well an outcome prediction algorithm performs. Sensitivity is the proportion of true positive subjects (e.g., subjects with HCC) who are assigned a positive outcome by the outcome prediction model. Similarly, specificity is defined as the proportion of true negative subjects (e.g, subjects without HCC) who are assigned a negative outcome by the outcome prediction model. The Area Under the Receiver Operating Characteristic curve (AuROC) is another way of representing the overall accuracy of a test and ranges between 0 and 1 .0, with an area of 0.5 representing test accuracy no better than chance alone. Higher AuROC indicates a better performance.

[0030] ROC curves are often helpful in diagnostic settings as the outcome is determined and can be compared to a gold standard. However, in general, any statistic may be used to access the effectiveness of an outcome prediction algorithm. For example, a c-statistic may describe how well an outcome predication algorithm can rank cases and non-cases, but the c-statistic is not a function of actual predicted probabilities or the probability of the individual being classified correctly. This property makes the c- statistic a less accurate measure of the prediction error. Yet, in some implementations, an algorithm generation routine may generate an outcome predication algorithm such that the algorithm provides risk predictions with little change in the c-statistic. In addition, the overall performance of an outcome prediction model may be measured using a Brier score, which captures aspects of both calibration and discrimination. Brier scores can range from 0 to 1 , with lower Brier scores being consistent with higher accuracy and better model performance.

Random Forest

[0031] In some implementations, a computing device (e.g., a server) may execute an algorithm generation routine which includes a random forest analysis. The random forest analysis may identify baseline risk factors associated with the development of HCC in an internal cohort of patients with corresponding data in the internal training data set (e.g., the internal university training set), for example.

[0032] The random forest approach may divide the initial cohort into an "in-bag" sample and an "out-of-bag" sample. The algorithm generation routine may generate the in-bag sample using random sampling with replacement from the initial cohort, thus creating a sample equivalent in size to the initial cohort. A routine may then generate the out-of-bag sample using the unsampled data from the initial cohort. In some implementations, the out-of-bag sample includes about one-third of the initial cohort. The routine may perform this process a pre-determined number of times (e.g., five hundred times) to create multiple pairings of in-bag and out-of-bag samples. For each pairing, the routine may construct a decision tree based on the in-bag sample and using a random set of potential candidate variables for each split. Once a decision tree is generated, the routine may internally validate the tree using the out-of-bag sample. Fig. 2 includes an example decision tree based on an in-bag sample.

[0033] As each tree is generated, the routine may only consider a random subset of the predictor variables as possible splitters for each binary partitioning, in an

implementation. The routine may use predictions from each tree as "votes", and the outcome with the most votes is considered the dichotomous outcome prediction for that sample. Using such a process, the routine may construct multiple decision trees to create the final classification prediction model and determine overall variable

importance.

[0034] The algorithm generation routine may calculate accuracies and error rates for each observation using the out-of-bag predictions and then average over all

observations, in an implementation. Because the out-of-bag observations are not used in the fitting of the trees, the out-of-bag estimates serve as cross-validated accuracy estimates (i.e., for internal validation).

[0035] In some implementations, random forest modeling may produce algorithms that have similar variable importance results as other machine learning methods, such as boosted tree modeling, except with a greater AuROC in the internal training set. The effectiveness of the algorithm generated by the random forest model in predicting clinical response is illustrated in Figs. 3-5. An example illustration of a proportional variable importance of each of the variables is shown in graph form in Fig. 3. In one scenario, the most important independent variables in differentiating patients who develop HCC and those without HCC were as follows: AST, ALT, the presence of ascites, bilirubin, baseline AFP level, and albumin.

[0036] It should be noted that the random forest machine learning approach, as well as any of the other sophisticated tree generating approaches (including boosted trees), may produce very complex algorithms (e.g., huge sets of if-then conditions) that can be applied to future cases with computer code. However, such a complex algorithm (e.g., with 10,000 or more decision trees) is difficult to illustrate in graphical form for inclusion in an application. Instead, the selection of variables used as inputs into any of the regression and classification tree techniques to generate an algorithm and/or the relative importance of the variables also uniquely identify the algorithm. Alternatively, a graph of variable importance percentages can be used to uniquely characterize each algorithm. In fact, both the ratio and the ranges of the variable importance percentages uniquely identify the set of decision trees or algorithms produced by the random forest model. For example, while only a subset of the total list of variables may be used in generating further algorithms, the ratios of relative importance between the remaining variables remains roughly the same, and can be gauged based on the values provided in a variable importance.

[0037] Any random forest tree generated according to a data set is suitable according to the present disclosure, but will be characterized by relative variable importance substantially the same as those displayed in Fig. 3. For example, if all of the variables depicted in Fig. 3 are used, the relative importance of each variable will be about the same proportion within a range of about twenty-five percent (either lower or higher). As another example, if only ten of the variables depicted in Fig. 3 are used, the relative importance of one variable to another (e.g. the ratio of the importance of one variable divided by the importance of the other variable) will remain substantially the same, where the ratios differ by only about 7%.

[0038] In one scenario, an outcome prediction algorithm generated using random forest analysis has a c-statistic of 0.71 (95%CI 0.63-0.79) in the internal university training set. Further, using a previously accepted cut-off of 3.25 to identify high-risk patients, the outcome predication algorithm has a sensitivity and specificity of 80.5% and 57.9%, respectively, in the internal university training set. In addition, the Brier scores for the outcome prediction algorithm is 0.08 in the internal university training set, in the scenario. See Figs. 4 and 5 for summaries of results for the outcome prediction algorithm and two other existing regression models for comparison.

[0039] In some implementations, the outcome prediction algorithm may be based both on fixed, or static, variables like AST, ALT, and longitudinal variables like weight, AFP, CTP, and MELD, to build a record for each patient (one row for each patient). The values associated with the longitudinal variables and used by the outcome prediction algorithm may include the base, the mean, the max, the slope and the acceleration of the longitudinal variables. Based on the longitudinal variables, an outcome prediction algorithm may include three kinds of models called baseline, predict-6-month, predict- 12-month, in an implementation. The baseline model is associated with a final outcome, and the predict-6-month model is associated with the outcome within 6 months of the patient's last visit. Likewise, the predict-12-month model is associated with an outcome within 12 months of the patient's last visit.

External Validation

[0040] In some implementations, the algorithm generation routine may externally validate an outcome prediction algorithm to generate a both internally and externally validated patient identification algorithm. Although, the outcome predication algorithm may not need separate external validation, as it is generated internally using the out-of- bag samples, the algorithm generation routine may still perform both out-of-bag internal validation (e.g., in the internal university training set) and external validation (e.g., in an external validation set).

[0041] For example, the algorithm generation routine may use several

complementary types of analysis to assess different aspects of outcome prediction algorithm performance with respect to an external validation data set. First, the algorithm generation routine may compare model discrimination for the outcome prediction algorithm using receiver operating characteristic (ROC) curve analysis. The algorithm generation routine may then assess gain in diagnostic accuracy with the net reclassification improvement (NRI) statistic, using the Youden model, and the integrated discrimination improvement (IDI) statistic, in an implementation. Further, the algorithm generation routine may obtain risk thresholds in the outcome prediction algorithm to maximize sensitivity and capture all patients with HCC.

[0042] Still further, using risk cut-offs to define a low-risk and high-risk group, the algorithm generation routine may assess the ability of the outcome prediction algorithm to differentiate the risk of HCC development among low-risk and high-risk patients. Also, the algorithm generation routine may again assess the overall performance of the outcome prediction algorithm using Brier scores and Hosmer-Lemeshow χ2 goodness- of-fit test.

[0043] In general, the algorithm generation routine may use any suitable

complementary types of analysis to assess aspects of outcome prediction algorithm performance with respect to an external validation data set. As a result of these complimentary types of analysis, the algorithm generation routine may generate an both externally and internally validated patient identification algorithm. Further, in some cases, the algorithm generation routine may refine an outcome predication algorithm (e.g., with machine learning techniques) based on assessments with respect to external validation data, thus producing a further refined patient identification algorithm.

[0044] The complementary types of analysis discussed above and, in general, all or part of the algorithm generation routine may be implemented using any suitable statistical programming techniques and/or applications. For example, the algorithm generation routine may be implemented using the STATA statistical software and/or the R statistical package.

[0045] In one example scenario, an external validation data set (referred to as the "External cohort validation set") includes data about 1050 patients, with a mean age of 50 years and 71 % being male. Cirrhosis is present at baseline in 41 % of patients, with all cirrhotic patients having Child-Pugh A disease. The mean baseline platelet count in the external cohort validation set was 159 ^*10 9 /L, with 18% of patients having a platelet count below 100 ^*10 9 /L. Also, the mean baseline AFP level was 17 ng/mL, with 19% of patients having AFP levels >20 ng/mL. Over a 6120 person-year follow-up period, 88 patients in the example external cohort validation set developed HCC. Of those patients who developed HCC, 19 (21 .1 %) tumors are classified as TNM stage T1 and 47 (52.2%) as TNM stage T2.

[0046] In the scenario, the algorithm generation routine validates an outcome prediction algorithm to produce a internally and externally validated patient identification algorithm. During validation, the outcome prediction algorithm, generated using random forest analysis as discuss above, had a c-statistic of 0.64 (95%CI 0.60-0.69). Further, the outcome prediction algorithm is able to correctly identify 71 (80.7%) of the 88 patients who developed HCC, while still maintaining a specificity of 46.8%. The outcome prediction algorithm also had a Brier score of 0.08 in the external cohort validation set. See Figs. 4 and 5 for summaries of results for the outcome prediction algorithm and two other existing regression models for comparison.

[0047] Also, after using four bin calibration to adjust for differences between the internal university training set and the external cohort validation set, the algorithm generation routine may evaluate model calibration using the Hosmer-Lemeshow χ2 goodness-of-fit test, in the example scenario. Such a test may be used to evaluate the agreement between predicted and observed outcomes, in an implementation. A significant value for the Hosmer-Lemeshow statistic indicates a significant deviation between predicted and observed outcomes. In the example scenario discussed above, the Hosmer-Lemeshow statistic was not significant for the outcome predication algorithm.

[0048] The algorithm generation routine may utilize the results of a validation, such as in the example scenario above, to further refine the outcome prediction algorithm, or the algorithm generation routine may output the outcome prediction algorithm as an internally and externally validated patient identification algorithm. Subsequently, clinicians may utilize the patient identification algorithm to identify newly encountered patients with a high risk for HCC.

Identifying High Risk Patients

[0049] Fig. 6 is a flow diagram of an example method 600 for applying a patient identification algorithm to identify risk (e.g., of HCC) associated with a patient. The method may be implemented by a computing device or system such as the computing system 10 illustrated in Fig. 7, for example.

[0050] To begin, data about a patient is received (block 602). For example, a computing device may receive data about a patient from a clinician operating a remote computer (e.g., laptop, desktop, or tablet computer). The data may be received by the computing device according to any appropriate format and protocol, such as the

Hypertext Transfer Protocol (HTTP).

[0051] The data about the patient (i.e., "patient data") may include at least some of the variables illustrated in Fig. 3, in an implementation. For example, the data about the patient may include AST, ALT, and the presence of ascites, bilirubin, baseline AFP level, and albumin. In general, the data about the patient may include any data related to the development of HCC, and the data about the patient may vary in amount and/or type from patient to patient. Further, the patient data may include data about only one patient, such that a risk of HCC may be predicted for a specific patient, or the patient data may include data about multiple patients, such that patient risks may be prioritized or ranked.

[0052] Next, a patient identification algorithm, such as the internally and externally validated patient identification algorithm described above, is executed. In some cases, the patient identification algorithm is flexible and dynamic allowing a execution based on any amount and/or type of patient data received at block 602. Such flexibility may arise from the patient identification algorithm basis in machine learning techniques, such as random forest analysis.

[0053] In some implementations, execution of the patient identification algorithm may be at least partially directed to the analysis of temporal variables. For example, means, maxes, averages, slopes, accelerations, etc. of input variables (e.g., longitudinal variables) may be calculated and utilized to determine the patient's risk of developing HCC. In some implementations, the patient identification algorithm may execute a variety of models or modules. For example, the patient identification algorithm may execute a variety of models to predict outcomes at a respective variety of times, such as a current time, six months from the last patient visit, etc. [0054] Then, at block 606, one or more outcome predictions is output as a result of executing the patient identification algorithm. In some implementations, the outcome predications are output as a grouping a cirrhotic patients into groups of high risk patients and low risk patients. However, it is understood that any suitable grouping may be output from the patient identification algorithm. For example, the outcome

predications from the patient identification algorithm may include a grouping of patients into groups of high risk patients, medium risk patients, low risk patients, short term risk patients, long term risk patients, etc. Alternatively, the outcome predictions may include numerical data representing relative risk scores, probabilities, or other numerical representations of risk.

[0055] As such, the patient identification algorithm may be utilized by clinicians to identify cirrhotic patients at high risk for HCC development. Further, the patient identification algorithm may be utilized to risk stratify patients with cirrhosis regarding their risk of HCC development.

Computer Implementation

[0056] The algorithm generation routine, the outcome prediction algorithm, and the internally and externally validated patient identification algorithm may be coded as a program for execution on a computing device such as that illustrated in Figure 40.

Generally, Fig. 7 illustrates an example of a suitable computing system environment 10 that may operate to display and provide the user interface described by this

specification. It should be noted that the computing system environment 10 is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the method and apparatus of the claims. Neither should the computing environment 10 be interpreted as having any dependency or requirement relating to any one component or combination of

components illustrated in the exemplary operating environment 10.

[0057] With reference to Fig. 7, an exemplary system for implementing the blocks of the claimed method and apparatus includes a general purpose computing device in the form of a computer 12. Components of computer 12 may include, but are not limited to, a processing unit 14 and a system memory 16. The computer 12 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 70, via a local area network (LAN) 72 and/or a wide area network (WAN) 73 via a modem or other network interface 75.

[0058] Computer 12 typically includes a variety of computer readable media that may be any available media that may be accessed by computer 12 and includes both volatile and nonvolatile media, removable and non-removable media. The system memory 16 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) and random access memory (RAM). The ROM may include a basic input/output system (BIOS). RAM typically contains data and/or program modules that include operating system 20, application programs 22, other program modules 24, and program data 26. The computer 12 may also include other removable/non-removable, volatile/nonvolatile computer storage media such as a hard disk drive, a magnetic disk drive that reads from or writes to a magnetic disk, and an optical disk drive that reads from or writes to an optical disk.

[0059] A user may enter commands and information into the computer 12 through input devices such as a keyboard 30 and pointing device 32, commonly referred to as a mouse, trackball or touch pad. Other input devices (not illustrated) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 14 through a user input interface 35 that is coupled to a system bus, but may be connected by other interface and bus structures, such as a parallel port, game port or a universal serial bus (USB). A monitor 40 or other type of display device may also be connected to the processor 14 via an interface, such as a video interface 42. In addition to the monitor, computers may also include other peripheral output devices such as speakers 50 and printer 52, which may be connected through an output peripheral interface 55.

[0060] Generally, the tree classification models, such as random forest, may be coded in R language (a statistical programming language developed and distributed by the GNU system) or any other computing language for execution on computer 12.

Once the classification model program (e.g., random forest) is loaded on to computer 12, the program may be executed on observed data, such as the training set of patient results indicating clinical response and values for blood counts, blood chemistry, and patient age. This observed data may be loaded on to any of the computer storage devices of computer 12 to generate an appropriate tree algorithm (e.g., using boosted trees or random forest). Once generated, the tree algorithm, which may take the form of a large set of if-then conditions, may then be coded using any general computing language for test implementation. For example, the if-then conditions can be capture using C/C++ and compiled to produce an executable, which, when run, accepts new patient data and outputs a calculated prediction or grouping of HCC risk. The output of the executable program may be displayed on a display (e.g., a monitor 40) or sent to a printer 52. The output may be in the form of a graph or table indicating the prediction or probability value along with related statistical indicators.

Previous Patent: MULTIPURPOSE MAT

Next Patent: GRAPHICS PROCESSING UNIT