Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MULTI-MODAL MACHINE LEARNING TO DETERMINE RISK STRATIFICATION
Document Type and Number:
WIPO Patent Application WO/2023/201054
Kind Code:
A1
Abstract:
Presented herein are systems, methods, and non-transient computer readable media for determining risk scores using multimodal feature sets. A computing system may identify a first feature set for a first subject at risk of a condition. The first feature set may include (i) a first radiological feature derived from a tomogram of a section associated with the condition within the first subject, (ii) a first histologic feature acquired using a whole slide image of a sample having the condition from the first subject, and (iii) a first genomic feature obtained from gene sequencing of the first subject for genes associated with the condition. The computing system may apply the first feature set to a model. The computing system may determine, from applying the first feature set to the model, a predicted risk score of the condition for the first subject.

Inventors:
AHERNE EMILY (US)
BOEHM KEVIN (US)
LAKHMAN YULIA (US)
NIKOLOVSKI INES (US)
ZAMARIN DMITRIY (US)
ELLENSON LORA (US)
PATEL DRUV (US)
GAO JIANJIONG (US)
SHAH SOHRAB P (US)
VAZQUEZ GARCIA IGNACIO (US)
Application Number:
PCT/US2023/018678
Publication Date:
October 19, 2023
Filing Date:
April 14, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MEMORIAL SLOAN KETTERING CANCER CENTER (US)
MEMORIAL HOSPITAL FOR CANCER AND ALLIED DISEASES (US)
SLOAN KETTERING INST CANCER RES (US)
International Classes:
G06T7/00; G01N33/574; G06T7/11; G16H50/20; G16H50/30; G06T7/12
Foreign References:
US20170270666A12017-09-21
US20200166523A12020-05-28
US20210255200A12021-08-19
US20210319906A12021-10-14
US20180016642A12018-01-18
US20190242894A12019-08-08
US20160117439A12016-04-28
Other References:
ERGEN BURHAN, SERDAR ABUT: "Gender Recognition Using Facial Images", INTERNATIONAL PROCEEDINGS OF CHEMICAL, BIOLOGICAL AND ENVIRONMENTAL ENGINEERING (IPCBEE), IACSIT PRESS, SINGAPORE, vol. 60, 1 January 2013 (2013-01-01), Singapore , pages 112 - 117, XP093102168, ISSN: 2010-4618
Attorney, Agent or Firm:
KHAN, Shabbi S. et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method of determining risk stratification for subjects, comprising: identifying, by a computing system, a first feature set for a first subject at risk of a condition, the first feature set comprising:

(i) a first radiological feature derived from a tomogram of a section associated with the condition within the first subject;

(ii) a first histologic feature acquired using a whole slide image of a sample having the condition from the first subject, and

(iii) a first genomic feature obtained from gene sequencing of the first subject for genes associated with the condition; applying, by the computing system, the first feature set to a model, wherein the model is established using a plurality of second feature sets and a plurality of expected risk scores for a corresponding plurality of second subjects; determining, by the computing system, from applying the first feature set to the model, a predicted risk score of the condition for the first subject; and storing, by the computing system, using one or more data structures, an association between the predicted risk score and the first feature set for the first subject.

2. The method of claim 1, further comprising classifying, by the computing system, the first subject into one of a plurality of risk level groups based on a comparison between the predicted risk score indicating a likelihood of an occurrence of an event due to the condition in the first subject and a threshold for each of the plurality of risk level groups.

3. The method of claim 1, further comprising establishing, by the computing system, the model comprising a multivariate model using one or more features selected from the plurality of second feature set using one or more corresponding univariate models.

4. The method of claim 1, wherein determining the predicted risk score further comprises determining a survival function identifying the predicted risk score for the first subject over a period of time.

5. The method of claim 1, wherein identifying the first feature set further comprises selecting, from a plurality of radiological features, the first radiological feature based on a hazard ratio of each of the plurality of radiological features determined using a univariate model for radiological features.

6. The method of claim 1, wherein identifying the first feature set further comprises selecting, from a plurality of histological features, the first histological feature based on a hazard radio of each of the plurality of histological features determined using a univariate model for histological features.

7. The method of claim 1, wherein the first radiological feature is derived from the tomogram using a Coif-wavelet transform, and comprises at least one of: (i) a gray level cooccurrence matrix (GLCM), (ii) gray level dependence matrix (GLDM), (iii) a gray level run length matrix (GLRLM), (vi) a gray level size zone matrix (GLSZM), or (v) a neighboring gray tone difference matrix.

8. The method of claim 1, wherein the first histologic feature further comprises at least one of: (i) a tissue type of the sample from which the whole slide image is derived, (ii) an area of cell nuclei corresponding to the condition within the sample, or (iii) a length of a portion of the sample corresponding to the tissue type.

9. The method of claim 1, wherein the first genomic feature identifies a status of Homologous recombination deficiency (HRD) or Homologous recombination proficiency (HRP) in the first subject, the status determined using at least one of: (i) variants in genes associated with HRD DNA damage response or (ii) subtypes for disjoint tandem duplicator and foldback inversion mutations.

10. The method of claim 1, further comprising providing, by the computing system, information based on the association between the predicted risk score and the first feature set for the first subject.

11. A system for determining risk stratification for subjects, comprising: a computing system having one or more processors coupled with memory, configured to: identify a first feature set for a first subject at risk of a condition, the first feature set comprising: (i) a first radiological feature derived from a tomogram of a section associated with the condition within the first subject;

(ii) a first histologic feature acquired using a whole slide image of a sample having the condition from the first subject, and

(iii) a first genomic feature obtained from gene sequencing of the first subject for genes associated with the condition; apply the first feature set to a model, wherein the model is established using a plurality of second feature sets and a plurality of expected risk scores for a corresponding plurality of second subjects; determine, from applying the first feature set to the model, a predicted risk score of the condition for the first subject; and store, using one or more data structures, an association between the predicted risk score and the first feature set for the first subject.

12. The system of claim 11, wherein the computing system is further configured to classify the first subject into one of a plurality of risk level groups based on a comparison between the predicted risk score indicating a likelihood of an occurrence of an event due to the condition in the first subject and a threshold for each of the plurality of risk level groups.

13. The system of claim 11, wherein the computing system is further configured to establish the model comprising a multivariate model using one or more features selected from the plurality of second feature set using one or more corresponding univariate models.

14. The system of claim 11, wherein the computing system is further configured to determine a survival function identifying the predicted risk score for the first subject over a period of time.

15. The system of claim 11, wherein the computing system is further configured to select, from a plurality of radiological features, the first radiological feature based on a hazard ratio of each of the plurality of radiological features determined using a univariate model for radiological features.

16. The system of claim 11, wherein the computing system is further configured to select, from a plurality of histological features, the first histological feature based on a hazard radio of each of the plurality of histological features determined using a univariate model for histological features.

17. The system of claim 11, wherein the first radiological feature is derived from the tomogram using a Coif-wavelet transform, and comprises at least one of: (i) a gray level cooccurrence matrix (GLCM), (ii) gray level dependence matrix (GLDM), (iii) a gray level run length matrix (GLRLM), (vi) a gray level size zone matrix (GLSZM), or (v) a neighboring gray tone difference matrix.

18. The system of claim 11, wherein the first histologic feature further comprises at least one of: (i) a tissue type of the sample from which the whole slide image is derived, (ii) an area of cell nuclei corresponding to the condition within the sample, or (iii) a length of a portion of the sample corresponding to the tissue type.

19. The system of claim 11, wherein the first genomic feature identifies a status of Homologous recombination deficiency (HRD) or Homologous recombination proficiency (HRP) in the first subject, the status determined using at least one of: (i) variants in genes associated with HRD DNA damage response or (ii) subtypes for disjoint tandem duplicator and foldback inversion mutations.

20. The system of claim 11, wherein the computing system is further configured to provide information based on the association between the predicted risk score and the first feature set for the first subject.

Description:
Multi-Modal Machine Learning to Determine Risk Stratification

CROSS-REFERENCE TO RELATED PATENT APPLICATIONS

[0001] The present application claims priority to U.S. Provisional Patent Application No. 63/331,390, titled “Multi-Modal Machine Learning to Determine Risk Stratification,” filed April 15, 2022, which is incorporated herein by reference in its entirety.

BACKGROUND

[0002] A computing system may apply various machine learning (ML) techniques on an input to generate an output.

SUMMARY

[0003] Aspects of the present disclosure are directed to systems and methods of determining risk scores using multimodal feature sets. A computing system may identify a first feature set for a first subject at risk of a condition. The first feature set may include (i) a first radiological feature derived from a tomogram of a section associated with the condition within the first subject, (ii) a first histologic feature acquired using a whole slide image of a sample having the condition from the first subject, and (iii) a first genomic feature obtained from gene sequencing of the first subject for genes associated with the condition. The computing system may apply the first feature set to a model. The model may be established using a plurality of second feature sets and a plurality of expected risk scores for a corresponding plurality of second subjects. The computing system may determine, from applying the first feature set to the model, a predicted risk score of the condition for the first subject. The computing system may store, using one or more data structures, an association between the predicted risk score and the first feature set for the first subject.

10004] In some embodiments, the computing system may classify the first subject into one of a plurality of risk level groups based on a comparison between the predicted risk score indicating a likelihood of an occurrence of an event due to the condition in the first subject and a threshold for each of the plurality of risk level groups. In some embodiments, the computing system may establish the model comprising a multivariate model using one or more features selected from the plurality of second feature set using one or more corresponding univariate models. In some embodiments, the computing system may provide information based on the association between the predicted risk score and the first feature set for the first subject.

[0005] In some embodiments, the computing system may determine a survival function identifying the predicted risk score for the first subject over a period of time. In some embodiments, the computing system may select, from a plurality of radiological features, the first radiological feature based on a hazard ratio of each of the plurality of radiological features determined using a univariate model for radiological features. In some embodiments, the computing system may select, from a plurality of histological features, the first histological feature based on a hazard radio of each of the plurality of histological features determined using a univariate model for histological features.

[0006| In some embodiments, the first radiological feature may be derived from the tomogram using a Coif-wavelet transform, and comprises at least one of: (i) a gray level cooccurrence matrix (GLCM), (ii) gray level dependence matrix (GLDM), (iii) a gray level run length matrix (GLRLM), (vi) a gray level size zone matrix (GLSZM), or (v) a neighboring gray tone difference matrix. In some embodiments, the first histologic feature further comprises at least one of: (i) a tissue type of the sample from which the whole slide image is derived, (ii) an area of cell nuclei corresponding to the condition within the sample, or (iii) a length of a portion of the sample corresponding to the tissue type.

BRIEF DESCRIPTION OF THE DRAWINGS

[0007] FIG. 1. Schematic outline of the architecture, (a) Multiple data modalities were acquired through routine diagnostics to inform clinical decision making: (b) pretreatment contrast-enhanced CT (CE-CT) scans of the abdomen and pelvis, (c) pretreatment H&E-stained diagnostic biopsies, and (d) HRD status inferred from hybridizationcapture based targeted sequencing or clinical HRD-DDR gene panels, (e) Integrated multimodal analyses by late fusion to stratify patients by overall survival. (Abbreviation: CT: computed tomography, GLSZM-SAE: gray level size zone matrix small area emphasis, GLRLM-GLV: gray level run length matrix gray level variance, H&E: hematoxylin and eosin, Var: variance, Nuc: nuclear, NGS: next-generation sequencing, LSTs: large-scale state transitions, NtAI: number of subchromosomal regions with allelic imbalance extending to the telomere, LOH: loss of heterozygosity, HRD: homologous recombination deficiency, CRS: chemotherapy response score, OS: overall survival).

[0008[ FIG. 2(a)-(c). Overview of cohorts and data types acquired, (a) Venn diagram of patients in the training cohort with available clinical imaging and inferred HRD status, (b) Inferred subtypes, sequencing modality, dataset of origin, genes with five or more variants, and signature 3 status of each patient. Gray represents sequenced genes without the aberrations shown, and white represents an unsequenced gene, (c) Kaplan- Meier analysis on overall survival stratified by HRD status (N = 377 patients). P- values were calculated using the log-rank test. (Abbreviation: Sig.: mutational signature, SNV: simple nucleotide variation, Amp.: copy number amplification, WES: whole-exome sequencing).

[00091 FIG. 3(a)-(h). High-autocorrelation omental implants are associated with shorter overall survival, (a) Segmented omental lesion (red) on CE-CT. (b) The log hazard ratio is depicted for each radiomic feature derived from omental implants (N = 600 features). Features above the line were statistically significant by Cox regression after multiple testing correction of interquartile range-filtered features, (c) Adnexal radiomic features (N = 600 features) were not significant by Cox regression after correction of interquartile range-filtered features, (d) The hazard ratio with 95% C.I. as estimated by Cox regression is shown for the feature in the final model, the autocorrelation derived from the gray level cooccurrence matrix for the wavelet-filtered image, (e) The value of this feature against OS is plotted for patients in the training set (N = 251 patients), (f) Training and test concordance indices for the model along is shown: the height of each bar shows the c- Index, and the lower and upper points of the respective error bars depict the 95% C.I. by 100-fold leave-one-out bootstrapping, (g, h) Two risk groups based on the model’s predicted risk score are shown for the training and test sets. -values were derived using the log-rank test. (Abbreviation: glcm: gray level co-occurrence matrix, gl dm: gray level dependence matrix, glrlm: gray level run length matrix, glszm: gray level size zone matrix, ngtdm: neighboring gray tone difference matrix, HLL: high-low-low wavelet filter, OS: overall survival, c: Harrell’s concordance index).

[0010] FIG. 4(a)-(d). Weakly supervised deep learning accurately infers HGSOC tissue type on H&E. (a) Annotated tiles normalized using Macenko’s method chosen at random. The number of tiles for each tissue type are shown, (b) Workflow of ResNet-18 model trained using the annotated regions, (c) Example of the model’s predictions for an annotated region, (d) The confusion matrix aggregated across folds of cross validation for each of the tissue classes.

[0011] FIG. 5(a)-(g). Interpretable histopathologic features stratify HGSOC patients by OS. (a) Tissue map from H&E slides with nuclear detections yielding tissuetype and cell-type features (b) Log hazard ratios of the two chosen histologic features (with 95% C.I. as estimated by Cox regression; fit on N = 243 patients), (c) Training and test concordance indices are shown: the height of each bar shows the c-Index, and the lower and upper points of the respective error bars depict the 95% C.I. by 100-fold leave-one-out bootstrapping, (d) Kaplan Meier survival analysis and log-rank test statistics for training (d) and test sets (e). (f, g) H&E of extreme examples of the model’s inferred mean tumoral nuclear area (scale bar is 50pm for each image).

[0012] FIG. 6(a)-(h). Multimodal integration improves stratification and identifies clinically significant subgroups, (a) The test c-indices for integration of combinations of multimodal features is shown: the height of each bar shows the c-Index, and the lower and upper points of the respective error bars depict the 95% C.I. by 100-fold leave-one-out bootstrapping. Asterisks denote 95% confidence of significant ordering of the test set by 1000-fold permutation test, (b) Log hazard ratios of imaging without (top) and with (bottom) HRD integration. Two modalities are shown in the top panel (fit on N = 122 patients), and three are shown in the bottom (fit on N = 114 patients), (c) Kaplan-Meier plot comparing high- and low- risk groups determined by the GRH model on the training set. -value calculated using the log-rank test, (d) Kaplan-Meier plot comparing high- and low- risk groups test set. -value calculated using the log-rank test, (e) Unique patients at risk of early death are identified by radiologic, histopathologic, and genomic modalities. Only patients in the test set with uncensored outcomes (N = 23 patients) are shown, (f) Kendall rank correlation coefficient of the risk quantile across pairs of the individual modalities, indicating low mutual ordering information between individual modalities in the training set. (g) KM plot of GHR model risk groups on progression-free survival in the test set. (One patient has unknown PFS.) -value calculated using the log-rank test, (h) Distributions of GHR model score of low (blue) and high (green) chemotherapy response score (CRS) in the training set (N = 46 patients). Boxes denote interquartile range, with the center depicting the median and the whiskers denoting the entire distribution excluding any outliers. Significance was assessed by a one-sided Mann-Whitney U test: p = 0.0044. ** denotes p<0.01. (Abbreviation: perm.: permutation test, G: genomic model, H: histopathologic model, R: radiologic model, C: clinical model, GHR: combined genomic histopathologic and radiologic model, GHRC: combined genomic histopathologic, radiologic, and clinical model, NET: no evidence of tumor, PFS: progression-free survival, OS: overall survival).

[0013| FIG. 7. Segmenting radiologist and CT vendor in training and test sets, (a) The same three expert radiologists segmented the discovery and test cases, (b) The most common scanner vendors were General Electric and Siemens for both cohorts, with other vendors being less represented. The test set contained one scan acquired on an Imatron device.

[0014| FIG. 8. Genomic features of the training and test sets, (a) The distribution of large-scale state transitions in the discovery cohort is depicted. The threshold for LSThigh versus LSTi ow may be set at 7 LSTs, which is lower than previously reported thresholds for whole-exome sequencing. This is because the cohort is a targeted gene panel, and LSTs occurring at the same rate will measure lower on targeted panels compared to more comprehensive sequencing, (b) Signature three was detected by SigMA as the dominant signature with high confidence (HC) and low confidence (LC) in a significant number of cases, and the next most prevalent was the clock signature, (c) The COSMIC SBS3 frequencies for all TCGA-OV cases with sequencing from are shown, and the distribution is clearly bimodal but imbalanced, (d, e) Patients with HRD-type disease have longer OS than those with HRP-type disease in the training and test sets, (f) Incorporating thresholded LST counts as indicators of HRD status worsened the significance of the separation of the HRD and HRP curves and was thus not used in the definition of HRD status, (g) Using BRCA2 SNVs, BRCA1 SNVs, CCNE1 CNAs, and CDK12 SNVs, a subset of all patients were categorized into the following mutational subtypes: HRD-Deletion (HRD-DEL), HRD- Duplication (HRD-DUP), Foldback Inversion (FBI), and Tandem Duplications (TD), respectively. The patients stratify as expected by PFS, with HRP-type patients suffering earlier progression of disease (p value for log-rank test between aggregated HRD patients and aggregated HRP patients), (h) The stratification is ordered as expected but fails to reach significance for OS. (i) Using only patients with explicit evidence of HRP or HRD disease also yields groups with significantly different OS. [0(H5| FIG. 9. Radiomic feature values by segmenting radiologist, CT scanner, and site. The radiomic feature chosen for the model is not confounded by (a) segmenting radiologist, (b) CT vendor, or (c) whether the scan was acquired at the institution or elsewhere.

[0016] FIG. 10. Example cross-validation histopathologic tissue type classifications.

[0017] FIG. 11. Histopathologic feature discovery. The logarithm of the univariate hazard ratio is depicted for each histopathologic feature, with the cluster in the upper right quadrant being primarily features describing tumor nuclear diameter and size.

[0018] FIG. 12. Histopathologic embeddings by specimen size and histopathologic feature selection. The embeddings in UMAP space of the two-feature histopathologic signature do not appear influenced by the relative specimen size (here depicted as the quantile of the number of foreground tiles detected). The larger specimens appear relatively evenly distributed, with the exception of a preponderance of smaller specimens toward the bottom left of the plot.

[0019] FIG. 13. Test performance of histopathologic-radiomic model. (a) The RH model separates the high- and low-risk groups by OS, but with a reduced separation (45% and 70% survival at 36 months), (b) However, the RH model-determined curves do not separate significantly by PFS.

[0020| FIG. 14. Learning only from cases with full information (N=l 14) worsens performance, (a) Ovarian and (b) omental features do not reach significance during discovery, (c) Histopathologic feature discovery is similar, (d) Performance on the test set against overall survival is worse, (e) The GRH model fails to significantly stratify the test set by OS. (f) It also fails to significantly stratify the test set by PFS.

[0021] FIG. 15. No robust association exists between individual modalities in the test set. (a) The maximal magnitude of the Pearson correlation between individual modalities is 0.178. (b) The maximal magnitude of the Spearman correlation between individual modalities is 0.135. [00221 FIG. 16. Chemotherapy response scores for all models on the test set. (a-o) for C, G, GC, GH, GHC, GR, GRC, GRH, GRHC, H, HC, R, RC, RH, and RHC models, respectively.

[0023] FIG. 17 depicts a block diagram of a system for determining risk scores using multimodal feature sets in accordance with an illustrative embodiment.

10024] FIG. 18A depicts a block diagram of a process of extracting multimodal features in the system for determining risk scores in accordance with an illustrative embodiment.

[00251 FIG. 18B depicts a block diagram of a process of applying risk prediction models to multimodal features in accordance with an illustrative embodiment.

[0026[ FIG. 19 depicts a flow diagram of a method of determining risk scores using multimodal feature sets in accordance with an illustrative embodiment.

[0027] FIG. 20 depicts a block diagram of a server system and a client computer system in accordance with an illustrative embodiment.

DETAILED DESCRIPTION

[0028] Following below are more detailed descriptions of various concepts related to, and embodiments of, systems and methods for determining risk stratification using multi-modal machine learning models. It should be appreciated that various concepts introduced above and discussed in greater detail below may be implemented in any of numerous ways, as the disclosed concepts are not limited to any particular manner of implementation. Examples of specific implementations and applications are provided primarily for illustrative purposes.

[0O29| Section A describes multi-modal machine learning to improve risk stratification of high-grade serious ovarian cancer;

[0030] Section B describes systems and methods of determining risk scores using multimodal features; and

[0031] Section C describes a network environment and computing environment which may be useful for practicing various embodiments described herein. A. Multi-Modal Machine Learning to Improve Risk Stratification of High-Grade

Serious Ovarian Cancer

[0032] Patients with high-grade serous ovarian cancer (HGSOC) suffer poor prognosis and variable response to treatment. Known prognostic factors for this disease include homologous recombination deficiency status, age, pathologic stage, and residual disease status after debulking surgery. Other approaches have highlighted important prognostic information captured in computed tomography and histopathologic specimens, which can be exploited through machine learning. However, little is known about the capacity of combining features from these disparate sources to improve prediction of treatment response. Here, a multimodal dataset of 444 patients with primarily late-stage HGSOC is assembled, and quantitative features, such as tumor nuclear size on H&E and omental texture on CE-CT, associated with prognosis are discovered. It was found that these features contributed complementary prognostic information relative to one another and clinico-genomic features. By fusing histopathologic, radiologic, and clinico-genomic machine learning models, a path toward improved risk stratification of cancer patients through multimodal data integration is demonstrated.

Introduction

[00331 High-grade serous ovarian cancer (HGSOC) is the most common cause of death from gynecologic malignancies, with a five-year survival rate of less than 30% for metastatic disease. Initial clinical management relies on either primary debulking surgery (PDS), or neoadjuvant chemotherapy followed by interval debulking surgery (NACT-IDS). Endogenous mutational processes are an established determinant of clinical course, with improved response of homologous recombination deficient (HRD) disease to platinumbased chemotherapy and poly-ADP ribose polymerase (PARP) inhibitors. More nuanced genomic analyses integrating point mutation and structural variation patterns further refine this stratification into four biologically and prognostically meaningful subtypes including distinct sub-groups of HRD, foldback inversion enriched tumors and those with distinctive accrual of large tandem duplications. Beyond genomic factors, clinical indicators such as patient age, pathologic stage, and residual disease (RD) status after debulking surgery are also prognostic. However, these clinico-genomic factors alone fail to adequately account for the heterogeneity of clinical outcomes. Identifying patients at risk of poor response to standard treatment remains a critical unmet need. Improved risk stratification models would aid gynecologic oncologists in selecting primary treatment, planning surveillance frequency, making decisions about maintenance therapy, and counseling patients about clinical trials of investigative agents.

[0034] Beyond clinico-genomic features, multi-scale clinical imaging is routinely acquired during the course of care, including contrast-enhanced computed tomography (CE- CT) at the mesoscopic scale and hematoxylin and eosin (H&E)-stained slides at the microscopic scale. Digital forms of these diagnostics present opportunities to develop computational models and test whether integrating these data modalities improves identification of risk groups for HGSOC. At the mesoscopic scale, other radiologic studies have uncovered quantitative CE-CT features that are predictive of early progression, time to recurrence, and overall survival in HGSOC. Other approaches have analyzed the prognostic information captured within adnexal lesions or the whole burden of disease and variably use either deep learning or empirically reproducible radiomic features from the Imaging Biomarker Standardization Initiative. However, a radiomic prognostic model based on omental lesions has not yet been developed even though omental implants are ubiquitous in advanced-stage disease. Such a model would be advantageous because it is possible even for less experienced observers to delineate omental implants, and it would alleviate the need for highly challenging and time-consuming segmentation of the total burden of disease.

|0035] At the microscopic scale, H&E-stained tissue biopsies enable pathologic diagnosis and are routinely acquired before the start of therapy. A quantitative histopathologic study of HGSOC identified patterns of immune infiltration on H&E slides that correlate with mutational subtypes. In other cancer types, studies of whole slide images (WSIs) have advanced the ability to quantify the histopathologic architecture of tumors using deep and interpretable features. Apart from stage, HGSOC lacks independent pretreatment pathologic factors by which to stratify patients, and quantitative approaches thus present an opportunity to systematically develop scaled models that are beyond qualitative human interpretation. Interpretable features are less prone to overfitting in small cohorts and can be more easily interrogated by human pathologists.

[0036] Conceptually, genomic sequencing does not account for spatial context, and it is thus hypothesized that multiscale imaging contains complementary information, rather than merely recapitulating genomic prognostication. There is also the potential for clinical multimodal machine learning to outperform unimodal systems by combining information from multiple routine data sources. In the present disclosure, the complementary prognostic information of multimodal features derived from clinical, genomic, histopathologic, and radiologic data obtained during the routine diagnostic workup of HGSOC patients is examined (Fig. la). The prognostic relevance of ovarian and omental radiomic features derived from CE-CT are tested, and a model based on omental features (Fig. lb) and a histopathologic model based on pre-treatment tissue samples to risk stratify patients (Fig. 1c) are developed. The models were validated on a test cohort and integrated with clinical and genomic information (Fig. Id) using a late fusion multimodal statistical framework (Fig. le). These results revealed the empirical advantages of cross-modal integration and demonstrated the ability of multimodal machine learning models to improve risk- stratification of HGSOC patients.

Results

Cohort and clinical characteristics

[0037] 444 patients with HGSOC, including 296 patients treated at Memorial Sloan

Kettering Cancer Center (MSKCC) and 148 TCGA-OV cases, were analyzed. The 40 test cases were randomly sampled from the entire pool of cases with all data modalities available for analysis; the remaining 404 cases were used for training. The training set contained 160 patients with stage IV disease, 225 with stage III, 10 with stage II, 8 with stage I, and 1 with unknown stage (Supplementary Table 1). The test cohort contained 31 stage IV and 9 stage III patients 23. Median age at diagnosis was 63 years [IQR 55-71] for the training set and 66 years [IQR 59-70] for the test set. In the training cohort, 175 patients received neoadjuvant chemotherapy followed by interval debulking surgery (NACT-IDS), and the remaining 82 underwent primary debulking surgery (PDS). In the test cohort, 31 received NACT-IDS and 8 underwent PDS. 61 MSKCC patients were known to have received PARP inhibitors (Supplementary Table 1). Treatment regimens are not annotated for the remaining 148 TCGA patients. Median OS was 38.7 months [IQR 25-55] for training patients and 37.6 months [IQR 26-49] for testing patients. 132 training patients and 17 testing patients had censored OS outcomes (Supplementary Table 2).

[0038] Among 404 patients in the training cohort, 243 patients had H&E WSIs, 245 patients had adnexal lesions on pretreatment CE-CT, and 251 patients had omental implants on pretreatment CE-CT (Fig. 2a). All 40 patients in the internal test cohort had omental lesions on CE-CT, H&E WSIs, and available sequencing by construction; 29 patients had ovarian lesions on CE-CT. Three gynecologic radiologists volumetrically segmented adnexal lesions and representative omental lesions on all sections containing these lesions (FIG. 7a). The training and testing data were acquired with similar CT scanners (FIG. 7b).

[0039] Clinical sequencing is used to infer HRD status, in particular variants in genes associated with HRD DNA damage response (DDR) such as BRCA1 and BRCA2, and those specific to disjoint tandem duplicator and foldback inversion-enriched mutational subtypes (CDK12 and CCNE1 respectively, Fig. Id, Fig. 2b-c). The genomes of 130 patients with appropriate consent are examined for direct evidence of homologous recombination deficiency, namely COSMIC single base substitution (SBS) signature 3, which is associated with defective HRD-DDR. In this subset of MSKCC patients, signature 3 was detected by SigMA with high confidence in 48 cases, detected with low confidence in 30 cases, and found not to be the dominant signature in 52 cases (FIG. 8b). In the TCGA, signature 3 was high in 6 cases and low in 51 (FIG. 8c). Patients with available sequencing and without evidence for HRD or HRP (N=126) were treated as HRP. Patients with conflicting evidence (N=6) or without sequencing (N= 61) were assigned a label of “ambiguous” and excluded from all analyses involving HRD status. In total, the training cohort contained 218 HRP and 119 HRD cases (Fig. 2c). The test set contained 12 HRD and 28 HRP cases. HRD status alone (excluding ambiguous) stratified patients by OS with a c-Index of 0.55 in the training cohort and 0.52 in the test set (without fitting any model parameters; FIG. 8d-e). Aberrations specific to distinct endogenous mutational processes also stratified patients as expected: that is, patients with HRP disease had worse outcomes than those with HRD disease (p=7e-3; FIG. 8g, 2i).

CE-CT imaging feature selection and stratification

[0040] The prognostic relevance of features derived from radiology scans either obtained at the institution (91; 27%) using GE Medical Systems CT scanners or acquired at outside institutions (247; 73%) from a variety of CT scanners (FIG. 7; Supplementary Table 3) is studied. The majority of CE-CT scans were acquired with a peak kilovoltage of 120 (median 120 kVp, range: 90-140; Supplementary Table 3) and reconstructed with the standard convolutional kernel using 5 mm slice thickness (median 5 mm; range: 2.5-7.5; Supplementary Table 3). Three fellowship-trained radiologists with expertise in gynecologic oncologic imaging manually segmented all adnexal masses and representative omental implants on each pretreatment CE-CT scan (Fig. lb, 3a).

[00411 Radiomic features are extracted from Coif-wavelet transformed images, yielding a 444-dimensional radiomic vector per site per patient. Using the training cohort, the hazard ratios and prognostic significance of omental and ovarian radiomic features are calculated using univariate Cox proportional hazards models (Supplementary Table 4). After correction for multiple hypothesis testing, omental features (Fig. 3b) and none of the ovarian features exhibited statistically significant hazard ratios (Fig. 3c). Hence, going forward, the omental implants are only considered. Cox models are iteratively fit and pruned for multivariable significance on the nine omental features (Algorithm 1), yielding a univariate model based on the autocorrelation of the gray level co-occurrence matrix derived from the HLL Coif wavelet-transformed 29 images (Fig. 3d). This feature exhibited a log(HR) of 1.68 (corrected p < 0.01; Fig. 3e) and was invariant to CT scanner manufacturers and segmenting radiologists (FIG. 9). The model stratified patients in the training and the test sets with concordance indices of 0.55 [95% C.I. 0.549-0.554] and 0.53 [95% C.I. 0.517-0.547], respectively (Fig. 3f). Kaplan-Meier analysis of the high- and low- risk groups (as determined by inferred risk) showed statistically different overall survival by the log-rank test (p < 0.01) in the training set (Fig. 3g), with median survival of 44 and 57 months, respectively but not in the test set, with median survival of 38 and 47 months, respectively (Fig. 3h).

Histopathologic tissue type classifier for interpretable features

[0042| Next, a tissue type classifier is trained from histology images using a weakly supervised approach. Tissue types on 60 H&E WSIs are annotated, yielding more than 1.4 million partially overlapping tiles, each measuring 128x128 pixels (64x64 pm) and containing 4096 pm 2 of tissue (Fig. 4a). A ResNet-18 convolutional neural network (CNN) pretrained on ImageNet (Fig. 4b) classified tissue types with an accuracy of 0.88 (range 0.77-0.95) on pathologist-annotated areas labeled as fat, stroma, necrosis, and tumor (Fig. 4c) by four-fold slide-wise cross validation. Notably, the model correctly identified small regions of fat within stromal annotations and necrotic regions within the tumor, supporting the suitability of weakly supervised deep learning for this task and refining annotations into more granular classifications. [00431 The cross-validation confusion matrix aggregated across folds showed good performance overall (Fig. 4d), with the most significant confusion being necrotic tiles predicted to be tumor and stroma. However, one disadvantage of weakly supervised learning is that neither the training data nor the validation data are exactly labeled. Hence, the cross-validation metrics are not computed against the exact truth. Visual inspection of the predictions were qualitatively concordant with only moderate confusion of necrosis with tumor and stroma (FIG. 10).

Histopathologic stratification

[0044] Tissue type classifier is applied to the 243 training H&E WSIs of lesions from pretreatment specimens (Fig. 1c). These inferred tissue type maps are combined with detected cellular nuclei, yielding labeled nuclei (Fig. 5a). Subsequently, cell-type features are extracted from these nuclei and tissue-type features from the tissue-type maps based on the methods. This yielded a histopathologic vector of 216 features. Next the hazard ratios of features are identified using univariate Cox models fit on slides in the training cohort. Several tissue-type features, such as overall tumoral area, were partially determined by specimen sizes, and were thus controlled for this during selection. Of the 24 features with a log(hazard ratio) found to be significantly different from 0 with 95% confidence, 20 related to tumor nuclear diameter or size, with larger being associated with shorter OS (FIG. 11; Supplementary Table 5). Again, Cox models were iteratively fit and pruned per Algorithm 1, yielding a multivariable model with two features: the mean tumor nuclear area and the major axis length of the stroma (Fig. 5b). This histopathologic signature was not confounded by specimen size (FIG. 12). This model stratified the training and test sets, with concordance indices of 0.56 [95% C.I. 0.559-0.564] and 0.54 [95% C.I. 0.527-0.560], respectively (Fig. 5c). High- and low-risk groups established based on the inferred risk scores separate well for the training set with median survival of 34 and 49 months, respectively (Fig. 5d; p < 0.01). For the test set, the risk groups trended toward — but did not attain — significantly different separation, with median survival of 37 and 50 months (Fig. 5e; p=0.076). To probe the interpretability of the histopathologic features, the mean tumor nuclear area is investigated: examples of low (Fig. 5f) and high (Fig. 5g) values are shown, which were associated with better and worse prognosis, respectively.

Multimodal prognostication [O045| The following were tested: prognostic significance of patient age, pathologic stage, residual disease status after debulking surgery, NACT-IDS versus PDS treatment paradigm, receipt of PARP inhibitors in the first two years after diagnosis, and the presence or absence of adnexal lesions (Supplementary Table 6), ultimately training a model on residual disease status and PARP inhibitor administration. This model stratified the test set with c=0.51 [95% C.I. 0.493-0.528], A late-fusion approach were then implemented to integrate histopathologic, radiomic, genomic, and clinical data into multimodal models (Fig. le). Specifically, each patient’s log partial hazard is predicted using the Cox model trained using the respective modality, then trained a final Cox model to integrate them (Methods). In the test set, the model combining both imaging modalities (radiomic-histopathologic, RH model) significantly outperformed the HRD status-based model, clinical model, and individual imaging models, with a test concordance index of 0.62 [95% C.I. 0.604-0.638] (Fig. 6a). The model with genomic, radiomic, and histopathologic (GRH) modalities performed comparably, with a test concordance index of 0.61 [95% C.I. 0.594-0.625], The histopathologic submodel score remained significant upon addition of HRD status (Fig. 6b). The high- and low-risk groups established by the GRH model were significantly different by log-rank test in the training set (median survival of 34 and 50 months, respectively (p=0.026; Fig. 6c). In the test set, the GRH risk groups also showed significantly different OS, with median survival of 30 months for the high-risk group and 50 months for the low- risk group (p=0.023; Fig. 6d). At 36 months, 68% and 34% survived for low- and high-risk groups, respectively, in the test set. The separation of the RH model’s risk groups was inferior (FIG. 13). Notably, analysis of only training cases with full information (n=l 14) resulted in poor performance (FIG. 14), reinforcing the ability of late fusion models to learn in the setting of missing data. No robust association was found between modalities to enable interpolation of missing values (FIG. 15).

[0046] The c-indices for individual imaging modalities were similar, but identified distinct patient subgroups with good prognosis (Fig. 6e). This is consistent with radiologic and histologic features containing complementary information content, whereby some patients with good outcomes were identified as high risk by the radiomic sub-model but correctly assigned a lower risk score by the histopathologic sub-model, and vice versa. Patients with HRD and HRP disease were distributed relatively evenly, agnostic to unimodal imaging risk scores. [O047| Corroborating this, absolute Kendall rank correlation coefficient values were low between individual modalities (<0.14) (Fig. 6f), demonstrating that the radiomic and histopathologic models ordered patients differently as compared to the genomic model and to one another. The same two risk groups identified by the model in the test set also showed significantly different progression-free survival (p=0.040; Fig. 6g). Finally, as an orthogonal validation, the inferred risk of all models except the G and GH models associated with pathologic chemotherapy response score (CRS) in the training set, including the GHR model (Fig. 6h). The test set had only 21 patients with known CRS, and only HRD status exhibited statistically significantly different distributions of CRS by the Mann- Whitney U test in the test set (FIG. 16).

Discussion

[0048| Machine learning in cancer prognostics is a growing field with great potential, but the contribution of common diagnostic modalities to multimodal risk stratification remains poorly understood. Here, it is shown that integrating multi-scale clinical imaging and genomic data increases predictive capacity. These results, in addition to the low correlation between risk scores derived from individual modalities, support the hypothesis that clinical imaging contains complementary prognostic information that is independent of clinico-genomic information. Histopathologic and radiologic imaging characterize the tumor architecture at microscopic and mesoscopic scales, respectively. Therefore, it stands to reason that these data channels complement one another and HRD status, which is derived from spatially-agnostic sequencing. The full GHRC model did not perform as well as the RH and GRH models, suggesting that multimodality is not a universal guarantee of improved performance. In this case, the most likely reason is that the clinical model (based on history of PARP inhibitor administration and residual disease status after debulking surgery) does not stratify the test cohort, likely due to its small size. Furthermore, the TCGA cohort did not have these informative clinical variables available. The late fusion architecture benefits from few parameters to fit — which reduces overfitting — and the ability to learn from partial information cases, but it cannot gate information from noisy modalities. With larger datasets enabling more parameter fitting without overfitting, mechanisms such as attention can be explored to adaptively adjust unimodal contributions. [00491 In addition to multimodal integration, two unimodal models are presented to stratify late-stage HGSOC patients using routine clinical imaging, validated these models on a test set, and studied the relative contributions of each modality to risk-stratifying HGSOC patients. For radiologic imaging, it is discovered that omental autocorrelation computed from the gray level co-occurrence matrix derived from the HLL Coif wavelet-filtered image was a prognostic feature. This Imaging Biomarker Standardization Initiative-defined feature has been found to be strongly or very strongly reproducible in multiple studies. It describes the coarseness of the lesion texture and also depends on tissue density. Seven of the other nine omental features with significant log(HR) values were explicitly designed to measure high-density zones, and these features did not exhibit log(HR) values significantly different from zero on multivariable regression with the autocorrelation. Hence, the most parsimonious explanation is that higher-density — rather than coarser — omental implants are an adverse prognostic factor, which could be due to more solid tumors with reduced cystic or fatty components. Omental textures captured by autocorrelation may also reflect differing intratumoral heterogeneity.

[0050| Other HGSOC radiomic models have not explored the prognostic information captured within omental implants, relying instead on more demanding segmentations of adnexal lesions or the entire tumor burden. Interestingly, it was found that none of the radiomic features derived from adnexal masses had log(HR) values significantly different from zero after correction for multiple hypothesis testing, which is possibly due to the late stage of this cohort: the omentum is the most common site of metastasis in HGSOC and may drive further peritoneal seeding. An omental model is advantageous over an adnexal model because omental implants are ubiquitous in advanced stage disease, even in patients with primary peritoneal high-grade serous cancer that lack adnexal mass(es). Furthermore, an omental implant can be readily segmented even by less experienced observers, whereas adnexal masses can be challenging to distinguish from adjacent loculated ascites, serosal and pouch of Douglas implants, and adjacent anatomic structures such as the uterus, especially in the presence of leiomyomas. An omental model is also more practical than a radiomic model based on the whole tumor burden; routine segmentation of the whole tumor volume is impractical in daily practice using current tools due to prohibitively high demand for time and expertise. [O051 | For histopathologic imaging, an H&E WSI-based model is developed to stratify HGSOC patients. Although none of the features exhibited log(HR) values significantly different from zero after correction for multiple hypothesis testing, the presence of 20 features highly related to mean tumor nuclear size (e.g., 60th percentile of tumor nuclear size, 50th percentile of tumor nuclear diameter) with similar hazard ratios in the 24 features with uncorrected significant p-values for univariate log(HR) values supports the prognostic relevance of tumor nuclear size. This is further supported by the good stratification of the test set. The larger nuclear size may be associated with events such as whole-genome doubling or cellular fusion and warrants direct study of matched genomes and histopathologic sections. The major axis length of stroma is difficult to interpret for a two-dimensional slice of tissue but may reflect distinct patterns of disease infiltration into surrounding stroma. The trained weights are included for the HGSOC model, and the source code is included for extension to other cancer types.

10052] This lack of usable large datasets is one of the main challenges for multimodal machine learning in oncology. Data is created from the 296 MSKCC HGSOC patients available to enable work toward improving upon the models presented here. These results demonstrate the benefit of learning from cases with only partial information in multimodal studies: the smaller, full-information sub-cohort yielded a significantly less generalizable risk stratification model. The dataset also offers the advantage of comprising H&E images and CE-CT scans originally acquired at multiple institutions: this improves confidence in the generalizability of the results. Furthermore, data generated during the standard of care was intentionally mind. Using these data instead of specialty research data drastically reduces adoption costs in the clinical workflow for resultant models, but the data were not collected specifically with computational modeling in mind. For example, some patients with only germline sequencing of HRD-DDR genes were included, a clinically relevant but biologically imperfect measure of HRD status: each risk group is enriched for — but not exclusively composed of — the genomic subtype of interest. It is expected that clinical whole-genome sequencing will enable more robust genomic analyses.

[0053] The improved risk stratification models developed herein show the promise of extracting and integrating quantitative clinical imaging features toward aiding gynecologic oncologists in selecting primary treatment, planning surveillance frequency, making decisions about maintenance therapy, and counseling patients about clinical trials of investigative agents. The statistical robustness and clinical relevance of the risk groups by both PFS and OS in the test set substantiate the utility of this multimodal machine learning approach, establishing proof of principle. Next steps include scaled and inter-institutional retrospective cohort assembly for further model training and refinement before prospective validation of clinical benefit in randomized controlled trials.

[0054[ In summary, a multimodal dataset of HGSOC patients is assembled and this dataset is used to develop and integrate radiologic, histopathologic, and clinico-genomic models to risk-stratify patients. It is discovered that the autocorrelation of omental implants on CE-CT and average tumor nuclear size on H&E are prognostic factors, that these modalities are demonstrably orthogonal, and that their computational integration improves stratification beyond previously known clinico-genomic factors in a test set. These results motivate further large-scale studies driven by multimodal machine learning to stratify cancer patients, both in HGSOC and other cancer subtypes.

Methods

[0055] This study complies with all relevant ethical regulations, and its protocols were approved an institutional review board. Informed consent was waived for this retrospective study, and participants were not compensated.

Cohort curation

[0056] Patients were eligible for this retrospective study if they had biopsy-proven newly diagnosed high-grade serous ovarian cancer and at least one of (A) pre-treatment whole-slide images of H&E depicting high-grade serous carcinoma or (B) pre-treatment contrast-enhanced abdominal/pelvic computed tomography (CE-CT). Most of the MSKCC cohort was sourced from a retrospective clinical database of patients who underwent diagnostic workup and NACT-IDS at the institution. This database also contained information on the residual disease status after debulking surgery, pathologic stage, administration of neoadjuvant chemotherapy, and patient age at diagnosis from the electronic medical record. To expand the cohort, the institutional data warehouse is searched for patients with sequencing and available pretreatment CT studies or H&E images. In addition to this retrospective curation, 36 patients were also included from the prospective project. Pathologic stage was unavailable for 14 patients, and instead the clinical stage were recorded as in the Institutional Database for these patients. Also, the race were collected for all patients from the institutional data warehouse. Overall and progression-free survival were calculated using the date of CT as a start date, when available, or the date of pathologic diagnosis otherwise.

[0057] To collect the H&E imaging, the EHR is reviewed to find associated pathology cases with peritoneal lesions (primarily omental), and expert pathologists reviewed the slides to select high-quality specimens for digitization. The institutional data repository was also reviewed for scanned slides associated with the diagnostic biopsy and included those containing tumors. All H&E imaging was pretreatment.

[0058] Subsequently, the associated CE-CT scans are reviewed for the following the inclusion criteria: 1) intravenous contrast-enhanced images acquired in the portal venous phase, 2) absence of streak artifacts or motion-related image blur obscuring lesion(s) of interest, and 3) adequate signal to noise ratio (Supplementary Table 7). All CE-CT imaging was pretreatment. All CT scans were available in the digital imaging and communications in medicine (DICOM) format through an institutional picture archiving and communication system (PACS, Centricity, GE Medical Systems v. 7.0).

TCGA cohort selection

|0059] From the TCGA-OV project, patients were searched with clinical data annotated in the TCGA Clinical Data Resource, pathologic grade, and at least one of a diagnostic FFPE H&E WSIs or abdominal/pelvic CE-CT scan in the TCIA. All clinical and demographic information were extracted from the TCGA CDR. Only diagnostic WSIs of formalin-fixed, paraffin-embedded H&E-stained specimens from the TCGA-OV project were included. All H&E imaging was pre-treatment.

|0060] All CT scans met the following the inclusion criteria: 1) intravenous contrast-enhanced images acquired in the portal venous phase, 2) absence of streak artifacts or motion-related image blur obscuring lesion(s) of interest, and 3) adequate signal to noise ratio (Supplementary Table 7). All CE-CT imaging was pretreatment.

[0061] Inferring HRD status. In the MSKCC cohort, MSK-IMPACT clinical sequencing is used, when available, to infer HRD status. Variant calling for these genes and copy number analysis of CCNE1 was performed using a clinical pipeline. For patients with appropriate consent for further genomic re-analysis, COSMIC SBS3 activity is also inferred using SigMA (for cases with at least five mutations across all 505 genes) and searched for large-scale state transitions using another pipeline. OncoKB and Hotspot annotations were also used for variant significance in genes involved in HRD-DDR to assign patients to the HRD subtype. Patients with high-confidence dominant signature 3 or at least one significant variant or deep deletion in the HRD-DDR genes were assigned to the HRD subtype, except when there was evidence that patients belonged to the foldback inversion- or tandem duplicator-enriched subgroups (via CCNE1 amplification or CDK12 SNVs, specifically): These patients with conflicting evidence were assigned to the ambiguous subtype and excluded from analysis. Low-confidence signature 3 results were not used for HRD status definition. Incorporating LST thresholding to define HRD status was found to diminish the separation of the HRD and HRP-defined groups in the training set (FIG. 8a, f), and thus it was not used in the final HRD status definition. Patients with available results from clinical HRD-DDR panels or BRCA1/2 send out panels were assigned HRP unless there were variants of known significance (as determined by the test provider) in at least one reported gene.

[00621 In the TCGA cohort, CNA and SNV data were downloaded from the TCGA- OV project on cBioPortal for the same set of genes implicated in HRD-DDR, CDK12, and CCNE1, again filtering to variants deemed significant by OncoKB. Using these criteria, patients with at least one SNV or deep deletion in HRD-DDR genes were assigned the HRD subtype. Patients without aberrations in these HRD-DDR-associated genes were assigned the HRP subtype. Patients with an SNV in CDK12 or amplification in CCNE1 and also with an SNV in at least one of the HRD-DDR genes were assigned the ambiguous subtype and excluded from analysis. Patients without available SNV and CNA data in cBioPortal were assigned to the ambiguous subtype and excluded. COSMIC SB S3 frequencies were downloaded from Synapse, which is clearly bimodal (FIG. 9c), and patients with SBS3 frequency greater than 15% and without conflicting evidence of HRP were assigned to the HRD subtype.

Adnexal and omental lesions segmentation

[0063] Three fellowship-trained radiologists manually segmented ovarian lesions and representative omental implants on each pretreatment CE-CT scan for all patients (MSKCC and TCGA-OV/TCIA). Using the Insight Segmentation and Registration Toolkit- SNAP version 3.8.0 software, each radiologist traced the outer contour of ovarian and omental lesions on every tumor-containing axial section. All questions that arose during segmentation were resolved via joint review and consensus.

Train-test split

[0064] 40 training cases were sampled randomly before analysis from the patients with available H&E WSI, unambiguous HRD status, known stage, and omental lesion on CE-CT. This strategy is used to enable fair comparisons across unimodal and multimodal models, preventing spurious differences in test concordance indices due to patient exclusion for some models but not for others. Both TCGA-OV and MSKCC cases are included in the training and test sets: this is because only 4 TCGA cases had complete information from all modalities and thus could not support a fully external test set.

Radiologic feature extraction

[0065] All DICOM series are converted to volumetric images in Hounsfield Units and applied an abdominal window (level 50, width 400). Using PyRadiomics, images were resampled to isotropic 1mm 3 voxels using the Simple ITK B-spline interpolator and binned images with bin size of 25 HU. Features in 3D were extracted from Coif wavelet- transformed images. Features were extracted from the gray level size zone, neighboring gray tone difference, gray level run length, gray level dependence, and gray level cooccurrence matrices, yielding a representation of each study’s representative omental lesion(s) or individual adnexal lesion(s).

Histopathologic annotation

[0066] Two expert pathologists partially annotated 60 H&E WSIs using the Slide Viewer. The approach was to label example regions of necrosis, lymphocyte-rich tumor, lymphocyte-poor tumor, lymphocyte-rich stroma, lymphocyte-poor stroma, veins, arteries, and fat with reasonable but imperfect accuracy. These annotations are exported as bitmaps and converted them to GeoJSON objects. Lymphocyte-rich/poor tumor labels and lymphocyte-rich/poor stroma labels are amalgamated for training and omitted vessels from the training data for the models presented herein. Next, these annotations are used to generate tissue-type tiles.

Training the histopathologic tissue type classifier [O067| Tiles measuring 64pm x 64pm (128 x 128 pixels) with 50% overlap are generated, using the above annotations to delineate regions to be tiled. No other tile sizes were explored; this size was chosen because it offered good resolution while still depicting multiple cells in each tile. Putative tile squares within an annotation but with <20% foreground as assessed by Otsu’s method were not tiled. Macenko stain normalization was used. A ResNet-18 model (pretrained on ImageNet) are trained for 30 epochs with a learning rate of 5e-4, le-4 L2 regularization, and the Adam optimizer. The objective function was class-balanced cross entropy, and mini batches of 96 tiles are used on a single NVIDIA Tesla V100 GPU. Four-fold, slide-wise cross-validation are used for model evaluation and hyperparameter tuning. The number of epochs are selected to train the final model using the epoch with the highest lower 95% C.I. bound estimated using the mean and standard deviation of the cross-validation Fl scores. The model is trained on tiles from all 60 slides for 21 epochs.

Histopathologic feature extraction and selection

10068] The WSIs associated with the patients in this cohort are tiled without overlap, performing inference using mini batches of 800 across four NVIDIA Tesla VI 00 GPUs. Macenko stain normalization is used for all slides because staining intensity differences from the predominantly MSKCC-based training cohort confounded inference. Tile predictions are assembled into downscaled bitmaps, which were then used to calculate tissue-type features in an approach. The region properties from scikit-image are included for both the largest connected component and the entirety of each tissue type. Features such as the area ratio of one tissue type to another and the entropy of tumor and stroma are also calculated. Using the StarDist method for QuPath, individual nuclei are segmented and characterized, using nuclei with a detection probability greater than 0.5. A lymphocyte classifier trained iteratively using manual annotations is used to distinguish lymphocytes from other cells. A tissue parent type is assigned to each nucleus using the inferred tissue type maps and calculated aggregative statistics by tissue type and cell type of the QuPath- extracted nuclear morphologic and staining features, such as variance in eosin staining or circularity. Together, these cell type features and tissue type features based on tumor, stroma, and necrosis constituted the histopathologic embedding for each slide.

Clinical data encoding [00691 Residual disease status after debulking surgery was encoded as a binary variable, where patients with <lcm residual disease (including complete gross resection) were assigned a value of 1, and patients with >lcm residual disease were assigned a value of 0. The presence of adnexal lesions on CE-CT was also included as a binary variable. Age at diagnosis was modeled as a continuous variable scaled by the training set range. Tumor stage was encoded as one-hot categorical variables for I, II, III, IV, and Unknown. Similarly, the primary treatment approach was encoded as a one-hot categorical variable with values NACT-IDS, PDS, and Unknown.

Feature selection

[0070] The same strategy was used to select radiomic, histopathologic, and clinical features. For each feature, a univariate Cox Proportional Hazards model is fit to the full training set using the Python Lifelines package without regularization, and the univariate coefficient and significance confidence are plotted. For features whose model failed to converge, fitting is re-attempted with L2 regularization C=0.2, and any model still failing to converge was assigned a log Hazard Ratio of 0 and -value of 1. For histopathology, relative specimen size is controlled for by including it in each Cox model. Next, features with scaled interquartile range below 0.1 are removed. Subsequently, for radiomics, which is the largest feature space, the Benjamini -Hochberg method is used to correct for multiple hypothesis testing. Taking the ordered list of features significant with 95% confidence, Algorithm 1 is applied to select features, yielding modality signatures with low multicollinearity.

Algorithm 1 Multivariable model selection procedure

Input: A list of unique candidate features ordered by p-value fi where i G [l,k].

Output: A list of features significant with confidence a on multivariable regression gj where j G [1,Z] and I < k.

Require: Ar > 1 i <— 1 <— 1 while i < k do gi ^fi p <— significance(g) significance assessed by Cox regression if pj < a then j <— j + 1 end if z <— z + 1 end while

[0071] The only modification to this procedure occurred for the ablation experiment to test the importance of learning from the partial information cases: a threshold of 0.31 is used for clinical features since none were significant with p<0.05, and multiple hypothesis testing is not corrected for in the omental radiomic features during the ablation experiment since none would be significant by this metric.

Survival modeling

[0072] Linear Cox Proportional Hazards models are used with L2 regularization (C=0.5) and no LI regularization for all multimodal and unimodal models. No sub-model was fit for the genomic modality: patients assigned to the HRP subtype were designated high risk (risk score=1.0), and patients assigned to the HRD subtype were designated low risk (risk score=0.0). No interaction terms were used.

[0073] Kaplan Meier analysis is used to determine whether each model stratified patients into clinically significant groups. To delineate group membership, percentile thresholds is tested in {0.33, 0.34, . . ., 0.64, 0.65 0.66}, choosing the value that maximized significance of the separation in the training set by the log-rank test. This was performed individually for OS and PFS, where relevant. -values for concordance indices were calculated using 1000-fold permutation tests. 95% confidence intervals for c-indices were calculated using 100-fold leave-one-out bootstrapping. All /?-values for Kaplan-Meier analysis were calculated by the multivariate log-rank test. C- values for covariate significance in Cox Proportional Hazards models are reported for models fit with C=0.5. Fraction surviving was estimated using linear interpolation.

Multimodal integration

[0074] A late fusion approach is chosen to increase unimodal sample sizes available for parameter estimation. Parameters for unimodal sub-models were estimated using all available unimodal data (e.g., radiomic parameters were estimated across the 251 training CT cases with omental lesions, and histopathologic parameters were estimated across the 243 training H&E cases), where each sub-model inferred a partial hazard for each patient. The negative partial hazard was used to enable compatibility with the concordance index as implemented in the lifelines Python package. For the second-stage late fusion model, parameters are estimated for a multivariate Cox model integrating the negative log partial hazards inferred by each modality using only the intersection set of patients.

Statistics and reproducibility [0075| No statistical method was used to predetermine sample size. Data were excluded from the analyses only for the reasons detailed above and prior to any machine learning modeling. The training and test sets were chosen at random from the patients with all four data modalities available. The investigators were not blinded to allocation during outcome assessment. Data distributions were not assumed to be normal for any tests. The hazards were assumed to be proportional for survival modeling, but this was not formally tested.

APPENDIX

Supplementary Table 1 Supplementary Table 2

Supplementary Table 3

Supplementary Table 4

Supplementary Table 5 Supplementary Table 6

Supplementary Table 7

B. Systems and Methods of Determining Risk Scores Using Multimodal Features

[0076] A diagnostics platform may evaluate a subject at risk of a certain condition (e.g., cancer, disease, or ailment) using prognostic information for the conditions, such as genetic sequencing data for the subject. The reliance on prognostic information alone, however, may yield poor prognosis and variable response to treatment. This may also lead to wasted computer resources on the platform from calculating and providing poor results. To address these and other technical challenges, a computing system may combine features from disparate sources, such as histopathological data, radiomic data, and genomic data. The computing system may establish a multivariate model using these combined features to improve prediction of treatment response in accordance with machine learning (ML) techniques. In this manner, in providing more accurate and useful results, the computing system may reduce computer resources.

[0077] Referring now to FIG. 17, depicted is a block diagram of a system 1700 for determining risk scores using multimodal feature sets. In overview, the system 1700 may include at least one data processing system 1705, at least one tomograph device 1710, at least one imaging device 1715, at least one genomic sequencing device 1720, and at least one display 1725, communicatively coupled via at least one network 1730. The data processing system 1705 may include at least one radiological feature extractor 1735, at least one histological feature acquirer 1740, at least one genomic feature obtainer 1745, at least one model trainer 1750, at least one model applier 1755, and at least one output handler 1760, at least one risk prediction model 1765, and at least one database 1770, among others. Each of the components in the system 1700 as detailed herein may be implemented using hardware (e.g., one or more processors coupled with memory), or a combination of hardware and software as detailed herein in Section C. Each of the components in the system 1700 may implement or execute the functionalities detailed herein, such as those described in Section A.

[0078] Referring now to FIG. 18A, depicted is a block diagram of a process 1800 of extracting multimodal features in the system 1700 for determining risk scores. The process 1800 may correspond to or include operations in the system 1700 for identifying features in various modalities from subjects. Under the process 1800, one or more devices of the system 1700 may obtain or acquire data in multiple modalities from at least a portion of a subject 1805 (e.g., a human or animal). The subject 1805 may be at risk of a condition, or may be afflicted with the condition. The condition may include, for example, a type of cancer (e.g., breast cancer, bladder cancer, cervical cancer, colorectal cancer, kidney cancer, liver cancer, lung cancer, lymphoma, ovarian cancer, prostate cancer, skin cancer, or thyroid cancer), among others. The subject 1805 may be under evaluation for the progression or deterioration of the condition.

[0079] The tomograph device 1710 may produce, output, or otherwise generate at least one tomogram 1810 (sometimes herein referred to generally as a biomedical image or an image) of a section of the subject 1805. For example, the tomogram 1810 may be a scan of the sample corresponding to a tissue of the organ in the subject 1805. The tomogram 1810 may include a set of two-dimensional cross-sections (e.g., a front, a sagittal, a transverse, or an oblique plane) acquired from the three-dimensional volume. The tomogram 1810 may be defined in terms of pixels, in two-dimensions or three-dimensions. In some embodiments, the tomogram 1810 may be part of a video acquired of the sample over time. For example, the tomogram 1810 may correspond to a single frame of the video acquired of the sample over time at a frame rate.

[00801 The tomogram 1810 may be acquired using any number of imaging modalities or techniques. For example, the tomogram 1810 may be a tomogram acquired in accordance with a tomographic imaging technique, such as a magnetic resonance imaging (MRI) scanner, a nuclear magnetic resonance (NMR) scanner, X-ray computed tomography (CT) scanner, an ultrasound imaging scanner, and a positron emission tomography (PET) scanner, and a photoacoustic spectroscopy scanner, among others. The tomogram 1810 may be a single instance of acquisition (e.g., X-ray) in accordance with the imaging modality, or may be part of a video (e.g., cardiac MRI) acquired using the imaging modality.

[0081] The tomogram 1810 may include or identify at least one at least one region of interest (ROI) (also referred herein as a structure of interest (SOI) or feature of interest (FOI)). The ROI may correspond to an area, section, or part of the tomogram 1810 that corresponds to the presence of the condition in the sample from which the tomogram 1810 is acquired. For example, the ROI may correspond to a portion of the tomogram 1810 depicting a tumorous growth in a CT scan of a brain of a human subject. With the acquisition of the tomogram 1810, the tomograph device 1710 may send, transmit, or otherwise provide the tomogram 1810 to the data processing system 1705. The tomogram 1810 may be in maintained using one or more files in accordance with a format (e.g., singlefile or multi-file DICOM format).

[0082] The imaging device 1715 may scan, obtain, or otherwise acquire a whole slide image (WSI) 1815 (sometimes herein referred generally as a biomedical image or image) of a tissue sample of the subject 1805. The tissue sample may be obtained from the section of the subject 1805 used to generate the tomogram 1810, or may be taken from another portion associated with the condition within the subject 1805. The WSI 1815 itself may be acquired in accordance with microscopy techniques or a histopathological image preparer, such as using an optical microscope, a confocal microscope, a fluorescence microscope, a phosphorescence microscope, an electron microscope, among others. The WSI 1815 may be for digital pathology of a tissue section in the sample from the subject 1805. The WSI 1815 may be, for example, a histological section with a hematoxylin and eosin (H&E) stain, immunostaining, hemosiderin stain, a Sudan stain, a Schiff stain, a Congo red stain, a Gram stain, a Ziehl-Neelsen stain, a Auramine-rhodamine stain, a trichrome stain, a Silver stain, and Wright’s Stain, among others. The WSI 1815 may be maintained using one or more files in accordance with a format (e.g., DICOM whole slide imaging (WSI)).

|0083] The WSI 1815 may include one or more regions of interest (ROIs). Each ROI may correspond to areas, sections, or boundaries within the sample WSI 1815 that contain, encompass, or include conditions (e.g., features or objects within the image). The ROIs depicted in the WSI may correspond to areas with cell nuclei. The ROIs of the sample WSI 1815 may correspond to different subtype conditions. For example, when the WSI 1815 is a WSI of the sample tissue, the features may correspond to cell nuclei and the conditions may correspond to various cancer subtypes, such as carcinoma (e.g., adenocarcinoma and squamous cell carcinoma), sarcoma (e.g., osteosarcoma, chondrosarcoma, leiomyosarcoma, rhabdomyosarcoma, mesothelial sarcoma, and fibrosarcoma), myeloma, leukemia (e.g., myelogenous, lymphatic, and polycythemia), lymphoma, and mixed types, among others. Upon generation, the imaging device 1715 may send, transmit, or otherwise provide the WSI 1815 to the data processing system 1705.

[0084| The genomic sequencing device 1720 may carry out, execute, or otherwise perform genetic sequencing on a deoxyribonucleic acid (DNA) sample taken from the subject 1805 to generate gene sequencing data 1820. The genetic sequencing carried out may be a high throughput, massively parallel sequencing technique (sometimes herein referred to as next generation sequencing), such as pyrosequencing, Reversible dyeterminator sequencing, SOLiD sequencing, Ion semiconductor sequencing, Helioscope single molecule sequencing, among others. The genetic sequencing may be targeted to find biomarkers associated with or correlated with the condition of the subject 1805. For example, the genomic sequencing device 1720 may perform the hybridization-capture based targeted sequencing to find tumor protein 53 (TP53), BRCA panel (e.g., BRCA1 or BRCA2), Gl/S-specific cyclin-El (CCNE1), or cyclin-dependent kinase 12 (CDK12), among others. Upon carrying out the sequencing, the genomic sequencing device 1720 may send, transmit, or otherwise provide the gene sequencing data 1820 to the data processing system 1705. The gene sequencing data 1820 may be maintained using one or more files according to a format (e.g., FASTQ, BCL, or VCF formats).

[0085] The radiological feature extractor 1735 executing on the data processing system 1705 may generate, determine, or otherwise identify a set of radiological features 1825A-N (hereinafter generally referred to as radiological features 1825) using the tomogram 1810. The radiological feature 1825 may include or identify information derived from the tomogram 1810 of the section associated with the condition in the subject 1805, such as those described in Section A. To identify, the radiological feature extractor 1735 may apply a wavelet transform (e.g., a Coif wavelet transform) on the tomogram 1810. The radiological feature extractor 1735 may calculate, determine, or otherwise generate a matrix from the tomogram 1810 transformed using the wavelet function. The derived matrix for the radiological feature 1825 may, for example, include any one or more of (i) a gray level co-occurrence matrix (GLCM), gray level dependence matrix (GLDM), (iii) a gray level run length matrix (GLRLM), (vi) a gray level size zone matrix (GLSZM), or (v) a neighboring gray tone difference matrix, among others. The radiological feature 1825 may include any of the features listed in Supplementary Table 4.

[0086] The histological feature acquirer 1740 executing on the data processing system 1705 may generate, determine, or otherwise identify a set of histological features 1830A-N (hereinafter generally referred to as histological features 1830) using the WSI 1815. The WSI 1815 may include or identify information derived from the WSI 1815 associated with the condition in the subject 1805. The histological feature acquirer 1740 may use one or more machine learning (ML) models to recognize, detect, or otherwise identify the histological features 1830 from the WSI 1815. The ML models may include, for example: an image segmentation model to determine the ROI within the WSI 1815 associated with the condition; an image classification model to determine the condition type to which to classify sample depicted in the WSI 1815; or an image localization model to determine a portion (e.g., a tile) within the WSI 1815 corresponding to the ROI, among others. The ML model for image segmentation, localization, or classification may be of any architecture, such as a deep learning artificial neural network (ANN), a regression model (e.g., linear or logistic regression), a clustering model (e.g., k-NN clustering or density- based clustering), Naive Bayesian classifier, a decision tree, a relevance vector machine (RVM), or a support vector machine (SVM), among others.

[0087] From applying the image segmentation or localization model, the histological feature acquirer 1740 may determine a portion of the WSI 1815 corresponding to the one or more ROI associated with the condition. The ROIs may correspond to types of tissue or cell nuclei associated with the condition, such as fat, necrosis, stroma lymphocyte, stroma nuclei, stroma, tumor lymphocyte, tumor nuclei, or tumorous tissue, among others. With the determination, the histological feature acquirer 1740 may calculate, determine, or identify one or more properties of the ROIs in the WSI 1815, such as: nuclei cell types within the sample; a mean area (e.g., percentage) of cell nuclei by type within sample; a dimension (e.g., length or width along a given axis) of cell nuclei by type; tissue types within the sample depicted in the WSI 1815; an area (e.g., percentage) of a given tissue type in the sample; a dimension (e.g., diameter, length, or width along a given axis) of the given tissue type in the sample; cells or tissues for a given cancer subtype; an area of the portion of the WSI 1815 corresponding to the cancer subtype; a dimension (e.g., diameter, length, or width along a given axis) of the portion for the cancer subtype; or a statistical measure (e.g., mean, median, standard deviation) in staining (e.g., H&E) indicative of the tissue type or cell nuclei type; among others. In some embodiments, from applying the image classification model, the histological feature acquirer 1740 may determine a classification of the sample in the WSI 1815. The classification may include, for example, a presence or an absence of the condition, such as the type of cancer. The histological feature acquirer 1740 may use the properties of the ROIs in the WSI 1815 and the classification as the histological features 1830. The histological features 1830 may also include any of the features listed in Supplementary Table 5. One or more of the histological features 1830 in the set may be used for training the risk prediction model 1765.

[0088| The genomic feature obtainer 1745 executing on the data processing system 1705 may generate, determine, or otherwise identify a set of genomic features 1835A-N using the gene sequencing data 1820. Using the gene sequencing data 1820, the genomic feature obtainer 1745 may identify or determine Homologous recombination deficiency (HRD) or Homologous recombination proficiency (HRP) status of the subject 1805. The determination of the HRD or HRP status may be based on a presence or absence of one or more mutations within the gene sequencing data 1820 for the subject 1805. The genomic feature obtainer 1745 may identify variants associated with HRD DNA damage response (DDR), such as BRCA1, BRCA2, CCNE1, and CDK12, among others. The genomic feature obtainer 1745 may also identify mutational subtypes within the gene sequencing data 1820, such as HRD Deletion (HRD-DEL); HRD-Duplication (HRD-DUP); Foldback Inversion (FBI), and Tandem Duplications (TD), among others. The variants for HRD DDR may have a correspondence with the mutational subtypes, such as: BRCA2 SNVs with HRD-DEL, BRCA1 SNVs with HRD-DUP, CCNE1 CNAs with FBI, and CDK12 SNVs associated with TD, among others.

((>089] With the identification, the radiological features 1825, the histological features 1830, and genomic features 1835 may form at least one feature set 1840 (sometimes herein referred to as a multimodal feature set). The feature set 1840 may include one or more features from a variety of modalities, as described herein. The feature set 1840 may be further processed by the data processing system 1705 to evaluate the subject 1805. At least some of the feature sets 1840 together with expected risk scores may be used for training the risk prediction model 1765 as explained below. At least some of the feature sets 1840 may be used at runtime to feed to the risk prediction model 1765 to determine predicted risk scores for subjects 1805.

[0090] Referring now to FIG. 18B, depicted a block diagram of a process 1850 of applying risk prediction models to multimodal features. The process 1850 may correspond to or include operations in the system 1700 for establishing a multimodal model and determining risk scores for subjects. Under the process 1850, the model trainer 1750 executing on the data processing system 1705 may initialize or establish the risk prediction model 1765 (sometimes herein referred to as a multimodal or multivariate model). The model trainer 1750 may be invoked to establish the risk prediction model 1765 during training mode. The risk prediction model 1765 may be any machine learning (ML), such as: a regression model (e.g., linear or logistic regression), a clustering model (e.g., k-NN clustering or density-based clustering), Naive Bayesian classifier, artificial neural network (ANN), a decision tree, a relevance vector machine (RVM), or a support vector machine (SVM), among others. The risk prediction model 1765 may be an instance of the Cox regression models discussed in Section B, such as the multivariate model generated using Algorithm 1. In general, the risk prediction model 1765 may have one or more inputs corresponding to the feature set 1840, one or more outputs for predicted risk scores, and one or more weights relating the inputs and the outputs, among others.

IOO91| To establish the risk prediction model 1765, the model trainer 1750 may retrieve, receive, or identify training data. The training data may include one or more feature sets 1840 and corresponding expected risk scores, and may be maintained on the database 1770. Each feature set 1840 may identify or include the radiological features 1825, the histological features 1830, and genomic features 1835 for a given sample subject 1805 as discussed above. Each expected risk score may identify or correspond to a likelihood of an occurrence of an event (e.g., survival, hospitalization, injury, pain, treatment, or death) due to the condition in the subject 1805. The expected risk score may be manually created by a clinician (e.g., pathologist) examining the subject 1805 from which the feature set 1840 is obtained. In some embodiments, the training data may include a survival function for each feature set 1840 identifying expected risk scores over a period of time. The period of time may range, for example, from 3 days to 5 years. The model trainer 1750 may set the weights of the risk prediction model 1765 to initial values (e.g., zero or random) when initializing.

[00921 In some embodiments, the model trainer 1750 may identify or select features from the feature set 1840 of the training data to apply to the risk prediction model 1765. In selecting for establishing, the model trainer 1750 may identify or select at least one radiological feature 1825 from the set of radiological features 1825. The selection of the at least one radiological feature 1825 may be performed using a model. The model may be any machine learning (ML), such as: a regression model (e.g., linear or logistic regression), a clustering model (e.g., k-NN clustering or density-based clustering), Naive Bayesian classifier, artificial neural network (ANN), a decision tree, a relevance vector machine (RVM), or a support vector machine (SVM), among others. The model for selecting the radiological features 1825 may be, for example, an instance of the univariate Cox regression model discussed in Section B. The model trainer 1720 may establish the model by updating using the radiological features 1825 and the expected risk scores. The updating may include fitting and pruning the weights of the model for statistical significance of the types of features in the set of radiological features 1825 relative to the expected risk scores.

[00931 Upon fitting, the model trainer 1720 may calculate, generate, or otherwise determine a hazard ratio for each type of radiological features 1825 in the set of radiological features 1825 from the model. The model trainer 1720 may also determine, calculate, or otherwise generate a confidence value for each hazard ratio. The hazard ratio may identify or correspond to a degree of effect that the corresponding radiological feature 1825 has on the expected risk score. In general, the lower the hazard ratio, the lower the contributory effect of the radiological feature 1825 has to the expected risk score. Conversely, the higher the hazard ratio, the higher the contributory effect of the radiological feature 1825 has to the expected risk score. Based on the hazard ratio, the model trainer 1720 may select at least one of the radiological features 1825 for training the risk prediction model 1765. For instance, the model trainer 1720 may select the n radiological features 1825 with the highest n hazard ratios with a threshold level of confidence (e.g., 95%).

[00941 In addition, the model trainer 1750 may identify or select at least one histological feature 1830 from the set of histological features 1830. The selection of the at least one histological feature 1830 may be performed using a model. The model may be any machine learning (ML), such as: a regression model (e.g., linear or logistic regression), a clustering model (e.g., k-NN clustering or density-based clustering), Naive Bayesian classifier, artificial neural network (ANN), a decision tree, a relevance vector machine (RVM), or a support vector machine (SVM), among others. The model for selecting the histological features 1830 may be, for example, an instance of the univariate Cox regression model discussed in Section B. The model trainer 1720 may establish the model by updating using the histological features 1830 and the expected risk scores. The updating may include fitting and pruning the weights of the model for statistical significance of the types of features in the set of histological features 1830 relative to the expected risk scores.

[0095] Upon fitting, the model trainer 1720 may calculate, generate, or otherwise determine a hazard ratio for each type of histological features 1830 in the set of histological features 1830 from the model. The model trainer 1720 may also determine, calculate, or otherwise generate a confidence value for each hazard ratio. The hazard ratio may identify or correspond to a degree of effect that the corresponding histological feature 1830 has on the expected risk score. In general, the lower the hazard ratio, the lower the contributory effect of the histological feature 1830 has to the expected risk score. Conversely, the higher the hazard ratio, the higher the contributory effect of the histological feature 1830 has to the expected risk score. Based on the hazard ratio and the confidence value, the model trainer 1720 may select at least one of the histological features 1830 for training the risk prediction model 1765. For instance, the model trainer 1720 may select the n histological features 1830 with the highest n hazard ratios with a threshold level of confidence (e.g., 95%). In some embodiments, the model trainer 1750 may use the set of genomic features 1835 for training, without additional selection, as the gene sequencing data 1820 from which the genomic features 1835 are extracted may have been generated using targeted sequencing of DNA from the subject 1805.

[0096 | From the training data, the model trainer 1750 may identify the feature set 1840 to apply to the risk prediction model 1765. The feature set 1840 may include at least one of the radiological features 1825, at least one of the histological features 1830, and at least one of the genomic features 1835, among others. In some embodiments, the feature set may include the radiological features 1825 and the histological features 1830 selected using the univariate models as discussed above, along with the genomic features 1835. The model trainer 1750 may traverse over the feature sets 1840 of the training data to identify each feature set 1840. To apply, the model trainer 1750 may feed the feature set 1840 into the input of the risk prediction model 1765. Upon feeding, the model trainer 1750 may process the values of the feature set 1840 in accordance with the weights of the risk prediction model 1765 to output a predicted risk score for the feature set 1840. The predicted risk score may be similar to the expected risk score, and may identify or correspond to a likelihood of an occurrence of an event (e.g., survival, hospitalization, injury, pain, treatment, or death) due to the condition in the subject 1805 as calculated using the risk prediction model 1765. In some embodiments, the output may include the survival function identifying predicted risk scores over a period of time.

[0097] With the output, the model trainer 1750 may compare the predicted risk scores outputted by the risk prediction model 1765 and the corresponding expected risk scores from the training data. Using the comparison, the model trainer 1750 may update the weights of the risk prediction model 1765. In some embodiments, the model trainer 1750 may calculate, generate, or otherwise determine at least one loss metric (sometimes herein referred to as an error metric) based on the comparison. The loss metric may identify or correspond to a degree of deviation of the predicted risk score from the expected risk score. The loss metric may be calculated in accordance with any number of loss functions, such as a mean squared error (MSE), a mean absolute error (MAE), a hinge loss, a quantile loss, a quadratic loss, a smooth mean absolute loss, and a cross-entropy loss, among others. Using the loss metric, the model trainer 1750 may update the weights of the risk prediction model 1765. The updating (e.g., fitting and pruning) of the weights of the risk prediction model 1765 may be repeated until reaching convergence as defined for the model architecture.

[0098] In some embodiments, in updating the risk prediction model 1765, the model trainer 1750 may identify or select one or more features of the feature set 1840 for inputs of the risk prediction model 1765. The selected features may include at least one of the radiological features 1825, at least one of the histological features 1830, and at least one of the genomic features 1835. Upon fitting, the model trainer 1750 may calculate, generate, or otherwise determine a hazard ratio for each type of feature (e.g., the radiological feature 1825, the histological feature 1830, and the genomic feature 1835) in the set of histological features 1830 from the model. The model trainer 1720 may also determine, calculate, or otherwise generate a confidence value for each hazard ratio. The hazard ratio may identify or correspond to a degree of effect that the corresponding feature has on the expected risk score. In general, the lower the hazard ratio, the lower the contributory effect of the feature has to the expected risk score. Conversely, the higher the hazard ratio, the higher the contributory effect of the feature has to the expected risk score. Based on the hazard ratio and the confidence value, the model trainer 1720 may select each of the feature types for training the risk prediction model 1765. For instance, the model trainer 1720 may select the n histological features 1830 and n radiological features 1835 with the highest n hazard ratios with a threshold level of confidence (e.g., 95%) in their respective feature type.

[00991 With the establishment of the risk prediction model 1765, the model applier

1755 executing on the data processing system 1705 may receive, retrieve, or otherwise identify the feature set 1840. The feature set 1840 may include at least one of the radiological features 1825, at least one of the histological features 1830, and at least one of the genomic features 1835. The feature set 1840 may be newly acquired, and differ from the feature sets 1840 of the training data as described above. Under runtime mode, the type of radiological features 1825, histological features 1830, and genomic features 1835 may correspond to those selected during training of the risk prediction model 1765. Upon the identification, the model applier 1755 may feed the feature set 1840 into the input of the risk prediction model 1765.

[0100| In feeding, the model applier 1755 may process the values of the feature set 1840 in accordance with the weights of the risk prediction model 1765 to output at least one predicted risk score 1850 for the feature set 1840. The predicted risk score 1850 may identify or correspond to a likelihood of an occurrence of an event (e.g., hospitalization, injury, pain, treatment, or death) due to the condition in the subject 1805 as calculated using the risk prediction model 1765. In some embodiments, the model applier 1755 may calculate, determine, or otherwise generate a survival function identifying predicted risk scores 1765 over a period of time using the risk prediction model 1765.

[01011 With the generation, the output handler 1760 executing on the data processing system 1705 may generate an association between the predicted risk score 1765 (or the survival function) and the feature set 1740 using one or more data structures, such as a linked list, a tree, an array, a table, a matrix, a stack, a queue, or a heap, among others. In some embodiments, the association may be among the predicted risk scores 1765, the subject 1805 (e.g., using an anonymized identifier), data used to generate the feature set 1840 (e.g., the tomogram 1810, the WSI 1815, and gene sequencing data 1820) and the feature set 1840. The data structures for the association may be stored and maintained on the database 1770.

[0102] In some embodiments, the output handler 1760 may categorize, assign, or otherwise classify the subject 1805 into one of a set of risk level groups based on the predicted risk score 1765. The groups may be used to classify subjects 1805 by predicted risk score 1765. For example, one group may correspond to low risk of a particular cancer and another group may correspond to high risk for the same type of cancer. To classify, the output handler 1760 may compare the predicted risk score 1765 for the subject 1805 with a threshold for each risk level group. The threshold may delineate or define a value (or range) for the predicted risk scores 1765 above which the subject 1805 is to be classified into the associated risk level group. When the predicted risk score 1765 satisfies the threshold for at least one risk level group, the output handler 1760 may assign the subject 1805 (e.g., using the anonymized identifier) to the associated risk level group.

[01031 In some embodiments, the output handler 1760 may generate information 1855 based on the predicted risk score 1850 (or the association). The information 1855 may include instructions for rendering, displaying, or otherwise presenting the predicted risk score 1850, along with the identifier for the subject 1805 and the feature set 1840, among others. Upon generation, the output handler 1760 may send, transmit, or otherwise provide the information 1855 to the display 1725 (or a computing device coupled with the display 1725). The provision of the information 1855 may be in response to a request from a user of the data processing system 1705 or the computing device. The display 1725 may render, display, or otherwise present the information 1855, such as the predicted risk score 1850, the feature set 1840, and the identifier for the subject 1805, among others. For instance, the display 1725 may display, render, or otherwise present the information 1855 via a graphical user interface of an application to display predicted risk score 1850 and the classification into a risk level, adjacent to the tomogram 1810, the WSI 1815, and the gene sequencing data 1820, among others.

|0104] In this manner, the data processing system 1705 may be able to process features from various modalities (e.g., tomogram 1810, the WSI 1815, and the gene sequencing data 1820) to more accurate generate the predicted risk scores 1850. The features from various modalities may be obtained from various portions of the treatment process of the subject 1805, thereby enriching the types of data used to apply to the risk prediction model 1765. By outputting more accurate risk scores 1850, the data processing system 1705 may save computing resources (e.g., processor and memory consumption) that would have been exhausted from providing inaccurate and thus less useful risk scores.

[0105| Referring now to FIG. 19, depicted is a flow diagram of a method 1900 of determining risk scores using multimodal feature sets. The method 1900 may be performed by or implementing using the system 1700 described herein in conjunction with FIGs. 17- 18B or the system 2000 as described herein in conjunction with Section C. Under the method 1900, a computing system (e.g., the data processing system 1705) may identify a feature set (e.g., the feature set 1840 including the radiological feature 1825, the histological feature 1830, and the genomic feature 1835) (1905). The computing system may apply the feature set to a model (e.g., the risk prediction model 1765) (1910). The computing system may determine a predicted risk score (e.g., the predicted risk score 1850) from the application of the model (1915). The computing system may store an association between the predicted risk score and a subject (e.g., the subject 1805) (1920). The computing system may provide information (e.g., the information 1855) based on the predicted risk score (1925). C. Computing and Network Environment

[0106| Various operations described herein can be implemented on computer systems. FIG. 20 shows a simplified block diagram of a representative server system 2000, client computer system 2014, and network 2026 usable to implement certain embodiments of the present disclosure. In various embodiments, server system 2000 or similar systems can implement services or servers described herein or portions thereof. Client computer system 2014 or similar systems can implement clients described herein. The systems 1700 described herein can be similar to the server system 2000. Server system 2000 can have a modular design that incorporates a number of modules 2002 (e.g., blades in a blade server embodiment); while two modules 2002 are shown, any number can be provided. Each module 2002 can include processing unit(s) 2004 and local storage 2006.

[0107| Processing unit(s) 2004 can include a single processor, which can have one or more cores, or multiple processors. In some embodiments, processing unit(s) 2004 can include a general-purpose primary processor as well as one or more special-purpose coprocessors such as graphics processors, digital signal processors, or the like. In some embodiments, some or all processing units 2004 can be implemented using customized circuits, such as application specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs). In some embodiments, such integrated circuits execute instructions that are stored on the circuit itself. In other embodiments, processing unit(s) 2004 can execute instructions stored in local storage 2006. Any type of processors in any combination can be included in processing unit(s) 2004.

[01 8| Local storage 2006 can include volatile storage media (e.g., DRAM, SRAM, SDRAM, or the like) and/or non-volatile storage media (e.g., magnetic or optical disk, flash memory, or the like). Storage media incorporated in local storage 2006 can be fixed, removable, or upgradeable as desired. Local storage 2006 can be physically or logically divided into various subunits such as a system memory, a read-only memory (ROM), and a permanent storage device. The system memory can be a read-and-write memory device or a volatile read-and-write memory, such as dynamic random-access memory. The system memory can store some or all of the instructions and data that processing unit(s) 2004 need at runtime. The ROM can store static data and instructions that are needed by processing unit(s) 2004. The permanent storage device can be a non-volatile read-and-write memory device that can store instructions and data even when module 2002 is powered down. The term “storage medium” as used herein includes any medium in which data can be stored indefinitely (subject to overwriting, electrical disturbance, power loss, or the like) and does not include carrier waves and transitory electronic signals propagating wirelessly or over wired connections.

[0109] In some embodiments, local storage 2006 can store one or more software programs to be executed by processing unit(s) 2004, such as an operating system and/or programs implementing various server functions such as functions of the systems 1700 or any other system described herein, or any other server(s) associated with systems 1700 or any other system described herein.

10110] Software” refers generally to sequences of instructions that, when executed by processing unit(s) 2004, cause server system 2000 (or portions thereof) to perform various operations, thus defining one or more specific machine embodiments that execute and perform the operations of the software programs. The instructions can be stored as firmware residing in read-only memory and/or program code stored in non-volatile storage media that can be read into volatile working memory for execution by processing unit(s) 2004. Software can be implemented as a single program or a collection of separate programs or program modules that interact as desired. From local storage 2006 (or nonlocal storage described below), processing unit(s) 2004 can retrieve program instructions to execute and data to process in order to execute various operations described above.

10111] In some server systems 2000, multiple modules 2002 can be interconnected via a bus or other interconnect 2008, forming a local area network that supports communication between modules 2002 and other components of server system 2000. Interconnect 2008 can be implemented using various technologies including server racks, hubs, routers, etc.

[0112] A wide area network (WAN) interface 2010 can provide data communication capability between the local area network (interconnect 2008) and the network 2026, such as the Internet. Technologies can be used, including wired (e.g., Ethernet, IEEE 802.3 standards) and/or wireless technologies (e.g., Wi-Fi, IEEE 802.11 standards).

[0113 [ In some embodiments, local storage 2006 is intended to provide working memory for processing unit(s) 2004, providing fast access to programs and/or data to be processed while reducing traffic on interconnect 2008. Storage for larger quantities of data can be provided on the local area network by one or more mass storage subsystems 2012 that can be connected to interconnect 2008. Mass storage subsystem 2012 can be based on magnetic, optical, semiconductor, or other data storage media. Direct attached storage, storage area networks, network-attached storage, and the like can be used. Any data stores or other collections of data described herein as being produced, consumed, or maintained by a service or server can be stored in mass storage subsystem 2012. In some embodiments, additional data storage resources may be accessible via WAN interface 2010 (potentially with increased latency).

10114] Server system 2000 can operate in response to requests received via WAN interface 2010. For example, one of the modules 2002 can implement a supervisory function and assign discrete tasks to other modules 2002 in response to received requests. Work allocation techniques can be used. As requests are processed, results can be returned to the requester via WAN interface 2010. Such operation can generally be automated. Further, in some embodiments, WAN interface 2010 can connect multiple server systems 2000 to each other, providing scalable systems capable of managing high volumes of activity. Other techniques for managing server systems and server farms (collections of server systems that cooperate) can be used, including dynamic resource allocation and reallocation.

101 5] Server system 2000 can interact with various user-owned or user-operated devices via a wide-area network such as the Internet. An example of a user-operated device is shown in FIG. 20 as client computing system 2014. Client computing system 2014 can be implemented, for example, as a consumer device such as a smartphone, other mobile phone, tablet computer, wearable computing device (e.g., smart watch, eyeglasses), desktop computer, laptop computer, and so on.

[0116] For example, client computing system 2014 can communicate via WAN interface 2010. Client computing system 2014 can include computer components such as processing unit(s) 2016, storage device 2018, network interface 2020, user input device 2022, and user output device 2037. Client computing system 2014 can be a computing device implemented in a variety of form factors, such as a desktop computer, laptop computer, tablet computer, smartphone, other mobile computing device, wearable computing device, or the like. [0117| Processing unit(s) 2016 and storage device 2018 can be similar to processing unit(s) 2004 and local storage 2006 described above. Suitable devices can be selected based on the demands to be placed on client computing system 2014; for example, client computing system 2014 can be implemented as a “thin” client with limited processing capability or as a high-powered computing device. Client computing system 2014 can be provisioned with program code executable by processing unit(s) 2016 to enable various interactions with server system 2000.

[0118] Network interface 2020 can provide a connection to the network 2026, such as a wide area network (e.g., the Internet) to which WAN interface 2010 of server system 2000 is also connected. In various embodiments, network interface 2020 can include a wired interface (e.g., Ethernet) and/or a wireless interface implementing various RF data communication standards such as Wi-Fi, Bluetooth, or cellular data network standards (e.g., 3G, 4G, LTE, etc ).

[01 191 User input device 2022 can include any device (or devices) via which a user can provide signals to client computing system 2014; client computing system 2014 can interpret the signals as indicative of particular user requests or information. In various embodiments, user input device 2022 can include any or all of a keyboard, touch pad, touch screen, mouse or other pointing device, scroll wheel, click wheel, dial, button, switch, keypad, microphone, and so on.

I0120J User output device 2037 can include any device via which client computing system 2014 can provide information to a user. For example, user output device 2037 can include display-to-display images generated by or delivered to client computing system 2014. The display can incorporate various image generation technologies, e.g., a liquid crystal display (LCD), light-emitting diode (LED) including organic light-emitting diodes (OLED), projection system, cathode ray tube (CRT), or the like, together with supporting electronics (e.g., digital-to-analog or analog-to-digital converters, signal processors, or the like). Some embodiments can include a device such as a touchscreen that function as both input and output device. In some embodiments, other user output devices 2037 can be provided in addition to or instead of a display. Examples include indicator lights, speakers, tactile “display” devices, printers, and so on. [01211 Some embodiments include electronic components, such as microprocessors, storage, and memory that store computer program instructions in a computer readable storage medium. Many of the features described in this specification can be implemented as processes that are specified as a set of program instructions encoded on a computer readable storage medium. When these program instructions are executed by one or more processing units, they cause the processing unit(s) to perform various operations indicated in the program instructions. Examples of program instructions or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor using an interpreter. Through suitable programming, processing unit(s) 2004 and 2016 can provide various functionality for server system 2000 and client computing system 2014, including any of the functionality described herein as being performed by a server or client, or other functionality.

10122] It will be appreciated that server system 2000 and client computing system 2014 are illustrative and that variations and modifications are possible. Computer systems used in connection with embodiments of the present disclosure can have other capabilities not specifically described here. Further, while server system 2000 and client computing system 2014 are described with reference to particular blocks, it is to be understood that these blocks are defined for convenience of description and are not intended to imply a particular physical arrangement of component parts. For instance, different blocks can be but need not be located in the same facility, in the same server rack, or on the same motherboard. Further, the blocks need not correspond to physically distinct components. Blocks can be configured to perform various operations, e.g., by programming a processor or providing appropriate control circuitry, and various blocks might or might not be reconfigurable depending on how the initial configuration is obtained. Embodiments of the present disclosure can be realized in a variety of apparatus including electronic devices implemented using any combination of circuitry and software.

J 0123] While the disclosure has been described with respect to specific embodiments, one skilled in the art will recognize that numerous modifications are possible. Embodiments of the disclosure can be realized using a variety of computer systems and communication technologies, including, but not limited to, specific examples described herein. Embodiments of the present disclosure can be realized using any combination of dedicated components and/or programmable processors and/or other programmable devices. The various processes described herein can be implemented on the same processor or different processors in any combination. Where components are described as being configured to perform certain operations, such configuration can be accomplished; e.g., by designing electronic circuits to perform the operation, by programming programmable electronic circuits (such as microprocessors) to perform the operation, or any combination thereof. Further, while the embodiments described above may refer to specific hardware and software components, those skilled in the art will appreciate that different combinations of hardware and/or software components may also be used and that particular operations described as being implemented in hardware might also be implemented in software or vice versa.

[0124] Computer programs incorporating various features of the present disclosure may be encoded and stored on various computer readable storage media; suitable media includes magnetic disk or tape, optical storage media such as compact disk (CD) or digital versatile disk (DVD), flash memory, and other non-transitory media. Computer readable media encoded with the program code may be packaged with a compatible electronic device, or the program code may be provided separately from electronic devices (e.g., via Internet download or as a separately packaged computer-readable storage medium).

|0125] Thus, although the disclosure has been described with respect to specific embodiments, it will be appreciated that the disclosure is intended to cover all modifications and equivalents within the scope of the following claims.