SYSTEMS, METHODS, AND MEDIA FOR AUTOMATICALLY DETECTING BLOOD ABNORMALITIES USING IMAGES OF INDIVIDUAL BLOOD CELLS - VERSITI BLOOD RES INSTITUTE FOUNDATION INC

Title:

SYSTEMS, METHODS, AND MEDIA FOR AUTOMATICALLY DETECTING BLOOD ABNORMALITIES USING IMAGES OF INDIVIDUAL BLOOD CELLS

Document Type and Number:

WIPO Patent Application WO/2024/103004

Kind Code:

Abstract:

In accordance with some embodiments, systems, methods, and media for automatically detecting blood abnormalities using images of individual blood cells are provided. In some embodiments, a system comprises: at least one hardware processor configured to: receive a plurality of images, each of the plurality of images representing at least one blood cell from a blood sample; provide, for each of the plurality of images, image data based on the image to a trained convolutional neural network (CNN); receive, for each of the plurality of images, an output from the trained CNN, the output indicative of a predicted classification of the blood cell represented in the image data from a plurality of blood cell classes; and output a report indicative of a prevalence of each blood cell class in the plurality of blood cell classes in the sample.

More Like This:

WO/2023/028639	BRAIN HUB EXPLORER
JP5399407	Image processing device and image processing method
JP2022519678	Methods and equipment for automated target and tissue segmentation using multimodal imaging and ensemble machine learning models

Inventors:

ZHAO HELONG (US)
FAERBER DOMINIK (US)
DEININGER MICHAEL (US)

Application Number:

PCT/US2023/079383

Publication Date:

May 16, 2024

Filing Date:

November 10, 2023

Export Citation:

Click for automatic bibliography generation Help

Assignee:

VERSITI BLOOD RES INSTITUTE FOUNDATION INC (US)

International Classes:

G16H30/40; G16H30/20; G06V10/764; G06V10/82

Attorney, Agent or Firm:

VOLBERDING, Peter, J. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

What is claimed is:

1. A system for automatically detecting blood abnormalities, the system comprising: at least one hardware processor configured to: receive a plurality of images, each of the plurality of images representing at least one blood cell from a blood sample: provide, for each of the plurality of images, image data based on the image to a trained convolutional neural network (CNN); receive, for each of the plurality of images, an output from the trained

CNN, the output indicative of a predicted classification of the blood cell represented in the image data from a plurality of blood cell classes; and output a report indicative of a prevalence of each blood cell class in the plurality of blood cell classes in the sample.

2. The system of claim 1, wherein each of the plurality of images comprises pixels each associated with a pixel value, and wherein the at least one hardware processor is further configured to: generate, for each of the plurality of images, the image data based on the image such that each pixel of the image data is in a range of [0,1 ] and has a particular size corresponding to a size of an input layer of the trained CNN.

3. The system of claim 2, wherein the at least one hardware processor is further configured to: normalize, for each of the plurality of images, each pixel value to a range of [0,1]; and pad, for each of the plurality of images below a particular size, one or more margin of the image with black pixels to increase a size of the image to the particular size.

4. The system of claim 2, wherein the particular size is 100 X 100, and wherein the at least one hardware processor is further configured to: normalize, for each of the plurality of images, each pixel value to the range of [0,1] by converting the pixel value from a 16-bit brightness value.

5. The system of claim 1, wherein each of the plurality of images comprises a brightfield image captured by an imaging flow cytometer.

6. The system of claim 1, wherein the plurality' of blood cell classes comprises two or more of the following: Normal, Acute Myeloid Leukemia (AML), Chronic Myelomonocytic Leukemia (CMML), Myeloproliferative Neoplasm (MPN), Myelodysplasia Syndrome (MDS), Clonal Hematopoiesis of Indetermined Potential (CHIP).

7. The system of claim 1, further comprising an imaging flow cytometer, wherein the at least one hardware processor is further configured to: cause the imaging flow cytometer to capture the plurality of images.

8. The system of claim 1, yvherein the at least one hardware processor is further configured to: provide, for each of the plurality of images, the image data based on the image to a second trained CNN; receive, for each of the plurality of images, an output from the second trained CNN, the output indicative of a predicted classification of the blood cell represented in the image data from the plurality of blood cell classes, wherein the report is based on the output from the trained CNN and the second trained CNN.

9. The system of claim 1, wherein the trained CNN comprises: a first convolutional layer; a first max pooling layer that receives an output of the first convolutional layer; a second convolutional layer that receives an output of the first max pooling layer; a second max pooling layer that receives an output of the second convolutional layer; a third convolutional layer that receives an output of the second max pooling layer; a third max pooling layer that receives an output of the third convolutional layer; a flaten layer that receives an output of the third max pooling layer; a first fully connected layer that receives an output of the flaten layer; a second fully connected layer that receives an output of the first fully connected layer; and a third fully connected layer that receives an output of the second fully connected layer, and outputs the output, wherein the third fully connected layer is implemented using a softmax activation function.

10. A method for automatically detecting blood abnormalities, the method comprising: receiving an image with a computer system, wherein the image represents at least one blood cell from a blood sample; inputing the image to a trained neural network using the computer system; receiving an output from the trained neural network with the computer system, the output indicative of a predicted classification of the at least one blood cell represented in the image from a plurality of blood cell classes; and outputting with the computer system, a report indicative of a prevalence of each blood cell class in the plurality of blood cell classes in the sample.

11. The method of claim 10, wherein the image comprises pixels each associated with a pixel value, and wherein the method further comprises: generating image data based on the image such that each pixel of the image data is in a range of [0,1] and has a particular size corresponding to a size of an input layer of the trained neural network; and inputing the image data to the trained neural network.

12. The method of any one of claims 10 or 11, wherein the image comprises a brightfield image captured by an imaging flow cytometer.

13. The method of any one of claims 10-12, wherein the plurality' of blood cell classes comprises two or more of the following: Normal, Acute Myeloid Leukemia (AML), Chronic Myelomonocytic Leukemia (CMML), Myeloproliferative Neoplasm (MPN), Myelodysplasia Syndrome (MDS), and Clonal Hematopoiesis of Indetermined Potential (CHIP).

14. The method of any one of claims 10-13, further comprising causing an imaging flow cytometer to capture the image.

15. The method of any one of claim 10—14. wherein the trained neural network is a trained convolutional neural network (CNN).

16. The method of any one of claims 10-15, further comprising: inputting the image to a second trained neural network; receiving an output from the second trained neural network, the output indicative of a predicted classification of the blood cell represented in the image from the plurality⁷ of blood cell classes, wherein the report is based on the output from the trained neural network and the second trained neural network.

17. The method of claim 16, wherein the second trained neural network is a second trained convolutional neural network (CNN).

Description:

SYSTEMS, METHODS, AND MEDIA FOR AUTOMATICALLY DETECTING BLOOD ABNORMALITIES USING IMAGES OF INDIVIDUAL BLOOD CELLS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] The present application claims priority to U.S. Provisional Patent Application No. 63/424,381 that was filed November 10, 2022, the entire contents of which are hereby incorporated by reference.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

[0002] N/A

BACKGROUND

[0003] Blood smear tests are often used to inform diagnoses of various blood disorders. Such a test is often performed by a hemato-pathologist examining a blood sample using a microscope to manually characterize blood cells from a blood sample collected from a subject. For example, the hemato-pathologist can characterize blood cells based on the size, shape, color, general appearance, relative number of various types of cells, etc. However, blood smear can only infer rare gross abnormalities of cell morphology. Most mutations in blood cells do not cause gross morphology changes and thus cannot be recognized by human eye (with the assistance of a microscope). To detect mutations, patient samples need to be analyzed by costly next-generation genetic sequencing tests.

[0004] New systems, methods, and media for automatically detecting blood abnormalities using images of individual blood cells are desirable. Such systems, methods, and media can also be used in biomedical research to detect mutant blood cells in model organisms, such as mouse (Mus musculus).

SUMMARY

[0005] In accordance with some embodiments of the disclosed subject matter, systems, methods, and media for classifying detecting blood abnormalities using images of individual blood cells are provided.

[0006] In accordance with some embodiments, a system for automatically detecting blood abnormalities is provided, the system comprising: at least one hardware processor configured to: receive a plurality of images, each of the plurality of images representing at least one blood cell from a blood sample; provide, for each of the plurality of images, image data based on the image to a trained convolutional neural network (CNN); receive, for each of the plurality of images, an output from the trained CNN, the output indicative of a predicted classification of the blood cell represented in the image data from a plurality of blood cell classes; and output a report indicative of a prevalence of each blood cell class in the plurality of blood cell classes in the sample.

[0007] In some embodiments, each of the plurality of images comprises pixels each associated with a pixel value, and wherein the at least one hardware processor is further configured to: generate, for each of the plurality of images, the image data based on the image such that each pixel of the image data is in a range of [0,1] and has a particular size corresponding to a size of an input layer of the trained CNN.

[0008] In some embodiments, the at least one hardware processor is further configured to: normalize, for each of the plurality of images, each pixel value to a range of [0,1]; and pad, for each of the plurality of images below a particular size, one or more margin of the image with black pixels to increase a size of the image to the particular size, wherein each image data.

[0009] In some embodiments, the particular size is 100 x 100, and wherein the at least one hardware processor is further configured to: normalize, for each of the plurality of images, each pixel value to the range of [0,1] by converting the pixel value from a 16-bit brightness value.

[0010] In some embodiments, each of the plurality of images comprises a brightfield image captured by an imaging flow cytometer.

[0011] In some embodiments, the plurality of blood cell classes comprises two or more of the following: Normal, Acute Myeloid Leukemia (AML), Chronic Myelomonocytic Leukemia (CMML), Myeloproliferative Neoplasm (MPN), Myelodysplasia Syndrome (MDS), Clonal Hematopoiesis of Indetermined Potential (CHIP), and mutant mouse blood cells carrying mutation(s) related to the above-mentioned human diseases. In some embodiments, the system further comprises an imaging flow cytometer, wherein the at least one hardware processor is further configured to: cause the imaging flow cytometer to capture the lurality of images.

[0012] In some embodiments, the at least one hardware processor is further configured to: provide, for each of the plurality of images, the image data based on the image to a second trained CNN; receive, for each of the plurality of images, an output from the second trained CNN. the output indicative of a predicted classification of the blood cell represented in the image data from the plurality of blood cell classes, wherein the report is based on the output from the trained CNN and the second trained CNN.

[0013] In some embodiments, the trained CNN comprises: a first convolutional layer; a first max pooling layer that receives an output of the first convolutional layer; a second convolutional layer that receives an output of the first max pooling layer; a second max pooling layer that receives an output of the second convolutional layer; a third convolutional layer that receives an output of the second max pooling layer; a third max pooling layer that receives an output of the third convolutional layer; a flatten layer that receives an output of the third max pooling layer; a first fully connected layer that receives an output of the flatten layer; a second fully connected layer that receives an output of the first fully connected layer; and a third fully connected layer that receives an output of the second fully connected layer, and outputs the output, wherein the third fully connected layer is implemented using a softmax activation function.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] Various objects, features, and advantages of the disclosed subject matter can be more fully appreciated with reference to the following detailed description of the disclosed subject matter when considered in connection with the following drawings, in which like reference numerals identify like elements.

[0015] FIG. 1 shows an example of a system for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter.

[0016] FIG. 2 shows an example of hardware that can be used to implement a blood cell image source, a computing device, and a server, shown in FIG. 1 in accordance with some embodiments of the disclosed subject matter.

[0017] FIG. 3 shows an example of a flow for training and using mechanisms for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter.

[0018] FIG. 4 shows an example of a process for training and using a convolutional neural network that can be used to implement mechanisms for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter. [0019] FIG. 5 shows an example of a topology of a convolutional neural network that can be used to implement mechanisms for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter.

[0020] FIG. 6 shows an example of a user interface associated with a blood cell imaging source, and various blood cell images that can be used in connection with mechanisms for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter.

[0021] FIG. 7 shows an example of a raw image received from a blood cell imaging source, various blood cell images with normalized pixel values, and various blood cell images padded to a standardized size.

[0022] FIG. 8 shows an example of performance of various convolutional neural networks implemented using mechanisms described herein for automatically detecting blood abnormalities using images of individual blood cells during training.

[0023] FIG. 9 shows an example of a correlation between cell class predicted by trained convolutional neural networks implemented using mechanisms described herein for automatically detecting blood abnormalities using images of individual blood cells and test sample composition, and a Pearson's coefficient analysis showing overlap between diagnosis results and true sample composition.

DETAILED DESCRIPTION

[0024] In accordance with various embodiments, mechanisms (which can, for example, include systems, methods, and media) for automatically detecting blood abnormalities using images of individual blood cells are provided.

[0025] In some embodiments, mechanisms described herein can use a relatively small volume blood sample (e.g., as low as 10 microliters (pL)) to detect the presence of blood cell abnormalities that may be indicative of disease. For example, a blood sample can be collected from a finger prick using an Ethylenediaminetetraacetic acid (EDTA)-coated capillary tube.

[0026] In some embodiments, a blood sample can be diluted (e.g., with buffered saline) and stained (e.g., with a nuclear dye. such as l,5-bis{[2-(di-methylamino) ethyl] amino} -4, 8-dihydroxyanthracene-9, 10-dione, which has been marketed as DRAQ5). [0027] In some embodiments, mechanisms described herein can capture images of cells from the stained sample using any suitable imaging system (e.g., using an imaging flow cytometer, such as an ImageStream imaging flow cytometer, marketed under the Amnis brand by Cytek Biosciences, headquartered in Fremont, California, United States). For example, mechanisms described herein can cause an imaging flow cytometer to capture images of individual blood cells in a sample and/or utilize images of individual blood cells in the sample captured by an imaging flow cytometer.

[0028] In some embodiments, mechanisms described herein can analyze data derived from the images of the blood cells in the sample (e.g., data generated via processing of image data captured by an imaging system, such as an imaging flow cytometer) to detect the presence of abnormal blood cells in the sample and/or to classify abnormal blood cells present in the sample. The image data may include, for example, pixel values in the images and/or parameters, metrics, or other quantities determined from or based on the pixel values in the images. As one non-limiting example, image data may include pixel values and/or normalized pixel values (e.g., pixel values that have been normalized to a range of [0,1]). In some embodiments, mechanisms described herein can be used to analyze white blood cells (e.g., labeled with nuclear dye as aforementioned) and red blood cells separately.

[0029] In some embodiments, mechanisms described herein can be used to train a machine learning model (e.g.. a convolutional neural network (CNN) or other neural network suitable for processing images) to detect the presence of abnormal blood cells in the sample and/or classify abnormal blood cells present in the sample as representing examples of a certain class of abnormal blood cell. For example, mechanisms described herein can use a set of labeled image data (e.g., processed image data) representing normal and abnormal cells as training data.

[0030] In some embodiments, image data generated from a blood sample collected from a subject can be provided to the trained machine learning model, and predictions by the trained machine learning model can be used to generate a diagnostic report that includes information indicative of whether the sample is normal, and if not, a likely type of disease(s) represented by the sample, and other information (e.g., the percentage of abnormal cells of one or more types).

[0031] In some embodiments, mechanisms described herein can be used to process blood samples at much higher rates than a trained medical practitioner (e.g., a hemato- pathologist), which can facilitate improvements in the diagnosis of blood disorders. For example, mechanisms described herein can improve automated blood analysis technologies (e.g., facilitating automation of a task currently performed by highly trained physicians, such as hemato-pathologists).

[0032] In some embodiments, mechanisms described herein can detect the presence of abnormal blood cells that make up a relatively small fraction of total blood cells, which may be missed by a human analyst (e.g., if the analyst happens to be viewing a portion of a sample that does not include any of the low abundance abnormal cells). In such an example, analysis using mechanisms described herein can facilitate diagnosis of disease in subjects that may otherwise be undiagnosed or diagnosed later.

[0033] In some embodiments, mechanisms described herein can detect blood cells with relatively subtle abnormalities that a human analyst may not recognize. In such an example, analysis using mechanisms described herein can facilitate diagnosis of disease in subjects that may otherwise be undiagnosed or diagnosed later (e.g., when abnormalities become more apparent).

[0034] The blood cells may comprise an abnormality that is caused by a genetic mutation.

[0035] The disclosed systems, methods, and media automatically detect blood abnormalities in blood samples from any suitable source organism. In one example, the blood sample may be from a vertebrate. The blood sample may be, for example, from a mammal. The blood sample may be from a laboratory animal, e.g., a mouse, a rat, a rabbit, a goat, a cow. The blood sample may be from a companion animal, e.g., a cat, a dog, a horse, a donkey. The blood sample may be from a human.

[0036] In some embodiments, mechanisms described herein can increase throughput of analysis of blood samples and/or lower costs associated with analysis of blood samples. For example, a system implemented in accordance with some embodiments of the disclosed subject matter can analyze a blood sample for abnormal blood cells and generate information indicative of diagnosis for a small fraction of the cost of a trained human analyst (e.g., as low as 1% of the cost of a hemato-pathologist). For example, using mechanisms described herein can facilitate analysis of blood samples more quickly and/or at lower cost, can facilitate analysis of blood samples in on site at a clinical setting (e.g., not requiring transportation of a blood sample to a specialized lab for analysis), and/or can allow trained human analysts (e.g., hemato-pathologists) to analyze blood samples predicted to be indicative of disease (e.g., confirming an analysis performed by mechanisms described herein). In some embodiments, mechanisms described herein can be used to implement high-throughput and low-cost technologies that can be used to detect blood abnormalities, which can be especially useful for screening of blood abnormalities in large populations.

[0037] FIG. 1 shows an example 100 of a system for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter. As shown in FIG. 1, a computing device 110 can receive images of blood cells in a blood sample from blood cell image source 102.

[0038] In some embodiments, computing device 1 10 can execute at least a portion of a blood cell classification system 106 to detect the presence of abnormal cells in blood cell image data received from blood cell image source 102. As described below in connection with FIG. 3 and 6, blood cell classification system 106 can generate information indicative of the presence of abnormal blood cells based on image data of individual blood cells received from blood cell image source 102.

[0039] Additionally or alternatively, in some embodiments, computing device 110 can communicate image data received from blood cell image source 102 and/or derived from blood cell image source 102 to a server 120 over a communication network 108, and server 120 can execute at least a portion of blood cell classification system 106. In such embodiments, server 120 can return information to computing device 110 (and/or any other suitable computing device) indicative of an output blood cell classification system 106 to generate information indicative of the presence of abnormal blood cells and/or a diagnosis of a disease(s). In some embodiments, blood cell classification system 106 can execute one or more portions of process 400 described below' in connection with FIG. 4.

[0040] In some embodiments, computing device 110 and/or server 120 can be any suitable computing device or combination of devices, such as a desktop computer, a laptop computer, a smartphone, a tablet computer, a wearable computer, a server computer, a virtual machine being executed by a physical computing device, etc.

[0041] In some embodiments, blood cell imaging source 102 can be any suitable source of blood cell image data. For example, blood cell imaging source 102 can be an imaging flow cytometer. As another example, blood cell imaging source 102 can be a microscope with a controlled fluidic adapter/chip and image sensor. As another example, blood cell imaging source 102 can be another computing device (e.g., a server storing blood cell image data).

[0042] In some embodiments, blood cell imaging source 102 can be local to computing device 110. For example, blood cell imaging source 102 can be incorporated with computing device 110 (e.g., computing device 110 can be configured as part of a device for capturing and/or storing blood cell image data). As another example, blood cell imaging source 102 can be connected to computing device 110 by a cable, a direct wireless link, etc. Additionally or alternatively, in some embodiments, blood cell imaging source 102 can be located locally and/or remotely from computing device 110, and can communicate blood cell image data to computing device 110 (and/or server 120) via a communication network (e.g., communication network 108). Note that, in some embodiments, blood cell imaging source 102 can provide other image data (e.g., an image(s) that do not include a blood cell). [0043] In some embodiments, communication network 108 can be any suitable communication network or combination of communication networks. For example, communication network 108 can include a Wi-Fi network (which can include one or more wireless routers, one or more switches, etc.), a peer-to-peer network (e.g.. a Bluetooth network), a cellular network (e.g., a 3G network, a 4G network, a 5G network, etc., complying with any suitable standard, such as CDMA, GSM, LTE, LTE Advanced, NR, etc.), a wired network, etc. In some embodiments, communication network 108 can be a local area network, a wide area network, a public network (e.g., the Internet), a private or semi-private network (e.g., a corporate or university intranet), any other suitable type of network, or any suitable combination of networks. Communications links shown in FIG. 1 can each be any suitable communications link or combination of communications links, such as wired links, fiber optic links, Wi-Fi links, Bluetooth links, cellular links, etc.

[0044] FIG. 2 shows an example 200 of hardware that can be used to implement a blood cell image source 102, computing device 110, and/or server 120 in accordance with some embodiments of the disclosed subject matter. As shown in FIG. 2, in some embodiments, computing device 110 can include a processor 202, a display 204, one or more inputs 206, one or more communication systems 208. and/or memory 210. In some embodiments, processor 202 can be any suitable hardware processor or combination of processors, such as a central processing unit (CPU), a graphics processing unit (GPU), an accelerated processing unit (APU), an application specific integrated circuit (ASIC), a field- programmable gate array (FPGA), etc. In some embodiments, display 204 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc. In some embodiments, inputs 206 can include any suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, etc.

[0045] In some embodiments, communications systems 208 can include any suitable hardware, firmware, and/or software for communicating information over communication network 108 and/or any other suitable communication networks. For example, communications systems 208 can include one or more transceivers, one or more communication chips and/or chip sets, etc. In a more particular example, communications systems 208 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, etc.

[0046] In some embodiments, memory 210 can include any suitable storage device or devices that can be used to store instructions, values, etc., that can be used, for example, by processor 202 to present content using display 204, to communicate with server 120 via communications system(s) 208, etc. Memory 210 can include any suitable volatile memory', non-volatile memory, storage, or any suitable combination thereof. For example, memory 210 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, etc. In some embodiments, memory 210 can have encoded thereon a computer program for controlling operation of computing device 110. In such embodiments, processor 202 can execute at least a portion of the computer program to perform any suitable combination of: receiving blood cell image data; processing blood cell image data; providing processed blood cell image data to a trained machine learning algorithm; receiving output from the trained machine learning algorithm; processing and/or analyze the output; generating diagnosis information; presenting content (e.g., blood cell images, diagnosis information , user interfaces, graphics, tables, etc.); receiving content and/or information from server 120; transmit content and/or information to server 120; etc. For example, processor 202 can execute at least a portion of the computer program to perform one or more portions of process 400 described below in connection with FIG. 4.

[0047] In some embodiments, server 120 can include a processor 212, a display 214, one or more inputs 216, one or more communications systems 218, and/or memory 220. In some embodiments, processor 212 can be any suitable hardware processor or combination of processors, such as a CPU, a GPU, an APU. an ASIC, an FPGA, etc. In some embodiments, display 214 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc. In some embodiments, inputs 216 can include any suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, etc.

[0048] In some embodiments, communications systems 218 can include any suitable hardware, firmware, and/or software for communicating information over communication network 108 and/or any other suitable communication networks. For example, communications systems 218 can include one or more transceivers, one or more communication chips and/or chip sets, etc. In a more particular example, communications systems 218 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, etc.

[0049] In some embodiments, memory 220 can include any suitable storage device or devices that can be used to store instructions, values, etc., that can be used, for example, by processor 212 to present content using display 214, to communicate with one or more computing devices 110, etc. Memory 220 can include any suitable volatile memory, nonvolatile memory, storage, or any suitable combination thereof. For example, memory 220 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, etc. In some embodiments, memory 220 can have encoded thereon a server program for controlling operation of server 120. In such embodiments, processor 212 can execute at least a portion of the server program to perform any suitable combination of: transmitting information and/or content (e.g., blood cell images, diagnosis information , user interfaces, graphics, tables, etc.) to one or more computing devices 110; receive information and/or content from one or more computing devices 110; receiving blood cell image data; processing blood cell image data; providing processed blood cell image data to a trained machine learning algorithm; receiving output from the trained machine learning algorithm; processing and/or analyze the output; generating diagnosis information; etc. For example, processor 212 can execute at least a portion of the server program to perform one or more portions of process 400 described below in connection with FIG. 4.

[0050] In some embodiments, blood cell image source 102 can include a processor 222, imaging components 224, one or more communications systems 226, and/or memory 228. In some embodiments, processor 222 can be any suitable hardware processor or combination of processors, such as a CPU, a GPU, an APU, an ASIC, an FPGA, etc. In some embodiments, imaging components 224 can be any suitable components to generate blood cell images, such as aflow cell, one or more light sources (e g., a source of transmitted light, one or more lasers, etc.), optics, an image sensor (e.g., implemented using CMOS pixels, CCD pixels, and/or any suitable pixels), a spectral decomposition element, one or more optical filters, etc. [0051] Note that, although not show n, blood cell image source 102 can include any suitable inputs and/or outputs. For example, blood cell image source 102 can include input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, a trackpad, a trackball, hardware buttons, software buttons, etc. As another example, blood cell image source 102 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc., one or more speakers, etc.

[0052] In some embodiments, communications systems 226 can include any suitable hardware, firmware, and/or software for communicating information to computing device 110 (and, in some embodiments, over communication network 108 and/or any other suitable communication networks). For example, communications systems 226 can include one or more transceivers, one or more communication chips and/or chip sets, etc. In a more particular example, communications systems 226 can include hardware, firmware and/or software that can be used to establish a wired connection using any suitable port and/or communication standard (e.g., VGA. DVI video, USB, RS-232, etc.). Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, etc.

[0053] In some embodiments, memory 228 can include any suitable storage device or devices that can be used to store instructions, values, image data, etc., that can be used, for example, by processor 222 to: control imaging components 224, and/or receive image data from imaging components 224; generate images; present content (e.g.. blood cell images, a user interface, etc.) using a display; communicate with one or more computing devices 110 (which can, e.g., control one or more operations of blood cell image source 102); etc. Memory 228 can include any suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof. For example, memory 228 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, etc. In some embodiments, memory 228 can have encoded thereon a program for controlling operation of blood cell imaging source 102. In such embodiments, processor 222 can execute at least a portion of the program to generate image data, transmit information and/or content (e.g., image data depicting one or more blood cells) to one or more computing devices 110, receive information and/or content from one or more computing devices 110, transmit information and/or content (e.g., image data) to one or more servers 120, receive information and/or content from one or more servers 120. receive instructions from one or more computing devices (e.g., a personal computer, a laptop computer, a tablet computer, a smartphone, etc.), etc. [0054] FIG. 3 shows an example 300 of a flow for training and using mechanisms for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter.

[0055] In some embodiments, mechanisms described herein can be used to train a convolutional neural network (CNN), or other neural network suitable for processing images, to detect the presence of blood cells indicative of disease using processed blood cell images as input. As described below in connection with FIG. 5, a CNN can be implemented using convolutional layers, pooling layers, and fully connected layers.

[0056] As shown in FIG. 3, labeled blood cell images 302 (e.g., images that have already been pre-processed using techniques described below in connection with 408 of FIG. 4) generated from blood samples collected from one or more subjects (e.g., human subjects, mouse subjects, other animal subjects, etc.) can be used as training data, and can include blood cells exhibiting various morphologies, and each blood cell can be associated with a label corresponding to the morphology. For example, some blood cell images that depict normal cells can be labeled using a normal label, and some blood cell images that depict abnormal cells can be labeled using an abnormal label. In such an example, there can be one or more different class of normal cells and/or abnormal cells, which can each be associated with a different label. For example, image data depicting human cells in a human blood dataset can include, but is not limited to, any suitable combination of the following disease/label types: Normal (Normal), Acute Myeloid Leukemia (AML) (Abnormal), Chronic Myelomonocytic Leukemia (CMML) (Abnormal), Myeloproliferative Neoplasm (MPN) (Abnormal), Myelodysplasia Syndrome (MDS) (Abnormal), and Clonal Hematopoiesis of Indetermined Potential (CHIP) (Abnormal). As another example, image data depicting mouse cells in a mouse blood dataset can include any suitable combination of the following label types: peripheral blood (PB)-Normal (Normal), PB-Mutant (Abnormal), Bone Marrow (BM)-Normal (Normal), BM-Mutant (Abnormal)

[0057] In some embodiments, additional training data can be generated by augmenting images received from a blood image source (e.g., blood image source 102), for example, as described below in connection with 408 of FIG. 4.

[0058] In some embodiments, an untrained CNN 304 can be trained (e.g., by computing device 110, by server 120, by blood cell classification system 106) using labeled images of blood cells 302. In some embodiments, untrained CNN 304 can have any suitable topology, such as a topology described below in connection with FIG. 5. [0059] In some embodiments, untrained CNN 304 can be trained using an Adam optimizer (e.g., based on an optimizer described in Kingma et al., "Adam: A Method for Stochastic Optimization," available at arxiv(dot)org, 2014). In some embodiments, a particular image 302 can be provided as input to untrained CNN 304, which can output a predicted blood cell classification for each input image 302.

[0060] In some embodiments, images 302 can be formatted in any suitable format. For example, images 302 can each be formatted as an array of values in which each element of the array corresponds to an image pixel and has a value in a particular range (e.g., a value in a range of [0,1]). In some embodiments, outputs 306 (a predicted label of the cell depicted in image data provided as input) of untrained CNN 304 can formatted in any suitable format. For example, as an array (e.g., a vector) in which each element corresponds to a likelihood that the cell is predicted to be a particular label.

[0061] In some embodiments, predicted labels 306 can be compared to the corresponding label for the image 302 to evaluate the performance of untrained CNN 304. For example, a loss value can be calculated using a loss function L, which can be used to evaluate the performance of untrained CNN 304 in predicting which class a blood cell depicted in an image belongs. In some embodiments, the loss value can be used to adjust weights of untrained CNN 304. For example, a loss calculation 308 can be performed (e.g., by computing device 110, by sen' er 120, by VMI system 106) to generate a loss value that can represent a performance of untrained CNN 304. The loss value generated by loss calculation 308 can be used to adjust weights of untrained CNN 304. For example, the loss value can be used with Adam optimizer, a learning rate of 0.001 , and for any suitable number of training epochs (e.g., fifteen epochs). From the pre-processed dataset 95% can be used for training and 5% can be used for testing. From the training subset, 20% can be used for validation, and 80% can be used for training.

[0062] In some embodiments, after training has converged (and the untrained CNN 304 performs adequately on test data), untrained CNN 304 with final weights can be used to implement as a trained CNN 312.

[0063] As shown in FIG. 3, unlabeled blood cell images 310 can be provided as input to trained CNN 312, which can output predicted blood cell labels 314 for each unlabeled image 310.

[0064] In some embodiments, the frequencies of abnormal cell types can be used to establish correlations between true sample composition and predicted sample composition, using linear regression (e.g., with 95% confidence intervals, with 90% confidence intervals, etc.), for example, as described below in connection with FIG. 9, panel (a). The linear correlation can characterize a formula that can be used to correct predicted sample composition to reflect the true sample composition.

[0065] In some embodiments, predicted blood cell labels 314 can be used (e.g., by computing device 110, by server 120, by VMI system 106) to perform an analysis 316 to estimate prevalence of various normal and/or abnormal blood cells in the sample, and a report informative of a diagnosis 318 can be generated and/or output based on the analysis. For example, predicted blood cell labels 314 can be used to estimate a percentage of each type of cell, to predict a diagnosis (of a disease), and/or output a report informative of a diagnosis. In some embodiments, any suitable technique or combination of techniques can be used to analyze the predicted blood cell labels 314. analyze the labels 316. and/or generate a diagnosis and/or report informative of diagnosis, such as techniques described below in connection with 422 of FIG. 4.

[0066] In some embodiments, the analysis and/or report can include an absolute total number of cells used for prediction, an absolute total number of normal cells (for entire sample), an absolute total number of abnormal cells (with abnormal categories summarized as a whole or as individual types, for entire sample), a relative number of abnormal cells (for entire sample), an absolute total number of cells (in sample).

[0067] In some embodiments, flow 300 can be used to train multiple CNNs (e.g., multiple trained CNNs 312). For example, different CNNs can be trained using different objectives, and can be used as an ensemble model. For example, each CNN can be trained using a different portion of the dataset selected as the training dataset, the validation dataset, and/or the test dataset.

[0068] FIG. 4 shows an example 400 of a process for training and using a convolutional neural network, or other neural network suitable for processing images, that can be used to implement mechanisms for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter.

[0069] At 402, process 400 can include obtaining and/or receiving various blood samples that each include normal blood cells or a combination of normal blood cells and abnormal blood cells. For example, blood samples can be obtained from subjects. In a more particular example, a peripheral blood sample (e.g., of 10-50 pL) can be collected by lancet finger prick from a human subject. As another more particular example, an aliquot of peripheral blood sample can be obtained from an EDTA tube collected blood for lab tests. As yet another example, a blood samples (e.g., of 20-50 pL) can be obtained from a subject (e.g., an animal, such as a mouse) can be collected via submandibular bleeding.

[0070] At 404, process 400 can prepare the blood samples using any suitable technique or combination of techniques, and image blood cells within the prepared sample using an imaging device (e.g., blood cell image source 102).

[0071] In some embodiments, process 400 can include preparing the blood samples for imaging by a blood cell imaging system (e.g., blood cell image source 102). For example, 10 pL of a blood sample can be transferred to a sterile Eppendorf tube, and 100 pL of processing solution (e.g., including PBS with 5 micromoles (pM) nuclear dye, which can include DRAQ5 and 1 pg/mL dead cell dye, such as 7-Aminoactinomycin D) can be added to the tube to dilute and stain the sample.

[0072] In some embodiments, 5 minutes after adding processing solution, diluted samples can be loaded into an imaging device (e.g., the Amnis ImageStream imaging flow cytometer) to obtain bright field microscopic images, and any other suitable ty pe of image (e.g., fluorescent microscopic images), of cells in their native morphology (e.g., as shown in FIG. 6). In some embodiments, the fluorescent microscopic images can be used to distinguish between white blood cells and red blood cells, and/or can be used to label cells for training.

[0073] In some embodiments, recorded events can be gated to remove multi-cell clusters, cells out of focus, cells without nuclei, and dead cells.

[0074] In some embodiments, bright field images of selected cells can be exported (e.g., as 16-bit raw pixel value tif files, any other suitable RAW file, a .jpeg file, etc.).

[0075] At 406, process 400 can include labeling of blood cell images to identify the cell in the image as belonging to a particular class, such as a normal class (or particular normal class), or a particular type of abnormal class, based on cell morphology and/or a known condition of a subject from which the blood sample was taken. For example, each cell can be labeled with a label described above in connection wi th blood cell images 302, such as Normal (Normal). AML (Abnormal), CMML (Abnormal), MPN (Abnormal), MDS (Abnormal), and CHIP (Abnormal) for human cells, or PB-Normal (Normal), PB-Mutant (Abnormal), BM-Normal (Normal), BM-Mutant (Abnormal) for mouse cells.

[0076] In some embodiments, labels can be associated with each blood cell image based on an analysis by a human analyst or a panel of human analysts (e.g., a hemato- pathologist(s)) of the blood cell image. [0077] At 408, process 400 can pre-process the images using any suitable technique or combination of techniques. For example, the labeled, one-color channel single-cell images from individual image files (e.g., .tif files, any other suitable RAW files, .jpeg files, etc.) can be converted into an array (e.g., a pixel matrix), and pixel values can be normalized from pixel values (e.g., 8 bit brightness values, 16 bit brightness values, etc.) into a range of [0,1].

[0078] In some embodiments, process 400 can standardize the size of each image to a standard size (e.g., 100 X 100 pixels), which can correspond to a size of an input layer of a CNN to be trained (e.g., untrained CNN 100). For example, blood cell images that are larger than 100 x 100 pixels can be filtered (e.g., removed from the dataset), as they may be likely to be cell clusters (e.g., rather than a single cell). As another example, images smaller than 100 x 100 pixels can be padded to be 100 x 100 pixels (e g., by adding a black margin, with a pixel value = 0 around the original image), such as shown in FIG. 7, panels (b) and (c).

[0079] In some embodiments, process 400 can generate augmented data from pre- processed images, which can increase the amount of training data, using any suitable augmentation technique and/or combination of augmentation techniques. For example, process 400 can rotate an image (e.g., clockwise rotation of an image of 90°, 180°, or 270°). [0080] As another example, process 400 can change brightness of an image, creating a darker/ brighter version respectively by subtracting/adding a value (e.g., a value of 0.2) from/to each pixel value (e.g., not exceeding a range of [0,1]).

[0081] As yet another example, process 400 can change a contrast of an image (e.g., contrasting the original image with default settings of the MATLAB imadjust()-function which saturates the bottom 1% and the top 1% of all pixel values).

[0082] As yet another example, process 400 can generate a reflected version of an image (e.g., by mirroring an image along the y-axis).

[0083] In some embodiments, after augmentation, the batch of real data and augmented data can be merged as a pre-processed dataset for training.

[0084] At 410, process 400 can train a CNN using any suitable technique or combination of techniques, such as techniques described above in connection with FIG. 3. In some embodiments, process 400 can train multiple CNNs.

[0085] At 412, process 400 can include obtaining and/or receiving an unlabeled blood sample includes normal blood cells or a combination of normal blood cells and abnormal blood cells. For example, the unlabeled blood sample can be obtained from a subject for which a diagnosis is sought. In a more particular example, a peripheral blood sample (e.g., of 10-50 pL) can be collected by lancet finger prick from a human subject. As another more particular example, an aliquot of peripheral blood sample can be obtained from an EDTA tube collected blood for lab tests. As yet another example, a blood sample (e.g., of 20-50 pL) can be obtained from a subject (e.g., an animal, such as a mouse) can be collected via submandibular bleeding.

[0086] At 414, process 400 can prepare the unlabeled blood sample using any suitable technique or combination of techniques, and image blood cells within the unlabeled prepared sample using an imaging device (e.g.. blood cell image source 102).

[0087] In some embodiments, process 400 can include preparing the unlabeled blood sample for imaging by a blood cell imaging system (e.g., blood cell image source 102). For example, 10 pL of a blood sample can be transferred to a sterile Eppendorf tube, and 100 pL of processing solution (e.g., including PBS with 5 micromoles (pM) nuclear dye. which can include DRAQ5 and 1 pg/mL dead cell dye, such as 7-Aminoactinomycin D) can be added to the tube to dilute and stain the sample.

[0088] In some embodiments, 5 minutes after adding processing solution to the unlabeled blood sample, the diluted unlabeled sample can be loaded into an imaging device (e.g., the Amnis ImageStream imaging flow cytometer) to obtain bright field microscopic images, and any other suitable type of image (e.g., fluorescent microscopic images), of cells in their native morphology (e.g., as shown in FIG. 6).

[0089] In some embodiments, recorded events can be gated to remove multi-cell clusters, cells out of focus, cells without nuclei, and dead cells.

[0090] In some embodiments, bright field images of selected cells can be exported (e.g., as 16-bit raw pixel value tif files, any other suitable RAW file, a .jpeg file, etc.).

[0091] At 416, process 400 can pre-process the images generated from the unlabeled sample using any suitable technique or combination of techniques. For example, the unlabeled, one-color channel single-cell images from individual image files (e.g.. .tif files, any other suitable RAW files, jpeg files, etc.) can be converted into an array (e.g., a pixel matrix), and pixel values can be normalized from pixel values (e.g., 8 bit brightness values, 16 bit brightness values, etc.) into a range of [0,1],

[0092] In some embodiments, process 400 can standardize the size of each image to a standard size (e.g., 100 x 100 pixels), which can correspond to a size of an input layer of the trained CNN(s) to be trained (e.g., untrained CNN 100). For example, blood cell images that are larger than 100 x 100 pixels can be filtered (e.g., removed from the dataset), as they may be likely to be cell clusters (e.g., rather than a single cell). As another example, images smaller than 100 x 100 pixels can be padded to be 100 x 100 pixels (e g., by adding a black margin, with a pixel value = 0 around the original image), such as show n in FIG. 7, panels (b) and (c).

[0093] At 418, process 400 can provide the pre-processed images from the unlabeled blood sample to a trained CNN(s).

[0094] At 420, process 400 can receive classifications (e.g., predicted labels) for each of various blood cell images from the unlabeled blood sample as output from the trained CNN(s).

[0095] At 422, process 400 can analyze the classification results from the CNN(s), and/or generate a diagnosis based on the classification results. For example, in some embodiments, the output from the trained CNNs can be made under a certain probability ⁷, e.g., the likelihood the CNN associates the given image with the predicted cell population.

[0096] For example, process 400 can determine the absolute total number of cells predicted used for prediction, an absolute total number of normal cells (with abnormal categories summarized as a whole or as individual ty pes, for entire sample), an absolute total number of abnormal cells (for entire sample), a relative number of abnormal cells (for entire sample), an absolute total number of cells (in sample), and/or any other suitable information characterizing the output of the CNN(s).

[0097] As another example, process 400 can use a linear regression model to revise an estimate a composition of the sample based on the number of cells characterized as abnormal cells of a certain type.

[0098] In some embodiments, process 400 can generate a diagnosis based on the predicted and/or revised estimated frequency of abnormal cells. The type of disease/abnormality of the sample can be estimated based on the most prevalent ty pe of abnormal cell type in the sample.

[0099] At 424, process 400 can output a report indicative of the analysis performed at 422 and/or informative of a diagnosis based on the analysis generated at 422. For example, process 400 can cause the report to be presented (e.g., by computing device 110). As another example, process 400 can cause the report to be made available to a user (e.g., through an electronic message, such as email, through a portal, etc.). [0100] In some embodiments, the report can include information generated using process 400 (e.g., prevalence of various types of normal and/or abnormal cells in the sample), and/or a likely diagnosis based on the information generated using process 400.

[0101] FIG. 5 shows an example of a topology of a convolutional neural network that can be used to implement mechanisms for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter.

[0102] In some embodiments, mechanisms described herein can use a topology that includes an input layer (e.g., a 100 X 100 X 1 input layer), pairs of convolutional layers followed by a pooling layer (e g., a max pooling layer) (e.g., three pairs of convolutional layers and pooling layers), and multiple fully connected layers (e.g., preceded by a flatten layer that formats an output of a pooling layer as a ID vector that can be input to a fully connected layer).

[0103] In some embodiments, the CNN can be implemented using any suitable technique, such as using the Python library Keras.

[0104] In some embodiments, the model can include ten layers, listed from first until last, which can include a 2D-convolutional-layer (e.g., with kernel size 3*3, 32 filters, Rectified Linear Unit activation function), a 2D-max-pool-layer (e.g., with pool size 2*2, and a stride of two), a second 2D-conlutionaLlayer (e.g., with kernel size 3*3, 64 filters, Rectified Linear Unit activation function), a second 2D-max-pool-layer (e.g., with pool size 2*2, and a stride of two), a third 2D-conlutional-layer (e g., with kernel size 3*3, 128 filters, Rectified Linear Unit activation function), a third 2D-max-pool-layer (e.g., with pool size 2*2, and a stride of two), a flatten-layer, a fully connected layer (e.g., which is sometimes referred to as a dense-layer, which can include 64 layer nodes, Rectified Linear Unit activation function), a second fully connected layer (e.g., which can include 184 layer nodes, Rectified Linear Unit activation function), and a third fully connected layer (which can include a number of layer nodes corresponding to the number of labels, such as 3 or 6 layer nodes, and a SoftMax activation function).

[0105] FIG. 6 shows an example of a user interface associated with a blood cell imaging source, and various blood cell images that can be used in connection with mechanisms for automatically detecting blood abnormalities using images of individual blood cells in accordance with some embodiments of the disclosed subject matter.

[0106] FIG. 6, panel (a), shows a portion of a user interface that can be used in connection with capturing images of blood cells using a blood imaging system. [0107] FIG. 6, panel (b), shows images of abnormal and normal cells captured using an Amnis ImageStream imaging flow cytometer using brightfield images and fluorescent images.

[0108] FIG. 7 shows an example of a raw image received from a blood cell imaging source, various blood cell images with normalized pixel values, and various blood cell images padded to a standardized size.

[0109] FIG. 7, panel (a) shows an example visualization of a .tif image file of a blood cell captured using an Amnis ImageStream imaging flow cytometer.

[0110] FIG. 7, panel (b) shows brightfield images of individual blood cells that have been normalized to values of [0,1],

[0111] FIG. 7, panel (c) shows the images shown in FIG. 7, panel (b) that have been padded with black pixels (having a value of 0) to cause the size of each image to be 100 X 100 pixel images.

[0112] FIG. 8 shows an example of performance of various convolutional neural networks implemented using mechanisms described herein for automatically detecting blood abnormalities using images of individual blood cells during training.

[0113] Mechanisms described herein were used to train CNNs to predict the presence of various types of blood cells in samples from mice. For MLL-AF9 fusion gene - induced chimeric acute myeloid leukemia (AML) disease model, C57BL/6J mouse bone marrow (BM) was harvested and lentivirally transduced with MLL-AF9-GFP cassette and transplanted into lethally irradiated C57BL/6J littermates. After bone marrow transplant (BMT), blood samples (20-50 pL) were obtained from mice by submandibular bleeding, and kept in refrigerated EDTA-coated sample tubes. Samples with detectable GFP ⁺ cells were used for training and testing a CNN using mechanisms described herein. All mouse works was approved by the Medical College of Wisconsin (MCW) institutional animal care and use committee (IACUC) (protocol # AUA00007680).

[0114] Mechanisms described herein were used to train CNNs to predict the presence of various types of blood cells in samples from human subjects. For healthy donors, peripheral blood samples (10-50 pL) were collected by lancet finger prick, and kept in refrigerated EDTA-coated sample tubes. For patients seen at a specialty hematological clinic, an aliquot of peripheral blood sample was obtained from EDTA tube collected blood for lab tests. All blood donors consented to participate in medical research, and use of human samples for research was approved by MCW institutional review board (IRB) committee (protocol # PR000042528). [0115] Training of CNNs was performed in the Python environment, for both models trained with human blood cells (which are sometimes referred to herein as Human Cell Models, HuCMs) and models trained with mouse blood cells (which are sometimes referred to herein as Mouse Cell Models, MsCMs).

[0116] For HuCMs real and augmented images were used to create three streams of models. The first stream of HuCMs used a dataset that included all six previously referenced Human Blood label types, via data augmentation each label type contained 13,290 singlecell images. The second stream of HuCMs used the same label types, however each type was up-sampled to 3,480 single-cell images (e.g., including fewer images generated through augmentation). The third stream contained only three label types PB-Normal (Normal), CMML (Abnormal), and AML (Abnormal), and each label t pe was up-sampled to 2500 single-cell images using data augmentation. For MsCMs only one stream of models was used which was designed to have a 2: 1 ratio between Normal to Abnormal cells. All seven earlier Mouse blood label types described above were present in this dataset.

[0117] The same training processes were used for both HuCMs and MsCMs: First the pre-processed, labeled, and up-sampled single-cell images were loaded in. The number of loaded label types depended on the model stream. Several conversion techniques were applied in order to use MATLAB objects in Python. All labeled images were combined into one dataframe, shuffled the entries to guarantee that there was no correlation between row order and label types. To train each model, 95% of the objects were randomly selected from each dataset as training input (e.g., by randomly selecting all images associated with certain subjects to include in the training data, rather than randomly selecting images) and the remaining 5% of the objects were reserved to test model consistency after training (this can ensure that the test data does not include data from the same subjects used to train the models). The training set was divided into a training portion (80%) and a validation portion (20%). The image matrices of both training & validation portions were converted to a multidimensional array in order to fit the input layer of the machine learning models. The models were based on Keras Sequential()-models, trained on 15 epochs, with optimizer Adam() and a learning rate of 0.001. The following architecture was used: input layer: shape [n, 100, 100, 1]; convolutional-(2D)-layer: filters=32, kernel size [3,3], activation function = relu; max-pooling-(2D)-layer: pool size = [2,2], strides = 2; convolutional-(2D)-layer: filters=64, kernel size [3,3], activation function = relu; max-pooling-(2D)-layer: pool size = [2,2], strides = 2; convolutional-(2D)-layer: filters=128. kernel size [3.3], activation function = relu; max-pooling-(2D)-layer: pool size = [2,2], strides = 2; flatten-layer I; dense-layer II; dense-layer III; and dense-layer IV with a softmax activation.

[0118] After training, the models were saved along with their training history (including data loss and model accuracy during training and validating phase). Under each stream a total of ten models was trained and sorted into five subgroups. Within each subgroup the random test-training-data split was the same, across groups the split was different (the data used for training was different for each subgroup). In some embodiments, using an ensemble of models can improve the overall classification accuracy, as error in each model can be random.

[0119] Cell type prediction and sample classification were performed in the Python environment. Several independent new datasets were used to evaluate the performance of trained models in a blinded fashion. These new datasets were composed of cells from additional mouse or human blood samples that were obtained and processed independently from the training datasets. Accordingly, the trained models are completely blinded to these independent new datasets.

[0120] The recorded images were pre-processed (e.g., as described above in connection with 408) and were loaded into Python along with the trained models. A total of 26 external samples was used which included some of the human blood cell ty pes. The trained models were used to analyze each individual cell from each sample and classify the input images with cell types. Each prediction was made under a certain probability (e.g.. the likelihood, in a range of [0,1 ] that each model associates the given image with the predicted cell population).

[0121] The predictions were saved in a spreadsheet (specifically an Excel spreadsheet with file extension .xlsx). These files included the following measures: Absolute total number of cells predicted (for each cell population); Absolute total number of normal cells (for entire sample); Absolute total number of abnormal cells (for entire sample); Relative number of abnormal cells (for entire sample); Absolute total number of cells (in sample).

[0122] FIG. 9 shows an example of a correlation between cell class predicted by trained convolutional neural networks implemented using mechanisms described herein for automatically detecting blood abnormalities using images of individual blood cells and test sample composition, and a Pearson's coefficient analysis showing overlap between diagnosis results and true sample composition. [0123] The measures described above in connection with FIG. 8 were obtained for each model within each stream of HuCMs (i.e., 30 predictions were made for each blood cell image), and the frequencies of abnormal cell types were used to establish correlations between true sample composition and predicted sample composition, using linear regression with 95% confidence intervals. This linear correlation describes a model that can be used to correct predicted sample composition to reflect the true sample composition, as shown in FIG. 9, panel (a).

[0124] Additionally, multiple unknown samples were processed in the established pre-processing pipeline and all input cells were given a predicted cell type. Then the predicted frequency of abnormal cells was corrected using the linear regression model to estimate the true frequency of abnormal cells in the unknown sample. The type of disease/abnormality of the sample was determined by the most prevalent type of abnormal cell type in the sample.

[0125] Results of the Pearson's coefficient analysis are shown in FIG. 9, panel (b). Overall, ten trained CNNs were used to make predictions on human blood samples. When analyzing the performance of the models, the relative and absolute occurrences of cells predicted as normal and as abnormal was counted. Afterwards, the predictions of the models were compared to the actual percentage of normal/abnormal cells in the sample. The analysis in FIG. 9. panel (b) shows analysis on models trained using different hardware (e.g., a local CPU and cloud-based GPU), and with different numbers of samples. As shown in FIG. 9, panel (b), the models trained using different hardware can be expected to result in trained CNNs that produce similar results.

[0126] Further Examples Having a Variety of Features:

[0127] Implementation examples are described in the following numbered clauses: [0128] 1. A method for automatically detecting blood abnormalities, the method comprising: receiving a plurality of images, each of the plurality of images representing at least one blood cell from a blood sample; providing, for each of the plurality of images, image data based on the image to a trained convolutional neural network (CNN); receiving, for each of the plurality of images, an output from the trained CNN, the output indicative of a predicted classification of the blood cell represented in the image data from a plurality of blood cell classes; and outputting a report indicative of a prevalence of each blood cell class in the plurality of blood cell classes in the sample.

[0129] 2. The method of clause 1, wherein each of the plurality of images comprises pixels each associated with a pixel value, and wherein the method further comprises: generating, for each of the plurality of images, the image data based on the image such that each pixel of the image data is in a range of [0,1] and has a particular size corresponding to a size of an input layer of the trained CNN.

[0130] 3. The method of clause 2, further comprising: normalizing, for each of the plurality of images, each pixel value to a range of [0,1]; and padding, for each of the plurality of images below a particular size, one or more margin of the image with black pixels to increase a size of the image to the particular size, wherein each image data.

[0131] 4. The method of clause 2, wherein the particular size is 100 X 100, and wherein the method further comprises normalizing, for each of the plurality' of images, each pixel value to the range of [0,1] by converting the pixel value from a 16-bit brightness value.

[0132] 5. The method of any one of clauses 1 to 4, wherein each of the plurality of images comprises a brightfield image captured by an imaging flow cytometer.

[0133] 6. The method of any one of clauses 1 to 5, wherein the plurality' of blood cell classes comprises two or more of the following: Normal. Acute Myeloid Leukemia (AML), Chronic Myelomonocytic Leukemia (CMML), Myeloproliferative Neoplasm (MPN), Myelodysplasia Syndrome (MDS), and Clonal Hematopoiesis of Indetermined Potential (CHIP).

[0134] 7. The method of any one of clauses 1 to 6, further comprising causing an imaging flow cytometer to capture the plurality of images.

[0135] 8. The method of any one of clauses 1 to 7, further comprising: providing, for each of the plurality ⁷ of images, the image data based on the image to a second trained CNN; receiving, for each of the plurality of images, an output from the second trained CNN, the output indicative of a predicted classification of the blood cell represented in the image data from the plurality of blood cell classes, wherein the report is based on the output from the trained CNN and the second trained CNN.

[0136] 9. The method of any one of clauses 1 to 8, wherein the trained CNN comprises: a first convolutional layer; a first max pooling layer that receives an output of the first convolutional layer; a second convolutional layer that receives an output of the first max pooling layer; a second max pooling layer that receives an output of the second convolutional layer; a third convolutional layer that receives an output of the second max pooling layer; a third max pooling layer that receives an output of the third convolutional layer; a flatten layer that receives an output of the third max pooling layer; a first fully connected layer that receives an output of the flatten layer; a second fully connected layer that receives an output of the first fully connected layer; and a third fully connected layer that receives an output of the second fully connected layer, and outputs the output, wherein the third fully connected layer is implemented using a softmax activation function.

[0137] 10. A system for automatically detecting blood abnormalities, comprising: at least one processor that is configured to: perform a method of any of clauses 1 to 9.

[0138] 11. The system of clause 10, wherein the system further compnses a flow cytometer.

[0139] 12. A non-transitory computer-readable medium storing computerexecutable code, comprising code for causing a computer to cause a processor to: perform a method of any of clauses 1 to 9.

[0140] In some embodiments, any suitable computer readable media can be used for storing instructions for performing the functions and/or processes described herein. For example, in some embodiments, computer readable media can be transitory or non- transitory. For example, non-transitory computer readable media can include media such as magnetic media (such as hard disks, floppy disks, etc.), optical media (such as compact discs, digital video discs, Blu-ray discs, etc ), semiconductor media (such as RAM, Flash memory', electrically programmable read only memory' (EPROM), electrically erasable programmable read only memory (EEPROM), etc.), any suitable media that is not fleeting or devoid of any semblance of permanence during transmission, and/or any suitable tangible media. As another example, transitory' computer readable media can include signals on networks, in wires, conductors, optical fibers, circuits, or any suitable media that is fleeting and devoid of any semblance of permanence during transmission, and/or any suitable intangible media.

[0141] It should be noted that, as used herein, the term mechanism can encompass hardware, software, firmware, or any' suitable combination thereof.

[0142] It should be understood that the above described steps of the processes of FIG. 4 can be executed or performed in any order or sequence not limited to the order and sequence shown and described in the figures. Also, some of the above steps of the processes of FIG. 4 can be executed or performed substantially simultaneously where appropriate or in parallel to reduce latency and processing times.

[0143] Although the invention has been described and illustrated in the foregoing illustrative embodiments, it is understood that the present disclosure has been made only by way of example, and that numerous changes in the details of implementation of the invention can be made without departing from the spirit and scope of the invention, which is limited only by the claims that follow. Features of the disclosed embodiments can be combined and rearranged in various ways.

Previous Patent: SYSTEMS FOR MUTATION CALLER AND METHODS OF USING THE SAME

Next Patent: METHOD FOR MEASUREMENT OF NICOTINAMIDE MONONUCLEOTIDE (NMN) AND ITS RELATED METABOLITES