Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DECISION FOR DOUBLE READER
Document Type and Number:
WIPO Patent Application WO/2022/112731
Kind Code:
A1
Abstract:
The present invention relates to deep learning implementations for medical imaging. More particularly, the present invention relates to a method and system for suggesting whether to obtain a two user manual review/analysis or a single user review/analysis of a set of medical images from an initial medical screening. Aspects and/or embodiments seek to provide a method and system for suggesting whether one or two radiologists review one or more cases/sets of medical images, based on the use of computer-aided analysis (for example using deep learning) on each case/set of medical images.

Inventors:
RIJKEN TOBIAS (GB)
KARPATI EDITH (HU)
O'NEILL MICHAEL (GB)
HEINDL ANDREAS (GB)
YEARSLEY JOSEPH ELLIOT (GB)
KORKINOF DIMITRIOS (GB)
KHARA GALVIN (GB)
Application Number:
PCT/GB2020/053016
Publication Date:
June 02, 2022
Filing Date:
November 26, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
KHEIRON MEDICAL TECH LTD (GB)
International Classes:
G06T7/00; A61B6/00
Domestic Patent References:
WO2019239155A12019-12-19
Foreign References:
US20060100507A12006-05-11
Other References:
CHRISTIANA BALTA ET AL: "Going from double to single reading for screening exams labeled as likely normal by AI: what is the impact?", 15TH INTERNATIONAL WORKSHOP ON BREAST IMAGING (IWBI2020), 22 May 2020 (2020-05-22), pages 66, XP055762421, ISBN: 978-1-5106-3832-7, DOI: 10.1117/12.2564179
FUJITA HIROSHI: "AI-based computer-aided diagnosis (AI-CAD): the latest review to read first", RADIOLOGICAL PHYSICS AND TECHNOLOGY, SPRINGER JAPAN KK, JP, vol. 13, no. 1, 2 January 2020 (2020-01-02), pages 6 - 19, XP037030006, ISSN: 1865-0333, [retrieved on 20200102], DOI: 10.1007/S12194-019-00552-4
ANONYMOUS: "Use of double reading in mammography screening", 28 May 2020 (2020-05-28), XP055762430, Retrieved from the Internet [retrieved on 20201223]
HEKAIMING: "Mask r-cnn", COMPUTER VISION (ICCV), 2017 IEEE INTERNATIONAL CONFERENCE ON. IEEE, 2017
PAUL F. JAEGER ET AL., RETINA U-NET: EMBARRASSINGLY SIMPLE EXPLOITATION OF SEGMENTATION SUPERVISION FOR MEDICAL OBJECT DETECTION, Retrieved from the Internet
Attorney, Agent or Firm:
BARNES, Philip Michael (GB)
Download PDF:
Claims:
CLAIMS:

1. A computer-aided method of analysing medical images (1100), the method comprising the steps of: receiving one or more medical images (1100); using one or more trained machine learning models (520) to independently analyse said one or more medical images to determine one or more characteristics; generating output data based on the determined one or more characteristics; triggering at least two manual reviews (1108, 1109) of the one or more medical images (1100) when the determined characteristics indicate a positive malignancy classification of the one or more medical images (1100).

2. The method of claim 1 wherein the method instructs the at least two manual review to be performed by two or more independent users (1108, 1109, 1111 ).

3. The method of any preceding claim wherein the further analysis comprises any or any combination of: a computerised tomography (CT) scan; an ultrasound scan; a magnetic resonance imaging (MRI) scan; a tomosynthesis scan; and/or a biopsy.

4. The method of any preceding claim wherein the one or more medical images (1100) comprises one or more mammographic or X-ray scans.

5. The method of any preceding claim wherein the step of analysing and determining is performed using one or more trained machine learning models (401 , 30, 1103).

6. The method of claim 5 wherein the trained machine learning models (104, 30, 1103) comprise convolutional neural networks.

7. The method of any preceding claim wherein the step of analysing and determining comprises segmenting one or more anatomical regions.

8. The method of any preceding claim wherein the output data (1100) further comprises overlay data indicating a segmentation outline and/or a probability masks showing one or more locations of one or more segmented regions.

9. The method of any preceding claim wherein the step of analysing and determining comprises identifying tissue type and density category.

10. The method of claims 3 and 9 wherein the further analysis comprises one or more additional medical tests (105a) dependent upon the density category (105b) determined based on the one or more medical images (1100).

11 . The method of any preceding claim wherein the step of analysing and determining comprises automatically identifying one or more anomalous regions in the medical image (1100).

12. The method of any preceding claim wherein the step of analysing and determining comprises identifying and distinguishing between a malignant lesion and/or a benign lesion and/or typical lesion.

13. The method of claim 12 wherein the output data further comprises overlay data indicating a probability mask for the one or more lesions.

14. The method of any preceding claim wherein the step of analysing and determining comprises identifying architectural distortion.

15. The method of any preceding claim wherein the one or more medical images and the one or more additional medical images comprise the use digital imaging and communications in medicine, DICOM, files.

16. A system for analysing sets of medical images (1100), the system comprising means for carrying out the method of any one of claims 1 to 15.

17. The system of claim 16 further comprising: a medical imaging device (101); a storage device; a user terminal operable to input diagnosis metadata for each set of medical images (1100); a processing unit operable to analyse one or more of each set of medical images on the storage device; and an output viewer (202) operable to display a requirement for or trigger a further analysis of the set of medical images (1100).

18. The system of claim 17 wherein the processing unit is integrated with the medical imaging device or wherein the processing unit is located remotely and is accessible via a communications channel. 19. The system of claim 17 wherein the storage device comprises a picture archiving communication system, PACS, and/or a vendor neutral archive, VNA.

20. A computer program product comprising instructions which, when the program is executed by a computer, causes the computer to carry out the method according to any one of claims 1 to 15.

Description:
DECISION FOR DOUBLE READER

Field The present invention relates to deep learning implementations for medical imaging.

More particularly, the present invention relates to a method and system for suggesting whether to obtain a two user manual review/analysis or a single user review/analysis of a set of medical images from an initial medical screening. Background

Mammography is an advanced method of scanning human breast tissue which makes use of low dose X-rays to produce images of the internal structure of the human breast. The screening of these images, called mammograms, aids early detection and diagnoses of breast abnormalities and diseases.

In order to ascertain a more accurate scan, mammogram machines usually have two plates that compress the breast to spread the tissue apart and help radiologists examine the mammogram.

Assessment by human radiologists is believed to be the most accurate method of image evaluation, and refers to the task performed by a radiologist, or similar professional, of inspecting medical scans, section by section, in order to produce a comprehensive analysis. However, considering that a mammogram is a representation of three-dimensional information projected onto a two-dimensional image plane, there is often superimposition of tissues in the 2D medical scan images (mammograms) being inspected. As a result, tissues that appear superimposed within the image of the breast can reduce the visibility of malignant abnormalities or sometimes even simulate the appearance of an abnormality (false positive). This makes the task of analysing a mammogram more challenging and can cause difficulty when it comes to accurately and precisely detecting abnormalities. In some situations, only a single radiologist can review and diagnose the set of images produced from each set of mammogram image data. It is therefore possible that sometimes the single radiologist will not accurately diagnose a patient based on their review of mammogram image data.

While it is sometimes preferred to use two independent radiologists to review each patient’s mammogram image data independently, this is not always possible logistically or economically. Summary of Invention

Aspects and/or embodiments seek to provide a method and system for suggesting whether one or two radiologists review one or more cases/sets of medical images, based on the use of computer-aided analysis (for example using deep learning) on each case/set of medical images.

According to a first aspect, there is provided a computer-aided method of analysing medical images, the method comprising the steps of: receiving one or more medical images; using one or more trained machine learning models to independently analyse said one or more medical images to determine one or more characteristics; generating output data based on the determined one or more characteristics; triggering at least two manual reviews (1108, 1109) of the one or more medical images (1100) when the determined characteristics indicate a positive malignancy classification of the one or more medical images; .

Optionally, the method instructs the at least two manual review to be performed by two or more independent users. Optionally, the further analysis comprises further manual analysis. Optionally, the further analysis comprises further analysis by a computer aided diagnosis system.

Radiologists do not demonstrate consistent accuracy due to the manual nature of the task, for example, making errors due to superimposed breast tissues in the mammogram and/or details too fine for the human eye to detect. By comparing the manually determined one or more characteristics with computer-determined characteristics for the same data, the method can trigger a second manual review of the data where the computer-determined characteristics indicate a malignancy is present or if the computer-determined characteristics do not match the human assessment thus only ever make a single radiologist approach safer by triggering a second manual review for potentially positive cases or if there is a significant mismatch between the user diagnosis and the computer-aided analysis (where the computer- aided analysis indicates a negative case) of each set of medical images.

Optionally, the method is performed in substantially real-time. This can allow the trigger for the second manual review promptly, thus allowing the method to integrate with existing medical workflows more easily as it does not cause significant delay.

Optionally, the method can trigger or recommend one or more additional medical tests comprise any or any combination of: a computerised tomography (CT) scan; an ultrasound scan; a magnetic resonance imaging (MRI) scan; a tomosynthesis scan; and/or a biopsy.

A further medical test can be suggested based on the analysis of the preliminary screening. As an example, a more detailed tomosynthesis scan can be instantaneously recommended if the initial mammogram is unclear or features are superimposed or there might be a lesion worth investigating. In some cases, the analysis from the initial medical image may not require any further workup or medical tests. Optionally, the output data may also indicate a breast density or tissue classification type.

Optionally, the one or more medical images comprises one or more mammographic or X-ray scans.

In most medical screening programmes, X-ray or mammography is the first type of medical scan.

Optionally, the step of analysing and determining is performed using one or more trained machine learning models.

Trained machine learning models can analyse medical images far quicker than a human expert, and hence increase the number of medical images analysed overall. The accuracy is typically consistent when using a machine learning model. Thus, a problem, for example the growth of a cancerous tumour, can be detected more quickly than waiting for a human expert to become available and hence treatment may begin earlier, or an additional medical test may be requested sooner. The identification of regions of interest, which may include lesions, may therefore aid screening and clinical assessment of breast cancer among other medical issues. Earlier diagnosis and treatment can reduce psychological stress to a patient and also increase the chances of survival in the long term.

Optionally, the trained machine learning models comprise convolutional neural networks.

Convolutional networks are powerful tools inspired by biological neural processes, which can be trained to yield hierarchies of features and are particularly suited to image recognition. Convolutional layers apply a convolutional operation to an input and pass the results to a following layer. With training, convolutional networks can achieve expert-level accuracy or greater with regard to segmenting and localising anatomical and pathological regions in digital medical images such as mammograms.

Optionally, the step of analysing and determining comprises segmenting one or more anatomical regions. Optionally, the output data further comprises overlay data indicating a segmentation outline and/or a probability masks showing one or more locations of one or more segmented regions.

Providing a clear and accurate segmentation of regions can be very helpful when reviewing a medical image, such as a mammogram. This may be especially relevant if there is reason to suspect there is a medical issue with a patient, for example a swollen area which is larger than it was in previous scans. Such changes may be more easily detectable if the different regions are clearly segmented. In addition, the segmentation information can also be used to enrich the Picture Archiving Communication Systems (PACS) that radiology departments use in hospitals. With the inclusion of this segmentation data on PACS, it advantageously improves future methods of flagging up similar cases, whether the methods are semi-automated, entirely automated or performed manually.

Optionally, the step of analysing and determining comprises identifying tissue type and density category. Optionally, the required type of the one or more additional medical tests are dependent upon the density category determined based on the one or more medical images. Optionally, this step may jointly estimate tissue type and density category.

Correctly classifying the tissue type and density category can enable the method to recommend an appropriate additional medical test or specific workup.

Optionally, the step of analysing and determining comprises automatically identifying one or more anomalous regions in the medical image.

Optionally, the step of analysing and determining comprises identifying and distinguishing between a malignant lesion and/or a benign lesion and/or typical lesion.

Optionally, the output data further comprises overlay data indicating a probability mask for the one or more lesions.

Optionally, the step of analysing and determining comprises identifying architectural distortion.

Optionally, the one or more medical images and the one or more additional medical images comprise the use digital imaging and communications in medicine, DICOM, files.

As a DICOM file is conventionally used to store and share medical images, conforming to such a standard can allow for easier distribution and future analysis of the medical images and/or any overlays or other contributory data. The one or more binary masks may be stored as part of a DICOM image file, added to an image file, and/or otherwise stored and/or represented according to the DICOM standard or portion of the standard.

According to a further aspect, there is provided a system for analysing sets of medical images in substantially real-time, the system comprising: a medical imaging device; a storage device; a user terminal operable to input diagnosis metadata for each set of medical images; a processing unit operable to analyse one or more of each set of medical images on the storage device to determine one or more characteristics and determine a degree of similarity of the determined one or more characteristics and the input diagnosis metadata; and an output viewer operable to display a requirement for output data generated based on the determined one or more characteristics, wherein the output data is indicative of a requirement to obtain one or more additional medical images or trigger a further analysis of the set of medical images if the degree of similarity is below a predetermined threshold. Optionally, the storage device comprises a picture archiving communication system (or PACS) and/or a vendor neutral archive (or VNA).

Such a system may be installed in or near hospitals, or connected to hospitals via a digital network, to reduce waiting times for medical images to be analysed. Patients may therefore be spared stress from not knowing the results of a medical scan and receive a decision more quickly.

Optionally, the processing unit is integrated with the medical imaging device.

In this way, the medical scanner can be coupled with a processing unit to analyse medical images as soon as they are scanned.

Optionally, the processing unit is located remotely and is accessible via a communications channel.

In this configuration, the processing unit can be deployed from a remote cloud system without need to replace and change existing scanning equipment.

According to a further aspect, there is provided a system operable to perform the method according to any other aspect.

According to a further aspect, there is provided a computer program operable to perform the method according to any other aspect

Through the use of a computer or other digital technology, examination of medical images may be performed with greater accuracy, speed, and/or reliability that relying on a human expert. Therefore, a greater number of medical images may be reviewed at one time thereby reducing backlogs for experts and further reducing errors made when the medical images themselves are actually reviewed.

Brief Description of Drawings

Embodiments will now be described, by way of example only and with reference to the accompanying drawings having like-reference numerals, in which:

Figure 1 shows a flow diagram of an embodiment;

Figure 2 depicts a first deployment (for example, within a medical scanning device); Figure 3 depicts a second deployment (for example, on the premises of a medical facility);

Figure 4 depicts a third deployment (for example, using a cloud system);

Figure 5 illustrates a method of an embodiment;

Figure 6 illustrates a flowchart showing an outline of the method of an embodiment; Figure 7 illustrates the portion of the flowchart of Figure 6 focussed on providing a malignancy output based on the input image and the pre-trained malignancy detection neural network, optionally showing the pre-processing that can be applied to the input image;

Figure 8 illustrates the Mask-RCNN of the embodiment of Figure 6 in more detail; Figure 9 illustrates the portion of the flowchart of Figure 6 showing the process of the mean and max operations performed by the embodiment;

Figure 10 illustrates how the final output of the embodiment of Figure 6 is determined; Figure 11 illustrates a workflow method of an embodiment; and

Figure 12 illustrates a modified version of the workflow illustrated in Figure 11 .

Specific Description

Referring to Figures 1 to 4, an embodiment will now be described in more detail.

Referring firstly to Figure 1 , a medical imaging process 100 is shown. Flaving performed a medical scan of a patient (such as a mammography) using a medical imaging scanner 101 , the scanned images are collated in DICOM format, which is a file format commonly used to store medical images and stored on a Picture Archiving Communication Systems (PACS) database 102 that radiology departments use in hospitals. The method 100 can use pre-processed data that is stored on the PACS database 102. The output of this method can also enrich the PACS database 102 to improve future applications of analysing mammographic images. Image data is extracted from the DICOM file and an image is generated.

The image then undergoes a pre-processing stage 103. The image is loaded onto a 4D tensor of size [1 , width, height, 1]. The pre-processing stage may also perform windowing on the image data to a predetermined windowing level. The windowing level defines the range of bit values considered in the image. Medical images are conventionally 16-bit images, wherein each pixel is represented as a 16-bit integer ranging from 0 to 2 16 -1 , i.e. [0, 1 , 2, ..., 65535]. The information content is very high in these images, and generally comprises more information than what the human eye is capable of detecting. A set value for the windowing level is typically included within the DICOM file.

In some cases, it can be important to maintain image resolution. Often, conventional graphics processing unit (GPU) constraints require that the image be divided into a plurality of patches in order to maintain resolution. Each patch can then be provided to a Fully Convolutional Network (FCN) 104. The larger the patch, the more context that can be provided, but some precision may be lost. For example, in the case of a large image comprising a small tumour, if the FCN 104 is instructed that somewhere in this patch there is a tumour, the network would need to learn how to find it first before it can be classified. In this embodiment patch sizes of 300x300 pixels are used, although larger and smaller patch sizes may be used.

A rescaling step may be included in the pre-processing stage 103 owing to above mentioned constraints of conventional hardware. Medical images are typically in the region of -3500x2500 pixels. An FCN 104 applied to this image does not fit in conventional graphics processing unit (GPU) memory. The image can be rescaled to a larger or smaller size during the pre-processing stage 103, or even not rescaled at all, and would allow the FCN 104 to see a higher resolution and may pick up finer detail. However, this is unlikely to fit in GPU memory, and could cause the method to become considerably slower. By rescaling the image to a smaller size during pre-processing 103, it is more likely to be able to fit in a GPU memory, and allow the FCN 104 processes to run at a faster speed. The FCN 104 may also generalise better owing to a smaller number of input parameters.

The method 100 may be used to identify and detect lesions in the mammograms as part of the output 105 of the FCN 104. The lesions which may be segmented may comprise one or more cancerous growths, masses, abscesses, lacerations, calcifications, and/or other irregularities within biological tissue.

The images are analysed by feeding them through a trained machine learning model 104, such as a Convolutional Neural Network. This embodiment utilises deep learning techniques to train and develop the convolution network. The model is trained on a dataset with known workups and, hence, directly establishes a relationship between the images received and the known workups to estimate a required workup. In particular, the output 105 of the machine learning model is a binary vector, where the indices represent various types of workup. For example, the workups may be any, or any combination of need no further action, an Ultrasound scan, a Tomosynthesis scan, an MRI scan and/or taking a Biopsy.

The dataset used for training the neural networks may also contain known density or tissue types. In that case, a multi-task learning approach can be taken to have the model also output density (A, B, C, D) or tissue type (1 , 2, 3, 4, 5).

There are different types of patterns in breast tissue that affect the detectability of breast cancers. Thus, it is important to know what kind of pattern is present. There are five mammography parenchymal patterns known as “Tabar patterns”, named after Professor Laszlo Tabar who developed this classification.

The Tabar patterns (or classifications types) are based on a histologic-mammographic correlation with a three-dimensional, sub-gross (thick-slice) technique, and on the relative proportion of four “building blocks” (nodular densities, linear densities, homogeneous fibrous tissue, radiolucent fat tissue). The five classifications are as follows:

1 . Balanced proportion of all components of breast tissue with a slight predominance of fibrous tissue

2. Predominance of fat tissue

3. Predominance of fat tissue with retro-areolar residual fibrous tissue

4. Predominantly nodular densities

5. Predominantly fibrous tissue (dense breast)

Classes 4 and 5 are considered high risk, meaning that it is difficult to detect cancers in the breast with those patterns, whereas classes 1 , 2 and 3 are considered lower risk as it is easier to spot cancerous regions. Some therapies may alter the pattern by increasing parenchymal density, as in hormone replacement therapy (HRT), or reducing it as in therapies with selective oestrogen- receptor modulators (SERM).

Similarly, breast density categories are classified by radiologists using the BI-RADS system. Again, this classification is used for quality control purposes. For example, it is very difficult to spot an anomaly in dense breasts. There are four categories in the BI-RADS system:

A. The breasts are almost entirely fatty

B. There are scattered areas of fibro-glandular density

C. The breasts are heterogeneously dense, which may obscure small masses D. The breasts are extremely dense, which lowers the sensitivity of mammography

Importantly, breast densities and tissue patterns are also known to have a mutual correlation to breast cancer development.

In some cases, the method can produce two types of output data 105. Whilst output data 105 can relate to a suggested workup or additional medical tests 105a, the output data 105 may also indicate the density or tissue classification 105b. The output data 105 can indicate a binary output as to the requirement for further tests. Optionally, the output data 105 can include data relating to how the binary output was reached, including any of; Tabar pattern; tissue classification types; breast density; nodular densities; linear densities; homogenous fibrous tissue; radiolucent fat tissue; BI-RADS category; a measure of superimposed features within the images; probability and/or confidence rating.

Mammography is a medical imaging modality widely used for breast cancer detection. Mammography makes use of “soft” X-rays to produce detailed images of the internal structure of the human breast - these images are called mammograms and this method is considered to be the gold standard in early detection of breast abnormalities which provide a valid diagnosis of a cancer in a curable phase.

Unfortunately, the procedure of analysing mammograms is often challenging. The density and tissue type of the breasts are highly varied and in turn present a high variety of visual features due to patient genetics. These background visual patterns can obscure the often-tiny signs of malignancies which may then be easily overlooked by the human eye. Thus, the analyses of mammograms often lead to false-positive or false-negative diagnostic results which may cause missed treatment (in the case of false negatives) as well as unwanted psychological and sub-optimal downstream diagnostic and treatment consequences (in the case of false positives).

Referring now to Figure 2, the assessment 200 of image data obtained from a medical imaging scanner 101 and stored in a (PACS) database 102 will now be described in more detail. Most developed countries maintain a population-wide screening program, comprising a comprehensive system for “calling in” women of a certain age group (even if free of symptoms) to have regular breast screening. These screening programs require highly standardized protocols to be followed by experienced specialist trained doctors who can reliably analyse a large number of mammograms routinely. Most professional guidelines strongly suggest reading of each mammogram by two equally expert radiologists (also referred to as double-reading). Nowadays, when the number of available radiologists is insufficient and decreasing, the double-reading requirement is often impractical or impossible.

The imaging data from a mammogram performed using the medical imaging apparatus 101 is retrieved from the PACS database 102 and displayed on a user terminal 202. When analysing mammograms using the image displayed on the user terminal 202, which can be manipulated by the user to for example change contrast, brightness and/or magnify portions of the image, the reliable identification of anatomical structures is important for visual evaluation and especially for analytic assessment of visual features based on their anatomic location and their relation to anatomic structures, which may have profound implications on the final diagnostic results. In the case that anatomic structures appear distorted they may also indicate the presence of possible malignancies.

Conventional X-ray is a medical imaging modality widely used for the detection of structural abnormalities related to the air containing structures and bones, as well as those diseases which have an impact on them. Conventional X-ray is the most widely used imaging method and makes use of “hard” X-rays to produce detailed images of the internal structure of the lungs and the skeleton. These images are called roentgenograms or simply X-rays.

Unfortunately, the procedure of analysing X-rays is often challenging, especially when analysing lung X-rays in order to detect infectious disease (e.g. TB) or lung cancer in early stage.

Cross-sectional medical imaging modalities are widely used for detection of structural or functional abnormalities and diseases which have a visually identifiable structural impact on the human internal organs. Generally, the images demonstrate the internal structures in multiple cross-sections of the body. The essence of the most widely used cross-sectional techniques are described below.

Computed tomography (CT) is a widely used imaging method and makes use of “hard” X-rays produced and detected by a specially rotating instrument and the resulted attenuation data (also referred to as raw data) are presented by a computed analytic software producing detailed images of the internal structure of the internal organs. The produced sets of images are called CT-scans which may constitute multiple series with different settings and different contrast agent phases to present the internal anatomical structures in cross sections perpendicular to the axis of the human body (or synthesized sections in other angles). Magnetic Resonance Imaging (MRI) is an advanced diagnostic technique which makes use of the effect magnetic field impacts on movements of protons which are the utmost tiniest essential elements of every living tissue. In MRI machines the detectors are antennas, and the signals are analysed by a computer creating detailed images if the internal structures in any section of the human body. MRI can add useful functional information based on signal intensity of generated by the moving protons.

However, the procedure of analysing any kind of cross-sectional images is often challenging, especially in the case of oncologic disease as the initial signs are often hidden and appearance of the affected areas are only minimally differed from the normal. When analysing cross sectional scans, diagnosis is based on visual evaluation of anatomical structures. The reliable assessment, especially for analytic assessment, of visual appearance based on their anatomic location and their relation to anatomic structures, may have profound implications on final diagnostic results. In the case that anatomic structures appear distorted they may also indicate the presence of possible malignancies Generally, in the case of all diagnostic radiology methods (which include mammography, conventional X-ray, CT, MRI), the identification, localisation (registration), segmentation and classification of abnormalities and/or findings are important interlinked steps in the diagnostic workflow.

In the case of ordinary diagnostic workflows carried out by human radiologists, these steps may only be partially or sub-consciously performed but in the case of computer-based or computer-aided diagnoses and analyses the steps often need to be performed in a clear, concrete, descriptive and accurate manner.

Locality and classification may define and significantly influence diagnoses. Both locality and classification may be informed by segmentation in terms of the exact shape and extent of visual features (i.e. size and location of boundaries, distance from and relation to other features and/or anatomy). Segmentation may also provide important information regarding the change in status of disease (e.g., progression or recession).

To assist the user of the user terminal 202, a model 201 that has previously been trained on pre-labelled training data can be used to determine whether any of the images contain portions indicative of interest or concern to radiologists and the model can indicate these regions on the images retrieved from or stored in the PACS database 102. In Figure 2, the model 201 is able to be run on the images as they are acquired by the medical imaging device 101 and the model 201 output is stored as a layer in the PACS database 102 so that it can be shown or not shown to the user of the user terminal 202 or used to perform preliminary automated assessment of the images in parallel or prior to the user of the user terminal 202.

In other embodiments, the model 201 can be provided elsewhere. For example, in Figure 3, which shows a local server-based model example 300, the model 201 is provided on a local network and can read from and write to the PACS database 102. In another example, in Figure 4 which shows a cloud server-based model example 400, the model 201 is provided at a remote server or distributed computer system (a.k.a. the “cloud”) 401 and interfaces with the PACS database 102 to read from and write the output of the model 201 to the PACS database 102.

Referring to Figure 5, there is shown a second reader suggestion method 500 according to an embodiment which will now be described in more detail.

Mammography image data 510 is obtained for each patient and assessed by a radiologist as per standard clinical procedures. Once the assessment/diagnosis 530 has been completed by the radiologist, or in parallel, the mammography image data 510 is input into a model 520. The model 520 is arranged according to one of the embodiments described in this specification, for example according to the models 201 in the embodiments described in relation to Figures 1 to 4 or the embodiments described in accordance with Figures 6 to 10. The model 520 outputs an assessment of the input image data 510, for example highlighting portions of the image data 510 indicative of interest or concern to radiologists. The radiologist assessment 530 and the output of the model 520 are then compared 540 to determine if they do or do not overlap/agree. If there is not agreement between radiologist assessment 530 and the output of the model 520 then the output 550 triggers that a second reader is suggested 560, i.e., a second independent radiologist reviews the image data 510 and performs a second independent diagnosis. If the radiologist assessment 530 and the output of the model 520 agree, or overlap, then no further action needs to be taken 570.

The model 520 can be a machine learning (ML) model or system, for example a convolutional neural network.

The radiologist assessment 530 and the output of the model 520 can be determined to agree, or overlap, based on a threshold of similarity.

Alternatively, in addition this embodiment can also have other information input into the model 520 such as age of the patient and the model 520 configured to take this other information into account.

Another alternative is that, instead of a second independent radiologist being suggested to perform a second independent diagnosis, either the original radiologist can be alerted, and it suggested that the original radiologist performs a second review; or a computer- aided-diagnosis is performed on the image data 510.

Figure 6 depicts an example embodiment which will now be described in more detail below with reference to Figures 7 to 10 as appropriate.

Referring to Figure 6, there is shown a method 600 for receiving input mammography images 10 and outputting a malignancy output, for example a yes/no binary output or a more detailed output showing regions of interest along with a binary output. In a medical scan of a patient (mammography), the scanned images 10 are collated in DICOM format, which is a file format commonly used to store medical images. The method can also use pre-processed data that is stored on a Picture Archiving Communication Systems (PACS) that radiology departments use in hospitals. The output of this method can also enrich the PACS database to improve future applications of analysing mammographic images.

In some instances, the images 10 can be pre-processed 20 using a variety of methods, including but not restricted to, windowing, re-sampling and normalisation. The input images 10 may also undergo domain adaption and/or style transfer techniques to further improve the results as part of the pre-processing 20.

The mammograms, pre-processed or not, are then fed into a convolutional neural network (CNN) classifier 30 which has been trained to analyse the images 10 and assess whether the image shows a malignant lesion. In some embodiments, there is use of more than one trained CNN as part of the classification step 30 to complete this task. Conventional methods of detecting malignant lesions in a mammogram may also be used. Alternatively, other machine learning implementations may be used in place of a convolutional neural network, for example a machine learning implementation that can perform feature extraction and classification either as a combined process or separate processes.

In order for a CNN to operate as a malignancy model the network 30 first needs to be trained. Similar to the pre-processing methods 20 mentioned above, input images for the purpose of training the network may undergo windowing, resampling, normalisation, etc., before the images are used. In some instances, the images used to train the network are either provided or sized to up to 4000 x 4000 pixels.

As the images are fed through the CNN, a number of stacked mathematical operations are performed. In doing so, the CNN applies variable tensors to the previous layer such that a malignant or not score is produced as a result of these operations. The variables are then updated based on the gradient of the cost function (cross-entropy), making use of the “chainrule” to work out the gradient updates to apply. In this way, multiple CNNs can be trained to be used with the described aspects/embodiments.

The output of the classifier 30 is a convolution layer output 30X and four types of mammography images 30Y,50. The convolutional layer output 30X is used by a Mask RCNN 40 to produce a Mask RCNN output 40X while the mammography images 30Y,50 are input into the malignancy model 60 which outputs a malignancy score 60Y. The Mask RCNN output 40X and malignancy score 60Y are used by a gating operation 70 to determine the output of the process 600.

Referring now to Figure 7, which shows the process of using previous image data 700 in an alternative embodiment of Figure 6. In this embodiment, additionally the training of the CNNs may include concatenating as part of the pre-processing 20 a previous image(s) 10a taken of the same mammographic view as image(s) 10 and processing it using the classifier networks 30 together with the current image 10 being fed into the network 30. As with the other embodiment, the output of the classifier 30 is a convolution layer output 30X and four types of mammography images 30Y,50. This enables the fine tuning of the final few layers of the CNN 30 such that they can account for multiple images 10, 10a.

Referring now to Figure 8, an example Mask RCNN 40 is shown in more detail. Once the malignancy model(s) are trained, the network and its weights are frozen. One or more of the convolutional layer’s 20 outputs 30X is then taken and is fed into mask heads from a Mask RCNN 40. The Mask RCNN 40 includes a bounding box predictor 41 , where the bounding boxes can be used to cut out a part of the original image. In addition to, or on top of the cut out patch, a malignant classifier 42 and segmentation 43 heads are also included in the Mask RCNN 40. As with the malignancy model 600, any conventional bounding box, malignancy classifier or segmentation models can be used with this system. In "Mask r-cnn." Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, 2017, He, Kaiming, et al. describe a traditional RCNN that can be used in at least some embodiments, which is incorporated by reference.

There are various methods of training the RCNN(s) 40. Firstly, by connecting the malignancy model 600 to the Mask RCNN 40, the Mask RCNN heads 41 , 42, 43 can be trained at the same time as the whole image malignancy model 600. Secondly, it is also possible to train the Mask RCNN 40 without freezing the malignancy model network 600. Finally, the Mask RCNN heads 41 , 42, 43 may be trained with multiple malignancy models 600. Thus, the method of training the Mask RCNN heads 41 , 42, 43 is not restricted to a certain type, which enables the approach to be tailored for specific uses.

Once the neural networks are trained, during use, or at inference time, the malignancy model 600 is frozen based on the training data.

Referring now to Figure 9, determining malignancy 900 and the malignancy model 60 will now be described in more detail. As an example, during run time, the system of the embodiment receives four types of mammography images: left cranial caudal view (L-CC) 51 , right cranial caudal view (R-CC) 53, left medio-lateral-oblique (L-MLO) 52 and a right medio- lateral-oblique (R-MLO) 54. This combination of images is known to be referred to as a “case”. Upon passing though the malignancy model or models 60, the system of the embodiment produces an entire case of outputs 51 , 52, 53, 54. These outputs are then averaged to generate a single output 60Y.

The L-CC 51 represents an average score of all left cranial caudal views, the L-MLO 52 represents an average score of all left medio-lateral-oblique views, the R-CC 53 represents an average score of all right cranial caudal views and the R-MLO 54 represents an average score of all right medio-lateral-oblique views. The system of the embodiment then calculates a mean of the respective left side views 61 and right side views 62. This results in a malignancy output for each side 61a, 62a. A max operation 63 is then performed for the average malignancy outputs for each side.

Although not depicted in the Figures, in the described embodiment the method then thresholds this result with a predetermined threshold which gives a binary “malignant or not” score 60Y.

Finally, with reference to Figure 10, which shows the gating process 1000, the score 60Y is used to gate 70 whether or not to show the Mask RCNN segmentations or bounding boxes 40X. In this way, instead of showing absolutely all lesions detected by the Mask RCNN 40 alone, which leads to numerous false-positives, the resulting Mask R-CNN outputs are only shown 70X if the binary malignant score is positive, i.e., indicating malignancy. When the single output/score 60Y does not indicate the case to be malignant, the Mask RCNN outputs 40X are ignored and no localisation data is produced 70Y as an output of the system.

In some cases, the Mask RCNN results 40X can be ensembled by interpolating between bounding box coordinates (of shape [N, M, x1 , x2, y1 , y2] where N represents the number of models and M the maximum number of bounding boxes) which have a sufficient intersection over union (IOU), which is predetermined. Any bounding box which does not have a sufficient IOU with the others are removed from consideration. With the resulting bounding boxes, the raw segmentation masks are then averaged before thresholding with a predetermined threshold, and also averaging the lesion scores for all of the sufficient bounding boxes.

These operations result in a final set of bounding boxes of shape [1 , M, x1 , x2, y1 , y2] along with a segmentation mask of shape [1 , FI, W] and lesion scores of shape [1 , M] A better way is to use weighted box clustering (WBC) which is described by Paul F. Jaeger et al in “Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection” (https://arxiv.org/pdf/1811 .08661 .pdf), which is incorporated by reference.

Referring to Figure 11 , there is shown a decision for double reader arbitration workflow 1100 according to an embodiment which will now be described in more detail.

As with the embodiments mentioned above, or as with any existing methods, medical scans of patients can be obtained in one or more modalities 1100 when a scan, such as a mammogram, is performed.

In a medical scan of a patient (mammography), the scanned images are collated in DICOM format, which is a file format commonly used to store medical images. The method sometimes uses pre-processed data that is stored on a Picture Archiving Communication Systems (PACS) that radiology departments use in hospitals. Alternatively, the medical images can be stored on a Vender Neutral Archive (VNA), or any other local or remote (cloud) server either alone or in combination with other servers with medical image data. As above, the output of this method also enriches the PACS/VNA databases 1101 to improve future applications of analysing mammographic images.

In some instances, similar to the method of the previous embodiments, the images can be pre-processed 1102 using a variety of methods, including but not restricted to windowing, re-sampling, and normalisation. The input images may also undergo domain adaption and/or style transfer techniques either to increase the amount of data to assess or to convert the data to a format more compatible with the trained models to be used to assess the data.

The mammograms, pre-processed or not, are then fed into a convolutional neural network (CNN) 1103 which has been trained to analyse the images and assess whether the image shows a malignant lesion. This may include determining a number of characteristics relating to the medical images which are used to determine the classification. In some embodiments, there is use of more than one trained CNN to complete this task. Conventional methods of detecting malignant lesions in a mammogram may also be used.

Alternatively, other machine learning implementations may be used in place of a convolutional neural network 1103. In some embodiments, the machine learning implementation used will include a feature extraction module 1104 and a classifier 1105, either separate or combined, to assist in determining the aforementioned characteristics (as mentioned in the preceding embodiments, this may include bounding boxes, masks, etc.). Any known feature extraction methods or classifiers may be used with this method. The output of the CNN 1103 is a determination of whether the medical image scans indicate a positive (i.e. one or more malignancies are present in the images) or negative (i.e. no malignancies are present in the images) 1106 malignancy classification. If the initial assessment of the medical image is not negative, then no action 1107 is taken by the method and a traditional human double reader program or workflow is implemented. A traditional human (manual) double reader workflow requires at least two radiologists

(1108 and 1109) to provide an assessment/diagnosis on the medical images. Generally speaking, a double reader workflow is implemented by two separate radiologists performing independent assessments. Once each radiologist performs their analyses on the medical images, the two assessments/diagnosis will be compared 1110 to one another and if both radiologists agree on a diagnosis (i.e. a positive or negative malignancy classification) then this is accepted.

As mentioned above, double reading is the “gold standard” in breast cancer screening with mammography. In this scenario, two radiologists will assess a case. Arbitration 1111 will occur when the two readers are not in agreement over a malignancy assessment or about whether to recall a patient for further screening tests. Arbitration 1111 can involve a third radiologist performing an independent review of the data, or can involve the two radiologists meeting to review the data together to reach a reconciled conclusion. If (or once) the two radiologists agree matching assessments, then if there is a matching positive malignancy classification so the patient will be called back 1112, or if there is a matching negative malignancy classification so the patient will not be called back 1113.

On the other hand, moving back to the initial assessment 1106, if it is determined that there is a negative classification, then the results of the initial automatic assessment (without assistance from a human radiologist) will be carried forward by the method and treated as the results of an independent reader 1108.

In a method similar to traditional double reader workflows, this is then supplemented by an independent assessment carried out by a radiologist 1114. Should the assessment of the human reader 1114 and the initial assessment correlate to each other, then the method will not call back the patient 1117. However, if the two assessments do not agree 1115, then the method will suggest going to arbitration 1116.

Alternatively, as depicted in Figure 12, which shows a double reader suggestion workflow 1200, if the predetermined threshold or assessments do not overlap or agree with each other 1115a, the scans for the patient will be referred to a second independent radiologist to perform a second human reading 1118. Thus, effectively having a second human reading of the patient scans.

In the embodiments illustrated in Figures 11 and 12, the method forces a second human assessment of the medical images when the automatic initial assessment of the medical images makes a positive malignancy classification. In this way, the method efficiently decides which medical images need a double reading performed by a human (radiologist) and which do not. This frees up precious time and resources for medical professional and institutions as performing manual assessments, and twice for each patient, is an arduous and expensive task.

In the present embodiment, the described system is able to operate as an independent second reader so can assess whether a first radiologist diagnosis has identified all detected possible irregularities, abnormalities and/or malignant features in a set of medical images of a patient when provided with the diagnosis of the first radiologist and optionally some further information about each patient such as age (among other data). In the past, computer aided diagnosis systems were not able to act as such due to a high false positive rate. Similar to a human radiologist, the described system of the embodiment can have a low false positive rate which means it can be used in at least the following ways:

(1 ) As a truly independent second reader: a first (human) radiologist looks at the case and the present system independently assesses the case. If the two disagree, the system of the embodiment shows the outlines for lesions of interest for the human radiologist to consider but requires a second human reader to review the data, and if they agree, the radiologist does not see the outputs of the system nor does a second reader need to be involved; or (2) As a non-independent second reader where the human radiologist and the system of the embodiment both analyse the case - in that the human radiologist is supported by the system of the embodiment. The radiologist can click to see the results generated by the system of the embodiment whenever they want. (3) A verification tool once a first radiologist has performed a manual review and diagnosis of a set of images for a patient, provided that the tool is provided with both the set of images and the diagnosis information from the radiologist. If the diagnosis diverges from what the tool would expect a radiologist to diagnose in the set of images (and optionally based on the further data too, such as for example the age of the patient), then the tool can suggest that a second radiologist performs an independent review of the set of images and make a second diagnosis or that a second radiologist performs and independent review if the tool makes a positive malignancy classification/output.

Many approaches that mimic the techniques used by human radiologists can be incorporated in the system in some embodiments, such as using a previous image as a reference to look for any changes since the last scan and also a mean then max operator to mimic the way human radiologists trade off calling back a case.

Machine learning is the field of study where a computer or computers learn to perform classes of tasks using the feedback generated from the experience or data gathered that the machine learning process acquires during computer performance of those tasks. Typically, machine learning can be broadly classed as supervised and unsupervised approaches, although there are particular approaches such as reinforcement learning and semi-supervised learning which have special rules, techniques and/or approaches. Supervised machine learning is concerned with a computer learning one or more rules or functions to map between example inputs and desired outputs as predetermined by an operator or programmer, usually where a data set containing the inputs is labelled.

Unsupervised learning is concerned with determining a structure for input data, for example when performing pattern recognition, and typically uses unlabelled data sets. Reinforcement learning is concerned with enabling a computer or computers to interact with a dynamic environment, for example when playing a game or driving a vehicle. Various hybrids of these categories are possible, such as "semi-supervised" machine learning where a training data set has only been partially labelled. For unsupervised machine learning, there is a range of possible applications such as, for example, the application of computer vision techniques to image processing or video enhancement. Unsupervised machine learning is typically applied to solve problems where an unknown data structure might be present in the data. As the data is unlabelled, the machine learning process is required to operate to identify implicit relationships between the data for example by deriving a clustering metric based on internally derived information. For example, an unsupervised learning technique can be used to reduce the dimensionality of a data set and attempt to identify and model relationships between clusters in the data set, and can for example generate measures of cluster membership or identify hubs or nodes in or between clusters (for example using a technique referred to as weighted correlation network analysis, which can be applied to high- dimensional data sets, or using k-means clustering to cluster data by a measure of the Euclidean distance between each datum).

Semi-supervised learning is typically applied to solve problems where there is a partially labelled data set, for example where only a subset of the data is labelled. Semi- supervised machine learning makes use of externally provided labels and objective functions as well as any implicit data relationships. When initially configuring a machine learning system, particularly when using a supervised machine learning approach, the machine learning algorithm can be provided with some training data or a set of training examples, in which each example is typically a pair of an input signal/vector and a desired output value, label (or classification) or signal. The machine learning algorithm analyses the training data and produces a generalised function that can be used with unseen data sets to produce desired output values or signals for the unseen input vectors/signals. The user needs to decide what type of data is to be used as the training data, and to prepare a representative real-world set of data. The user must however take care to ensure that the training data contains enough information to accurately predict desired output values without providing too many features (which can result in too many dimensions being considered by the machine learning process during training and could also mean that the machine learning process does not converge to good solutions for all or specific examples). The user must also determine the desired structure of the learned or generalised function, for example whether to use support vector machines or decision trees. The use of unsupervised or semi-supervised machine learning approaches are sometimes used when labelled data is not readily available, or where the system generates new labelled data from unknown data given some initial seed labels.

Machine learning may be performed through the use of one or more of: a non-linear hierarchical algorithm; neural network; convolutional neural network; recurrent neural network; long short-term memory network; multi-dimensional convolutional network; a memory network; fully convolutional network or a gated recurrent network allows a flexible approach when generating the predicted block of visual data. The use of an algorithm with a memory unit such as a long short-term memory network (LSTM), a memory network or a gated recurrent network can keep the state of the predicted blocks from motion compensation processes performed on the same original input frame. The use of these networks can improve computational efficiency and also improve temporal consistency in the motion compensation process across a number of frames, as the algorithm maintains some sort of state or memory of the changes in motion. This can additionally result in a reduction of error rates.

Developing a machine learning system typically consists of two stages: (1 ) training and (2) production. During the training the parameters of the machine learning model are iteratively changed to optimise a particular learning objective, known as the objective function or the loss. Once the model is trained, it can be used in production, where the model takes in an input and produces an output using the trained parameters.

During training stage of neural networks, verified inputs are provided, and hence it is possible to compare the neural network’s calculated output to then the correct the network is need be. An error term or loss function for each node in neural network can be established, and the weights adjusted, so that future outputs are closer to an expected result. Backpropagation techniques can also be used in the training schedule for the or each neural network.

The model can be trained using backpropagation and forward pass through the network. The loss function is an objective that can be minimised, it is a measurement between the target value and the model’s output.

The cross-entropy loss may be used. The cross-entropy loss is defined as where C is the number of classes, y e {0,l}is the binary indicator for class c, and s is the score for class c.

In the multitask learning setting, the loss will consist of multiple parts. A loss term for each task.

L(x) — L-LL-L + A 2 L 2 where L 1; L 2 are the loss terms for two different tasks and A lt A z are weighting terms.

Any system features as described herein may also be provided as method features, and vice versa. As used herein, means plus function features may be expressed alternatively in terms of their corresponding structure.

Any feature in one aspect may be applied to other aspects, in any appropriate combination. In particular, method aspects may be applied to system aspects, and vice versa. Furthermore, any, some and/or all features in one aspect can be applied to any, some and/or all features in any other aspect, in any appropriate combination. It should also be appreciated that particular combinations of the various features described and defined in any aspects of the invention can be implemented and/or supplied and/or used independently.